interesting article on Floating point perfromance in AMD and Intel chips

bru · 22 Jan 2014 at 14:51

The peak CPU performance will depend on the SIMD ISA that your code was written and compiled for. We consider three cases: SSE, AVX (without FMA) and AVX with FMA (either FMA3 or FMA4).

CPU floating-point peak performance

I know the Kavari chips were underwhelming in some respects but I really did expect more than this, the FP performance in Kavari is no better than the cores that come before it. In fact its slightly worse but that is possible down to the slight difference in clock speed.

It is no secret that AMD's Bulldozer family cores (Steamroller in Kaveri and Piledriver in Trinity) are no match for recent Intel cores in FP performance due to the shared FP unit in each module. As a comparison point, one core in Haswell has the same floating point performance per cycle as two modules (or four cores) in Steamroller.

Now onto GPU peaks. Here, for Haswell, we chose to include both GT2 and GT3e variants.

The fp64 support situation is a bit of a mess because some GPUs only support fp64 under some APIs. The fp64 rate of Intel's GPUs does not appear to be published but David Kanter provides an estimate of 1/4 speed compared to fp32. However Intel only enables fp64 under DirectCompute but does not enable fp64 under OpenCL for any of its GPUs.

Situation on AMD's Trinity/Richland is even more complicated. fp64 support under OpenCL is not standards-compliant and depends upon using a proprietary extension (cl_amd_fp64). Trinity/Richland do not appear to support fp64 under DirectCompute (and MS C++ AMP implementation) from what I can tell. From an API standapoint, Kaveri's GCN GPUs should work fine on for fp64 under all APIs.

I didn't know that Intel didn't enable OpenCL support on any of its GPU's, it certainly doesn't help OpenCL gain ground in market place with the biggest player not supporting them.

I do have to highlight this sentence though as it made me laugh out loud. but I will refrain from comment so as not to seem antagonistic.

Situation on AMD's Trinity/Richland is even more complicated. fp64 support under OpenCL is not standards-compliant and depends upon using a proprietary extension (cl_amd_fp64).

Original article

http://www.anandtech.com/show/7711/...f-kaveri-and-other-recent-amd-and-intel-chips

Competitor rules

interesting article on Floating point perfromance in AMD and Intel chips

More options

bru

bru