• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

"AMD - 5870 is better than fermi"

Soldato
Joined
18 May 2003
Posts
4,894
http://vr-zone.com/forums/496780/amd-ati-hd-5870-is-better-than-nvidia-fermi.html
12558204291df7f95851.jpg
 
rofl@ GFLOPs, id like to see the fermi released first.

Nvidia have already stated their numbers, and honestly the only way it will change is if they go down due to releasing at lower clocks than expected.

IT is rather hilarious that Nvidia have gone all out to build a GPGPU with massive double precision power, and AMD with little to no extra effort have got a card with around two thirds the double precision power, and higher single precision power with no sacrifices being made.

Considering GPGPU is 1-2% of Nvidia's business and GPU the massive majority, its odd that Nvidia have shifted so heavily so quickly. AMD/Intel can make cards for that market, when its worth more than the $78 in revenue and almost no profit that Nvidia made last year from GPGPU's.
 
Nvidia "shader cores" aren't comparable with AMDs. The 240-core GTX280 was faster than the 800-core 4800 after all. AMD is well aware of this fact, and apparently chooses to ignore it.

Also, if Fermi has a maximum throughput of 1.5TF I will eat my own ****. Not sure where they get that number from...

I guess this is one of the pitfalls of announcing your product and not providing any performance numbers - it lets your rivals run wild with speculation!
 
Nvidia "shader cores" aren't comparable with AMDs. The 240-core GTX280 was faster than the 800-core 4800 after all. AMD is well aware of this fact, and apparently chooses to ignore it.

Also, if Fermi has a maximum throughput of 1.5TF I will eat my own ****. Not sure where they get that number from...

I guess this is one of the pitfalls of announcing your product and not providing any performance numbers - it lets your rivals run wild with speculation!

I thought it was Nvidia who have themselves stated that the Fermi has 8x the single precision power of the GTX280?

In which case that does indeed equate to 1.5TF.
 
Total guesswork.

So, AMD have 'estimated' that their card is better. Whoopee-doo.


ATi have based their predictions on a pdf of the specs for Fermi on NVidia's web site. It's true that not all the specs are there, so some of the performance estimates have to be inferred, but mostly it's correct according to NVidia's own released spec.
 
The GTX280 had a single precision throughput of 0.93TF, so no, that doesn't add-up.

Oops I got it slightly wrong, teaches me to post from memory :o

Nvidia claim in their white paper that Fermi has 8x the Double Precision performance of the 2xx series not single precision as I posted.

http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIAFermiArchitectureWhitepaper.pdf

256 FMA ops/clock double precision, 512 FMA ops/clock single precision......8x the peak double precision floating point performance over GT200

taking 96 GFLOP as the GTX285's Double Percision performance:


8*96 = 768 GFLOP which is where ATI have got their figure from for the chart. Then again using the Nvidia white paper which states that the Fermi has twice the performance in single versus double precision and you get the to 1.5TFLOPS for the single precision performance.

The difference is the GTX285 it had almost 10 times the single precision performance compared to its double precision (933GFLOPS single vs 96GFLOPS double). Clearly Nvidia have concentrated on boosting the double precision of the Fermi and the ratio is only double now.

So unless Nvidia are doing a screwball with their white paper I would indeed agree with ATI comparision of the single and double precision speeds although as said, number of cores is irrelevant.

EDIT: I look forward to seeing pics of you eating your hat? ;)
 
Last edited:
taking 96 GFLOP as the GTX285's Double Percision performance:

From where did you get the GTX280 peformance being 96GFLOP? It's off by about 20%.

The GTX280 can perform 3 double precision operations per (shader) clock, and had 30 dp processing units. At 1296mhz this equates to 116.64GF [1296*30*3 MF]. Alternatively, you can simply divide the quoted single-precision performance by 8 (since there is one dp unit for every 8 sp units), which gives you 116.625GF. The small difference is accounted for by rounding of the single-precision performance to 3 significant figures.


AMD state in that picture that they assume a 1500mhz shader clock for Fermi. This, along with the information from the whitepaper (8x the performance of GT200 clock-for-clock), implies that the double precision performance is 1080GF [i.e. (1500/1296)*8*116.63]. If we further assume double the single-precision performance (as is also stated in the whitepaper) then we arrive at around 2.16TF for the Fermi. This is more in the expected ballpark.

Of course, the shader-clock speed is the 'great unknown' in these computations, but AMD explicitly state that they assume a 1500mhz shader speed (which won't be far off the truth). Again, I state that AMDs computation of single- and double-precision performance is inconsistent.
 
Last edited:
From where did you get the GTX280 peformance being 96GFLOP? It's off by about 20%.

The GTX280 can perform 3 double precision operations per (shader) clock, and had 30 dp processing units. At 1296mhz this equates to 116.64GF [1296*30*3 MF]. Alternatively, you can simply divide the quoted single-precision performance by 8 (since there is one dp unit for every 8 sp units), which gives you 116.625GF. The small difference is accounted for by rounding of the single-precision performance to 3 significant figures.


.

Err that's wrong though isn't it? It can't perform 3 double precision operations per (shader) clock, only 2 plus one single precision.

GTX 280, reference clocked at 1296 MHz. Notice that Port 0 instructions can be multiply-adds (2 flop/cycle) and Port 1 instructions are just multiplies (1 flop/cycle):

Single precision:

1296 MHz/s * 30 SM * (8 SP/SM * 2 flop/cycle per SP + 2 SFU * 4 FPU/SFU * 1 flop/cycle per FPU)
= Port 0 throughput + Port 1 throughput = 622080 Mflop/s + 311040 Mflop/s = 933 GFlop/s single precision

For double precision:

1296MHz/s * 30 SM * 1 double precision FPU * 2 flop/cycle = 78 GFlop/s

The Port 1 units can be co-issued with double precision instructions, so can also process 311GFlop/s of single precision multiplies while doing double precision multiply-adds. [That’s probably not terribly useful without single precision adds though.]

You have wrongly assumed that the dp can process as many flops/cycle as the sp when in fact's it's only two thirds (116.64 x 2/3 = 77 GFLOPS)

ANyway, I'm right unfortunately so unless Nvidia is lying with their 8 times faster dp speed then it will only be 1.5TFLOPS for single assuming a 1500 shader speed.

I don't know where you have got your "expected ballpark" for the Fermi. The only way that it can be faster than 1.5TFLOPS is if Nvidia are lying about the 8x performance of the GTX2xx series or the shader speed is a lot more than 1500.

Please show me where I am wrong if you think I am? :p

Oh and this which seems to confirm that I am not the only one who calculates DP performance this way:

Speaking of double precision, the Fermi has implemented IEEE 754-2008-compliant double-precision floating point operations. As we discussed in our Radeon HD 5870 exposé, gigaFLOPS stands for one billion FLoating point Operations Per Second. A floating point operation is a basic calculation used by the CPU to process code, especially “scientific” ones like computer AI, video encoding and physics. Double-precision FLOPs ensure a high degree of accuracy in these calculations, which translates to more accurate rendering or encoding. We guess Fermi will be north of 700 billion precision FLOPS, while the HD 5870 weighs in at 544 billion. On the other hand, the HD 5870 will deliver an assbeating in the altogether less useful single-precision category with nearly twice the performance.

http://icrontic.com/articles/nvidia_fermi_dissected
 
Last edited:
News shock "AMD claims its newest technology is better than its closest rival". :rolleyes:

I refer you to this post good sir, ATI have only based it on what nVidia themselves have told us. Its a little more than mere fiction.

ATi have based their predictions on a pdf of the specs for Fermi on NVidia's web site. It's true that not all the specs are there, so some of the performance estimates have to be inferred, but mostly it's correct according to NVidia's own released spec.
 
Back
Top Bottom