Fermi In Trouble ?

LoadsaMoney · 17 Nov 2009 at 00:44

We all know it is constantly being delayed - but at the end of all those delays, what we were expecting was a stellar product from Nvidia. However, with today's press release, certain inconvenient details are revealed. Let's forget about the delays for now, and just consider the product itself.

The first Fermi GPU - GF100 - as we know for a while now, is a 3 billion transistor giant, taking a die size of around 500 mm2. Compare this with the 2.15 billion transistor, 330 mm2 Cypress on the same 40nm TSMC process, and you would be expecting a different class of product. Unfortunately, the details revealed today about cast an uncertain shadow over this basic assumption.

The first thing worth noticing is a complete and total absence of Single Precision performance figures or any comparison to direct competition - i.e. ATI's GPGPUs. It is clear that Fermi's real performance advantage would be Double Precision performance - had it hit the right clock speeds.

However, today's press release suggests Nvidia have missed target speeds by a lot. To be fair, Tesla products do clock lower, though not by much. In fact, GTX 280 and Tesla C1060 were clocked the same. Even taking a generous increase for Geforce products, things are still uncertain. As a result, DP performance is rated at between 520 GFlops and 630 GFlops. Suddenly, ATI Radeon HD 5870 - which wasn't even supposed to be a direct competitor - is performing right on par with 544 GFlops against Fermi's supposed strong point.

Consider Single Precision - far more important for gaming graphics, and things turn rather ugly. GF100's target speeds were reported to be 1.5 GHz for the shaders. Based on the 520 / 630 GFlops figures, the shader clocks can only be estimated at 1015 MHz and 1230 MHz respectively.

The SP theoretical performance from 512 CUDA cores? Between 1.05 TFlops and 1.26 TFlops. Even less than 1.05 TFlops, considering the lesser part is likely to have units disabled. Now, no amount of overclock can bridge the enormous gap to the smaller, already available HD 5870, which stands pretty at 2.72 TFlops. Even the mainstream HD 5770 clocks in at 1.36 TFlops! Barring a different clock speed for SP units, or other technology we are unaware of, this is a dismal performance from the Fermi shaders. Sure, Nvidia's shaders are much more efficient, but this is just too massive a gap to claw back.

Then comes the price. A previous-gen C1060 released at $1699, falling to $1199. Compare this with fellow Geforce model, GTX 280, released at $649, falling quick to $500, and finally $300. The price of a next-gen C2070 is a whopping $3999. Nearly double the price as the previous generation C1060. Clearly, these are expensive products to make, so how much can Nvidia sell a Geforce version of Fermi for? Even the cheapest Tesla 20 variant, the C2050 costs $2499, nearly 50% more than the GT200 based C1060 flagship. Can Nvidia sell the $3999 Tesla product at $399 as a Geforce product?

So far, we are comparing GF100 to Cypress. Where, in reality, GF100 should be compared to Hemlock. 4.64 TFlops vs. 1.26 TFlops is not much of a comparison at all, however, CF limitations, ATI's less efficient shaders aside.

The other, much less common rumour is the possibility of a dual-GPU Fermi product. Well, considering "Typical" power of Tesla 20 is 190W, this will be highly unlikely, at least for a while. Not to mention, Geforce products might end up clocked higher. A HD 5870s peak TDP is 188W, lower than GF100's "typical" power! TDP is expected to be 220W, at least, and that is just too hot for a dual GPU product.

And we have not factored in the fact that GF100 is nowhere to be seen, and are unlikely to be on shelves in quantity for at least 4-5 months. Any further delays, and we will be looking at new products from AMD.

In the end, AMD have a solid product already available that is efficient, economical, scalable. Nvidia have ink on paper - and even that is not looking as promising as we may have hoped for. At this moment, we can only hope for "hidden" or "magical" gaming features which might bring about a revolution in how GPUs work. Short of that, all signs point to Fermi being in real trouble.

http://vr-zone.com/articles/fermi-in-trouble-/8054.html?doc=8054

Rroff · 17 Nov 2009 at 00:51

They really are in trouble if they can't get the shader clocks up... specially as the high end part was rumoured to come with a 1600MHz shader clock (not 1500).

EDIT: Unless that was 1500 for Tesla and 1600 for teh gaming part... don't have enough info to tell.

Wilderbeast · 17 Nov 2009 at 01:06

So they are saying its going to be very similar in performance to the 285gtx which can do 933 gigaflops single precision? Can't see that can you? Biggest flop ever for gamers if its true

Ritsugamesh · 17 Nov 2009 at 01:11

Very interesting read, time is running out and nVidia need to do something, otherwise ATi are going to be sitting even prettier than current. Roll on the next few months, should be interesting.

Rroff · 17 Nov 2009 at 01:18

Its certainly interesting... either nVidia is in serious trouble or they are trying to throw up a massive smoke screen to keep ATI on their toes...

drunkenmaster · 17 Nov 2009 at 02:00

Oh, look, i've only been mentioning the 40nm leakage problem and how Nvidia will be utterly screwed with their higher shader clock speeds for months.

The theoretical SP performance means fairly little, the 4870 had much higher theoretical performance than the 280GTX, yet they ended up performing fairly similarly in some games and the 280GTX ahead in quite a lot also. The problem being ATI's theoretical numbers are when each and every shader in a cluster is executing an instruction, in real life performance thats a VERY rare occurance, though only one instruction per cluster being executed is just as rare, somewhere in the middle is what you can expect.

But again as I've been saying for months, ATi had their monolithic insanely size R600, and when TSMC screwed up the 65nm, they were owned, big time. ATi learned, they drastically changed their architecture, small, efficient huge shader count at lower clocks. These are all things that maximise yields and performance no matter how crap their manufacturing is. Nvidia saw the trouble, ran into trouble themselves at 65nm, 55nm, and now 40nm. Its not their fault TSMC screwed up(by all accounts 32nm is doing incredibly badly already, which may cause AMD to move gpu production to semi chartered for a short time before switching to probably the New York fab in 2012) but its COMPLETELY Nvidia's fault for seeing the problem, being able to anticipate the problems and ploughing forwards with a architecture that at every level is deeply susceptible to manufacturing problems.

SP performance numbers aren't as bad as you think(also not sure quite where they came up with them tbh), likewise if you actually look at DP performance of the 4870 vs the 280GTX and most importantly that generation of cards in their own SP vs DP performance. YOu'll see that despite being pushed as a decent GPGPU Nvidia's last generation had a much worse comparitive SP to DP performance than ATi had. The biggest increase on the GPGPU side, is where their SP is vs their DP. IE say their last gen was 500tflops SP, it was 50tflops DP performance. This gen call it 1000tflops SP, but 500tflops DP performance. They've just made sure they have a little more core logic so basically every shader is capable of working in DP mode, last time around they couldn't.

But also keep in mind ATi's DP numbers also have the issue of filled instruction line being almost unachievable in real life. Nvidia have a much more realitic real life performance number in terms of shaders.

I still say Nvidia are largely screwed, I've said for a long time they will have serious issues with clockspeeds, I suggested that this is a reason we won't see gaming performance numbers till AT LEAST A3 silicon, which they are praying will have higher clockspeeds. GPGPU segment, is two things, tiny and entirely performance dependant, the guys who'll pay £20k for a bunch of these rigs make money off the computers, more performance = more money for them, a 5% increase could be worth it to them. 5% won't persuade anyone to buy a new £400 gpu, this is why we're seeing GPGPU numbers now and not gaming benchmarks.

If A3 doesn't bump up speeds, we'll, you're looking at real problems. As I've said before again, you very rarely double the old generation performance in games, doubling shaders isn't doubling everything onboard, and you have the same CPU/mobo regardless, double performance is just not very likely. If you have a new gpu that at best would be around double the performance, but at 20-30% clock reduction, you're looking at an expensive new product, with terrible yields thats simply not offering any kind of value for upgrading.

This wouldn't be so bad, if the GT200b were economically viable. instead of getting a single Fermi(call ita 380gtx) for £450, you can get 2 275gtx's for £300 and have 30% better clockspeeds and performance. But they can't sell them any

This is again the reason I **** off TSMC on an almost daily basis. We'd likely see a 5870 at beyond 1Ghz stock clocks if the process was better, same way we should have had a 1Ghz R600 at the same time the 8800gtx was released.

This is why ATi have done so ridiculously well, they saw the issue, manufacture, and completely changed their gameplan to reduce the impact TSMC has on their cards. Nvidia really have to learn that lesson, and really damn soon for everyones sake not least theirs.

This is exactly why Nvidia are really considering GlobalFoundries for production, despite the fact it basically means AMD make cash for every discrete gpu sold no matter the brand.

Make no mistake ,frankly I think Fermi will be a great GPGPU, because its DP number is real and usable it will be powerful there, the problem being their entire GPGPU turnover last year was $80ish, which is basically nothing, its 2% of their business, its not even close to enough to keep them going.

You need to scale down ATi's numbers to account for the difficulty in extracting full performance, in SP that would still probably put ATi ahead, or pretty similar, but scale down their DP the same and they will be a decent amount behind Nvidia.

The question, which I don't know the answer to is, though its very hard to program games to use the full performance of a ATi shader cluster, can a GPGPU program be coded to much more effectively use the available power, if so, Nvidia could easily lose on both fronts.

Also not sure why VRzone are bringing up the size of the cores, the difference in size between Fermi/5870 is the same ratio as 280gtx/4870, nothings changed their, both architecture's have "roughly" doubled in size over the previous gen cards(in transistors), again not unexpected.

straxusii · 17 Nov 2009 at 07:30

If the 5870 ends up beating the GTX300 it will be a disaster for all of us.

Zarf · 17 Nov 2009 at 08:02

Prices will still fall on the 58** series even if nVidia fail, because at the moment ATi are still the underdogs market share wise (Steam hardware survey shows ATi @ 29% market share vs nVidia at 63%)
It makes more business sense for ATi to concentrate on converting nVidia owners thinking about an upgrade, so they can build their reputation and brand.
Revenue and Market share are more important than profit margins, not that they won't be making money hand over fist anyway.

It's only if the GPU battle for the 68** series goes massively in ATi's favour too that we need to worry.

Shash · 17 Nov 2009 at 08:36

Wilderbeast said:
So they are saying its going to be very similar in performance to the 285gtx which can do 933 gigaflops single precision? Can't see that can you? Biggest flop ever for gamers if its true

Maybe that's what you can a tera-flop

.
.
.
Can't believe I just wrote that.

D.P. · 17 Nov 2009 at 08:37

Conjecture with little facts really. Lets wait for Fermi to be released.

For example, the price of the tesla part has nothing to do with the gfx part. They are very different products aimed at different markets to do different things, with different hardware, different support, different marketing strategies. The earlier Tesla was released extremely cheap to buy into the market. Now Nvidia is known as the market leaders in this technology they can increase the prices.

We simply have no information on the graphics part.

marscay · 17 Nov 2009 at 09:11

D.P. said:
Conjecture with little facts really. Lets wait for Fermi to be released.

Nothing new, a bunch of wannabe engineers throwing darts at a board and delivering their 'factual'

theories.

These threads are funny most of the time when you just sit back and watch the BS fly around.

Lightnix · 17 Nov 2009 at 09:33

D.P. said:
Conjecture with little facts really. Lets wait for Fermi to be released.

For example, the price of the tesla part has nothing to do with the gfx part. They are very different products aimed at different markets to do different things, with different hardware, different support, different marketing strategies. The earlier Tesla was released extremely cheap to buy into the market. Now Nvidia is known as the market leaders in this technology they can increase the prices.

We simply have no information on the graphics part.

It's not so much conjecture as very educated estimation. Whilst the GeForce part may be significantly different from the Tesla part in terms of shader horsepower, it's significantly unlikely, as every Tesla part so far has had an exactly matching shader clock speed to its desktop counterpart. Unless Nvidia is manufacturing two different fermi chips (which means we've probably got quite a wait on our hands for at least one of them), I doubt we'll see a lot of deviation from past trends.

Wilderbeast · 17 Nov 2009 at 10:01

Shash said:
Maybe that's what you can a tera-flop
.
.
.
Can't believe I just wrote that.

Oh dear, get your coat!

I really can't see it being only slightly better than the 285GTX and significantly worse than the 295GTX. Didn't Jen-Hsun Huang say that the 5870 "Wasn't all that fast"? That must point towards at least a competitive product?

Jokester · 17 Nov 2009 at 10:03

I don't see any particular reason why it won't be comparable to the 295GTX.

NickK · 17 Nov 2009 at 10:08

drunkenmaster said:
Make no mistake ,frankly I think Fermi will be a great GPGPU, because its DP number is real and usable it will be powerful there, the problem being their entire GPGPU turnover last year was $80ish, which is basically nothing, its 2% of their business, its not even close to enough to keep them going.

True, however I think the promise shown by the move into supercomputing will persuade the lenders to cover the period as Intel etc still don't have anything after the joke of Larrabee which Intel hopes will be quietly forgotten..

bhavv · 17 Nov 2009 at 10:14

Looks like we will have a repeat of the ATI 9700 / 9800 dominance this time around

Toastor · 17 Nov 2009 at 10:30

This explains much about Nvidia's press releases recently. I hope this failure gives them an opportunity to catch a breather and come back with a seriously kick-ass chip for the next product cycle.

Pendu · 17 Nov 2009 at 13:28

Anyone reckon Intel might get into the same business as global foundries, would make one heck of a competitive environment.

Andric · 17 Nov 2009 at 13:37

http://www.nvidia.com/object/io_1258360868914.html

the official press release - gamer note that they gain a mention at the bottom as an editors note

Strife212 · 17 Nov 2009 at 13:58

Nvidia behind for one generation, total disaster, company in trouble

ATI cards worse since Radeon 9700 (untill just now, so years and years), no problem.