Nvidia news direct from Jen-Hsun Huang!

An Exception · 23 Sep 2010 at 14:13

Duff-Man said:
Ejizz;

As I have pointed out, performance per mm^2 is only related to the cost of the die, and this is a small part of the retail sales price.

Performance per watt is strongly related to the specifics of the architecture, and how it operates on a particular process. It is a tricky beast.

Yes it is, software processing is less efficient than dedicated hardware processing, it's just a fact that you can't get away from no matter what you do to change your architecture.

Duff-Man said:
As for the rest - I have made the point many times now that Fermi is the first generation of a new design paradigm, whereas Cypress is towards the end of an older design process (read my previous posts for more detail).

Fermi is in a different situation, GPGPU architectures like Fermi rely on software based multi-function processing, and software solutions are inherently less efficient than fixed function hardware solutions.
GPGPU architectures have an inherent efficiency disadvantage you can't get away from.

Hence why I said Nv needs two fundamentally different GPU architectures if it wants to continue down the GPGPU route and still be competitive, because as current chips are now hitting hard TDP walls, the winning architecture will be the most efficient architecture.

madindehead · 23 Sep 2010 at 14:30

Ejizz said:
GPGPU architectures have an inherent efficiency disadvantage you can't get away from.

And you know this because? If you have a degree in electronic design, that I will be inclined to believe you. If not, it's merely speculation.

Yamahahahahaha · 23 Sep 2010 at 14:32

Freddie1980 said:
That's no excuse, if Nvidia done there homework before jumping headlong into 40nm they would have known about the problems with TSMC's 40nm process being able to print the size of dies that Nvidia had designed.

If you recall the first 40nm video card was the Hd4770 and it was reported on news sites not long after launch that there were supply issues as result of poor yields on the new 40nm process at TSMC. AMD were well aware of the issues at TSMC (needless to say Nvidia would have known about this as well) and responded accordingly with there 5000 series. Again as it was known that the 40nm was broken that early why didn't Nvidia revise Fermi because its not as if they didn't have the time as the first HD4770 came out in April 2009 a full year before the GXT480 and GTX470.

Nvidia's expectations of the 40nm process were not met by TMSC's technical competency, but this problem has been covered before countless times on this forum.

We don't know whether Nvidia was made privy to the data which AMD had gathered. TMSC notoriously 'save face' as much as they can, so maybe they didn't even admit to the seriousness of the problem.

It's all a moot point really since this is really ancient history now.
They both seem to be dealing with it, and TMSC have been given a kick up the rear by Global Foundries.

Yamahahahahaha · 23 Sep 2010 at 14:33

madindehead said:
And you know this because? If you have a degree in electronic design, that I will be inclined to believe you. If not, it's merely speculation.

You're spending power and transistors on functions which may not be used / may not be the most efficient implementation of said functions.
This is the benefit of ASICs vs. General Compute.

madindehead · 23 Sep 2010 at 14:40

Yamahahahahaha said:
You're spending power and transistors on functions which may not be used / may not be the most efficient implementation of said functions.
This is the benefit of ASICs vs. General Compute.

That doesn't mean that they are inefficient. He doesn't even mention in what way they are inefficient.

I think Nvidia know how to design a GPU. So, unless you have some inside information, I think the engineers who design the GPUs kind of know best.

RavenXXX2 · 23 Sep 2010 at 14:44

lol @ giving Nvidia advise on how to design their GPU's, LMAO.

Yamahahahahaha · 23 Sep 2010 at 15:00

madindehead said:
That doesn't mean that they are inefficient. He doesn't even mention in what way they are inefficient.

Well, as a relevant example: Tessellation performance on the current generation of cards.

The AMD cards have a smaller number ASIC transistors to deal with these operations. In terms of transistors / performance, it does very well.
Nvidia cards have the tessellation functions built into their polymorph engine, with one dedicated per SIMT as a result.

Good summary here:
http://www.evga.com/forums/tm.aspx?m=224112&mpage=1&tree=true

Baboonanza · 23 Sep 2010 at 15:11

I'd be more interested in knowing how many extra transisters and complication is required for all the GPGPU double precision processing. I would guess a lot.

The architecture is on the right track, rendering systems have been getting more programmable and general purpose for some time and will continue to do so. Unfortunately I think they've gone too far, too soon in persuit of the GPGPU market. Rendering will catch up eventually and this sort of architecture will show it's strengths, but will it be soon enough?

TheRealDeal · 23 Sep 2010 at 15:20

A couple of new articles from SemiAcurate that relate to things in this thread.

http://www.semiaccurate.com/2010/09/23/nvidias-kepler-and-maxwell-barely-beat-moores-law/

http://www.semiaccurate.com/2010/09/23/nvidia-blames-tsmc-fermis-failures/

Stanners · 23 Sep 2010 at 15:28

That is a good set of non biased articles you have posted there. Oh wait a minute....

An Exception · 23 Sep 2010 at 15:43

RavenXXX2 said:
lol @ giving Nvidia advise on how to design their GPU's, LMAO.

LMAO, seems to me like they (JHH) needs some good advice...

An Exception · 23 Sep 2010 at 15:45

Stanners said:
That is a good set of non biased articles you have posted there. Oh wait a minute....

Wait for what? Charlie is more credible than Nvidia themselves...

Stanners · 23 Sep 2010 at 15:48

Ejizz said:
Wait for what? Charlie is more credible than Nvidia themselves...

I am not doubting that there is some truth in what he says, that is if you can see it through the poison.

An Exception · 23 Sep 2010 at 15:50

Baboonanza said:
I'd be more interested in knowing hom many extra transisters and complication is required for all the GPGPU double precision processing. I would guess a lot

It would be interesting indeed considering they are no use in Geforce product lines and are fused off.
Another + to purpose built efficient architectures than a Jack of all trades.

Duff-Man · 23 Sep 2010 at 16:20

Baboonanza said:
The architecture is on the right track, rendering systems have been getting more programmable and general purpose for some time and will continue to do so. Unfortunately I think they've gone too far, too soon in persuit of the GPGPU market.

I'd be inclined to agree with this statement (given the benefit of 20:20 hindsight of course). I think it's likely that without the need to cater to the GPGPU market, nvidia would have taken the "easier" path, and released a scaled-up GT200 to compete with evergreen, pushing Fermi until late 2010 or 2011. I'm just speculating though - we will probably never know for sure.

I still maintain that the data dissemination and scheduling of the Fermi architecture is a necessary step toward achieving the scalability that will be required to build the next several generations of GPU, and I strongly suspect that we will see a number of "Fermi-esque" features making their way into Northern Islands. Increased programmability will also come at a cost of more transistors, but it does in some ways go hand-in-hand with the increased control logic needed to disseminate an ever increasing number of parallel threads (i.e. more pixels in GPU rendering terms).

Freddie1980 · 23 Sep 2010 at 16:25

Ejizz said:
LMAO, seems to me like they (JHH) needs some good advice...

To be fair despite all of Dear Leaders JHH's failings over the last few years it's people like him who have driven the industry forward and delivered consumers fast, inexpensive products and the software to match.

An Exception · 23 Sep 2010 at 16:29

Duff-Man said:
I think it's likely that without the need to cater to the GPGPU market, nvidia would have taken the "easier" path, and released a scaled-up GT200 to compete with evergreen.

They struggled to even get GT200 on 40nm hence why GT212 was delayed and subsequently cancelled.

bru · 23 Sep 2010 at 16:38

the video interview mentioned in that article from Charlie is well worth a watch actually as it explains why Fermi didn't work as Nvidia expected.

direct link to video

Duff-Man · 23 Sep 2010 at 17:07

Ejizz said:
They struggled to even get GT200 on 40nm hence why GT212 was delayed and subsequently cancelled.

As reported by Charlie Demerjian?

An Exception · 23 Sep 2010 at 17:10

^^^
Can't remember where I read it, are you saying that's not what happened?