• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

Nvidia news direct from Jen-Hsun Huang!

Ejizz;

As I have pointed out, performance per mm^2 is only related to the cost of the die, and this is a small part of the retail sales price.

Performance per watt is strongly related to the specifics of the architecture, and how it operates on a particular process. It is a tricky beast.

Yes it is, software processing is less efficient than dedicated hardware processing, it's just a fact that you can't get away from no matter what you do to change your architecture.

As for the rest - I have made the point many times now that Fermi is the first generation of a new design paradigm, whereas Cypress is towards the end of an older design process (read my previous posts for more detail).

Fermi is in a different situation, GPGPU architectures like Fermi rely on software based multi-function processing, and software solutions are inherently less efficient than fixed function hardware solutions.
GPGPU architectures have an inherent efficiency disadvantage you can't get away from.

Hence why I said Nv needs two fundamentally different GPU architectures if it wants to continue down the GPGPU route and still be competitive, because as current chips are now hitting hard TDP walls, the winning architecture will be the most efficient architecture.
 
Freddie1980 said:
That's no excuse, if Nvidia done there homework before jumping headlong into 40nm they would have known about the problems with TSMC's 40nm process being able to print the size of dies that Nvidia had designed.

If you recall the first 40nm video card was the Hd4770 and it was reported on news sites not long after launch that there were supply issues as result of poor yields on the new 40nm process at TSMC. AMD were well aware of the issues at TSMC (needless to say Nvidia would have known about this as well) and responded accordingly with there 5000 series. Again as it was known that the 40nm was broken that early why didn't Nvidia revise Fermi because its not as if they didn't have the time as the first HD4770 came out in April 2009 a full year before the GXT480 and GTX470.
Nvidia's expectations of the 40nm process were not met by TMSC's technical competency, but this problem has been covered before countless times on this forum.

We don't know whether Nvidia was made privy to the data which AMD had gathered. TMSC notoriously 'save face' as much as they can, so maybe they didn't even admit to the seriousness of the problem.

It's all a moot point really since this is really ancient history now.
They both seem to be dealing with it, and TMSC have been given a kick up the rear by Global Foundries.
 
And you know this because? If you have a degree in electronic design, that I will be inclined to believe you. If not, it's merely speculation.
You're spending power and transistors on functions which may not be used / may not be the most efficient implementation of said functions.
This is the benefit of ASICs vs. General Compute.
 
You're spending power and transistors on functions which may not be used / may not be the most efficient implementation of said functions.
This is the benefit of ASICs vs. General Compute.

That doesn't mean that they are inefficient. He doesn't even mention in what way they are inefficient.

I think Nvidia know how to design a GPU. So, unless you have some inside information, I think the engineers who design the GPUs kind of know best.
 
Last edited:
That doesn't mean that they are inefficient. He doesn't even mention in what way they are inefficient.
Well, as a relevant example: Tessellation performance on the current generation of cards.

The AMD cards have a smaller number ASIC transistors to deal with these operations. In terms of transistors / performance, it does very well.
Nvidia cards have the tessellation functions built into their polymorph engine, with one dedicated per SIMT as a result.

Good summary here:
http://www.evga.com/forums/tm.aspx?m=224112&mpage=1&tree=true
 
I'd be more interested in knowing how many extra transisters and complication is required for all the GPGPU double precision processing. I would guess a lot.

The architecture is on the right track, rendering systems have been getting more programmable and general purpose for some time and will continue to do so. Unfortunately I think they've gone too far, too soon in persuit of the GPGPU market. Rendering will catch up eventually and this sort of architecture will show it's strengths, but will it be soon enough?
 
Last edited:
I'd be more interested in knowing hom many extra transisters and complication is required for all the GPGPU double precision processing. I would guess a lot

It would be interesting indeed considering they are no use in Geforce product lines and are fused off.
Another + to purpose built efficient architectures than a Jack of all trades.
 
The architecture is on the right track, rendering systems have been getting more programmable and general purpose for some time and will continue to do so. Unfortunately I think they've gone too far, too soon in persuit of the GPGPU market.

I'd be inclined to agree with this statement (given the benefit of 20:20 hindsight of course). I think it's likely that without the need to cater to the GPGPU market, nvidia would have taken the "easier" path, and released a scaled-up GT200 to compete with evergreen, pushing Fermi until late 2010 or 2011. I'm just speculating though - we will probably never know for sure.

I still maintain that the data dissemination and scheduling of the Fermi architecture is a necessary step toward achieving the scalability that will be required to build the next several generations of GPU, and I strongly suspect that we will see a number of "Fermi-esque" features making their way into Northern Islands. Increased programmability will also come at a cost of more transistors, but it does in some ways go hand-in-hand with the increased control logic needed to disseminate an ever increasing number of parallel threads (i.e. more pixels in GPU rendering terms).
 
LMAO, seems to me like they (JHH) needs some good advice...

To be fair despite all of Dear Leaders JHH's failings over the last few years it's people like him who have driven the industry forward and delivered consumers fast, inexpensive products and the software to match.
 
Back
Top Bottom