Soldato
- Joined
- 18 May 2010
- Posts
- 23,753
- Location
- London
Article here.
Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.
Good news
From the double-precision power-efficiency claims, I'm guessing Kepler will have a larger ratio of DP to SP cores. As far as single precision goes, it would surprise me if they have improved by any more than a factor of 2x over the previous generation.
Good news
From the double-precision power-efficiency claims, I'm guessing Kepler will have a larger ratio of DP to SP cores. As far as single precision goes, it would surprise me if they have improved by any more than a factor of 2x over the previous generation.
Would you mind saying the same thing in laymans terms please?
Well from the process change (40nm to 28nm) I would expect less than 2x the overall power efficiency of 40nm Fermi. Design changes and tweaks can account for a little extra improvement, but 3x or more seems a bit unrealistic.
Nvidia have stated that the double precision power efficiency is 3x better than Fermi, not overall power efficiency, so I suspect that the number of double-precision-capable cores will have been increased, at least on the HPC version.
The HPC (Tesla) version of Fermi can perform 1/2 as many double precision computations as it can single precision, while the retail Fermi (GTX480 / 580 etc) can peform only 1/8th as many (double precision is not important for gaming). If I have only a small number of double precision units relative to single precision, then my double-precision power efficiency will be low. The most straightforward way to improve DP power efficiency is to increase the proportion of DP units relative to SP.
Anyway, from these claims I'm expecting a 1:1 ratio between single and double precision, at least in the HTC version of the chip. I don't know whether we will get a cut-down version for the GTX680, I imagine it will depend on how they have adjusted the architecture to handle double precision. Certainly if it allows the retail unit to run at higher clockspeeds and/or use less power, they will go for it.
A 1:1 would give a lot more than 3x the dp/w increase
no one on earth cares about sp/w.
Also remember, as I said, they were stating 4x the performance/w 6 months ago, its already down to 3x, by launch, it won't even be surprising if its lowered again.
[Likely] paper launch does not mean 'Kepler by the end of the year'. OP was a bit optimistic with the thread title methinks.
What's GT610?
Not neccesarily... Remember, by using the DP rather than SP shaders you are not running the GPU at full capacity. Power draw is not as high at "100% DP load" as it is with 100% SP.
Sure they do - Nvidia and AMD
I would love to see a 3x improvement in performance-per-watt for single precision, but I'm a little skeptical as to whether it's possible. Transistor power draw does not tend to scale as well as transistor packing-density, so I am not expecting a full 2x improvement simply from the manufacturing process. Anyway, I guess we'll see.
The first part is irrelevant, that was as true for a 280gtx as a potential 680gtx, if DP doesn't use the whole core, then it won't on new or old architectures.
Fermi pairs two 32bit shaders to complete a single 64bit operation, so it still uses ALL the shaders, and most of the rest of the core aswell, same as gaming.
As for if either care about SP power efficiency, no neither do
100W or so of the top end Fermi's is supposed to be leakage, AMD won't be that far behind. We'll have to see if the HP HKMG will make a difference but, quite likely not, 28nm leakage should be higher than 40nm, so HKMG is a tool like most processes have most new nodes, to fight the increase in leakage.
As a general rule of thumb, the smaller you make the manufacturing process the more difficult it is to prevent current leakage, since you need greater relative precision in manufacturing. I see no reason (yet) to convince me that will change with 28nm.