Apologies for the multiple posts, but I have things to do this arvo and wanted to point a couple of things out that some may find interesting before I forget
OK.
Let's look at high end problems and why mid ranged cards seem to have no such problems.
Firstly if you can cast your mind back to when the 1156 CPUs launched you will remember that they were highly overclockable. So overclockable infact that Intel did themselves damage.
At that time near on all games only used two cores. That meant that if you bought a Clarkdale I3 (dual core with HT) and then overclocked it to the obligatory 4ghz that it would perform bang on equal to a I5 that cost double.
There were many reasons for this. However, at a low level the I3 was worse in most every way. It had less cores and less function. However, none of that mattered. Because it only had two cores that meant it ran significantly cooler on the die and because of that you could overclock the crap out of it. And, with most games at that time only supporting dual cores and only really giving a damn about raw clock speed the I3 was a complete success.
OK, now let's look at Sandybridge. Where are the unlocked I3s? Where are the unlocked or overclockable Pentium chips? they don't exist.
But why? Well, put simply I would imagine Intel have realised the error of their ways. They do not want to put out a CPU that costs £80 that can do 5ghz and make their I5s that cost double that look silly.
What I mean is, lower end parts usually need less voltage. Also, because there is less going on on the core itself they run far cooler, meaning one thing. Clock speeds can be raised enormously.
The 2011 chips do overclock very well. However, power consumption is diabolical, as is heat. I read a group test of coolers for the 2011 chips and they say that even a mighty NH-D14 can not tame 2011.
The reason is simple. There is an awful lot going on on the die itself.
Now, using that information let's look at Kepler vs Tahiti.
Kepler, in pretty much every last way is lower spec than Tahiti. It uses a smaller memory bandwidth, smaller die with far less on it, hardly anything in the way of Direct Compute compared to the 7970, needs less power, less components and less metal for the cooler.
However, like the CPUs above because it has less going on on the die itself it is tremendously overclockable. It comes with a clock speed out of the box of over 1ghz.
Now if we cast our minds back we can see that the card, in many many pics, was clocked at just over 700mhz. However, due to the card being a cool customer and not needing more power (due to having less aboard) those clock speeds were massively increasable. To that ends Nvidia simply clocked the balls off of it and it comes as no surprise that it performs amazingly well.
Fermi 480 and 470 on launch were too hot, too loud and used too much power. They were, when compared to the 5870 and 5850 very underwhelming. Sure, they were just about faster. However, for everything they had bolted onto them they should have been a million miles better than they were. So what went wrong? Nvidia invested a lot of money, bolted on everything and the kitchen sink and then started thinking it could easily beat the 5970. It didn't, it was miles off.
Fermi was a disappointment IMO because of everything it promised on paper. Yet, when it finally launched it delivered hardly anything. Too much heat, too much stuff bolted on. Due to that clock speeds had to be lowered and performance suffered.
And then what happened? the 460. Cut down that die, cut back the crap and what are you left with? an enormously overclockable little card thats performance belies its price. Due to less crap being slapped on the die it could run at far higher frequencies. And, CUDA and all of that other stuff Nvidia bolt on there does not help in gaming, so the 460 was a gamer's dream.
In terms of "crap bolted on" to performance the 7850 and 7870 are streets ahead of the 7970. They deliver performance that belies their spec. The reason for that is simple. Less crap bolted on, less heat, higher clocks.