So let's see if I understood.
The transistors make up the GPU and provide the power. The nm eg.40, 28 is the size of the transistor. The transistors can only take up a certain physical area. Therefore with smaller transistors you can fit more into that physical area and therefore have more power.
If that's correct, then why don't we just have bigger cards with more power?
The main reason for bigger cards with more power not being an option is simply manufacturing.
Nvidia's Fermi top end cards are 529mm2 in size, wafers are 300mm diameter circles in shape and so thats the manufacturing limit. Now, if you made a core stupid big that covered almost the entire wafer, it WOULD fail, you always get at least a few defects in a wafer, sometimes a lot more. The smaller the core essentially the smaller the part of the wafer effected where the defect is. In terms of the original Fermi wafers, you get around 100 Fermi cores on a single 300mm wafer, because there were so many defects and because the cores were so big there were essentially no fully working Fermi's on the entire die.
Almost a year later and several multi million dollar respins and they've got working cores though I'm not sure how good yields are to be honest.
Now if you make much smaller cores, you both get say 150 cores per wafer, AND more of them work and less fail. If you paid per good core this wouldn't be an issue, but you pay TSMC for each wafer you have made.
Essentially Fermi is bordering on the absolute limit of maximum realistic die size on 300mm wafers, AMD were at a hugely safer, cheaper, higher yields, lower power and higher profits 340mm2 or so, and 380mm2 for the newer 6970's.
Basically there isn't a chance in hell you could make anything bigger than Fermi at all, and as it is, profits are low, production is low and yields aren't great. AMD have it about spot on in terms of size vs yield vs cost per core. If they made a core twice as big with twice the power, you'd go from 150 wafers per die with 80% yields at 5k a wafer = $42 a core, to something like 75 cores per wafer at 20% yields(if lucky) at the same 5k a wafer cost = $333 a core cost.
Thats why two small cores is MASSIVELY cheaper than one epically sized core, it isn't linear so 1x 380mm2 cores aren't significantly more expensive than 2x190mm2 cores, its an exponential curve where above a certain size yields go to crap and cost goes up exponentially.
For now xfire/sli is vastly better than a stupidly big core, 400mm wafers in the future will likely change the realistic top core size but were quite a few years from that at the moment.
28nm will bring WAY more than 20%, but I'm not convinced this time around that it will bring the circa 80% performance increases, though its very hard to know because AMD's architecture is having such a massive change to go with it.