Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.
It's not that simple, they both use completely different architecture just like you can't directly compare amd to intel I would say it is even more so with GPU's.
Are you really saying it's not explainable or do you not know the reason?
Also; does this mean AMD can make more profit per GPU considering they use less silicon than Nvidia?
Could AMD have a potentially stronger GPU if they used larger die sizes or are they already limited in other ways?
Yes is the answer to those questions.
AMD could build a bigger, faster GPU than they currently have, but it might not be as profitable. The larger the GPU the more risk of running into design problems and the more likely to have serious yield issues. It's just a different mindset- Nvidia are still designing the fastest GPU that they can, which backfired for the last few gens, while AMD are designing GPU's for certain price points.
realistically we would need engineers from both companies to explain
Yes is the answer to those questions.
AMD could build a bigger, faster GPU than they currently have, but it might not be as profitable. The larger the GPU the more risk of running into design problems and the more likely to have serious yield issues. It's just a different mindset- Nvidia are still designing the fastest GPU that they can, which backfired for the last few gens, while AMD are designing GPU's for certain price points.
edit: http://www.anandtech.com/show/2679/1
Sorta deals with this issue.
Can AMD go bigger, yes, theres literally no reason they couldn't make a chip at the same die size as Nvidia, the question is, would it be worth it, probably not. If AMD madea 530mm2 sized Cayman architecture it WOULD blow the 580gtx away, but it would also likely prove to be completely unprofitable or even make a loss.
Expect that to change over the next couple years would be my guess, will AMD move up to 500mm2 sized dies anytime soon, very unlikely, "big" cores will likely have more and more trouble with almost each new process though sometimes you'll have a crap process that cheaped out on the "new tech" like HKMG and then the move down a process forces the use of SOI/HKMG/something else and you'll get a not normal jump in process quality. Averaged out I'd think 500mm2 dies on 22nm and below would be pretty much stupid. I wouldn't be surprised to see AMD's top end get a little bigger while Nvidia's comes down in size quite a bit aswell, maybe converging around 400mm2 as a sensible, somewhat safe size.
NVidia have for a long time put a lot of transistors towards general purpose computing (GPGPU), whereas ATI (and now AMD) have always been more of a pure gaming card.
Its relatively simple really, right now most of the decent fabs around the world are running 300mm wafers, which are 300mm diameter circles of silicon. Now, theres a size where you both have so few chips per wafer and the number of faults within the silicon make essentially an non working chip. If you divide that 300mm circle into, 300 chips and theres 15 faults in the silicon or 15 non working transistors, then worst case scenario is you would lose 15 chips, best case is some of those faults actually appear in the same chip and you lose maybe half of that, though the reality would be closer to 15 lost chips.
Now take the same wafer and divide it into 30 chips, you still have 15 faults, worst case scenario is half the chips don't work, its essentially un workable. The wafers have a fixed cost and is usually significantly higher when a process is new, which is also when the yields are the worst.
Basically you have to decide just how big you want to go and how close to that point where it simply doesn't work. THe 480gtx missed that point, quite badly, with increased yields and lower wafer costs that sensible yield line moved a bit, not loads but enough to go from a 530mm2 die that didn't work in any yields, to one that did.
Thing is Nvidia have been skirting along that line for years, and its cost them trouble before and it cost AMD years ago with the 2900xt which was also huge(but that was mostly down to being designed for 65nm which was way late so had to get moved back to 80mn).
Can AMD go bigger, yes, theres literally no reason they couldn't make a chip at the same die size as Nvidia, the question is, would it be worth it, probably not. If AMD madea 530mm2 sized Cayman architecture it WOULD blow the 580gtx away, but it would also likely prove to be completely unprofitable or even make a loss.
Nvidia don't make much cash on their high end, but this is offset by dominating the HUGELY higher margin professional graphics sector. IF AMD were competitive their, any loss in profit on a larger core would be offset by having easily the fastest card for the professional sector.
Expect that to change over the next couple years would be my guess, will AMD move up to 500mm2 sized dies anytime soon, very unlikely, "big" cores will likely have more and more trouble with almost each new process though sometimes you'll have a crap process that cheaped out on the "new tech" like HKMG and then the move down a process forces the use of SOI/HKMG/something else and you'll get a not normal jump in process quality. Averaged out I'd think 500mm2 dies on 22nm and below would be pretty much stupid. I wouldn't be surprised to see AMD's top end get a little bigger while Nvidia's comes down in size quite a bit aswell, maybe converging around 400mm2 as a sensible, somewhat safe size.
The biggest problem there is Nvidia is a seriously less efficient architecture so to reduce die size and compete with AMD, it needs a pretty drastic change in architecture to do that and they are locked in to Fermi type architecture for at least one more gen, probably 2.
Essentially it boils down to two things, whats a "safe" size to make with good yields, and if you can offset losses/lack of profit with profits in another segment.
If AMD get competitive in professional graphics Nvidia are in trouble because they'll make a lot less profits there while continueing to make sucky profits in desktop/mobile products(comparitively), likewise if Nvidia drastically increase efficiency to get "in line" with AMD efficiency wise but keep making 30% bigger cores, AMD are boned.
Theres no "right" way to do things, consumers win out no matter how badly or well a company is doing, its Nvidia making crap profits on 580gtx's, not us paying £600 for them and AMD aren't charging £400 for a 6970, but charging a sensible profit and giving us cheap cards, we win, AMD win one segment, Nvidia win another segment, at least until something gives.
I'm sure lots of technical issues from both brands have been adequately explained without the direct aid of a nvidia/amd employee.
ThanksI'll have a look at that article now.
Actually,it is only the GF100 and GF110 based cards which have the enhanced GPGPU functionality. The cards below are more orientated towards gaming and have reduced GPGPU abilities.
Likewise the the GPUs in the HD6900 series have enhanced GPGPU functionality when compared to the HD6800 series.
Nope Cat-the-Fifth. You can take it on faith that what mmj_uk stated above is the basic reason. Or you can explore the architectures very thoroughly, but you'll also have to get into asymptotic analysis, processor architecture and compiler design to make proper sense of exactly why this is.
Let me try to simplify it:
The fundamental architecture of NVIDIA cards are just plain better suited to general-purpose computation. When you say "x is better suited towards gaming" you are comparing it to other versions of NVIDIA cards. Which they are. But even the NVIDIA gpus not optimised for gpGPU are still better than AMD at gpGPU.
e.g. streaming Processors (VLIW architecture) are great for graphics, as for example a particular AMD card may combines 5 ALUs into each processor, and graphics processing is largely about linear algebra. However in a more general computational problem it is very difficult, even impossible, to write code that that packs 5 useful arithmetic operations with every instruction word.
NVIDIA#'s cuda architecture on the other hand has a sophisticated memory/cache organisation that allows you to write easier general purpose code. a good way to compare this is to think of the CPU. The CPU is optimised for everything BUT floating point operations. AMD's cards are optimised purely for FLOPS like the kind seen in graphics problems all the time. CUDA fits somewhere in between which makes it equally great for gp computation as it does graphics.
For example GTX 580 has 512 CUDA cores. In comparison 5870 has 320 processors with 5 ALUs each (the so-called VLIW5 architecture)
Compare this to 6970. AMD literally went down a notch and used what they call VLIW4 for their gpu machine language. So this allows 4 instructions per long instruction instruction word. this was done because it is difficult for any app (EVEN graphics apps) to take full advantage of long instruction words.
CUDA cores are 'meatier' and while they are great at graphics, they are also great at general-purpose computation. This is the real reason why theoretical FLOPS in AMD cards are so high, but this is meaningless cos no real world algorithm can make use of even 1/3rd of that flop performance, while many algorithms can use well over 50% of the CUDA architecture's FLOP perf.
So as you can see the real reason NVIDIA gpus are smaller than AMD gpus has even less to do with microelectronics (which happens to be my area) and more to do with architecture itself (which is a different branch of electrical engineering and is more correctly called "Computer Engineering")... I am pretty confident if you look into it enough or speak to someone who specialises in computer engineering rather than micro or power or whatever you will get the same answer.
I am quite aware of the difference in architectures as the pros and cons of each companies GPUs have been discussed ad verbatim for years.
However,I was actually comparing the GPUs within the same companies product ranges(which was quite obvious). So,TBH your explanation was not really required in the first place.
GPUs like the GF104 and GF114 actually have reduced GPGPU functionality over the higher end GF100 and GF114. They do DP calculations at a much lower rate than the higher end GPUs and have much less cache too. The shader clusters have also be rearranged to make the lower end Nvidia GPUs more efficient with regards to gaming and they are also much smaller as a result.
I assumed you were using that to say the refreshed Fermi isn't generally good at gpGPU because of the thread title.
At any rate I heard something like that, that the newer GPUs are designed to be less powerful at gpGPU and that's probably true. I haven't actually looked into it closely, but you're probably right there.
Look at a GTX560TI and a GTX580 for example. The GTX580 is around 30% faster than a GTX560TI. However the GF110 is around 530MM2 whereas the GF114 is around 330MM2. The GF110 is around 3 to 3.2 billion transistors whereas the GF114 is under 2 billion transistors.
A few extra things to note, though. Performance doesn't necessarily increase linearly with the number of transistors. Furthermore, the GTX 560 is clocked considerably higher than a GTX 580. Then there is algorithmic design. It is often harder to design the algorithms to fully exploit the higher end. They are generally designed for mid-range and then scaled up and down. These other factors will also make a difference in performance vs chip size.
Hence a gaming optimised Nvidia card will have a much smaller die than one which is optimised for non-gaming tasks too.
No doubt.
The same goes with the lower end AMD GPUs. Barts for example cannot do DP whereas the HD6900 series can. When AMD moves to the VLIW4 arrangement with the HD7000 series from top to the bottom you will see that the higher end GPUs will be relatively larger than the lower end more gaming orientated ones. This is because the higher end GPUs will have to accommodate more functionality which is not useful in most cases for games.
Yep
However,one more thing. The AMD designs seem to be more transistor dense than their Nvidia counterparts.
By definition the process describes how densely the transistors are packed, so I take expection to the term "transitor dense".This may just just be a matter of semantics though, so let's see...
If you look at Barts(HD6870) which has 1.7 billion transistors it is around 255MM2 in area. The GF106(GTS450) has around 1.17 billion transistors and is 238MM2 in area. Both GPUs are fabbed on the same process by the same company.
You may be on to something here, but I abhor the choice of terms.
Because to use "Transistor density" literally, you'd have to look elsewhere for explanations for this. I know quite a lot of sources tend to run away with examples like this and say x is less dense than y, but they are contradicting themselves. If you sit down and do a nanoscale design you will see why this isn't true and can't be true.
The reason you may find a two chips of the same size with two different transistor counts will always be because either parts of the chip are disabled or parts are just not etched with transistors. I suppose you could call that transistor density -- but I wouldn't. Because the function (or denser) parts of both circuits will have EQUAL DENSITY.
PCB layout is a complex constrained optimisation problem in electronics engineering and there is a whole frontier of research to solve it in computer science and applied mathematics. There are numerous techniques used to make more efficient use of PCB space, and whether there is a fast way to determine the most optimal layout is in fact a very important open problem, and there is a $1million prize from the clay maths institute for anyone who can show it can be done (or show it cant)). Similarly laying out substrates and metal on a CMOS circuit can also be quite complex even with the standard cell-based design and using automated vlsi lay out synthesis tools we tend to use in the field.
Now, it could well be technically NVIDIA's designs are less space-optimised, but this will have more to do with the Silicon EDA tools used by NVIDIA than anything down to the design itself (if you look at the bottom (lowest level) of silicon design , a lot of it consists of using standard cells to implement more complex functions. Therefore there is less room for poor optimisation in the design itself, though there is a lot of room, as always for efficient use of mask area)
Even if you were right and NVIDIA's designs are intrinsically less compact in their transistor usage at various areas, I would say that hte GTS 450 vs an HD 6000 is not the best comparison because GTS 450 is 1) one gen older and 2) low-end compared to the 6870. I don't know too much about the GTS 450 and the 6870, and I'm sure you'll correct me if I'm wrong, but is it possible one has aspects of its design disabled? If the GTS450 has more parts disabled then the 6870 that could easily explain active transistor count vs die size.
There may also be other reasons. Design reasons like, say if the GTS 450 is a design "stop off", a design milestone through to some bigger design (as is commonly used in industry), in which case it may countain spurious circuit elements that are also disabled -- a lot of this is confidential information that we will NEVER really hear about. I am always wary of jumping to conclusions about someone else's design -- and often find it absolutely hilarious when people (esp a lot of these 'tech sites' do jump to such wild estimates and purport to think they can tell the industry players like NVIDIA and AMD what they should be doing with their chips (it makes for sensational journalese but is very poor form for an engineer or researcher). This is what, I think, is called hubris. Pickup a microelectronics design book and you will find none of that attitude there.) Indeed chip design is a very complex affair -- I can tell you that microelectronic circuits are the most complex man made objects in the universe. In the first nanoelectronics class I ever took (which was a rather specialised affair back in 2005) we the professor told us about how the chips back in the early days used to be designed on the floor of aircraft hangers because it was the only indoor space large enough to lay out a chip. He then paused and added, "And chips back then were a billion times less complex. now you couldn't hope to do it if you had entire city as your design floor.)
Anyway, the point of that was to say that it would be best to compare top end GPUs and decide if there is a difference in transistor count vs chip size because lower end or midrange chip sizes are definitely not going to be representative due to various complexities. If you still find a discrepancy then I'll grant you that NVIDIA's design is making less efficient use of chip space.