Possible huge ATI tessellation performance boost with 10.5 drivers

Rroff · 17 May 2010 at 15:43

nVidia has implemented scalable geometry processing into the shader processor pipeline, tessellation isn't done using traditional compute "emulated within the CUDA cores" as is the common misconception. If it was performance would be terrible. To be very clear the CUDA cores are an entirely seperate part of the pipeline to the tessellator.

RavenXXX2 · 17 May 2010 at 15:47

Well tessellation in games like Dirt2 is on par with a 5870, what does that tell you. It tells me that when fermi has to do a proper game environments, its tessellation performance is compromised by the cores/shaders processor pipeline doing more work.

mmj_uk · 17 May 2010 at 15:49

Pretty funny that they admit the tesselation emulation is practically useless in games but will add it anyway merely to boost benchmarks.

Rroff · 17 May 2010 at 15:50

I would say theres too many variables to draw that conclusion - it could be that ATI is much more efficent at rendering a certain shader thats in the scene than the nVidia cards.

All the tessellation benchmarks - and some of those ARE shader and other processing heavy - indicate that the GF100 is much better equipped for tessellation performance. There aren't enough games properly using it to draw a firm conclusion.

Rroff · 17 May 2010 at 15:55

I'm not an expert on this bit - I'm happy for someone to show me wrong - but pretty sure that the way ATI are proposing to do tessellation on the SPs would mean they have to do triangle setup for the viewpoint twice, once for the original data and then again for the tessellated output - and this is a good part of where tessellation performance comes from as the hardware implementation only has to transform to the viewpoint once.

drunkenmaster · 17 May 2010 at 16:19

As said, it will probably mean nothing, but also as said, AMD stuck a tesselator in 3 years ago, its gone unused, to stick in a massive tesselator would have been wasteful and reduced performance(assuming bigger tesselation but same size die) due to less shaders, lacking core logic somewhere. TO benefit a mere handful of games that would have been ludicrous. Now that tesselation is ACTUALLY IN a game, as opposed to being in the hardware wasted for 3 years, now is the time AMD can go, well its probably worth it now, lets bump up the tesselator performance.

Thing is, Nvidia's design can only do it, currently, with more shader clusters and theres a lot of core logic involved with each shader cluster, AMD can increase tesselator massively in future hardware, with a very small die size penalty.

AMD went WAY to early on tesselation 3 years ago(only because Nvidia had it removed from the spec), Nvidia finally added it, and with it will come a lot of dev's starting to add it for future games as now all hardware can do it, however those games won't be here for some time, it won't be standard or heavily used for 2 generations. Nvidia could have 300 times the tesselation performance, it would still only use as much as AMD have right now so its wasted, just as AMD's tesselator was wasted for 3 years.

This exercise, as I said, is for willy waving in Heaven 2 benchmark and probably not much else.

Even if it massively improves tesselation performance, its still only in a couple games and probably nothing compared to what it will do simply by doubling/trebling the size of the tesselator in hardware.

Nvidia to get double their current performance, need to increase the core size by 80% with twice the shader clusters, AMD need to increase the core size, by maybe 5%.

I've had my 5850 (at over 1Ghz clock speeds) for over 6 months, and other than Heaven 2 haven't played a game with tesselation(that I've enabled it in, Metro 2033, its simply not worth it at all, even if i had 2x480gtx's I wouldn't enable it).

Despite all that I still think Nvidia's tesselation performance has ONLY shown an advantage in one benchmark and not in any tesselation using games, crap as they are it still hasn't shown an advantage in tesselation performance.

If AMD manage to spank the 480gtx in Heaven 2, will I run around talking about how I've got the best tesselating card in the world...... no, because its still not used in anything effectively and till it is, I don't care. If they triple performance by doing it on shaders, and there isn't a game I can use that performance in, I just don't care.

Now if 480gtx is 50% faster in the first game that uses tesselation well, so be it, if AMD are, good for them, by the time that game is out its very likely I'll have a 7850........ or a 680gtx though I think we all know theres a VERY slim chance of the later happening.

Tesselation is a great feature, if it started being used 3 years ago NEITHER card available now, a 5870 or 480gtx would have the same tesselation hardware, if it had been featured from 3 years ago it would be standard and both AMD and Nvidia would have since bumped up tesselation power(and optimised better for it as real world experience shows the best ways to use it). Its rubbish to say the current gen can't handle it, both COULD. If it was standard by now the past 2 generations of both companies cards would have implemented larger tesselation units within their cores, and by this generation would either go for a bigger core for more tesselation power or sacrifice some other things for it.

Thats the problem, if it was being used, current gen cards simply wouldn't be what they are today. Now as games use it and more and more dev's implement it more and more heavily both companies will increase tesselation power, the most important thing by absolute miles, is that both companies HAVE tesselation units, the power doesn't matter, they both have it so games will start to use it....... thats the first step, ramping up usage and performance to use it are the next step, you can't take the next step without the first important step and thats making it a standard.

Rroff · 17 May 2010 at 16:43

drunkenmaster said:
Nvidia to get double their current performance, need to increase the core size by 80% with twice the shader clusters, AMD need to increase the core size, by maybe 5%.

Not actually true, again common misconception that its "emulated on the CUDA cores" - they only need to increase the PolyMorph engine component of each SM (which is less than a 20% increase in core size not 80% - although given the process issues that probably is just as difficult to achieve) also another thing to bare in mind is that (I think) the polymorph engines are still running on half clock due to issues that couldn't be ironed out in time for release - with a better process with less heat/power issues they could bump them upto full speed.

Spazzfish · 17 May 2010 at 16:52

mmj_uk said:
Pretty funny that they admit the tesselation emulation is practically useless in games but will add it anyway merely to boost benchmarks.

That's because in the madness we live in today, it seems that people will buy a card related to a benchmark program score rather than actual game performance.

Biffa · 17 May 2010 at 19:24

Spazzfish said:
That's because in the madness we live in today, it seems that people will buy a card related to a benchmark program score rather than actual game performance.

It has always been so since 3Dmark2001

james.miller · 17 May 2010 at 19:34

AMD's first play with hardware tessellation was 9 years ago when the 8500's were released

Meaker · 17 May 2010 at 19:39

Npatches lol.

Cyber-Mav · 17 May 2010 at 21:14

RavenXXX2 said:
Is that right..care to explain how Nvidia does tessellation on fermi cards, no dedicated tessellation hardware, how does it do it...magic?

nvidia does it using the compute unified device architecture. it shows how versatile the cuda cores are that they can do everything including tessalation without the need for seperate engines etc. this way cuda cores can be allocated for different tasks on a per game basis even.

ati have no unity in thier architecture, hence why its all over the place with inconsistant framerates over a broad range of games poor minimum framerates. you would expect something with 1600 shaders would atleast be 3 fold faster than nvidia's measly 480 cuda cores.

yet in reality the 1600 shader monster cant lift a finger against the 480 cuda core pussy cat.

quality over quantity.

Cyber-Mav · 17 May 2010 at 21:16

james.miller said:
AMD's first play with hardware tessellation was 9 years ago when the 8500's were released

any games out that actually used it?

come to think of it the 2900 was touted as having tessalation. now that tessalation is being used it turns out the 29** to 4 series cards are useless in tessalation. hopefully people didnt buy the cards hoping that one day they would automatically qualify for dx11 support with a simple driver update.

Rroff · 17 May 2010 at 22:21

Cyber-Mav said:
nvidia does it using the compute unified device architecture. it shows how versatile the cuda cores are that they can do everything including tessalation without the need for seperate engines etc. this way cuda cores can be allocated for different tasks on a per game basis even.

ati have no unity in thier architecture, hence why its all over the place with inconsistant framerates over a broad range of games poor minimum framerates. you would expect something with 1600 shaders would atleast be 3 fold faster than nvidia's measly 480 cuda cores.

yet in reality the 1600 shader monster cant lift a finger against the 480 cuda core pussy cat.

quality over quantity.

Please read my post - nVidia card do NOT do tessellation on the CUDA cores.

Cyber-Mav · 17 May 2010 at 22:40

Rroff said:
Please read my post - nVidia card do NOT do tessellation on the CUDA cores.

ahh they have a seperate tessalation engine then? could you please email nvidia and ask them to correct the information provided in thier gpu whitepapers then.
thanks

james.miller · 17 May 2010 at 22:45

GF100’s entire graphics pipeline is designed to deliver high performance in tessellation and geometry throughput. GF100 replaces the traditional geometry processing architecture at the front end of the graphics pipeline with an entirely new distributed geometry processing architecture that is implemented using multiple “PolyMorph Engines” . Each PolyMorph Engine includes a tessellation unit, an attribute setup unit, and other geometry processing units. Each SM has its own dedicated PolyMorph Engine (we provide more details on the Polymorph Engine in the GF100 architecture sections below). Newly generated primitives are converted to pixels by four Raster Engines that operate in parallel (compared to a single Raster Engine in prior generation GPUs). On-chip L1 and L2 caches enable high bandwidth transfer of primitive attributes between the SM and the tessellation unit as well as between different SMs.
Tessellation and all its supporting stages are performed in parallel on GF100, enabling breathtaking geometry throughput.

from the GF100 whitepaper.... http://www.nvidia.com/object/GTX_400_architecture.html

Cyber-Mav · 17 May 2010 at 22:48

whoa after reading that i almost whipped out my credit card and was about to place an order for one.

james.miller · 17 May 2010 at 22:49

Cyber-Mav said:
whoa after reading that i almost whipped out my credit card and was about to place an order for one.

why? i thought you'd already read it and knew it back to front. Otherwise, you know, you've just made a tit of yourself for the 3rd time tonight.

Cyber-Mav · 17 May 2010 at 22:51

you really think iv actually read the white paper on the gf100 or on any gfx card for that? man that piece of string your on about earlier, iv got it wrapped over your finger and been tuggin all this time.

ouch you didnt expect that to come back and hit you in that way now did ya???

metalmackey · 17 May 2010 at 22:55

I'm not really sure what tessellation does LOL. Can it make a big difference in quality if used properly?