• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

Fermi possible pricing

Care to quantify? I thought it has been shown/rumoured already that in heavy tessellation scenes the Fermi uses up so many of its shaders the performance drops?

Hence I think it will be up and down as to where Fermi wins and looses.

If you look at the architecture the GF100 load balances tessellation over all shader clusters - so it doesn't tie up the entire shader pipeline for the duration of the tessellation workload - discrete units can continue execution when their workload is done or if they aren't required for tessellation - so from the start the actual performance hit on the shaders isn't as big as certain people make out. Unlike ATI where your entirely dependant on the overall performance of the dedicated tessellation unit and have to wait til its done its job to proceed with anything relying on tessellation (performance in heaven benchmark would tend to indicate your stalling much of if not all of your shader pipeline when processing tessellation anyhow or that the tessellation unit is really weak).
 
Last edited:
From the very few bench marked results released of the fermi chip, tessellation is one of the areas where the main strength lies.

I understood that with old dxd9 type apps there wouldn't be a huge difference over current gen/ati but with dx11, cuda etc it would show it's strengths.

tesselation.jpg

 
From the very few bench marked results released of the fermi chip, tessellation is one of the areas where the main strength lies.

I understood that with old dxd9 type apps there wouldn't be a huge difference over current gen/ati but with dx11, cuda etc it would show it's strengths.

I think the tesselation performance will only be this good in synthetic benchmarks, and not actual games.
 
So does that mean game developers will have to dumb down the tesselation settings for games for ATI's sake even though Nvidia can probably use tesselation better?

I'm not 100% sure on this so take this with a pinch of salt, but I think I've read Nvidia can divert their programmable shaders to ~pure tesselation whereas ATI can't. So in a real game, tesselation heavy or not, the Nvidia shaders will be doing much more than pure tesseleation.
 
So does that mean game developers will have to dumb down the tesselation settings for games for ATI's sake even though Nvidia can probably use tesselation better?

Not really, one of the cool features about tessellation is that it's extremely easy to scale - in fact one tessellation mode, called adaptive tessellation, adjusts the level of tessellation on the fly depending on how far away you are from an object.
 
Well, anyhow....I really hope that nvidia can delivery some decent mid-price range cards that is something like same price as 5850 but better performance, or same performance as 5850 but at a lower price...unlikely as it may.
 
I think the tesselation performance will only be this good in synthetic benchmarks, and not actual games.

Probably the opposite, since the Fermi architecture is much better at balancing workloads through a unified architecture for tessellation and shading. As games get more advanced the Fermi architectures will pull ahead further in performance.
By then the next ATi chip should be out, and hopefully ATI have a new architecture rather than DX11 tacked on.
 
Nvidia are obviously very serious about the using tessellation to it's full extent and may be the reason for such lengthy delays. At some stage ATI are going to have to think along the same lines.


The overall goal of DirectX 11 will be to ease the workload of current GPUs and of game developers. To achieve these goals, new types of shader are being added to the API, including a hull shader and tessalator domain shader, in addition to the compute shader which has been mentioned previously.
 
hmmm, with no benchmarks leaked by nvidia im betting my money on the card sucking badly.

dx11 tacked on?? wtf?? its either compliant to the standard or not lol
 
Nvidia threatens it's partners at CeBIT
CeBIT 2010 Don't you dare talk GTX480

This is from SA,it doesn't feel me with confidence and they launching them later in the month
 
If you look at the architecture the GF100 load balances tessellation over all shader clusters - so it doesn't tie up the entire shader pipeline for the duration of the tessellation workload - discrete units can continue execution when their workload is done or if they aren't required for tessellation - so from the start the actual performance hit on the shaders isn't as big as certain people make out. Unlike ATI where your entirely dependant on the overall performance of the dedicated tessellation unit and have to wait til its done its job to proceed with anything relying on tessellation (performance in heaven benchmark would tend to indicate your stalling much of if not all of your shader pipeline when processing tessellation anyhow or that the tessellation unit is really weak).

Probably the opposite, since the Fermi architecture is much better at balancing workloads through a unified architecture for tessellation and shading. As games get more advanced the Fermi architectures will pull ahead further in performance.
By then the next ATi chip should be out, and hopefully ATI have a new architecture rather than DX11 tacked on.

Big assumptions being made here, how does the load balancing work?

Does the card resognise that a game which has gone over, for the sake of argument 80 fps, gets cutoff from over using the shaders. Therefore freeing up the remaining shaders for something like tessellation.

Roff i don't understand how nVidia would be immune to stalling, any job that is depended on tessellation would have to wait for the processing to be done before being executed, whether on nVidia or ATi, so stalling is not unique to ATi.

How does load blancing work when you throw PhysX into the mix. I believe nvidia's approach will not be noticably better than ATi's method, better yes, but not by a lot. It certainly won't make up for the short commings in price, power consumption, perhaps even a shorter life span, since nVidia is being iffy with the warranty of their new cards.
 
Last edited:
Big assumptions being made here, how does the load balancing work?

Does the card resognise that a game which has gone over, for the sake of argument 80 fps, gets cutoff from over using the shaders. Therefore freeing up the remaining shaders for something like tessellation.

Roff i don't understand how nVidia would be immune to stalling, any job that is depended on tessellation would have to wait for the processing to be done before being executed, whether on nVidia or ATi, so stalling is not unique to ATi.

It's a little simpler than that actually:

Each frame has a specific amount of calculation work to do (mostly add-multiply computations of some kind), before it is sent to the backend to be rendered. The "load balancing" algorithm will make an assessment of the amount of work required by each component (shaders, tessellation etc etc), and assign effort based on this assesment (i.e. dedicate certain clocks on certain shaders to one component or the other).

Of course, the load balancing algorithm will never give a perfect estimate of the amout of work required by one component or the other, but it can be iteratively improved. That is, if the algorithm under-estimates the amount of work required by the shaders in several consecutive frames (i.e. the shaders are the last to finish their workload) it will adjust the balance at the next frame. Anyway, the point is that this wil happen on a frame-by-frame basis, so the actual framerate the game is running at has no bearing. The goal of the load balancing is to finish each frame as quickly as possible.


...As for the question of which approach to tessellation (ATIs or nvidia's) is best: It's swings and roundabouts really. In situations where you have very little tessellation to do, it makes sense to utilise an external tessellator (like ATI do), as this will always finish before the rest of the workload of the frame, and won't take away from GPU performance at all. If you have a LOT of tessellation, then it's more efficient to balance it over the whole shader region (like nvidia do). In this case, if you have an external tessilator, the rest of the GPU will be just waiting for the tessellation unit to finish.

Anyway, that's what the nvidia graphs above are showing. Of course they have chosen scenarios that show a massive improvement over the 5870, but really all they are expressing is the different approach that to tessilation that has been taken by the two companies.



To summarise:

Very little tessellation: ATIs approach is more effective (no loss of GPU power).
Heavy tessellation: nvidia's approach is more effective (can utilise almost the entire GPU for tessilation)

Where the threshold between the two will occur, and how much tessilation will be used in real-world games, is anyone's guess at this point.
 
Last edited:
Big assumptions being made here, how does the load balancing work?

Does the card resognise that a game which has gone over, for the sake of argument 80 fps, gets cutoff from over using the shaders. Therefore freeing up the remaining shaders for something like tessellation.

Roff i don't understand how nVidia would be immune to stalling, any job that is depended on tessellation would have to wait for the processing to be done before being executed, whether on nVidia or ATi, so stalling is not unique to ATi.

How does load blancing work when you throw PhysX into the mix. I believe nvidia's approach will not be noticably better than ATi's method, better yes, but not by a lot. It certainly won't make up for the short commings in price, power consumption, perhaps even a shorter life span, since nVidia is being iffy with the warranty of their new cards.

You don't really understand load balancing.

To simplify this is the beauty of a unified shader architecture - you have clusters of processing that can be applied to a broad range of processing needs and they can work on a different task completely independant of the others. In your average tessellation workload your not going to be loading up every single cluster the same - some will be finished earlier than others - rather than wait til they are all done you can re-task the clusters that are done to whatever else needs working on.

This is why we moved away from fixed function pipelines in the first place.

PhysX is a different story - due to the way it works - you need to process all physics calculations before you can even start rendering the scene.
 
PhysX is a different story - due to the way it works - you need to process all physics calculations before you can even start rendering the scene.

This is an important point :)

The physics must be computed before anything else can be rendered, in order to know what exactly is being rendered! As a result, the loss of performance from physx will always be much greater than from other on-chip computation processes (like tessellation).
 

Good explanation.

We can pretty much take for granted those results are showing nVidia in their best light and actual realworld results would be more like half those.

Personally I think the dedicated tessellation unit on the 5 series is rather weak, weaker than people like to think, heaven doesn't show it in a good light tbh - you could potentially almost get the same performance doing the sub-division and deformation on a fast CPU.
 
Back
Top Bottom