Fermi possible pricing

Rroff · 1 Mar 2010 at 23:34

Greebo said:
Care to quantify? I thought it has been shown/rumoured already that in heavy tessellation scenes the Fermi uses up so many of its shaders the performance drops?

Hence I think it will be up and down as to where Fermi wins and looses.

If you look at the architecture the GF100 load balances tessellation over all shader clusters - so it doesn't tie up the entire shader pipeline for the duration of the tessellation workload - discrete units can continue execution when their workload is done or if they aren't required for tessellation - so from the start the actual performance hit on the shaders isn't as big as certain people make out. Unlike ATI where your entirely dependant on the overall performance of the dedicated tessellation unit and have to wait til its done its job to proceed with anything relying on tessellation (performance in heaven benchmark would tend to indicate your stalling much of if not all of your shader pipeline when processing tessellation anyhow or that the tessellation unit is really weak).

Blackbadger · 1 Mar 2010 at 23:52

From the very few bench marked results released of the fermi chip, tessellation is one of the areas where the main strength lies.

I understood that with old dxd9 type apps there wouldn't be a huge difference over current gen/ati but with dx11, cuda etc it would show it's strengths.

Publications | Research

Our publications provide insight into some of our leading-edge research.

www.nvidia.com

straxusii · 2 Mar 2010 at 07:28

Blackbadger said:
From the very few bench marked results released of the fermi chip, tessellation is one of the areas where the main strength lies.

I understood that with old dxd9 type apps there wouldn't be a huge difference over current gen/ati but with dx11, cuda etc it would show it's strengths.

I think the tesselation performance will only be this good in synthetic benchmarks, and not actual games.

Marine-RX179 · 2 Mar 2010 at 08:01

straxusii said:
I think the tesselation performance will only be this good in synthetic benchmarks, and not actual games.

So does that mean game developers will have to dumb down the tesselation settings for games for ATI's sake even though Nvidia can probably use tesselation better?

straxusii · 2 Mar 2010 at 08:13

Marine-RX179 said:
So does that mean game developers will have to dumb down the tesselation settings for games for ATI's sake even though Nvidia can probably use tesselation better?

I'm not 100% sure on this so take this with a pinch of salt, but I think I've read Nvidia can divert their programmable shaders to ~pure tesselation whereas ATI can't. So in a real game, tesselation heavy or not, the Nvidia shaders will be doing much more than pure tesseleation.

Lightnix · 2 Mar 2010 at 08:17

Marine-RX179 said:
So does that mean game developers will have to dumb down the tesselation settings for games for ATI's sake even though Nvidia can probably use tesselation better?

Not really, one of the cool features about tessellation is that it's extremely easy to scale - in fact one tessellation mode, called adaptive tessellation, adjusts the level of tessellation on the fly depending on how far away you are from an object.

Marine-RX179 · 2 Mar 2010 at 08:38

Well, anyhow....I really hope that nvidia can delivery some decent mid-price range cards that is something like same price as 5850 but better performance, or same performance as 5850 but at a lower price...unlikely as it may.

kitch9 · 2 Mar 2010 at 08:39

Blackbadger said:
From the very few bench marked results released of the fermi chip, tessellation is one of the areas where the main strength lies.

I understood that with old dxd9 type apps there wouldn't be a huge difference over current gen/ati but with dx11, cuda etc it would show it's strengths.

http://www.nvidia.com/object/IO_86775.html

You are a marketers wet dream.

D.P. · 2 Mar 2010 at 09:21

Greebo said:
Won't that be plus VAt so

£317 to £423 for the lower model and

£529 to £635 for the GTX480?

I can see these flying off the shelf.

No, if the price was quoted in euros it will include VAT at 20%.

D.P. · 2 Mar 2010 at 09:26

straxusii said:
I think the tesselation performance will only be this good in synthetic benchmarks, and not actual games.

Probably the opposite, since the Fermi architecture is much better at balancing workloads through a unified architecture for tessellation and shading. As games get more advanced the Fermi architectures will pull ahead further in performance.
By then the next ATi chip should be out, and hopefully ATI have a new architecture rather than DX11 tacked on.

Admiral Huddy · 2 Mar 2010 at 09:33

Nvidia are obviously very serious about the using tessellation to it's full extent and may be the reason for such lengthy delays. At some stage ATI are going to have to think along the same lines.

The overall goal of DirectX 11 will be to ease the workload of current GPUs and of game developers. To achieve these goals, new types of shader are being added to the API, including a hull shader and tessalator domain shader, in addition to the compute shader which has been mentioned previously.

buttkinz · 2 Mar 2010 at 09:39

hmmm, with no benchmarks leaked by nvidia im betting my money on the card sucking badly.

dx11 tacked on?? wtf?? its either compliant to the standard or not lol

Admiral Huddy · 2 Mar 2010 at 11:40

It would be too premature to release benchmarks until any driver and /or hardware related issues are ready for public release.

queamin · 2 Mar 2010 at 11:51

Nvidia threatens it's partners at CeBIT
CeBIT 2010 Don't you dare talk GTX480

This is from SA,it doesn't feel me with confidence and they launching them later in the month

twist3d0n3 · 2 Mar 2010 at 12:04

queamin said:
Nvidia threatens it's partners at CeBIT
CeBIT 2010 Don't you dare talk GTX480

This is from SA,it doesn't feel me with confidence and they launching them later in the month

shock and awe?

Pendu · 2 Mar 2010 at 12:49

Rroff said:
If you look at the architecture the GF100 load balances tessellation over all shader clusters - so it doesn't tie up the entire shader pipeline for the duration of the tessellation workload - discrete units can continue execution when their workload is done or if they aren't required for tessellation - so from the start the actual performance hit on the shaders isn't as big as certain people make out. Unlike ATI where your entirely dependant on the overall performance of the dedicated tessellation unit and have to wait til its done its job to proceed with anything relying on tessellation (performance in heaven benchmark would tend to indicate your stalling much of if not all of your shader pipeline when processing tessellation anyhow or that the tessellation unit is really weak).

D.P. said:
Probably the opposite, since the Fermi architecture is much better at balancing workloads through a unified architecture for tessellation and shading. As games get more advanced the Fermi architectures will pull ahead further in performance.
By then the next ATi chip should be out, and hopefully ATI have a new architecture rather than DX11 tacked on.

Big assumptions being made here, how does the load balancing work?

Does the card resognise that a game which has gone over, for the sake of argument 80 fps, gets cutoff from over using the shaders. Therefore freeing up the remaining shaders for something like tessellation.

Roff i don't understand how nVidia would be immune to stalling, any job that is depended on tessellation would have to wait for the processing to be done before being executed, whether on nVidia or ATi, so stalling is not unique to ATi.

How does load blancing work when you throw PhysX into the mix. I believe nvidia's approach will not be noticably better than ATi's method, better yes, but not by a lot. It certainly won't make up for the short commings in price, power consumption, perhaps even a shorter life span, since nVidia is being iffy with the warranty of their new cards.

Duff-Man · 2 Mar 2010 at 13:06

Pendu said:
Big assumptions being made here, how does the load balancing work?

Does the card resognise that a game which has gone over, for the sake of argument 80 fps, gets cutoff from over using the shaders. Therefore freeing up the remaining shaders for something like tessellation.

Roff i don't understand how nVidia would be immune to stalling, any job that is depended on tessellation would have to wait for the processing to be done before being executed, whether on nVidia or ATi, so stalling is not unique to ATi.

It's a little simpler than that actually:

Each frame has a specific amount of calculation work to do (mostly add-multiply computations of some kind), before it is sent to the backend to be rendered. The "load balancing" algorithm will make an assessment of the amount of work required by each component (shaders, tessellation etc etc), and assign effort based on this assesment (i.e. dedicate certain clocks on certain shaders to one component or the other).

Of course, the load balancing algorithm will never give a perfect estimate of the amout of work required by one component or the other, but it can be iteratively improved. That is, if the algorithm under-estimates the amount of work required by the shaders in several consecutive frames (i.e. the shaders are the last to finish their workload) it will adjust the balance at the next frame. Anyway, the point is that this wil happen on a frame-by-frame basis, so the actual framerate the game is running at has no bearing. The goal of the load balancing is to finish each frame as quickly as possible.

...As for the question of which approach to tessellation (ATIs or nvidia's) is best: It's swings and roundabouts really. In situations where you have very little tessellation to do, it makes sense to utilise an external tessellator (like ATI do), as this will always finish before the rest of the workload of the frame, and won't take away from GPU performance at all. If you have a LOT of tessellation, then it's more efficient to balance it over the whole shader region (like nvidia do). In this case, if you have an external tessilator, the rest of the GPU will be just waiting for the tessellation unit to finish.

Anyway, that's what the nvidia graphs above are showing. Of course they have chosen scenarios that show a massive improvement over the 5870, but really all they are expressing is the different approach that to tessilation that has been taken by the two companies.

To summarise:

Very little tessellation: ATIs approach is more effective (no loss of GPU power).
Heavy tessellation: nvidia's approach is more effective (can utilise almost the entire GPU for tessilation)

Where the threshold between the two will occur, and how much tessilation will be used in real-world games, is anyone's guess at this point.

Rroff · 2 Mar 2010 at 13:17

Pendu said:
Big assumptions being made here, how does the load balancing work?

Does the card resognise that a game which has gone over, for the sake of argument 80 fps, gets cutoff from over using the shaders. Therefore freeing up the remaining shaders for something like tessellation.

Roff i don't understand how nVidia would be immune to stalling, any job that is depended on tessellation would have to wait for the processing to be done before being executed, whether on nVidia or ATi, so stalling is not unique to ATi.

How does load blancing work when you throw PhysX into the mix. I believe nvidia's approach will not be noticably better than ATi's method, better yes, but not by a lot. It certainly won't make up for the short commings in price, power consumption, perhaps even a shorter life span, since nVidia is being iffy with the warranty of their new cards.

You don't really understand load balancing.

To simplify this is the beauty of a unified shader architecture - you have clusters of processing that can be applied to a broad range of processing needs and they can work on a different task completely independant of the others. In your average tessellation workload your not going to be loading up every single cluster the same - some will be finished earlier than others - rather than wait til they are all done you can re-task the clusters that are done to whatever else needs working on.

This is why we moved away from fixed function pipelines in the first place.

PhysX is a different story - due to the way it works - you need to process all physics calculations before you can even start rendering the scene.

Duff-Man · 2 Mar 2010 at 13:20

Rroff said:
PhysX is a different story - due to the way it works - you need to process all physics calculations before you can even start rendering the scene.

This is an important point

The physics must be computed before anything else can be rendered, in order to know what exactly is being rendered! As a result, the loss of performance from physx will always be much greater than from other on-chip computation processes (like tessellation).

Rroff · 2 Mar 2010 at 13:25

Duff-Man said:
snip

Good explanation.

We can pretty much take for granted those results are showing nVidia in their best light and actual realworld results would be more like half those.

Personally I think the dedicated tessellation unit on the 5 series is rather weak, weaker than people like to think, heaven doesn't show it in a good light tbh - you could potentially almost get the same performance doing the sub-division and deformation on a fast CPU.