vram stacking in sli/cf

jamiehavok · 1 Aug 2014 at 21:17

i was always under the impression that vram doesnt stack at all, the data is mirrored across 2 cards and they each render there own frame with the vram data being the same but this guy seems to think different!

Yes, texture cacheing is split evenly over the two cards (mirrored). but model rendering memory is still divided between the cards. In split frame rendering the "top card" manages the top half of the visible display and the second the bottom. In alternate frame rendering, one card is manageing the current display while the other is rendering the next frame. While some of that cached data is certanly present in both scenarios, the physx details, light pathing, ray tracing, shader details and other various video card functions, many of which use very large chunks of vram, are most certanly not identical across both cards. Depending on what engine you are running, what cards you are running and what program you are actually viewing you can get wildly different results. Especially for 3d rendering platforms such as autodesk inventor.

so is there any truth in this!?

shankly1985 · 1 Aug 2014 at 21:20

Dx11 doesn't stack but Mantle has the possibility to stack so dx12 could also.

jamiehavok · 2 Aug 2014 at 00:01

..so its dependant on api then? but the tech is still in its infancy as far as gaming is concerned? what about non gaming purposes? i cant find anything about this, if this is true vram not stacking is likely the biggest myth on the internetz!

Rroff · 2 Aug 2014 at 00:08

What he is saying doesn't really change the stacking v mirroring thing - each card will have some slight differences in what they have in memory depending on their current operations but they will still be mirroring the main data lump that they are originally working from.

This is one of the problems which can break multi GPU compatibility or performance i.e. when a texture is modified on one card to render the current frame and the other card also needs that new modified version of the data to do subsequent work.

jamiehavok · 2 Aug 2014 at 00:15

Rroff said:
What he is saying doesn't really change the stacking v mirroring thing - each card will have some slight differences in what they have in memory depending on their current operations but they will still be mirroring the main data lump that they are originally working from.

This is one of the problems which can break multi GPU compatibility or performance i.e. when a texture is modified on one card to render the current frame and the other card also needs that new modified version of the data to do subsequent work.

hes not saying the main data lump is mirrored though hes saying "many of which use very large chunks of vram, are most certanly not identical across both cards".

basically saying only a small amount is mirrored!

David Bisset · 2 Aug 2014 at 09:57

Not being identical doesn't mean they don't both have a copy of the data - as explained already, the data can be duplicated and modified so they both have data but it's not mirrored (identical)

Games use AFR almost exclusively so you require very close to everything being duplicated.

VRAM doesn't stack, that is not a myth, as it would only stack if you could add the two quantities up which is never true. That isn't the same as never getting any benefit from the extra, which is overlapping not stacking.

The guy is correct that in non-gaming situations the gains can be very significant however.

Edit: Man, I wrote this badly.

I'll try to use an imaginary example. We want to draw a coin (flat faced cause it's easier) spinning using AFR.
Each GPU will have a copy of the coin textures (say heads, tails, side) - these are identical (mirrored).
Each GPU will also calculate what will be shown based on the coins angle for their frame. This will be different on each GPU as they're drawing a different frame where the coin is at a different angle, so this information is not mirrored. However, this info still takes up space on each GPU so individual frames cannot require more available memory than if we were using a single GPU solution.

Now imagine we were using split-frame rendering.
Initially it looks like no change. However, we could optimise by splitting our textures so they're now top_heads, bottom_heads, top_tails, bottom_tails etc. We still need to do the calculations as above, but for less of the coin so this takes slightly less space too. Woohoo, lower memory use! We could have a larger texture for the coin to make it look better. However, if the coin is ever to be drawn spun in a non-totally flat way then we'd need the full texture, so that optimisation probably won't have been done as we'd not know what divisions of the texture file we need. Also, imagine the coin is reflecting light onto whatever surface it's spinning on, using ray tracing (OK, unlikely, but it's an easy example, so sue me!

). The bottom-image GPU should be drawing this onto the surface, but how can it know what to draw given it doesn't know what the top part of the coin is doing? (Especially if we imagine a coin with features like the head being slightly raised etc rather than a flat coin) Suddenly we've got a very tough problem that results in poor performance when we could have stuck with AFR and never had the issue.

In non-gaming things can look very different, especially non-realtime as you can divide the tasks very differently and build your result.

Note: While I am a programmer I don't do graphics so have almost certainly made a bit of a mess of this description, please feel free to correct/enhance it if you are an expert in the area!

Silent_Scone · 2 Aug 2014 at 10:14

Developers; even in the case of Mantle have already said AFR is preferable. In other words we won't see it any time soon..

Mantle CAN use asynchronous CF but the performance clearly isn't there or else they would jump on it.

Resa69 · 2 Aug 2014 at 16:16

MjFrosty said:
Developers; even in the case of Mantle have already said AFR is preferable. In other words we won't see it any time soon..

Mantle CAN use asynchronous CF but the performance clearly isn't there or else they would jump on it.

And VRAM is not much of a problem at the moment, with future cards all having 4gb+

Competitor rules

vram stacking in sli/cf

More options

jamiehavok

jamiehavok

shankly1985

shankly1985

jamiehavok

jamiehavok

Rroff

Rroff

jamiehavok

jamiehavok

David Bisset

David Bisset

Silent_Scone

Silent_Scone

Resa69

Resa69