• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

Poll: ** The AMD VEGA Thread **

On or off the hype train?

  • (off) Train has derailed

    Votes: 207 39.2%
  • (on) Overcrowding, standing room only

    Votes: 100 18.9%
  • (never ever got on) Chinese escalator

    Votes: 221 41.9%

  • Total voters
    528
Status
Not open for further replies.
Removal of what limitation?


no one is going to sell a GPU that has a massive total die area, even if it is composed of many smaller dies.




there are also some fundamental issues in even scaling something that larger. the Nvidia Volta GV100 is limited in 2 ways, the die can't physically be made any bigger, but also the interposer is at the maximum size. Actually the HBM2 memory chips supposedly overhang the interposer because the interposer can't be made any bigger.


And then there are some unfortunate practicalities. E.g., ets say you put 2 dies on one interposer with shared HBM memory. If both GPUs need the same resource like a texture to render a part of the scene then the data will be dupicated to each GPU, the bandwidth is shared between GPUs so each GPU is effectively getting half the bandwidth it would have had if it was operating alone. This is why crossfire does work, because the memory is duplicated then the effective bandwidth is duplicated. If you don't scale the memory then you don't scale performance.

That's the limitation they're removing, the limitation of single monolithic die size. They could possibly create multiple smaller chips that equate to a much larger than possible current single die. This will bring cost reductions and performance benefits.

I understand it won't be a linear performance increase even if they get it working well, but theoretically it could limit or eliminate the barrier of die size/cost/failure rate which the industry has been at for the last 5 years or so.
 
Does anyone know if AMD have explained why there is such a chronic shortage other than mining, it is not like the cards are coming into stock then going out from what I can see. Could a member of ocuk staff shed some light on the situation, have they been on order from the supplier and nothing turned up or are ocuk just not ordering because the wholesale price is through the roof?

I am trying to gauge weather I have to be quick getting a Vega 64 because the price is going to go through the roof after a short time or if I have time to see a good set of reviews and wait for AIB cards and matching water blocks.
 
How likely do you think it will be that retailers try it on with way over retail prices like twice the retail prices as with the 580. I want an upgrade but I don't want to pay some over inflated price. ocuk please don't do it to us lowly gamers and work something out so we can pay retail for them like volume limits per household. Gamers will be the ones who are still here after this craze has gone.
 
How likely do you think it will be that retailers try it on with way over retail prices like twice the retail prices as with the 580. I want an upgrade but I don't want to pay some over inflated price. ocuk please don't do it to us lowly gamers and work something out so we can pay retail for them like volume limits per household. Gamers will be the ones who are still here after this craze has gone.

Very likely. They aren't your friend and regardless of all the marketing about doing this and that for gamers, it's just marketing. Prime gouging seasons about to open, send in the suckers !
 
And then there are some unfortunate practicalities. E.g., ets say you put 2 dies on one interposer with shared HBM memory. If both GPUs need the same resource like a texture to render a part of the scene then the data will be dupicated to each GPU, the bandwidth is shared between GPUs so each GPU is effectively getting half the bandwidth it would have had if it was operating alone. This is why crossfire does work, because the memory is duplicated then the effective bandwidth is duplicated. If you don't scale the memory then you don't scale performance.
If my memory is correct that is exactly what AMD has been talking in past. Getting to point where multiple dies can share memory whitout having them all load eachothers frames/part of frames to every gpus memory. Getting rid off that PLX sircuit on dual gpu cards and have them talk directly at eachother or having multiple dies shown as one to operating system. Part of multi die chips moving forward is coming from Vulkan and DX12, DX11 and older were really limited on that front. Like you said gpus are like multicore chips already. So is it that big deal to add another chip on the card if they can communicate fast enough badwich if especially they are shown as one, I would think not. AMD is bad at games because they have 4 shader engines per gpu, every frame is rendered in 4 parts. As every part doesent have as much rendering in it, the whole gpu doesent operate at maxium capability when playing games. Having multiple dies, with all of them having that 4 shader engines would help them brake that frame smaller pieces and utilize more of their shader power. The talk has been Navi being first real multichip product. There seems to be mention in Linux drivers about PLX board with VegaX2 so I think Vega isint yet real multi gpu chip. Zen architecture has already shown multi die to be the path forward. The yelds are out of this world and you can pretty much use all of the dies, even partially broken ones.
 
If my memory is correct that is exactly what AMD has been talking in past. Getting to point where multiple dies can share memory whitout having them all load eachothers frames/part of frames to every gpus memory. Getting rid off that PLX sircuit on dual gpu cards and have them talk directly at eachother or having multiple dies shown as one to operating system. Part of multi die chips moving forward is coming from Vulkan and DX12, DX11 and older were really limited on that front. Like you said gpus are like multicore chips already. So is it that big deal to add another chip on the card if they can communicate fast enough badwich if especially they are shown as one, I would think not. AMD is bad at games because they have 4 shader engines per gpu, every frame is rendered in 4 parts. As every part doesent have as much rendering in it, the whole gpu doesent operate at maxium capability when playing games. Having multiple dies, with all of them having that 4 shader engines would help them brake that frame smaller pieces and utilize more of their shader power. The talk has been Navi being first real multichip product. There seems to be mention in Linux drivers about PLX board with VegaX2 so I think Vega isint yet real multi gpu chip. Zen architecture has already shown multi die to be the path forward. The yelds are out of this world and you can pretty much use all of the dies, even partially broken ones.

you are conflating a hardware limitation with a software solution that doesn't do what you think it does
to get full bandwidth to all the smaller dies, the memory controller would have to be off-chip and accessible by all of the smaller dies, which would introduce latency issues.
Zen uses cache per die and CPU's need a lot less dedicated memory than a GPU does so what works for a CPU won't be directly applicable to a GPU design.
Textures are the single biggest use of VRAM, so breaking a frame in to smaller sections to render doesn't offer massive savings because for best performance all of the GPU's still need access to all of the textures on a frame to frame basis.

all of this needs to be solved on the board itself, the API being used to access it is largely irrelevant, DX12/vulkan is not a solution to any of this
 
Large scale miners are bulk buying cards from the manufacturers and distributers before they even get to retail.

Not according to Gibbo -

Thank you, I can also tell you the professional miners we supply, never return product, they mine on it until the card is either superseded at which point they flog it or if it fails they bin it as the cards generally are returning them a profit within weeks as professional miners know how to make quick money. Don't ask me how, go and find out for yourselves as its all out there on the net.

These guys are so cheeky they put money into our bank account, one has 100k sitting on our account, we ship them cards that the gamers generally do not buy, but the miner is very happy to get, but that is how crazy these guys are by the simple fact they essentially want to give us money and as and when we have stock we feel the gamers are not interested in or our system integration departments cannot make use of we check with the miner if they are acceptable and they approve and we ship.

Some miners try it on and try to steal the good stuff, we used voucher codes in short term to stop this happening.

Moving forward were just hoping supply will improve enough to keep all happy, this week we have 2000+ RX 580's landing, which should be enough to drop resale price to normal levels and let gamers and miners buy them at will, should last a week or two. We've also got some new fancy super fast card arriving as well, my guess is you all know what this is, but of course I cannot say, but there will be plenty to go round. :D
 
you are conflating a hardware limitation with a software solution that doesn't do what you think it does
to get full bandwidth to all the smaller dies, the memory controller would have to be off-chip and accessible by all of the smaller dies, which would introduce latency issues.
Zen uses cache per die and CPU's need a lot less dedicated memory than a GPU does so what works for a CPU won't be directly applicable to a GPU design.
Textures are the single biggest use of VRAM, so breaking a frame in to smaller sections to render doesn't offer massive savings because for best performance all of the GPU's still need access to all of the textures on a frame to frame basis.
The whole point is to provide unified memory for all the gpus where the dont have to load whole screen. Multiple smaller gpus wont need uber high bandwhich then. Braking the screen in multiple parts helps them to get max utilization to all shader cores. I dont know why AMD cards wont have more than 4 shader engines, maeby its the architecture but other reason is it wont benefit in compute workloads. Seems you edited you post, yes memory problem needs to be adressed by the board. But DX12 Vulkan brings usage of multiple gpus on another level.
Since the launch of SLI, a long time ago, utilization of multiple GPUs was handled automatically by the display driver. The application always saw one graphics device object no matter how many physical GPUs were behind it. With DirectX 12, this is not the case anymore. But why start doing something manually that has been working automatically? Because, actually, for a good while before DirectX 12 arrived, the utilization of multiple GPUs has not been that automatic anymore.

As rendering engines have grown more sophisticated, the distribution of rendering workload automatically to multiple GPUs has become problematic. Namely, temporal techniques that create data dependencies between consecutive frames make it challenging to execute alternate frame rendering (AFR), which still is the method of choice for distribution of work to multiple GPUs. In practice, the display driver needs hints from the application to understand which resources it must copy from one GPU to another and which it should not. Data transfer bandwidth between GPUs is very limited and copying too much stuff can make the transfers the bottleneck in the rendering process. Giving hints to the driver can be implemented with NVAPI or by making additional Clear() or Discard() calls for selected resources.

Consequently, even when you didn’t have explicit control over multiple GPUs, you had to understand what happened implicitly and give the driver hints for doing it efficiently in order to get the desired performance out of multi-GPU setups. Now with DirectX 12, you can take full and explicit control of what is happening. And you are no longer limited to AFR. You are free to invent new ways of making use of multiple GPUs that better suit your application.
 
Last edited:
The whole point is to provide unified memory for all the gpus where the dont have to load whole screen. Multiple smaller gpus wont need uber high bandwhich then. Braking the screen in multiple parts helps them to get max utilization to all shader cores. I dont know why AMD cards wont have more than 4 shader engines, maeby its the architecture but other reason is it wont benefit in compute workloads.

the only way I can respond to this is to repeat my previous post, you aren't understanding he fundamental issue - loading textures on the fly constantly causes texture pop-in and/or hitching, breaking the scene in to smaller chunks doesn't solve this on a frame to frame basis

shaders running as fast as they like doesn't present a clean image to the end user if the textures aren't there
 
the only way I can respond to this is to repeat my previous post, you aren't understanding he fundamental issue - loading textures on the fly constantly causes texture pop-in and/or hitching, breaking the scene in to smaller chunks doesn't solve this on a frame to frame basis

shaders running as fast as they like doesn't present a clean image to the end user if the textures aren't there
I dont really understand what you are tying to say when I already wrote 2 times the point of coming real multigpu chips is to them not to have load frame for both gpus. The whole point is to get them use same memory and load the frame only once and have the gpus both share that. If they can share the memory we dont need double the memory, nor the bandwich.
 
Status
Not open for further replies.
Back
Top Bottom