• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

AMD Fiji HBM limited to 4GB stacked memory

Well aside from the fact that the bigger the memory bus the generally slower it will be due to power concerns(slower in as much as max clocks on the memory you'll hit), the best 512bit bus to date has produced what, 320GB/s, that isn't close, in any realistic way, to 512GB/s that should be easy to achieve, let alone a 640GB/s that is being rumoured(likely not correctly IMHO).

The trouble is that 512GB/s of bandwidth under HBM will use less power than the 320GB/s 4GB Hawaii provides, and certainly less than if you want to try and push clocks on the memory up further. At very best we're probably looking at 400GB/s with a significant increase in clocks, but even higher power usage to go with it.

That is where HBM wins, for any amount of bandwidth GDDR5 can provide, HBM can do it in 30% of the power. 512GB/s has ALWAYS been achievable with ggdr5, it would just take a likely 768bit bus, or 512bit bus with insane memory speeds, and it would probably use up 100-125W of the power on the card... leaving all over 125-150W realistically for the gpu itself. HBM can provide that same bandwidth in 30-40W, which would in the same situation leaving 210-220W for the gpu inside the same 250W gpu power budget.

GDDR5 is completely and utterly uncompetitive. If AMD or Nvidia produced a HBM and a GDDR5 version of their latest gen 250W cards, the HBM would spank the GDDR5 card silly because you could up the gpu clocks 40-50% or increase the shader/rop/tmu count by 40-50% with the extra power savings provided by HBM.

For your typical 40nm GDDR5 yes - they are now moving to a new revision on 20nm that uses a lot less power and runs higher frequencies - IIRC albeit in lab conditions they were hitting over 700GB/s in 512bit configuration with hand picked (overclocked) modules - obviously won't see that on retail GPUs.

Sure this is a last gasp and end of the road for DDR5 no one is saying anything different but its not as obsolete just yet as some people make out.

EDIT: The 320GB/s your referring to I believe is on a significantly downclocked workstation card (as is typical on workstation cards for some reason as they are generally configured for stability and/or have extra error correction capabilities, etc.) and not really representative of what GDDR5 could potentially do in a gaming card configuration.
 
Last edited:
If any of the current rumors regarding new cards from Nvidia and AMD have any shred of truth, alongside DX12 im wondering if i should just keep my 290 for now

DX12 will give the 290 a new lease of life in games that are coded for DX12

The 380X if it is a rebadged faster 290 does not appeal to me at all

If AMD hold off the 390X til late in the year, then having owned a 290 since release, i can happily wait a few more months AFTER the 390x and see what Nvidia come out with the Pascal (yes i know this is 2016, i can wait).

So basically i think im pinning my hopes on 390X or 395X2 being a superbeast, especially if DX12 gives the perrformance i think it might on my 290 for games designed with it, otherwise its going to be Nvidia newest tech. Tbh im at the point where i may just wait for the next die shrink to bother upgrading?

Lots of questions with no real answers i guess until we see a) AMD new batch of cards specs b) DX12 in realworld use and i guess maybe even c) something new from Nvidia?
 
If any of the current rumors regarding new cards from Nvidia and AMD have any shred of truth, alongside DX12 im wondering if i should just keep my 290 for now

DX12 will give the 290 a new lease of life in games that are coded for DX12

The 380X if it is a rebadged faster 290 does not appeal to me at all

If AMD hold off the 390X til late in the year, then having owned a 290 since release, i can happily wait a few more months AFTER the 390x and see what Nvidia come out with the Pascal (yes i know this is 2016, i can wait).

So basically i think im pinning my hopes on 390X or 395X2 being a superbeast, especially if DX12 gives the perrformance i think it might on my 290 for games designed with it, otherwise its going to be Nvidia newest tech. Tbh im at the point where i may just wait for the next die shrink to bother upgrading?

Lots of questions with no real answers i guess until we see a) AMD new batch of cards specs b) DX12 in realworld use and i guess maybe even c) something new from Nvidia?

All your questions have the same answer really,

Wait until your games no longer run the way you would like them to and then it will be upgrade time.
 
If any of the current rumors regarding new cards from Nvidia and AMD have any shred of truth, alongside DX12 im wondering if i should just keep my 290 for now

DX12 will give the 290 a new lease of life in games that are coded for DX12

The 380X if it is a rebadged faster 290 does not appeal to me at all

If AMD hold off the 390X til late in the year, then having owned a 290 since release, i can happily wait a few more months AFTER the 390x and see what Nvidia come out with the Pascal (yes i know this is 2016, i can wait).

So basically i think im pinning my hopes on 390X or 395X2 being a superbeast, especially if DX12 gives the perrformance i think it might on my 290 for games designed with it, otherwise its going to be Nvidia newest tech. Tbh im at the point where i may just wait for the next die shrink to bother upgrading?

Lots of questions with no real answers i guess until we see a) AMD new batch of cards specs b) DX12 in realworld use and i guess maybe even c) something new from Nvidia?

I would say upgrading every product cycle, like for like tier wise, is never really a satisfying experience, so unless it is a drastic improvement I'd wait it out.
 
Putting wild speculation aside for a sec, we need to see what actually shows up in retail and how they perform in legit reviews. We can't condemn a product that hasn't yet been released. By all means condemn the GTX 970 and R9 285 they are released and are pants etc (For different reasons), but at least give AMD a chance to release this new HBM card first before putting the boot in.
 
Putting wild speculation aside for a sec, we need to see what actually shows up in retail and how they perform in legit reviews. We can't condemn a product that hasn't yet been released. By all means condemn the GTX 970 and R9 285 they are released and are pants etc (For different reasons), but at least give AMD a chance to release this new HBM card first before putting the boot in.

pretty much what im thinking, if when it comes out its pants then sure give it some stick but at least lets have a look before we judge
 
For your typical 40nm GDDR5 yes - they are now moving to a new revision on 20nm that uses a lot less power and runs higher frequencies - IIRC albeit in lab conditions they were hitting over 700GB/s in 512bit configuration with hand picked (overclocked) modules - obviously won't see that on retail GPUs.

Sure this is a last gasp and end of the road for DDR5 no one is saying anything different but its not as obsolete just yet as some people make out.

Lab stuff really doesn't matter not least because there is another mode GDDR5 can operate in, that I completely forgot the name of. Basically there is a third mode that no one uses because it increases power more than it increases bandwidth. YOu might need that in some Nasa/Military system, You must have 500GB/s at any power, but is worthless inside a consumer product. Considering Samsung are releasing the first 20nm chips at a 8Gbps speeds(over 7Gbps previously) you aren't going to suddenly get a massive increase in efficiency.. so it's most likely lab made specific chips to access the memory in a certain way and using higher power modes that aren't relevant to GPU discussions.

However, the biggest power saving from HBM isn't the memory stacking, the clock speed, it's not the memory itself, it's moving it on package. The biggest power saving is from not communicating off die. GDDR5 even at 20nm will not come close to HBM's overall power usage due to the communication, nor will it suddenly become way more efficient. Also quite obviously HBM will go to smaller process nodes and make back all that it "lost" against GDDR5 in that sense.

Another point to make is HBM(and HMC) access the data banks quite differently to gddr5 and the interface, it's both lower latency and higher efficiency. 500gb/s isn't always 500gb/s. Depending on access block size and the like, well it's not unlike hdd/ssd access, you might get 500MB/s reads at a large block size but you won't get above 40MB/s for a 4kb random read at low queue depth. All memory is the same, it's a maximum theoretical throughout, HBM and HMC have made a lot of advances and should be fundamentally more efficient per GB/s theoretical, than GDDR5 ever was or could hope to be.

GDDR5 could be at 7nm tomorrow, it still wouldn't beat power usage of the overall memory subsystem of switching to 30nm(I think) HBM.

EDIT: 320GB/s is what Hawaii has, again in general the bigger the bus the slower the memory for again, power reasons. What can be achieved in a lab with the right chip(I would presume a very simple test chip designed to maximise bandwidth, nothing like a real world usage). Yes Hawaii could use 8Ghz clocks but you'd find the power would be insane. Again the power usage, the bulk of it, is in the signalling, not the chip. More chips/wider bus but slower signals use less power, the chips themselves will also be more power efficient at 5Ghz than at 8Ghz.
 
Last edited:
Something that people are forgetting, is that HBM and interposed memory has reduced latency, which improves performance by reducing idle ticks on the xPU the memory is connected with.

But as well as the bandwidth increase, which just increases the amount of data that can be transferred per transfer, as well as the latency decrease which decreases the time it takes for the transfer to occur. HBM memory can also perform parallel read-writes to each stack, further reducing latency per chip.

HBM was more than just a bandwidth increase and will give a larger performance improvement than just a pure bandwidth increase.
 
Putting wild speculation aside for a sec, we need to see what actually shows up in retail and how they perform in legit reviews. We can't condemn a product that hasn't yet been released. By all means condemn the GTX 970 and R9 285 they are released and are pants etc (For different reasons), but at least give AMD a chance to release this new HBM card first before putting the boot in.

I suspect there could be problems as it is new tech and it may take a while to iron them out.
 
Guess what i was so poorly trying to say is, im hoping DX12 gives enough lease of new life to aging cards to tide us over til newer tech :)

But Kaaps right, once it stops performing on your chosen games its upgrade time
 
I don't understand that 3D picture. How can that ever work properly, unless memory modules make good heatsinks nowadays?

Fud simply doesn't know what he's talking about, at all. The 3d stacked picture is more than possible in terms of, people can make chips like that, but they will only do it for watches/wearables and ultra compact/expensive phones for the next several years at least. It's not remotely viable for high powered chips and Nvidia's pictures for Volta(cancelled) and their mock up of Pascal both show hbm chip stacks with the gpu separately, all on the same package. Pascal WILL use the same method as AMD and like the HBM picture on the right. He's getting confused by stacked ram, because he's an idiot.

HBM is stacked memory, it has 4 chips stacked on top of a logic chip. So when Nvidia say they are using stacked memory, it's because they are, because HBM is fundamentally 3D stacked memory, he's just reading Nvidia saying that(though they've also VERY clearly stated they are using HBM) and taking it alone and ignoring the fact that HBM and HMC(the Intel/Micron semi alternative) are still based upon stacked memory.


Something that people are forgetting, is that HBM and interposed memory has reduced latency, which improves performance by reducing idle ticks on the xPU the memory is connected with.

But as well as the bandwidth increase, which just increases the amount of data that can be transferred per transfer, as well as the latency decrease which decreases the time it takes for the transfer to occur. HBM memory can also perform parallel read-writes to each stack, further reducing latency per chip.

HBM was more than just a bandwidth increase and will give a larger performance improvement than just a pure bandwidth increase.

Yup, you can't always compare one generation of gpu's to another because internally their own memory controllers, architecture, can use the available bandwidth even from the same type of memory to a different level of efficiency(Tonga/Maxwell for a good example of that). But you also can't compare 512GB/s GDDR5 to 512GB/s HBM, because the later will(used correctly) offer higher efficiency along with lower latency.
 
Really naive question here. If your VRAM has twice the bandwidth is it functionally similar to having twice the VRAM? Or is the amount of VRAM still going to be a limiting factor no matter how high the bandwidth is?

Imagine VRam is water in a tank, the bandwidth is the size of the pipe used to drain water out on the tank and fill it back up.

The bigger the pipe the quicker it can be filled and emptied.
 
Imagine VRam is water in a tank, the bandwidth is the size of the pipe used to drain water out on the tank and fill it back up.

The bigger the pipe the quicker it can be filled and emptied.

The problem comes when you have more water than you have space for in the tank. Then it does not matter how quickly you can fill or empty the tank as it is still going to overflow.
 
EDIT: 320GB/s is what Hawaii has, again in general the bigger the bus the slower the memory for again, power reasons. What can be achieved in a lab with the right chip(I would presume a very simple test chip designed to maximise bandwidth, nothing like a real world usage). Yes Hawaii could use 8Ghz clocks but you'd find the power would be insane. Again the power usage, the bulk of it, is in the signalling, not the chip. More chips/wider bus but slower signals use less power, the chips themselves will also be more power efficient at 5Ghz than at 8Ghz.

Keep in mind I'm not saying that GDDR5 is better than/the same/has any future compared to HBM generations but its going to be atleast 1 more generation til the strengths of HBM are really needed or are really going to mature enough to really make the difference.

I think your stuck a bit in the older 60-40nm GDDR5 chips, the new stuff is significantly better - sure its still a last hurrah but it can do 8GHz as standard with 8GB in the same kind of situation as your looking at 5.5GHz now and potentially quite a bit more depending on how much your prepared to pay for binning in a gaming GPU context. Unfortunately the power saving aren't as dramatic but the feasible speed increases are just shy of 50%.
 
Back
Top Bottom