AMD Fiji HBM limited to 4GB stacked memory

Lokken86 · 11 Feb 2015 at 15:08

Kaapstad said:
The problem comes when you have more water than you have space for in the tank. Then it does not matter how quickly you can fill or empty the tank as it is still going to overflow.

Ok this has cleared it up. For me anyway. Sorted.

bru · 11 Feb 2015 at 15:13

I dunno all this talk of bandwidth, I don't see what it has to do with graphics cards at all.

I know terrible joke and for that I do apologise.

Gregster · 11 Feb 2015 at 15:14

Great article and thanks for posting Kaap.

An interesting read and some very good responses.

@bru

That is quite a wide band indeed

drunkenmaster · 11 Feb 2015 at 16:18

Rroff said:
Keep in mind I'm not saying that GDDR5 is better than/the same/has any future compared to HBM generations but its going to be atleast 1 more generation til the strengths of HBM are really needed or are really going to mature enough to really make the difference.

I think your stuck a bit in the older 60-40nm GDDR5 chips, the new stuff is significantly better - sure its still a last hurrah but it can do 8GHz as standard with 8GB in the same kind of situation as your looking at 5.5GHz now and potentially quite a bit more depending on how much your prepared to pay for binning in a gaming GPU context. Unfortunately the power saving aren't as dramatic but the feasible speed increases are just shy of 50%.

Create a architecture designed around 60% more bandwidth and it would suck with less. Not needing that bandwidth on todays products doesn't mean you won't need it on tomorrows products.

I'll also point out that Hawaii chose to use a bigger bus and lower power chips because it also cuts down on the signalling power usage(at lower clock speeds). They decided 512bit + 5Ghz clock speeds was more power efficient than 384bit + 7Ghz clock speeds, it's not that the chips couldn't do 7Ghz, it's that they used lower clocks on purpose to reduce total memory system power usage, the chips are the smaller part of the equation.

Then we get to the speed increase, chips already did 7Ghz 'as standard', this is moving to 8Ghz with a new process node, at just over 14% increase, it's certainly not a 50% increase and it fails to take into account that the rest of the memory power usage will increase at 8Ghz vs 7Ghz. If the power drop in the 20nm chip even outweighs the increase from memory controller/signalling is questionable, particular with everyone's 20nm chips basically sucking balls power wise with not massive gains vs previous nodes.

Think about other downsides of GDDR5 even at 20nm. 32bit connection per chip... 16 chips needed to get a 512bit bus, 20nm min density is 1GB, 16GB of low yield memory on the most expensive process yet? Requires a huge memory controller to connect with. HBM can achieve the same bandwidth in only 4 stacks, 4 connections to the gpu with the connections using a fraction of the power. Yield on 1GHz HBM chips, 1.2v for lower power, scale it, 2Ghz for 256GB/s per stack is only a year or two away, why, because scaling to 2Ghz is trivial for these chips, scaling to 8Ghz is not trivial(power/yield/price wise). 16 sets of traces on a PCB, with huge power wasted on signalling vs 4 on package connections that saves a huge amount of power.

HBM also has some early downsides, potentially 1GB per stack is a limitation at the other end of the scale because they MIGHT be limited to 4 stacks(they might just go with 6 or 8 stacks and even more bandwidth). However HBM will scale brilliantly in a short space of time. 2 years from now it should be pretty easy to put 16GB of memory providing 1TB/s of bandwidth in only 4 chips. Not long after that we'll be looking at 32GB in only 4 stacks with the same memory or using 2-4 more stacks. This will all be with vastly simplified PCB's, making power components cheaper, layout cheap, the process of making custom cooling will be quicker as PCB's will be so simple. There are lots of other area's HBM will improve as a by product of a very simple PCB.

There are many more negative things about GDDR5 that you aren't considering and the speed increase you suggested simply isn't close to accurate but you've also ignored the power increase from other parts running at higher speeds.

HBM's biggest advantage is not the stacking directly, nor the clock speeds, nor the lower power per GB that the memory itself uses. It's the connection method, the main saving in power is from being on package, GDDR5 will never claw that back, if the memory was magically made on 0.1nm process and used 0.01W per chip it would still use more power to use gddr5 than current HBM. HBM clockspeeds are so low they'll double with the next process drop, not go up 14%, and they'll do so in the roughly same power.

layte · 11 Feb 2015 at 16:40

Lots of if, buts, and maybes in this topic. Any firm details people can read up on? I'm not interested in best case fairy tales as this is some pretty interesting stuff.

TNA · 11 Feb 2015 at 17:49

Kaapstad said:
No it will be the same, 4gb is 4gb unless it's on a GTX 970.

Nearly made me spill my tea! lol

LoadsaMoney · 11 Feb 2015 at 17:57

Baboonanza said:
Think of the VRAM like a teapot. Bandwidth is the speed you can pour the tea out, capacity (4GB) is the amount of tea it holds. If you make the spout bigger (more bandwidth) you aren't changing the amount of tea it can hold, just the speed it comes out at.

so short and stout

Doogles · 11 Feb 2015 at 18:00

SiDeards73 said:
If any of the current rumors regarding new cards from Nvidia and AMD have any shred of truth, alongside DX12 im wondering if i should just keep my 290 for now

DX12 will give the 290 a new lease of life in games that are coded for DX12

The 380X if it is a rebadged faster 290 does not appeal to me at all

If AMD hold off the 390X til late in the year, then having owned a 290 since release, i can happily wait a few more months AFTER the 390x and see what Nvidia come out with the Pascal (yes i know this is 2016, i can wait).

So basically i think im pinning my hopes on 390X or 395X2 being a superbeast, especially if DX12 gives the perrformance i think it might on my 290 for games designed with it, otherwise its going to be Nvidia newest tech. Tbh im at the point where i may just wait for the next die shrink to bother upgrading?

Lots of questions with no real answers i guess until we see a) AMD new batch of cards specs b) DX12 in realworld use and i guess maybe even c) something new from Nvidia?

Until we upgrade past 1440p 60hz the R9 290 will be enough for me at least, I'll only upgrade when UHD 144hz TN or UHD IPS 60hz are under £400 and decent.

Telecaster · 11 Feb 2015 at 19:50

Looking like Titan X won't be anywhere near the same VFM as the original Titan, this close to a die shrink and a switch to HBM isn't inspiring me to get my wallet out.

Rroff · 11 Feb 2015 at 20:26

drunkenmaster said:
Create a architecture designed around 60% more bandwidth and it would suck with less. Not needing that bandwidth on todays products doesn't mean you won't need it on tomorrows products.

I'll also point out that Hawaii chose to use a bigger bus and lower power chips because it also cuts down on the signalling power usage(at lower clock speeds). They decided 512bit + 5Ghz clock speeds was more power efficient than 384bit + 7Ghz clock speeds, it's not that the chips couldn't do 7Ghz, it's that they used lower clocks on purpose to reduce total memory system power usage, the chips are the smaller part of the equation.

Then we get to the speed increase, chips already did 7Ghz 'as standard', this is moving to 8Ghz with a new process node, at just over 14% increase, it's certainly not a 50% increase and it fails to take into account that the rest of the memory power usage will increase at 8Ghz vs 7Ghz. If the power drop in the 20nm chip even outweighs the increase from memory controller/signalling is questionable, particular with everyone's 20nm chips basically sucking balls power wise with not massive gains vs previous nodes.

Think about other downsides of GDDR5 even at 20nm. 32bit connection per chip... 16 chips needed to get a 512bit bus, 20nm min density is 1GB, 16GB of low yield memory on the most expensive process yet? Requires a huge memory controller to connect with. HBM can achieve the same bandwidth in only 4 stacks, 4 connections to the gpu with the connections using a fraction of the power. Yield on 1GHz HBM chips, 1.2v for lower power, scale it, 2Ghz for 256GB/s per stack is only a year or two away, why, because scaling to 2Ghz is trivial for these chips, scaling to 8Ghz is not trivial(power/yield/price wise). 16 sets of traces on a PCB, with huge power wasted on signalling vs 4 on package connections that saves a huge amount of power.

HBM also has some early downsides, potentially 1GB per stack is a limitation at the other end of the scale because they MIGHT be limited to 4 stacks(they might just go with 6 or 8 stacks and even more bandwidth). However HBM will scale brilliantly in a short space of time. 2 years from now it should be pretty easy to put 16GB of memory providing 1TB/s of bandwidth in only 4 chips. Not long after that we'll be looking at 32GB in only 4 stacks with the same memory or using 2-4 more stacks. This will all be with vastly simplified PCB's, making power components cheaper, layout cheap, the process of making custom cooling will be quicker as PCB's will be so simple. There are lots of other area's HBM will improve as a by product of a very simple PCB.

There are many more negative things about GDDR5 that you aren't considering and the speed increase you suggested simply isn't close to accurate but you've also ignored the power increase from other parts running at higher speeds.

HBM's biggest advantage is not the stacking directly, nor the clock speeds, nor the lower power per GB that the memory itself uses. It's the connection method, the main saving in power is from being on package, GDDR5 will never claw that back, if the memory was magically made on 0.1nm process and used 0.01W per chip it would still use more power to use gddr5 than current HBM. HBM clockspeeds are so low they'll double with the next process drop, not go up 14%, and they'll do so in the roughly same power.

You really don't like GDDR5...

I believe 8GHz was chosen due to where it reached power use parity (which IIRC is only about 15% more efficient despite the huge node change) rather than the upper end of what it is capable of clock speeds wise. Throwing a workstation scenario where core and memory clocks are always reduced anyhow doesn't really give a good representation case for what is/isn't possible in a gaming GPU context.

My point isn't that GDDR5 is comparable to HBM going forward but that for the next generation its unlikely that HBM is going to blow what is capable of being utilised with GDDR5 out the water - the generation after that is likely another story as that is when the tech is coming in where that kind of memory capabilities will have an impact on the core design.

triss · 11 Feb 2015 at 20:35

So HBM or whatever stacked type of ram there is still uses GDDR5? or is the memory chips themselves a whole new format? Will there be a GDDR9 ? or are we diverting from the GDDR format

Zeed · 11 Feb 2015 at 20:41

4gb is still **** loads for 3d gaming

eddyr · 11 Feb 2015 at 21:56

Rroff said:
You really don't like GDDR5...

I believe 8GHz was chosen due to where it reached power use parity (which IIRC is only about 15% more efficient despite the huge node change) rather than the upper end of what it is capable of clock speeds wise. Throwing a workstation scenario where core and memory clocks are always reduced anyhow doesn't really give a good representation case for what is/isn't possible in a gaming GPU context.

My point isn't that GDDR5 is comparable to HBM going forward but that for the next generation its unlikely that HBM is going to blow what is capable of being utilised with GDDR5 out the water - the generation after that is likely another story as that is when the tech is coming in where that kind of memory capabilities will have an impact on the core design.

What Drunken is getting at is that alongside increased bandwidth, HBM simultaneously frees up a huge amount of the power budget, which can be used for more processing throughput from more stream processors (die size permitting) or just cranking up the clocks a little more. Whereas the GGDR5 as you indicate offers only a slight improvement in one or the other.

Mauller · 11 Feb 2015 at 22:20

HBM, the latency improvements, regardless of bandwidth improvements, will increase performance by a measurable quantity.

down clocked HBM, so the bandwidth is comparable to GDDR5 will still perform better due to the above reason, as the processor core will idle less.

Latency is a bigger killer of processor performance, compared to bandwidth.

Finners · 11 Feb 2015 at 22:22

Any guesstimations on power draw of GDDR5 vs HBM?

James J · 11 Feb 2015 at 22:43

I personally don't think 4k gaming will really be viable for the majority of brand new titles in the quality we expect until 2016/2017. If this GPU has 4gb VRAM, it'll destroy 1080/1440p content but I can't see it being ideal for the newest AAA rated games at 4k res sadly due to how poorly multiplatform games are coded at the moment. 6gb VRAM seems to be the norm for decent 4k gaming with 8gb being absolutely ideal, yet the current gpus don't have enough grunt. If AMD manage a 40% increase in speed on the 290x with their new card, I'll start saving for an 8gb model from either AMD or nvidia, whichever is faster and quiet.

Boomstick777 · 11 Feb 2015 at 22:49

James J said:
I personally don't think 4k gaming will really be viable for the majority of brand new titles in the quality we expect until 2016/2017. If this GPU has 4gb VRAM, it'll destroy 1080/1440p content but I can't see it being ideal for the newest AAA rated games at 4k res sadly due to how poorly multiplatform games are coded at the moment. 6gb VRAM seems to be the norm for decent 4k gaming with 8gb being absolutely ideal, yet the current gpus don't have enough grunt. If AMD manage a 40% increase in speed on the 290x with their new card, I'll start saving for an 8gb model from either AMD or nvidia, whichever is faster and quiet.

The move from 1280 x 1024 > 1080P took about 5 years for cards to get really good at 1080P. From 1080P > 4K is an even bigger leap, that's a lot of pixels to push lol.

I think further than that because games will get more demanding as well. For the mainstream cards to be really good at 4K and able to run highest settings I would say 2018 > 2019. Even then you probably still need dual cards to enable all the settings..

JediFragger · 12 Feb 2015 at 00:03

Mauller said:
HBM, the latency improvements, regardless of bandwidth improvements, will increase performance by a measurable quantity.

down clocked HBM, so the bandwidth is comparable to GDDR5 will still perform better due to the above reason, as the processor core will idle less.

Latency is a bigger killer of processor performance, compared to bandwidth.

This is nice to know, and looks like AMD will get a decent boost in addition to the clock-speed increase that's rumored

I have hope that the 380 might just overtake the 980 in performace-terms!! Will shake things up

thesmokingman · 12 Feb 2015 at 06:11

pmc25 said:
NVIDIA abandoned it and went for the AMD / SK developed HBM. Pascal is very, very unlikely to appear before H2 '16.

Afaik and recall, Nvidia lost the contract for their proposal on stacked memory. SK went with AMD and so AMD gets first crack at HBM and Nvidia have to wait a year. They will be behind the curve so to speak. They will lose 2015 and I'd assume prep/design for HBM2 in 2016 so all is not lost.

jakspyder · 12 Feb 2015 at 06:22

Will this type of memory eventually replace main system RAM or is it entirely the realm of GPU memory?