Good read there drunken! Sometimes like to read your massive paragraphs lol. But i do sometimes think how accurate is the things you say. Not meaning that what you say isn't credible but if some of it is more your own thoughts than actual findings if you know what i mean? Will HBM truly be a big saving on power? Will it offer much greater performance or just greater bandwidth? I know more bandwidth is better for bigger resolutions but what about 1080p or possibly 1440p? Will it offer a great performance increase at these resolutions?
Also in terms of DDR4 performance in the system, doesn't DDR4 use higher timings over DDR3 to get a higher bandwidth? I always thought bigger timings was bad for gaming?
As games get more demanding more bandwidth is required regardless of resolution, it's an ever increasing thing, it's not like 128GB/s is fine for 1080p and always will be, it's fine in a game that doesn't really require more to achieve a certain frame rate, but a game a year later might want 200GB/s and another game a year after that might want 150GB/s at 1080p. It also depends on frame rate, pushing 120fps vs 60fps wants far higher bandwidth.
Some people are happy with 60fps which is entirely fine, but for those won want the lowest persistence, lower motion blur and most smooth performance pushing 90fps + is where LCD tech starts to look REALLY good.
More bandwidth is required for more performance full stop, how much performance you need and at what resolution is and has always been personal choice. You might game using a 290x and only get 50-60fps in say Watchdogs and stuttering, generally poor performance, to get 90 or 120fps in that game requires a bigger gpu and it will also require more bandwidth to achieve that frame rate.
I'd take 90+ frame rate and a 120+hz screen every time over a higher res lower refresh rate, lower frame rate combination every time.
Increasing resolution increases bandwidth requirements, but so does increasing frame rate, or increasing graphical options. bandwidth requirements have never been and never will stay static. Going from 1080p to 4k certainly jumps the bandwidth requirement more than going from 60 to 120fps at 1080p, but it's still more bandwidth required. Better performance will always require more bandwidth.
HBM/HMC also reduce latency and absolutely, not just guess, reduces power, significantly. Even Intel's own papers on HMC talk about how both HMC reduces power/(GB/s) significantly but HBM is better on power usage than HMC... when Intel will say HBM saves even more power you pretty much confirm it's true.
It's relatively basic electronic behaviour, thicker wires and longer distances require more power to push a signal. PCB level traces and the size of the connections required to go from silicon to PCB scale connections are significantly bigger than when it's all done at the silicon scale which is what HBM does. You're talking 40-80nm traces with lengths maybe between 1-5cm vs traces that are massively larger at the pcb level and are pretty much up to 50CM long(there are multiple layers in a pcb and traces travel up and down several layers to get to where they are going, the straight line distance isn't indicative of actual length they'd be).
What would take around 80W for 4GB of Gddr5 to provide 512gb/s of bandwidth HBM does in 30W. Partially the power saving of the communication and partially the memory itself. Rather than using very high speed memory with a small bus, you use low speed memory with a very wide bus. So 1Ghz clock speeds with lower voltage instead of 2.5+Ghz memory with higher voltage. As with most memory, lower clock speeds means tighter timings, so you end up with, lower latency, significantly lower power memory chips all connected to a much wider bus to provide more bandwidth with lower latency, lower power. To cap it off HBM is TINY compared to normal memory because it's stacked. I forget the exact comparison but a 1GB stack is significantly smaller than a single 256MB memory chip and it's all on the single package connected to the pcb. So we also get significantly neater and smaller pcb's, we get easier to cool chips(due to vastly less complex surface mounted crap on the pcb) so better cooled VRM's, more standardisation in designs, meaning more likely that a single waterblock or air cooler will fit multiple generations of cards. If you look up pictures Nvidia showed of a mocked up Pascal and the tiny pcb size and the extremely neat look to it, that is why.
There is nothing but upsides to HBM and many of the same ones for HMC though not quite as many.