There really isn't what I was saying. I would suspect that it will be similar to how it is now, memory split into multiple channels and each connects to a memory chip, if you cut the bus you cut the memory because you can't just add chips independently. To go from 4-8GB now you just use double density memory chips, same will be true of HBM, use a 2GB instead of a 1GB stack. I would presume it needs to be designed for a specific number of stacks in the same way a 290x needs a specific number of memory chips, capacity isn't hugely relevant, just how they talk to each other, software wise it doesn't really matter which part of the chip you access, just where you send that message.
It shouldn't make any difference in that sense to current cards.
What will be interesting more than anything is the packaging. Remember Nvidia's fake Pascal mock up. With far fewer chips(16 pretty much gone) from the pcb, power delivery all goes to the same point, thousands of traces are gone so routing of everything left over(which are mostly much simply lower trace count power components) becomes trivial and the PCB can become relatively tiny.
HBM should enable really small and efficient pcb designs leaving more room for cooling and more chance of generation to generation compatibility between coolers. It's really only just occurred to me when writing that how damn small a dual gpu card can become... absolutely tiny in comparison to current dual gpu cards. What I really wonder is, can they stick two gpus, and two set of memory all on one interposer and have one package on card with a significantly reduced core to core latency and huge bandwidth connection... could that lead to the end of microstutter... maybe.