Faud is a complete idiot, I trashed his article talking about how Nvidia was doing their memory differently to AMD, and he's backtracked and changed his mind already.
HBM.... wait for it....... is 3d memory. Not sure why people are getting confused about that or listening to Faud.
Putting stacks of memory on top of a anything from 50-250W GPU is not going to happen, it's that simple. Nvidia don't like saying HBM memory much, nor do they like crediting AMD, they like to for years in advance pretend that they are doing something new that they came up with and be as misleading as possible.
Each individual stack of HBM is a 3d stack itself of anything from 5-9 chips(1 logic and 4-8 memory chips), that is your 3d part. 2.5d is how you connect them to the gpu. You can do it in 3d but it both increases the size and cost of your gpu and makes cooling it much more difficult.
Unless Volta is a mobile only chip then it won't have 3D memory stacked on top of it. 3d stacking of memory over a proper processor is pretty much limited to the <5W range, that might get closer to 10W but not much more so. There is no fundamental reason to do it, the saving from 2.5d to 3d is miniscule. Where going off die to on package might drop communication between gpu and memory from 50W to 2W, 2.5d to 3d would go from 2W to 1W or maybe more like 1.5W. Latency, very little difference. The only real thing you gain is package size, but increased costs, increased design complexity, increased die size, increased production time.
Actual 3d stacking in that sense will be pretty much limited to stupid ass watches, and stupid ass wearable devices where package size is critical to device size. A graphics card inside a PC or even a laptop has no such package size problems.