There are LOTS of things people don't understand about HBM memory so I'll try to do some points without too much info and without being too long..... I said TRY
1/ 512GB/s from gddr5 and HBM isn't equivalent. HBM has more efficient usage of bandwidth. For many reasons you never get 'full' bandwidth, HBM and HMC are designed for higher efficiency of bandwidth meaning you can use more of the available bandwidth. 512GB/s of gddr5 may only have 70% effective bandwidth where HBM might have 85% effective bandwidth.
In that scenario while 512GB/s is 60% more than 320GB/s, factoring in the efficiency you actually have 435GB/s vs 224GB/s which is over a 90% increase in effective bandwidth. Increasing the efficiency as shown with the example increases the effective bandwidth increase HBM is really providing. The actual efficiency numbers for both I don't know, I have seen in many papers/articles that HBM and HMC are designed to increase efficiency by a decent amount over GDDR5. Basically 512GB/s HBM is a significantly bigger increase in bandwidth than most think.
2/ With current gpu's you can scale bandwidth/clock speeds easily and tune memory speed to what the gpu actually needs at stock, anything beyond it is wasted power. With HBM they are running at the lowest clocks/voltages possible, there are upper and lower voltage/clock speed limits to all chips and HBM isn't likely tuned to what the GPU core needs at stock speed, it's merely the lowest bandwidth possible with HBM at the lowest clocks with the lowest number of stacks possible to get 4GB currently. As such while 99% of gpu's have the bandwidth they need at stock, Fiji likely has(coupled with point 1) WAY more bandwidth than is currently required at stock. So Fiji likely has a significant amount of excess bandwidth such that increasing core clocks won't become bandwidth limited.
3/ HBM, HMC, and really all stacked chips have complex temperature monitoring and throttling. Increasing speed, voltage and temp would improve speed for the top chips but ultimately cause more throttling on the bottom chips anyway. This would often cause uneven performance. If the data is in the top chip it's faster, but if it's in the bottom chip which may be throttled or turned off temporarily then it can be slower.
So overclocking may not improve performance and could potentially decrease it.
4/ I haven't honestly seen it spoken about anywhere but HBM might well have static clocks anyway and be unable to overclock. In the future this could prove a problem, hopefully we will see AMD/Nvidia factor in some headroom in bandwidth in the future. This may not be the case, as bandwidth goes through the roof architecture will be tuned towards utilising the higher bandwidth better. in 1-2 generations they might depend on massive bandwidth, be tuned right on the limit and become bandwidth limited when overclocking.