Except you can. At the design stage, but Nvidia chose to design for 10GB.
You can but it's not a straight win, it's always some kind of trade off with the end result usually being some compromise of different factors.
Typically you're trying to balance multiple things like:
1) The amount of memory
2) The memory speed (in mhz)
3) The memory bus width
4) How fast/demanding the GPU is
5) Power/thermal limits
You're not only targeting a certain amount of memory capacity but also a certain amount of total memory bandwidth. If you lack enough memory bandwidth to keep the GPU fed with data then you bottleneck it and performance drops. Memory bandwidth is the total bus width multiplied by the memory speed, however each actual memory chip typically has a smaller width on its own interface, in most cases it's 32bit.
The 3080 for example has 10GB of vRAM which is 10x1Gb chips each with a 32bit interface, giving you a total of 10x32bit for a bus width of 320bits. And then a memory speed of 19Gbps which in bytes is /8 so 2.375GBps, multiplied by the 320bit bus width for your 760GB/sec total memory bandwidth.
The 6800XT opted for 16GB total memory and used 2GB chips of GDDR6 but those chips while larger are still the same bus width, so 8 total chips of 2GB each to get the total 16GB but that only leaves you with a bus width of 8*32bit or 256bit. That mixed with the fact the chips are slower at only 16Gbps or 2GBps means they only have 512GB/sec total memory bandwidth.
Normally this little memory bandwidth would bottleneck such a powerful GPU, so they spend a lot of silicon area on chip making Infinity Cache, essentially a very large L3 cache to reduce demand on vRAM. But that has a trade off because more area of the silicon spent on memory means less on transistors to do calculations with, meaning less performance. And then on top of all of that there's cost. Faster memory cost more, GDDR6x is more expensive than GDDR6, higher capacity memory costs more and the downside is that it has the same memory interface which means if you use double the density memory you get half the effective bandwidth.
It's all one giant trade off, you can target one thing to get perfect but often to the detriment of other things.