The memory allocation is purely based on the developers implementation of the game engine, they're free to request more or less whatever they want from the GPU and driver stack and then have their own internal complex rules about how that memory is used. Yes you can dynamically scale up and down memory allocation based on demand but it's better that it's not done very frequently, mostly because other apps/processes can use vRAM and if you're designing a game engine that intends to use most of what is available you don't want to faff about increasing and decreasing allocated memory, there's a risk any memory is released could be assigned somewhere else. It also forces the GPU to do a bunch of memory management which is best done infrequently.
In the case of say 2 hypothetical identical GPUs one with 10Gb and one with 13Gb you'd find the performance is the same as long as the "bloat" is basically memory that has been allocated but not filled. 2 things can happen in that circumstance, the developers rules around memory allocation can simply notice there's less available vRAM on the 10Gb card and be more conservative about their allocation, so don't over provision as much. Or because we now have unified memory they can attempt to allocate that much memory anyway, and let the GPU itself handle swapping assets from disk to vRAM. As long as that over provisioned memory isn't filled with useful stuff, there's no performance penalty there either.
.