Where does the extra 3GB come from? Is this the game engine requesting it as reserve or is it a bug in windows that assigns more than is needed?
Lets assume that the game engine is requesting 12.5GB as reserve. Are you saying that game developers don't know how to do their job and are requesting an excessive amount VRAM that they do not need?
Has someone played through the whole of FS2020 to confirm that 9.5GB is the maximum it will ever need? Or did they benchmark a section and think that they can extrapolate it to apply to the rest of the game?
What about other games, have people benchmarked the entire game to check how VRAM requirements fluctuate?
Edit: As a sidenote, it is kind of funny to watch some people go from saying how we should listen to the Nvidia engineer because they know more than us. To they themselves knowing the requirements for a game engine better than the developer of said engine.
The extra memory allocation comes from developers writing the engine so it reserves an estimated block of memory in vRAM which is larger that what it knows it needs, and then the game engine itself interneally managed what is put into that vRAM so it's abstracted away from the hardware. The engine just see's a big list of memory addresses it can use that don't conflict with other processes using the GPU, and from the GPUs standpoint once it's assigned to an app it's "in use" and unavailable to any other process unless later released.
How each developer uses that vRAM in their engine is going to be unique to them, in the old days you'd just throw the "level" assets in there required for everything in the game space you're in, and game spaces were separated by levels, between which the "loading" process flushed vRAM and then loaded the next lot. As engines became more sophisticated and game installs (the assets) grew far larger than vRAM can cope, it became more of a buffer into which you do predictive streaming of assets as players pass between zones. Once that concept was mastered you could in theory have infinite content and just stream what you need which is why games, especially open world ones went from 4-5Gb to like 100+gb in a short few years.
But the fact is no one other than the engine engineers really know how this works deep down, lots of trade secrets I'm sure, even the developers don't really know, the game devs have abstracted tools to allow them to zone and put in loading/streaming areas, but almost certainly have no idea what the engine is actually doing in vRAM. Point is that vRAM usage and game assets kinda just became decoupled, and more and more of that vRAM is now being dedicated to purely what the GPU needs to render the next frame, the better that prediction gets the more vRAM is spent on that, rather than a large dumb cache. iDTech5+ made use of this last gen and the next gen console (and microsoft DirectStorage) will continue to capitalize on this into the next generation (in the case of Nvidia they've integrated this as RTX IO)
On the FS2020, the benchmarks were pulled from generic list of benchmarks so you'd assume they're representative but I don't know that for sure. But what I do know is that if it's not, you fly over some area that hypothetically needs 14Gb of vRAM then the vRAM wont crap out first, the GPU will, it'll choke trying to provide you with a fast enough playable frame rate.
It's not just about capacity. The memory bus used has a direct correlation to performance. Higher memory bus sizes need more physical memory chips on the PCB. Depending on what size modules are available, this may simply require them to have more memory chips (and thus capacity) on the card.
Yeah this is what I alluded to earlier, when I said architecture. Fundamentally if you pick some bus width for your GPU/Memory and chips are available in only certain sizes then you end up with a fixed list of candidate memory configs for the card, and my bet is that whatever the next config below 11Gb for the 1080Ti would have been too small, and it's better to over provision the memory than under provision it. The result is a more expensive card because at the end of the day the extra memory costs money so it adds to the already heavy premium of the card, but what are you gonna do? It's an architecture limitation that's different for each card depending on what RAM/BUS width you pick and what chips are available at the time.