But your Intel board doesn't actually have that many lanes that are 'real', it uses 24 lanes, pushed into 4 lanes through the chipset and has access to 16 lanes directly from the CPU. So your 24 chipset lanes can only offer 4x lane bandwidth at any one time, there fore making it a serious bottleneck if you are using those devices at the same time as each other.
Lets take an example, you've got your board with 3 16x slots running at 16x for the GPU, then the others are running at 4x as you mentioned, and you chose to populate these with PCI-E SSD's (we know how much you disklike M.2 drives) since both of the 4x slots are now going through the chipset you are going to be having only 50% of the bandwidth accessible per drive, and thus performing sub-optimally. If all you want to do is have lots of devices connected an only use one at a time (4x +) or lots of 1x devices then that is fine, but otherwise what you are asking for is HEDT TR style PCI-E lanes.
AMD actually have 20 lanes from the CPU, and 4 lanes dedicated to the chipset for AM4, but the layout means they don't use a multiplier like Intel to make the 4x lanes of the chipset in to 24x fake lanes, so the boards might look inferior but actually offer the same capability of peak performance, with an extra 4x lanes from the CPU for another 4x PCI-E device, such as a M.2 drive at full speed.
The main thing to remember is that AMD is an SoC and the chipset is not actually needed at all.
The peak performance isnt the issue, but the flexibility of configuration is, I wouldnt be utilising all SATA ports, all USB ports, all PCIE slots at same time.
So it seems the issue is the AMD boards because of this lack of multiplier cannot offer this configuration on mainstream?
Given this severe limitation, it would seem those 4 lanes from cpu to m.2 is extremely wasteful.
I would route those 4 lanes to a pcie slot instead or even better a bridge chip which can multiply the lanes into virtual lanes. Then you have a multipurpose flexible use for it, if you still want to use it for nvme then you can do so via a nvme m.2 card or a pcie SSD.
Basically the equivalent of the entire chipset allocation of bandwidth is sent to a single m.2 slot?
Also you really sure they dont support multiplication? I mean my cheap b450 board even has 4 pcie slots on top of 4 AMD sata ports, asmedia x2 SATA ports, plus USB etc. I expect only the primary pcie is fed from cpu, so the rest is coming from these 4 lanes. The pcie slots are pci express 2 so halved capacity, but there seems to be some multiplication still ongoing.
So I think These new boards could have pci express 3 slots fed from chipset (so only half lane needed to feed a x1 slot).
Pci express full sized slot fed from those spare 4 lanes on cpu so its a x4 not needed from chipset.
Pci express full sized slot fed from chipset but v2, so uses 2 lanes not 4. Share this with m.2 via bridge so if both in use at same time they slow down or something.
One lane can feed 2 more pci express v3 x1 slots.
One lane usb, one lane sata.
This would be almost as good with just 1 less x1 pcie slot than my current config, the sacrifice been the loss of the m.2 slot with its dedicated cpu lanes.
Seems really tho both intel and amd been very tight on chipset lanes, 4 lanes seems a pittance.