Yep, they have a warp scheduler for a reasoni believe wave32/64 is an internal amd only specification.. nvidia is a superscalar architecture do they need to batch/queue instructions like that?
Seriously, the changes to FP32 in Ampere are really similar in concept to the changes to FP32 in RDNA3 - AMD were just a generation behind, and it's not going to be the cause of any unexpected loss of performance.