I still don't even know what the direct specs are for a Drive PX2, everyone was saying it was likely a GP106 used in it. So you've got 2x 256cuda core Tegra chips and 2 GP106 on die, giving 8TF FP32 total. So you're talking about basically 3000 Cuda cores on 16nm TF for Drive PX2 to achieve that performance.
Then the new one is going to be a single chip with 512 Cuda cores and have the same performance? Thing is, Nvidia are the ones seemingly giving this out, Anandtech have posted the same information... something is extremely odd about the numbers.
EDIT:- Anandtech and Nvidia blog don't mention 8TF FP32, so I think wccf and others have just assumed it didn't lose FP32 performance, but that seems nearly impossible as for 512Cuda cores to match ~3000, it would need to suddenly be clocked 6 times higher or be 600% more efficient or any combination which adds up to 6x more performance per core which just isn't happening. Originally it was implied that deep learning ops were a based off total cuda core power. So 8TF FP32 performance meant 3 ops for deep learning per clock and 24DLTF. But the biggest gain was image processing and the images showed 2 or 3 third party chips on the PCB, one of which was a known image processing SOC.
If this chip doesn't have 8TF but has the same DPTF, it actually indicates the massive majority of the deep learning ops is image processing and the majority(or maybe absolutely all of it) is done on this third party image processing chip. Meaning, deep learning happens off the Nvidia chips almost entirely, so the 2 discrete GPUs on the PX2 were basically unused. It could also be the reason why the PX2 went from 250 to 80W out of nowhere.... turn off the discrete GPUs as they were unnecessary? So two socs, which maybe were also wasted, turns into one lower clocked better optimised soc and the work is still done on the third party chip hence no drop off in deep learning performance.
In which case PX2 was a con and Nvidia is still using third party chips to provide the performance they need. This isn't new either, almost all their car systems afaik used third party chips to do most of the work.
Regardless of what you think about my biased level against Nvidia, the only way I can think of for them to go from 250W and 3000 Cuda cores, to 20W and 512 Cuda cores and lose zero deep learning performance... is if the performance simply doesn't come from their chips.