There are some implicit assumptions I have made to stop short of building a full scale model
- All rays hit the scene
- BVH hits mutually exclusive, which means search is aborted at first hit.. there are no diverging queries
The first assumption makes sense cuz Nvidia would have built an appropriate scene to churn out these theoretical numbers.
However, the second assumption is an approximation but untrue in theory.. right now with the denoiser they can probably afford to do this without discernible loss of visual quality.
Also we don't know if diverging queries are being treated as new rays (should be if you are aiming a scalable arch).. in which case the difference can be compressed further..
So I would revise my estimates to
round about 38% slower instead of atleast 38% slower than RTX 3080 in RT.