• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

*** The AMD RDNA 4 Rumour Mill ***

I don't know where this reasoning that Nvidia has "dedicated RT cores" and AMD doesn't, it is not correct, it is not how this works, they both have "dedicated RT hardware" there is nothing emulated about AMD's RT, it is physical.

amd doesnt have dedicated RT pipeline, available documentation suggests the TMU performs ray-intersection tests in hybrid mode, but it can only do one type of operation in a clock cycle


The difference is in how Nvidia and AMD build BVH, this is really over simplified because i'm not going to write a wall of text to explain this, AMD Construct BVH over many branches, Nvidia do it over a very wide tree, this is a bit like 8 slow cores vs 4 fast cores, both can be equally as fast but not by the same method.
nvidia also has BVH traversal co-processor which amd lacks, and wider isnt always better in RT, because a deeper BVH lets you reject whole clusters of triangles in a single op, thus saving a lot of work, nvidia seems to have adopted a wider BVH structure for 40 series but there might be something else that goes into their optimization algorithm (nvidia has been leading innovation in the industry, so its reasonable to assume they have a secret sauce)

and i believe the 50 series will further widen the performance/efficiency gap wrt amd's offerings
 
Last edited:
I don't know where this reasoning that Nvidia has "dedicated RT cores" and AMD doesn't, it is not correct, it is not how this works, they both have "dedicated RT hardware" there is nothing emulated about AMD's RT, it is physical.
The way I understood it (for when RDNA2 launched), was that nvidia had more dedicated RT hardware and they could be considered 'independent', whereas AMD had accelerators for ray tracing and they were not (or were much less so).

This article describes it similar to the way I understood it:
The most notable change for the general consumer is that AMD now offer hardware acceleration for specific routines within ray tracing.

This part of the CU performs ray-box or ray-triangle intersection checks – the same as the RT Cores in Ampere. However, the latter also accelerates BVH traversal algorithms, whereas in RDNA 2 this is done via compute shaders using the SIMD 32 units.

...

The RA units are next to the texture processors, because they're actually part of the same structure. Back in July 2019, we reported on the appearance of a patent filed by AMD which detailed using a 'hybrid' approach to handling the key algorithms in ray tracing...

While this system does offer greater flexibility and removes the need to have portions of the die doing nothing when there's ray tracing workload, AMD's first implementation of this does have some drawbacks. The most notable of which is that the texture processors can only handle operations involving textures or ray-primitive intersections at any one time.

Given that Nvidia's RT Cores now operate fully independently of the rest of the SM, this would seem to give Ampere a distinct lead, compared to RNDA 2, when it comes to grinding through the acceleration structures and intersection tests required in ray tracing.

Like I said earlier though, this stuff goes way over my head, "BVH traversal algorithm" - Whut? Is that for helping trains turn around?
 
^^^^ :D

I don't want to drag this off topic given the complaint already.... AMD's shaders double up as RT cores, Nvidia have specific, or "dedicated" RT cores, so in terms of wording yes it is accurate to say that but in terms of functionality AMD's is still hardware with specific hardware extension functionality, IE "not emulated"

That's the last i will say on it here :)
 
I don't know where this reasoning that Nvidia has "dedicated RT cores" and AMD doesn't, it is not correct, it is not how this works, they both have "dedicated RT hardware" there is nothing emulated about AMD's RT, it is physical.

The difference is in how Nvidia and AMD build BVH, this is really over simplified because i'm not going to write a wall of text to explain this, AMD Construct BVH over many branches, Nvidia do it over a very wide tree, this is a bit like 8 slow cores vs 4 fast cores, both can be equally as fast but not by the same method.

The advantage of the wide approach is it doesn't really matter which BVH construction you code for you will always get the most out of being wide, the disadvantage wide requires more caching, its why Ada has so much L2 cache, the advantage of the branch approach is you don't need so much cache, but unless you're specifically going to code for that its going to be slower.

Now i's sure AMD's thinking was keep the die size down, it doesn't matter as we own consoles they are going to code for us, hmm... well they don't have to and if the studio is packed with Nvidia cards they aren't going to.

Also, and game that AMD does RT well in must be fake RT, no, not necessarily.

I was keeping my previous post very simple and even clarified that AMD has some dedicated hardware. What they leave out is hardware dedicated to BVH traversal.

And I don't need you to write a wall of text explaining it. Because I can sum up your wall of text right now. It will be load of technical sounding gibberish basically giving a bunch of excuses for why AMD doesn't perform as well as Nvidia in Ray Tracing.

I'm sorry but this isn't going to descend into a silly argument. Because if you actually know what you say know, you know that there is no getting around the hardware limitation that AMD's solution has compared to Nvidia's. Nvidia's will always be faster. Hence the reason why it's changing for RDNA 4.

And I never said Fake RT either. Where did you even get that from?
 
Last edited:
Ah, the AMD defence force arrives.

I was keeping my previous post very simple and even clarified that AMD has some dedicated hardware. What they leave out is hardware dedicated to BVH traversal.

And I don't need you to write a wall of text explaining it. Because I can sum up your wall of text right now. It will be load of technical sounding gibberish basically giving a bunch of excuses for why AMD doesn't perform as well as Nvidia in Ray Tracing.

I'm sorry but this isn't going to descend into a silly argument. Because if you actually know what you say know, you know that there is no getting around the hardware limitation that AMD's solution has compared to Nvidia's. Nvidia's will always be faster. Hence the reason why it's changing for RDNA 4.

And I never said Fake RT either. Where did you even get that from?


Read first line, ignores rest... goes back to reading something interesting.
 
I was keeping my previous post very simple and even clarified that AMD has some dedicated hardware. What they leave out is hardware dedicated to BVH traversal.

this has not changed for rdna 3 and isnt expected to change for rdna4, amd doesnt have any dedicated hardware for RT, the intersection tests are also done using hybrid units, it cannot perform more than one of these ops in a clock cycle:
- 4 textue mapping ops (TMU)
- 4 ray-box intersection tests
- 1 ray-triangle intersection test (AMD counts this as one Ray Accelerator op)
source: https://www.reddit.com/r/Amd/comments/ic4bn1/amd_ray_tracing_implementation/

and this can also be corroborated with aggregate numbers: 7900XTX for example has 384 TMUs and 96 Ray Accelerators (4:1 ratio), the ratio must have been the same for 6800XT as well, rdna 4 is not expected to have a big structural change in the pipeline, they are expected to make minor tweaks like the number of intersection tests that can be executed per clock cycle

there's no separate mention of BVH hardware
so in summary they dont have a separate RT pipeline and no big changes are expected in rdna4
 
Last edited:
this has not changed for rdna 3 and isnt expected to change for rdna4, amd doesnt have any dedicated hardware for RT, the intersection tests are also done using hybrid units, it cannot perform more than one of these ops in a clock cycle:
- 4 textue mapping ops (TMU)
- 4 ray-box intersection tests
- 1 ray-triangle intersection test (AMD counts this as one Ray Accelerator op)
source: https://www.reddit.com/r/Amd/comments/ic4bn1/amd_ray_tracing_implementation/

and this can also be corroborated with aggregate numbers: 7900XTX for example has 384 TMUs and 96 Ray Accelerators (4:1 ratio), the ratio must have been the same for 6800XT as well, rdna 4 is not expected to have a big structural change in the pipeline, they are expected to make minor tweaks like the number of intersection tests that can be executed per clock cycle

there's no separate mention of BVH hardware
so in summary they dont have a separate RT pipeline and no big changes are expected in rdna4

I didn't read too much into the changes. But with the improvements they were talking about and the secret sauce that Sony is supposedly bringing, I just presumed they were throwing more dedicated hardware at the problem.

Now I am hoping that the rumour mill is wrong.

Thanks for the info.
 
Read first line, ignores rest... goes back to reading something interesting.

Rather than take this thread further off topic. Go into the Official Ray Tracing thread and prove to me that AMD's current method is faster. And as a show of good faith, I have removed the offending line from my previous post.
 
But with the improvements they were talking about and the secret sauce that Sony is supposedly bringing
there is a possibility as reported somewhere that they double the RA throughput theoretically and TMU:RA ratio falls to 2:1 in rdna 4 but still the overall architecture doesnt change much and the scheduler has to work extra hard to maintain good utilisation because its eventually a common pipeline and that increases the order of complexity
 
Last edited:
Last edited:
At the end of the day, not beating the 7900xtx isnt an issue if its priced accordingly. You won't be able to charge over $500 for a GPU that can't beat the 7900xtx in 2025
 

Navi 48 RX 8800 will not be faster than Navi 31 RX 7900 XTX, it will be between RX 7900 XT and XTX performance.

What a weird click bait headline, its a choice they made and we already know about it, they are about 6 months late...

The only thing that matters is price, in the grand scheme of things no one cares about GPU's that cost more than £500.
 
Back
Top Bottom