• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

AMD Navi 23 ‘NVIDIA Killer’ GPU Rumored to Support Hardware Ray Tracing, Coming Next Year

Status
Not open for further replies.
Im more interested in how they are not going to starve the memory bandwidth.
This is the biggie for me. I fully believe that RDNA 2 has been designed to not need the traditional levels of memory bandwidth, but how AMD, Sony and Microsoft have done it is the juicy detail. Big-ass lump o' cache.
 
Im more interested in how they are not going to starve the memory bandwidth.

Two possible approaches:

a. some arcane compression technology
b. the infinity cache (salt) which will reduce cache misses thus in turn reducing fetches from/writes into vram

A more pragmatic outlook suggests disregarding leaks regarding the gimped bandwidth
 
This is the biggie for me. I fully believe that RDNA 2 has been designed to not need the traditional levels of memory bandwidth, but how AMD, Sony and Microsoft have done it is the juicy detail. Big-ass lump o' cache.

You can thank the Zen engineers for that, AMD moved some to the GPU division to help out with power efficiency and performance. Zen makes great use of large levels of cache for some of it's performance. Now they're getting RDNA to do the same.
 
That's why I don't understand why people rushed to buy the 3080/90 so fast. What is the damage in waiting to see what the competition has to offer? It's amazing what bias toward a company or the need to have the best and have it right now can do.

Because people need new and shiny!!! This hobby is full of bloody magpies!!
 
This is the biggie for me. I fully believe that RDNA 2 has been designed to not need the traditional levels of memory bandwidth, but how AMD, Sony and Microsoft have done it is the juicy detail. Big-ass lump o' cache.

This patent potentially gives us a small insight into what they will be doing for RDNA 2.

AMD has just had a patent granted for Adaptive Cache Reconfiguration Via Cluster. The filing date was March 2019, so this could very well be in RDNA2.
https://www.freepatentsonline.com/y2020/0293445.html

Nerdtechgasms tweets about it

Here is a paper on the concept https://adwaitjog.github.io/docs/pdf/sharedl1-pact20.pdf

A snippet from the paper
 
They've delayed the 3070 release obviously to coinside with the RDNA 2 desktop launch.

Maybe what AMD have isnt much better than the RTX 3070 - but NV still dont know for sure. Also, it gives them a bit more time to ensure a smoother launch than the rtx 3080.

The thing about new cache AMD may use is - it wont make up for lower specs in other areas, such using just 64 ROPS, or, having a lower than expected CU count.
 
Last edited:
They've delayed the 3070 release obviously to coinside with the RDNA 2 desktop launch.

Maybe what AMD have isnt much better than the RTX 3070 - but NV still dont know for sure. Also, it gives them a bit more time to ensure a smoother launch than the rtx 3080.

I think you'll find it's the other way round, what AMD will have to show is a lot better than Ampere, the reason AMD are showing off Zen 3 first is so RDNA2 isn't bottlenecked by current CPUs, i.e RDNA2 will need a very strong CPU to balance out it's gaming horsepower.
 
This patent potentially gives us a small insight into what they will be doing for RDNA 2.

I went through the last paper. It suggests following changes in cache architecture:
1. Shared L1 for all cores. Contemporary arch commits L1 separately for each core
2. No data replication across L1 assumed which is not the case currently
3. L1 cache addresses mapped per core. Not needed currently
4. Inter core communication/bandwidth optimisations for fetching data from L1 cache. Not needed currently
5. Software based methods for reconfiguring shared L1 to private mode to run 'non shared friendly workloads'. Not needed currently

Overall it focuses on IPC improvements rather than memory bandwidth optimisation. It estimates 22% increase in IPC for 'shared friendly applications' due to changed architecture by minimising cache misses while efficiently utilising cache by eliminating replication.

Edit: It talks about 16X cache, though I haven't looked at it in great detail to conclude anything noteworthy.
 
What? AMD wouldve released at the same time as their new CPUs if they could have. People will try to invent other reasons for the later RDNA 2 launch, but they are unlikely to be true, AMD would sell more CPUs and GPUs by combining the launches.

Modern CPUs arent really causing a bottleneck, you can still get 60 fps in nearly all games even at higher resolutions.
 
I went through the last paper. It suggests following changes in cache architecture:
1. Shared L1 for all cores. Contemporary arch commits L1 separately for each core
2. No data replication across L1 assumed which is not the case currently
3. L1 cache addresses mapped per core. Not needed currently
4. Inter core communication/bandwidth optimisations for fetching data from L1 cache. Not needed currently
5. Software based methods for reconfiguring shared L1 to private mode to run 'non shared friendly workloads'. Not needed currently

Overall it focuses on IPC improvements rather than memory bandwidth optimisation. It estimates 22% increase in IPC for 'shared friendly applications' due to changed architecture.

Going by the comments under the tweet about it, it seems mostly focused around changes needed for RDNA3, the one that should be multi die
 
I went through the last paper. It suggests following changes in cache architecture:
1. Shared L1 for all cores. Contemporary arch commits L1 separately for each core
2. No data replication across L1 assumed which is not the case currently
3. L1 cache addresses mapped per core. Not needed currently
4. Inter core communication/bandwidth optimisations for fetching data from L1 cache. Not needed currently
5. Software based methods for reconfiguring shared L1 to private mode to run 'non shared friendly workloads'. Not needed currently

Overall it focuses on IPC improvements rather than memory bandwidth optimisation. It estimates 22% increase in IPC for 'shared friendly applications' due to changed architecture.

The IPC improvements come from less L1 cache misses. If you have less cache misses you're requesting less data from the VRAM.
 
Lots of ppl are saying the RTX 3070 is crap already (AMD fanboys maybe?). Not sure why, looks like a pretty nice GPU to me. its unusual for the lowest end card to meet or exceed the performance of the last gens top performer. Will be interesting to see the perf. of overclocked models.

I know NV could also release a GPU for gaming at lower resolutions too, perhaps this year, but im just going ith whats been announced so far.
 
Lots of ppl are saying the RTX 3070 is crap already (AMD fanboys maybe?). Not sure why, looks like a pretty nice GPU to me. its unusual for the lowest end card to meet or exceed the performance of the last gens top performer.
Definitely :p
 
Status
Not open for further replies.
Back
Top Bottom