• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

RDNA 3 rumours Q3/4 2022

Status
Not open for further replies.
The advances in interconnect tech is one factor which helps to make many approaches to MCM GPUs possible but there are complications with game rendering you can't easily solve in hardware alone. You definitely can't do it via a Zen like chiplet approach in hardware alone unless you somehow convince developers to all adopt explicit multi-adaptor and rework their old games.

which annoying means Nvidia would probably have to do it first or consoles??
 
More rumours, bunch of leakers are reporting the same thing today:

* Navi 33 is a 80CU monolithic GPU
* Navi 31 is a 2 x 80CU MCM GPU

Performance

* Navi 31 is targeting to be 150% faster than the RX6800 in Rasterization
* Navi 31 is targeting to be 100% faster than the RX6800 in Ray Tracing

Architecture

* RDNA 3 doesn't contain fixed function hardware for DirectML/Super Resolution/DLSS, Super Resolution/Direct ML uses the same pipeline as it does on RDNA 2

Considering over the last few years generational upgrades for rasterization have been in the +30 to 40% range +150% would be pretty fantastic, unbelievable in fact. However if this really is an MCM design, i think its possible...

For Ray Tracing, Nvidia are 37% ahead overall according to Toms Hardware 10 game average, to double the 6900XT RT performance (+100%) so give the 6900XT 103 FPS vs 71 of the 3090, 103 / 71 = 1.45 (+45%) the 7900XT would be 45% ahead of the 3090.

Nvidia manage a 48% increase in RT performance over last gen, 2080TI, 71 / 48 = 1.479 (48%)

So AMD's second gen RT would gain twice as much over first gen compared with what Nvidia achieved. Catchy Catchy Monkey?

PS, if your raw rasterization performance is so huge why bother with your own DLSS? it would nullify it.

KiQNaz1.jpg
 
I bet on 2 times RT performance of RDNA2 and up to 40% more perf in raster, considering they will use a smaller node and they can increase the power used by the cards. If they don't go full berserk on the power usage, i bet on 25% more raster performance and up to 100% in RT.
Nvidia will have smaller gains if they won't improve perf/watt a lot. But since they are already ahead in RT and they control the PC gaming industry, even the gain from the smaller node transition will be enough to keep them on top.
 
The advances in interconnect tech is one factor which helps to make many approaches to MCM GPUs possible but there are complications with game rendering you can't easily solve in hardware alone. You definitely can't do it via a Zen like chiplet approach in hardware alone unless you somehow convince developers to all adopt explicit multi-adaptor and rework their old games.

Agree. Based on my reading of CoWoS and EMIB is that they can carry exceptionally wide and high speed buses, to the point that independent chiplets are nearly working as a single chip. I've seen it described as ending up with a die for each functional block in a "GPU", which is actually formed of several chiplets. I'm amazed that it can be done at volume, but I guess it makes economic sense with the defect rates on cutting edge nodes and the cost per transistor not falling as they used to.

The Zen 3 interconnect approach, of a fast set of serial links, is a non-starter in the inherent wide and parallel world of GPUs.
 
If these performance rumours are true and AMD get it out on time and before the competition, then Nvidia could be in trouble.
 
Last edited:
I bet on 2 times RT performance of RDNA2 and up to 40% more perf in raster, considering they will use a smaller node and they can increase the power used by the cards. If they don't go full berserk on the power usage, i bet on 25% more raster performance and up to 100% in RT.
Nvidia will have smaller gains if they won't improve perf/watt a lot. But since they are already ahead in RT and they control the PC gaming industry, even the gain from the smaller node transition will be enough to keep them on top.

The 2080TI has 68 RT cores, the 3090 has 82 RT cores, that difference and some improvement in IPC / clock speed of those RT cores is where Nvidia are getting 48% higher RT performance.

IF Navi 31 (Giant Navi?) is an MCM design we are talking about two RX 6900XT's, 80 RT cores each, that's were the +100% or X2 RT performance comes from, they are doubling the RT cores. 80 to 160 RT cores.

The +150% (X2.5) rasterization also comes from doubling the 6900XT, so 80 CU's to 160 CU's, the .5 could come from increasing the IPC / clock speed of those CU's by 25% each.
 
Agree. Based on my reading of CoWoS and EMIB is that they can carry exceptionally wide and high speed buses, to the point that independent chiplets are nearly working as a single chip. I've seen it described as ending up with a die for each functional block in a "GPU", which is actually formed of several chiplets. I'm amazed that it can be done at volume, but I guess it makes economic sense with the defect rates on cutting edge nodes and the cost per transistor not falling as they used to.

The Zen 3 interconnect approach, of a fast set of serial links, is a non-starter in the inherent wide and parallel world of GPUs.

Problem is even if you get all that working - and with the advances in substrate technology and semi-conductor nodes it is possible - you still have all the problems which plagued Crossfire and SLI - even super fast interconnects and direct memory access over boundaries doesn't solve that - ATI/AMD tried it a couple of times before with sideport type functionality, nVidia made some attempts with the NF200 bridge chip to shortcut transactions between GPUs - the resulting performance increases are in the <10% on top of whatever scaling you normally get from that kind of multi-adaptor approach.

What you really need it some kind of dynamic resource pooling approach rather than fixed chiplets, etc.
 
Apple are moving on to 3nm towards the end of this year at which point AMD gets a big slice, if not all, 5nm production. TSMC are also rapidly expanding 5nm production as well, it's almost a certainty that RDNA3 will be 5nm, just when is the question as they might favour Zen4 I guess...


Apple are still main buyer (by a lot) of 5nm - risk production isnt due to start till end 2021 at the very earliest - for an Apple 2023 product launch; we also know that TSMC 5nm is 2/3rds more expensive per wafer than 7nm
 
The 2080TI has 68 RT cores, the 3090 has 82 RT cores, that difference and some improvement in IPC / clock speed of those RT cores is where Nvidia are getting 48% higher RT performance.

IF Navi 31 (Giant Navi?) is an MCM design we are talking about two RX 6900XT's, 80 RT cores each, that's were the +100% or X2 RT performance comes from, they are doubling the RT cores. 80 to 160 RT cores.

The +150% (X2.5) rasterization also comes from doubling the 6900XT, so 80 CU's to 160 CU's, the .5 could come from increasing the IPC / clock speed of those CU's by 25% each.
You can't calculate like that, what about the power draw? Will the 7900xt use 700W? If we double everything then we should also double the power usage ( minus some watts from the better node ) :)
Also i don't think the 80 CUs from one generation are most of the time the same 80 CUs from the next generation. So you can't calculate like this: this one has 80, the next one has 160 so it is 2 times the performance.
It is not true even in the same generation, 6900xt with its 80 cores will not give you twice the performance of 6700xt with its 40 cores.

I expect the RT performance to be increased a lot by simply adding a lot more RT cores. AMD has the power budget to do this. Nvidia is already using a lot of power so for their performance to increase a lot, they will need to work to improve the perf/watt, they can't add a lot more hardware.
 
I expect the RT performance to be increased a lot by simply adding a lot more RT cores. AMD has the power budget to do this. Nvidia is already using a lot of power so for their performance to increase a lot, they will need to work to improve the perf/watt, they can't add a lot more hardware.
I can say it like this: If both AMD and Nvidia make a 450-500W card, AMD has a lot more room to add more hardware and close the gap to Nvidia. If they go again on the power efficiency route, that will be a mistake because Nvidia can't afford their next gen to be power efficient, it will have very small gains over Ampere. RDNA2 gains in power efficiency can be an advantage for AMD if they will push the limit of power usage with their next gen.
 
I can say it like this: If both AMD and Nvidia make a 450-500W card, AMD has a lot more room to add more hardware and close the gap to Nvidia. If they go again on the power efficiency route, that will be a mistake because Nvidia can't afford their next gen to be power efficient, it will have very small gains over Ampere. RDNA2 gains in power efficiency can be an advantage for AMD if they will push the limit of power usage with their next gen.


Unless Nvidia will use Samsung`s 7nm for the next card, as they wont pay TSMC prices
 
Apple are still main buyer (by a lot) of 5nm - risk production isnt due to start till end 2021 at the very earliest - for an Apple 2023 product launch; we also know that TSMC 5nm is 2/3rds more expensive per wafer than 7nm

They're also roughly doubling the output of N5 so there will be some capacity free, and the rumours here seem to be H2 2022 for RDNA3, which roughly matches the mass production of N3 for Apple.

As for wafer cost, whilst not ideal it doesn't really change anything, certainly on the CPU side and probably even GPU they'll sell everything they can produce. Sticking on 7nm for either is a bad plan, cheaper but unliekely to yield enough extra performance for a next gen product, unless they release a GPU consuming more than the 3090... When suddenly power consumption will be important again :p
 
You can't calculate like that, what about the power draw? Will the 7900xt use 700W? If we double everything then we should also double the power usage ( minus some watts from the better node ) :)
Also i don't think the 80 CUs from one generation are most of the time the same 80 CUs from the next generation. So you can't calculate like this: this one has 80, the next one has 160 so it is 2 times the performance.
It is not true even in the same generation, 6900xt with its 80 cores will not give you twice the performance of 6700xt with its 40 cores.

I expect the RT performance to be increased a lot by simply adding a lot more RT cores. AMD has the power budget to do this. Nvidia is already using a lot of power so for their performance to increase a lot, they will need to work to improve the perf/watt, they can't add a lot more hardware.

The 6700XT is the same generation as the 6900XT.

The 6700XT has the same 40 CU's as the 5700XT, its 33% faster, the 6900XT is 112% faster than the 5700XT, figures from TPU. So AMD doubled the performance going from 40 CU's to 80 CU's and 12% on top of that.

The 3090 uses about 20% more power than the 6900XT, so clearly AMD have 20% of power to grow into what people think is acceptable, going from 7nm to 5nm alone will reduce the power of each 80 CU chip. The 6900XT uses about the same power as the 5700XT, perhaps a little more, despite having twice as many CU's and being more than twice as fast.
 
Last edited:
They're also roughly doubling the output of N5 so there will be some capacity free, and the rumours here seem to be H2 2022 for RDNA3, which roughly matches the mass production of N3 for Apple.

As for wafer cost, whilst not ideal it doesn't really change anything, certainly on the CPU side and probably even GPU they'll sell everything they can produce. Sticking on 7nm for either is a bad plan, cheaper but unliekely to yield enough extra performance for a next gen product, unless they release a GPU consuming more than the 3090... When suddenly power consumption will be important again :p


Output has to come from somewhere, and whilst they are trying to build fabs as fast as they can, it still takes both time and money (GN covered this a week or so ago). All the nosie from the USA and EU about local fabs, all the companies want money behind the demands. AMD have been on 7nm for second generation, and TSMC are already running N7P, and N7+ which i can honestly see AMD using as the 4xxx refreshes, until the MCM cards are ready. The question will be - 7% improvement from N7P (or 10% less power), or redesign for N7+ and get 10%+ (or 15% less power). N7+ is EUV (N7P is DUV)

edit

Brain freeze today from an early start - RDNA 2 is on N7P, which would explain some of the improvement anyway. So i would `specualte` N7+ for rdna3, as costs can be reduced
 
Last edited:
@nvidiamd how did AMD more than double the performance of the 5700XT at similar power consumption on the same node? How did they do that?

How is it that a 16 core CPU can exist with higher per core performance than its 8 core competitor with half the power consumption? that's 4 times the performance per watt, how is that possible?

What you say cannot be done has been done more than once.
 
Last edited:
Output has to come from somewhere, and whilst they are trying to build fabs as fast as they can, it still takes both time and money (GN covered this a week or so ago). All the nosie from the USA and EU about local fabs, all the companies want money behind the demands. AMD have been on 7nm for second generation, and TSMC are already running N7P, and N7+ which i can honestly see AMD using as the 4xxx refreshes, until the MCM cards are ready. The question will be - 7% improvement from N7P (or 10% less power), or redesign for N7+ and get 10%+ (or 15% less power). N7+ is EUV (N7P is DUV)

edit

Brain freeze today from an early start - RDNA 2 is on N7P, which would explain some of the improvement anyway. So i would `specualte` N7+ for rdna3, as costs can be reduced

TSMC will have multiple fabs, or even areas within fabs, on very old processes, generally speaking these are upgraded over time so the extra N5 output doesn't necessarily require any new buildings, 'just' the equipment.

There's also N6 as well :p I wouldn't be surprised to see a 'Zen3+' CPU on N7x or N6, but I doubt RDNA3 will be. Even if that means RDNA3 is 'late' as fundamentally the GPU business is tiny to AMD, Zen4 will be the priority if capacity is limited at all. (just like now Consoles > Zen3 > RDNA2)
 
Status
Not open for further replies.
Back
Top Bottom