Speculation - AMD isn't going to be able to offer really competitive PC (4K) or console performance until they have GPUs with >100 Compute Units

g67575 · 24 Apr 2023 at 05:50

RDNA2 maxed out at 80 CUs.
RDNA3 maxed out at 96 CUs.

On consoles, the Series X GPU has 52 Compute units:

AMD Xbox Series X GPU Specs

AMD Scarlett, 1825 MHz, 3328 Cores, 208 TMUs, 64 ROPs, 10240 MB GDDR6, 1750 MHz, 320 bit

www.techpowerup.com

Generally speaking, AMD has been improving the performance when the number of CUs is scaled up on RDNA3, but they kind of hit a wall at 96 CUs due to power constraints (already 355w for the RX 7950X).

It makes sense (assuming the scaling is decent), that a GPU with 2x the compute units of the Series X console GPU would provide very good 4K performance on both consoles and desktop graphics cards. cards, particularly because the RX 7900 XTX already performs well (94 FPS 1% lows at 4K in most games - according to Techspot's review: https://static.techspot.com/articles-info/2601/bench/4K-p.webp).

It seems likely that they will only be able to accomplish this with RDNA4, which will offer a die shrink to either a 3 or 4nm TSMC fabrication process, allowing AMD to reduce the power consumption a considerable amount. The main problem with RDNA3 is that on 5nm, it doesn't appear to scale all that much higher than RDNA2.

It's a good bet that 100 CU or greater GPUs will be a thing, even at the upper mid end (e.g. same tier as the 6800 XT), but I think it's much more likely to happen if they are able to use TSMC's future 3nm fabrication process.

It is true that the clock rate can be scaled up also on desktop GPUs (compared to the Series X GPU which is already running at 200w clocked at 1825 Mhz), but based on AMD analysis, it does increase power consumption more than might be considered desirable.

Dicehunter · 24 Apr 2023 at 06:18

I'd love to see a true monster halo GPU from AMD using their chiplets tech, Put 4 dies together for something silly like 400 CU's.

g67575 · 24 Apr 2023 at 06:27

I think probably the most exciting prospect for AMD, is to produce a console GPU with 2x the performance of the Series X, whilst keeping the power consumption within an acceptable range. That should allow them to deliver 60 FPS at 4K natively on existing console games, assuming games receive updates to cope with new console hardware.

Presumably with further improved ray tracing, which has already had a nice boost with RDNA3.

Still, RDNA4 is only predicted to use an 'Advanced node' at the moment, so AMD doesn't seem confident yet to say 3nm would be used.

CAT-THE-FIFTH · 24 Apr 2023 at 06:30

They could have done more now,but seem to have not bothered. The Navi 31 GCD is only 300mm2 and made on TSMC 5NM.They has access to the same TSMC 4N process as Nvidia but chose to make Apus on it instead.

Hotwired · 24 Apr 2023 at 06:31

If we ignore prayers that a company slashes its profit margin they're not going to "compete" unless they greatly increase the maximum price they want to sell cards for.

g67575 · 24 Apr 2023 at 06:34

Hotwired said:
If we ignore prayers that a company slashes its profit margin they're not going to "compete" unless they greatly increase the maximum price they want to sell cards for.

AMD's development's tend to be linked to console hardware, these days. With the exception of 'infinity cache' on desktop GPUs.

g67575 · 24 Apr 2023 at 06:38

I wouldn't be surprised if they delayed RDNA4 until TSMC's 3nm can provide good yields of GPU dies, whilst also being affordable enough to mass produce for console GPUs + desktop cards.

Or, they could use 4nm for lower end/mid tier GPUs, and 3nm for the high end GPUs.

TSMC's 4nm isn't considered a 'full node' is it? It's an incremental improvement of 5nm, presumably with more EUV layers

g67575 · 24 Apr 2023 at 07:01

A TSMC process called N3E is being readied for volume production in 2023 apparently:

3 nm process - Wikipedia

en.wikipedia.org

The denser N3 pricess is already due in the first half of 2023.

But it seems they are reaching the limit of what can be accomplished with FinFET transistors.

One advantage of the N3E process seems to be '“very different” design rules intended to improve yield'. link:
https://fuse.wikichip.org/news/7048/n3e-replaces-n3-comes-in-many-flavors/

It's speculated that "Long term, the initial N3 node will likely fade into obscurity".

g67575 · 24 Apr 2023 at 07:21

This is quite an interesting slide about the N3E process:

It suggests that if the GPU used in the RX 7900 XTX had a N3E equivalent, that the power usage could be reduced down to 230-248 watts (down from 355w).

Dicehunter · 24 Apr 2023 at 07:24

g67575 said:
This is quite an interesting slide about the N3E process:

It suggests that if the GPU used in the RX 7900 XTX had a N3E equivalent, that the power usage could be reduced down to 230-248 watts (down from 355w).

It'll be interesting to see if we'll reach the physical limits of silicon any time soon and will have to move onto some other material.

g67575 · 24 Apr 2023 at 07:30

It does beg the question, why not just keep scaling up RDNA3 (more compute units), with the new N3E fabrication process?

Maybe not worth it at the moment (due to production costs?), perhaps they'd rather wait to combine this upgrade with RDNA4 GPUs.

humbug · 24 Apr 2023 at 07:34

CAT-THE-FIFTH said:
They could have done more now,but seem to have not bothered. The Navi 31 GCD is only 300mm2 and made on TSMC 5NM.They has access to the same TSMC 4N process as Nvidia but chose to make Apus on it instead.

I agree with this, i don't think them stopping at 96 CU's is any sort of design limitation, its a choice, it just so happens to be pretty much exactly 300mm?

No, AMD's whole thing is making things efficiently, they are obsessed by it, in a good way, its how they were first to MCM X86 CPU's, 3D Stacking and now also GPU's.
Breaking what would be a large monolithic die in to multiple smaller chunks means you get better wafer yields, there is a certain size at which more of the dies become defective, when you know that you realize that AMD chose to make the logic die 300mm and no larger.

Right now Nvidia's philosophy is more about piling on more and more shaders, making the dies larger and larger, AMD try to not to grow die sizes, instead the grow performance gen on gen by increasing IPC and / or clock speeds efficiently.
I'll give you an example.

RX 5700XT: 2560 Shaders @ 1.9Ghz, 251mm 7nm, 225 watts = to RTX 2070
RX 6700XT: 2560 Shaders @ 2.6Ghz, 335mm 7nm, 230 watts = to RTX 2080Ti

Its grown in size because of the 96MB of L3 cache, but critically they are both on the same 7nm and despite the die size actually growing a bit and being about 40% faster the power consumption is near identical.
That's purely architectural engineering.

They did the same thing with Zen 2 vs Zen 3, very different per core performance between those two CPU's, the power consumption is near identical and they are again on the same 7nm, again just purely architectural engineering.

So will the next GPU get more CU's? Not unless AMD can squeeze them in to the same space, what they will do is find other ways of increasing performance.
There is reportedly an RDNA 3.5 or 3+ in the oven clocking way over 3Ghz efficiently, something that IMO RDNA3 was meant to but something went wrong and they couldn't get it fixed in time before it had to launch, that, whatever it was is now reportedly fixed.

g67575 · 24 Apr 2023 at 07:45

Well, it wouldn't surprise me if AMD ends up using a custom variant of the N3E process for RDNA4.

Power usage remains their main issue, so I imagine they'd want even more optimisation there.

JediFragger · 24 Apr 2023 at 08:01

g67575 said:
RDNA4

Unless it comes earlier than expected using a 4N process...

g67575 · 24 Apr 2023 at 08:04

It looks like they are targeting 2H 2024 for RDNA4:

AMD tends to be pretty straight forward about it's release schedule.

humbug · 24 Apr 2023 at 08:06

g67575 said:
Well, it wouldn't surprise me if AMD ends up using a custom variant of the N3E process for RDNA4.

Power usage remains their main issue, so I imagine they'd want even more optimisation there.

I wouldn't call power consumption an "issue"

RTX 4080: 16GB, 304 watts, 100% performance
7900XTX: 24GB, 356 watts, 105% performance
RTX 4090: 24GB, 411 watts 127% performance

The 4080 has about 15% better performance per watt vs the 7900XTX
The 4090 about 5% better performance per watt

Quite a lot of tech journalists talked about how efficient Ada is but that's just them throwing the compulsory praise Nvidia's way so they don't discipline them.

JediFragger · 24 Apr 2023 at 08:07

Can't see it stating 2H? Unlikely I know but here's hoping that they surprise us with something good next Summer!

g67575 · 24 Apr 2023 at 08:09

humbug said:
I wouldn't call power consumption an "issue"

It depends on what their goal is. If they want to use RDNA4 in console GPUs, power usage improvements will be essential.

HRL · 24 Apr 2023 at 09:10

Dicehunter said:
I'd love to see a true monster halo GPU from AMD using their chiplets tech, Put 4 dies together for something silly like 400 CU's.

Ah, you already have a mortgage in principle?

JediFragger · 24 Apr 2023 at 09:32

HRL said:
Ah, you already have a mortgage in principle?

And his own personal power station