* The AMD RDNA 4 Rumour Mill *

Gerard · 2024-12-22T23:17:24+0000

Killer7xx said:
They kinda did with the 6900XT but blew it by not launching it with the 6950XT TDP and clocks. But I do hope that UDNA is a successful venture for them at the high end. Considering how their CDNA team has been able to match Nvidia's hardware almost blow for blow in the datacenter with the MI series GPUs, recombining the two teams together should yield better results than now.

He's referring to the naming conventions. Performance wise in raster 6900xt and 6950xt traded blows with the 3090 and 3090ti.

mahius · 2024-12-22T23:22:23+0000

Legion said:
AMD's next-generation RDNA 4 flagship is allegedly labeled as the Radeon RX 9070 XT

AMD Radeon RX 9070 XT Is Allegedly The Top RDNA 4 GPU, Red Team Goes With Radeon 9000 Branding

AMD's next-generation RDNA 4 flagship is allegedly labeled as the Radeon RX 9070 XT as the red team adopts the Radeon 9000 branding.

wccftech.com

*facepalm*

I sure hope thats a typo. This dumb naming is one of the reasons why Nvidia is able to appeal the the masses, at least they've been smart and consistent with naming. Because the average joe knows that a x70 is mid-tier and x90 is top-end. Meanwhile AMD keeps changing every few generations so average joe has no clue what's what when it comes to AMD tiers.

Just call it 8700xt or 8800xt and be done with it. Do a throw back to the good old days, back when 7970 was kicking rear.

jimhaumman · 2024-12-22T23:28:35+0000

RX 9070xt <RX 7900xt -$449-649
RX 9070xt >= RX 7900 gre
RX 9070=RX 7800 xt -$179-349

Grim5 · 2024-12-22T23:29:51+0000

mahius said:
*facepalm*

I sure hope thats a typo. This dumb naming is one of the reasons why Nvidia is able to appeal the the masses, at least they've been smart and consistent with naming. Because the average joe knows that a x70 is mid-tier and x90 is top-end. Meanwhile AMD keeps changing every few generations so average joe has no clue what's what when it comes to AMD tiers.

Just call it 8700xt or 8800xt and be done with it. Do a throw back to the good old days, back when 7970 was kicking rear.

8000 is for rdna3.5
9000 is for rdna4

This is the reason for jumping to 9000

JediFragger · 2024-12-22T23:51:33+0000

Killer7xx said:
They kinda did with the 6900XT but blew it by not launching it with the 6950XT TDP and clocks.

True, but the only reason they had the opportunity was because Nvidia dropped the ball by using the trash Samdung 8nm (more like 10nm) process to save some $$$. Otherwise the Ampere would have been out of sight on the TSMC node.

muon · 2024-12-23T00:04:26+0000

humbug said:
That's very narrow performance testing and it reads more like Nvidia cope marketing for their investors.

The most powerful super computer on earth has AMD GPU's, not Nvidia, the people who build these billion $ computers don't take their advice from forbs.

The most powerful supercomputers in the world aren't the ones listed used in research, science etc. which are worried about cost. The traditional supercomputer is dead in terms of being the most powerful computing array.

They are now the massive farms owned by OpenAI, Meta, Microsoft, Amazon, Baidu etc. Guess what they all use. They don't worry about cost and spend tens of billions on their farms.

To further highlight the point:

https://www.amd.com/en/newsroom/pre...rts-third-quarter-2024-financial-results.html
https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-third-quarter-fiscal-2025

AMD sold $3.5bn in data center sales (Instinct and Epyc) in 2024Q3. Nvidia sold $30.8bn in 2024Q3. Remember people spending tens of billions, don't take their advice from Forbes like you said.

Grim5 · 2024-12-23T04:13:23+0000

humbug said:
That's very narrow performance testing and it reads more like Nvidia cope marketing for their investors.

MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive

Intro SemiAnalysis has been on a five-month long quest to settle the reality of MI300X. In theory, the MI300X should be at a huge advantage over Nvidia’s H100 and H200 in terms of specifications an…

semianalysis.com

As per this review that was just released - going by AMD's marketing and GPU specs, the MI300X should be a slam dunk against anything Nvidia has - yet as the review shows, reality is very different and it lags significantly behind Nvidia, and that's after AMD sent teams of engineers to update software top improve on site performance, against Nvidia's out of the box performance with no engineers required

Key Findings

Comparing on paper FLOP/s and HBM Bandwidth/Capacity is akin to comparing cameras by merely examining megapixel count. The only way to tell the actual performance is to run benchmarking.
Nvidia’s Out of the Box Performance & Experience is amazing, and we did not run into any bugs during our benchmarks. Nvidia tasked a single engineer to us for technical support, but we didn’t run into any Nvidia software bugs as such we didn’t need much support.
AMD’s Out of the Box Experience is very difficult to work with and can require considerable patience and elbow grease to move towards a usable state. AMD's stable releases of AMD PyTorch is still broken and we needed workarounds.
If we weren’t supported by multiple teams of AMD engineers triaging and fixing bugs in AMD software that we ran into, AMD’s results would have been much lower than Nvidia’s.
For AMD, Real World Performance is nowhere close to its on paper marketed TFLOP/s.
Training performance is weaker, as demonstrated by the MI300X ‘s matrix multiplication micro-benchmarks, and still lags that of Nvidia’s H100 and H200
AMD’s training performance is also held back as the MI300X does not deliver strong scale out performance. This is due to its weaker ROCm Compute Communication Library (RCCL) and AMD’s lower degree of vertical integration with networking and switching hardware compared to Nvidia’s strong integration of its Nvidia Collective Communications Library (NCCL), InfiniBand/Spectrum-X network fabric and switches.

CAT-THE-FIFTH · 2024-12-23T04:32:23+0000

Grim5 said:
MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive

Intro SemiAnalysis has been on a five-month long quest to settle the reality of MI300X. In theory, the MI300X should be at a huge advantage over Nvidia’s H100 and H200 in terms of specifications an…

semianalysis.com

Key Findings

Comparing on paper FLOP/s and HBM Bandwidth/Capacity is akin to comparing cameras by merely examining megapixel count. The only way to tell the actual performance is to run benchmarking.

Nvidia’s Out of the Box Performance & Experience is amazing, and we did not run into any Nvidia specific bugs during our benchmarks. Nvidia tasked a single engineer to us for technical support, but we didn’t run into any Nvidia software bugs as such we didn’t need much support.

AMD’s Out of the Box Experience is very difficult to work with and can require considerable patience and elbow grease to move towards a usable state. On most of our benchmarks, Public AMD stable releases of AMD PyTorch is still broken and we needed workarounds.

If we weren’t supported by multiple teams of AMD engineers triaging and fixing bugs in AMD software that we ran into, AMD’s results would have been much lower than Nvidia’s.

We ran unofficial MLPerf Training GPT-3 175B on 256 H100 in collaboration with Sustainable Metal Cloud to test the effects of different VBoost setting

For AMD, Real World Performance on public stable released software is nowhere close to its on paper marketed TFLOP/s. Nvidia’s real world performance also undershoots its marketing TFLOP/s, but not by nearly as much.

The MI300X has a lower total cost of ownership (TCO) compared to the H100/H200, but training performance per TCO is higher on the MI300X

Training performance is weaker, as demonstrated by the MI300X ‘s matrix multiplication micro-benchmarks, and still lags that of Nvidia’s H100 and H200

MI300X performance is held back by AMD software.

AMD’s training performance is also held back as the MI300X does not deliver strong scale out performance. This is due to its weaker ROCm Compute Communication Library (RCCL) and AMD’s lower degree of vertical integration with networking and switching hardware compared to Nvidia’s strong integration of its Nvidia Collective Communications Library (NCCL), InfiniBand/Spectrum-X network fabric and switches.

Not sure what data centre has got to do with a gaming cards,but it appears Oracle and IBM seem to be fine with the AMD offerings:

AMD lands yet another major cloud deal as Oracle adopts thousands of Instinct MI300X GPUs to power new AI supercluster

Oracle Cloud Infrastructure will add the AI accelerator to its selection of bare metal instances

www.techradar.com

IBM Partners With AMD; Expands AI Accelerator Offerings

Companies launching AMD Instinct MI300X accelerators as a service in 2025

aibusiness.com

Vultr recently announced that it had ordered “thousands” of MI300X units, and now Oracle Cloud Infrastructure (OCI) says it has adopted AMD’s hardware for its new OCI Compute Supercluster instance, BM.GPU.MI300X.8.

The new supercluster is designed for massive AI models containing billions of parameters and supports up to 16,384 GPUs in a single cluster. This setup leverages the same high-speed technology used by other OCI accelerators, enabling large-scale AI training and inference with the memory capacity and throughput required for the most demanding tasks. The configuration makes it particularly suited for LLMs and complex deep learning operations.

The companies plan to make AMD Instinct MI300X accelerators available as a service to enterprise clients on IBM Cloud in the first half of 2025 for generative AI inferencing workloads.

The accelerators will integrate with IBM's watsonx AI and data platform to provide additional AI infrastructure resources for scaling AI workloads across hybrid cloud environments.

The larger customers with dedicated teams who have their own software ecosystems are buying whatever AMD is making. That means longer term,future AMD cards will integrate far easier into their ecosystems. A bit like how Sony works with AMD on its consoles.

The smaller teams probably are relying more on Nvidia for support and nobody is shocked if Nvidia has mature software.

Anyway if you are a gamer,you shouldn't be cheering on any of these companies doing "well" in AI. It only means that the gaming cards will cost more,be delayed more,have less VRAM and be more cut down.

Despite all the "loyality" marketing these companies make,the reality gamers are treated second rate now. Mining,AI,Supercomputers,Commercial sales,etc all seem to get priority over gamers now. If they do worse in the latter it's better for gamers. Not sure why people are trying to flex non-gaming sales. It doesn't help us!!

humbug · 2024-12-23T06:04:43+0000

Grim5 said:
MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive

Intro SemiAnalysis has been on a five-month long quest to settle the reality of MI300X. In theory, the MI300X should be at a huge advantage over Nvidia’s H100 and H200 in terms of specifications an…

semianalysis.com

As per this review that was just released - going by AMD's marketing and GPU specs, the MI300X should be a slam dunk against anything Nvidia has - yet as the review shows, reality is very different and it lags significantly behind Nvidia, and that's after AMD sent teams of engineers to update software top improve on site performance, against Nvidia's out of the box performance with no engineers required

Key Findings

Comparing on paper FLOP/s and HBM Bandwidth/Capacity is akin to comparing cameras by merely examining megapixel count. The only way to tell the actual performance is to run benchmarking.

Nvidia’s Out of the Box Performance & Experience is amazing, and we did not run into any bugs during our benchmarks. Nvidia tasked a single engineer to us for technical support, but we didn’t run into any Nvidia software bugs as such we didn’t need much support.

AMD’s Out of the Box Experience is very difficult to work with and can require considerable patience and elbow grease to move towards a usable state. AMD's stable releases of AMD PyTorch is still broken and we needed workarounds.

If we weren’t supported by multiple teams of AMD engineers triaging and fixing bugs in AMD software that we ran into, AMD’s results would have been much lower than Nvidia’s.

For AMD, Real World Performance is nowhere close to its on paper marketed TFLOP/s.

Training performance is weaker, as demonstrated by the MI300X ‘s matrix multiplication micro-benchmarks, and still lags that of Nvidia’s H100 and H200

AMD’s training performance is also held back as the MI300X does not deliver strong scale out performance. This is due to its weaker ROCm Compute Communication Library (RCCL) and AMD’s lower degree of vertical integration with networking and switching hardware compared to Nvidia’s strong integration of its Nvidia Collective Communications Library (NCCL), InfiniBand/Spectrum-X network fabric and switches.

This really isn't worth doubling down on.

Nvidia own 99% of the software, so Nvidia's GPU's are going to run better / faster than AMD's even if the compute power on them is lower, the first Key Finding almost gets to that conclusion, almost, but not quite, they don't actually understand what they are writing about, like 90% of writers these days, or they do and that's not the point, the point is reassurance to Nvidia's equally dumb investors that no matter how much AMD can capitalise on their breakthrough efforts Nvidia still has the bigger willy.

Not everyone uses or even likes Nvidia's ECO system, Intel used to do the same, they created an ECO system and dictated it to their customers, its kind of good because you get a stable ready made environment for your hardware, For Intel.... they get to lock you in, so you become dependant on that ECO system and find it difficult to switch even if you wanted to.

AMD came along with a new idea, here it is: You tell us exactly what you need and we will build it for you, AMD have been doing that long enough now so other people idea's that AMD then turn in to reality are actually good appealing idea's to many other people.
Beyond that not everyone wants you to do it for them, sometimes, quite often actually all you need is the hardware, if the hardware is not black boxed then you can program it yourself, old school.
Nvidia's hardware or ECO system is not the be all and end all, it has its own problems and yes AMD's hardware is more powerful in TFLOP/s, that's why you're seeing that nonsense from Nvidia's marketing ARM's plastered all over the Internet now, People are starting to take notice of AMD's hardware and idea's, enough so that its rattling Nvidia.

humbug · 2024-12-23T07:15:25+0000

I'll tell you another thing Intel have have figured out that is now starting to become and anvil around their necks, which Nvidia are clever enough to see as a concern.

Developing and maintaining a vast ECO system infrastructure is very very expensive, you can only do it with a combination of two things, very high margin returns and market domination.

Intel are losing vast amounts of money not because the CPU's cost more to make than they sell them for, certainly not with multi thousand $ chips, no matter how big or complex they are, they cost $3K to $15K a pop because of the costs associated with maintaining that ECO system.

Intel have been forced to compete with much faster and more efficient CPU's, aside from them actually failing in that its still pushed the development and manufacturing costs of their chips to heights never before seen, while at the same time reduce the price of those chips, effectively reducing the ECO system service fee, by quite a chunk, on top of that Intel have now lost significant market share, AMD are now at 30% product share and 40% revenue share.

All that amounts to a significant chunk of Intel's revenue transferred to AMD, years ago Intel made the same arguments you're seeing in these articles you're posting.
AMD don't have as many engineers and middle managers available on speed dial, they don't have as many people embedding their compilers in to every bit of software and they don't have as many marketing people bamboozling people with very underhanded slide presentations.

So AMD will never replace Intel, they don't have to, all they have to do, what they are doing is making that unsustainable for Intel and Intel will lose its advantage, its working.

Nvidia see what AMD are doing and they do recognise it as a real threat, AMD don't need to take large chunks of Nvidia marketshare, all AMD need to do is make Nvidia fight for it in the same way they made Intel fight for it, you only need to nibble away at your competitors 95% marketshare and reduce the revenue they gain from it to make it unsustainable.

Again these articles are the very same noises Intel was making years ago, Intel made them because they were rattled by what AMD was doing. Intel was right to be rattled.

Competitor rules

* The AMD RDNA 4 Rumour Mill *

More options

Gerard

Gerard

mahius

mahius

AMD Radeon RX 9070 XT Is Allegedly The Top RDNA 4 GPU, Red Team Goes With Radeon 9000 Branding

jimhaumman

jimhaumman

Grim5

Grim5

JediFragger

JediFragger

muon

muon

Grim5

Grim5

MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive

Key Findings

CAT-THE-FIFTH

CAT-THE-FIFTH

MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive

Key Findings

AMD lands yet another major cloud deal as Oracle adopts thousands of Instinct MI300X GPUs to power new AI supercluster

IBM Partners With AMD; Expands AI Accelerator Offerings

humbug

humbug

MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive

Key Findings

humbug

humbug

Competitor rules

*** The AMD RDNA 4 Rumour Mill ***

Key Findings​

Key Findings​

Key Findings​

* The AMD RDNA 4 Rumour Mill *

Key Findings

Key Findings

Key Findings