• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

RDNA 3 rumours Q3/4 2022

Status
Not open for further replies.
though there's a 50:50 chance with the kind of leaks we have seen, amd might match or exceed nvidia's raster performance
As I see it, rumours aside, there's a reason that NVIDIA pushed 4090 to such high power use, even though it barely gave any more performance (about 5% or bit above that), for over 100W more power. As Debauer tested it, one could drop power use of it to 300W-330W and lose just that much performance (about 5%). Even he was very confused why they went for 450W, way into hardcore overclocking area. Do they know something we do not? Are they worried AMD will be way too close in performance? I can only guess, but corpo like that never does anything like that without a good reason.
 
As I see it, rumours aside, there's a reason that NVIDIA pushed 4090 to such high power use, even though it barely gave any more performance (about 5% or bit above that), for over 100W more power. As Debauer tested it, one could drop power use of it to 300W-330W and lose just that much performance (about 5%). Even he was very confused why they went for 450W, way into hardcore overclocking area. Do they know something we do not? Are they worried AMD will be way too close in performance? I can only guess, but corpo like that never does anything like that without a good reason.
i am not convinced that amd can beat a 76 billion transistor monster gpu and any victory in rasterization will be pyrrhic at best because of the huge feature set disparity that gets factored into the transistor budget.. at this stage the raster performance is looking pretty good for the 4090 to the extent that any addition in there is now reduced to a statistic for analytical discussions, but on the other hand we need a game changer to strengthen relatively weak areas like RT and inferential rendering that can save a lot of energy
 
Last edited:
Yes, yes you win... DLSS better than native. Can you move on so we can keep this about RDNA3 instead of DLSS please. I 'm sure you will still get paid by Jensen for all your hard work derailing and spamming AMD thread with Nvida is teh awsum. :rolleyes:

Good debunking as always! :cool: :D So guess you won't address any of those points or footage then.... :cry:

PS.

It was you who raised this in the first place:

Your post was very clear and accurate but applies to both Nvidia and AMD. I think you will agree that it is hardly a win to claim Nvidia are better than AMD, based purely on the fact that one drops to 30 FPS compared to one dropping to 20 FPS. Neither are playable if 30 FPS is an average.

This is the difference between crashing into a brick wall at 150 mile per hour with a helmet on, vs without a helmet on. Neither will save your life and neither is ideal.

We have now reached a stage where almost £2,000 gets you a so called "top tier" GPU, that has to use compressions techniques to get playalbe frames with RT on. In fact I can't tell if it is satire that we are being sold image degrading compressions techniques and being told "better than native".

:cry:
 
It's perfectly relevant to the thread as it's a discussion forum/thread on what amd need to do to compete with nvidia and claw back market share i.e. like having a competitor to frame generation/dlss 3 "now" and not in 1-2 years time......

Maybe you should go back to picking holes in the 4080 12gb now that your 4090 hate spiel fell flat on its face and funnily you have gone quiet on that front.... much like has happened with your other "intel and nvidia bad" posts :cry:

How about addressing the points raised rather than the usual of attacking the poster? Not my fault if the truth hurts....
 
Last edited:
RT cores are supposed to be fixed function blocks so the correlation between compute and RT doesnt sound logical, infact if AMD is going the dedicated route, there shouldnt be any or minimal correlation with compute
eventually though whatever values RT cores return have to go through the rendering pipeline and written to a pixel
given the sequential and diverging nature of RT, this causes threads to stall and GPU utilization to fall off a cliff.. because neither vendor is keen on load balancing both aspects
something like 2 workers working in parallel to create different components for a finished good, and worker 1 is taking 10x the time while worker 2 dozed off
the only way you can address this is by load balancing both RT and non-RT throughput but given the diverging nature of RT and the fact that RT is still in transitioning phase, people are holding off on making big architectural bets
That's why NVIDIA introduced out-of-order execution in Ada, to combat this problem where RT is slowing down the whole pipeline and you can't parallelize it well on a GPU. Judging by test results, it seems to be working well. This tech is from CPUs and AMD has been making CPUs for a while now, hence it would be surprising if they also wouldn't introduce more CPU tech into their GPUs (ones that make sense, like mentioned above one).
 
i am not convinced that amd can beat a 76 billion transistor monster gpu and any victory in rasterization will be pyrrhic at best because of the huge feature set disparity that gets factored into the transistor budget.. at this stage the raster performance is looking pretty good for the 4090 to the extent that any addition in there is now reduced to a statistic for analytical discussions, but on the other hand we need a game changer to strengthen relatively weak areas like RT and inferential rendering that can save a lot of energy
You seem to not realise that a looot of that transistor budget went into a cache that they introduced - cache eats lots of space and transistors, even though it's packed much denser than compute area of a GPU. If you cut out cache, that number would be much smaller. The rest is just guessing - GPUs take many years to design and test and there's been almost no rumours (till very recently), hence we simply do not know what tech they introduced in their coming GPUs, yet. Both NVIDIA and AMD are already way into designing and testing next gen whilst we're talking about "old" ones that are just about to be released, whilst we have NO clue what that might bring either.
 
That's why NVIDIA introduced out-of-order execution in Ada, to combat this problem where RT is slowing down the whole pipeline and you can't parallelize it well on a GPU. Judging by test results, it seems to be working well. This tech is from CPUs and AMD has been making CPUs for a while now, hence it would be surprising if they also wouldn't introduce more CPU tech into their GPUs (ones that make sense, like mentioned above one).

yeah they have figured out an acceleration scheme that can recursively ray cast like everything was the first ray.. but still they arent making big bets on allocation and load balancing, you know like they could have possibly shaved of 30% of FP32 and allocated the same to RT, but at the same time nvidia believes that RT + DLSS go hand-in-hand, so maybe this is a better approach than applying brute force load balancing.. its a strategic bet eventually but at the same time nvidia would have better clarity on workload profiles

You seem to not realise that a looot of that transistor budget went into a cache that they introduced - cache eats lots of space and transistors, even though it's packed much denser than compute area of a GPU. If you cut out cache, that number would be much smaller. The rest is just guessing - GPUs take many years to design and test and there's been almost no rumours (till very recently), hence we simply do not know what tech they introduced in their coming GPUs, yet. Both NVIDIA and AMD are already way into designing and testing next gen whilst we're talking about "old" ones that are just about to be released, whilst we have NO clue what that might bring either.
the cache would be around 6 billion, they have invested more into tensor cores it seems and fleshed out acceleration structures for RT.. amd's approach is going to be interesting though
 
Last edited:
yeah they have figured out an acceleration scheme that can recursively ray cast like everything was the first ray.. but still they arent making big bets on allocation and load balancing, you know like they could have possibly shaved of 30% of FP32 and allocated the same to RT, but at the same time nvidia believes that RT + DLSS go hand-in-hand, so maybe this is a better approach than applying brute force load balancing.. its a strategic bet eventually but at the same time nvidia would have better clarity on workload profiles
What would be the goal of that? They add 30% more surface to RT and then what? Raster speed drops down, RT speed most likely does NOT increase (more rays = better quality and less noise but NOT more speed). Hence, they seem to have put only as many RT units as they needed to create enough rays for the denoiser to manage it well and image still look good enough, but the rest was used for other important tasks (like raster, which means also geometry, which is still needed for RT as is).
 
Last edited:
the cache would be around 6 billion, they have invested more into tensor cores it seems and fleshed out acceleration structures for RT.. amd's approach is going to be interesting though
Where did you get that number from? In CPUs cache uses a majority of all transistors. In Ada cache takes almost 1/3rd of the whole chip but it's more densely packed with transistor than the rest of the logic.
 
i am not convinced that amd can beat a 76 billion transistor monster gpu and any victory in rasterization will be pyrrhic at best because of the huge feature set disparity that gets factored into the transistor budget.. at this stage the raster performance is looking pretty good for the 4090 to the extent that any addition in there is now reduced to a statistic for analytical discussions, but on the other hand we need a game changer to strengthen relatively weak areas like RT and inferential rendering that can save a lot of energy

From what we can see of the 4080 leaks the 12GB is on power with a 3090 and on average about 15% faster than a 3080 12GB, or 20% faster than the 3080 10GB. The 4080 16GB is ~50% faster than a 3080 10GB and 72% more expensive.

3080 10GB was $699 MSRP (availability aside)
4080 12GB is $899 MSRP

So an almost 30% increase in MSRP for 2 extra GB of VRAM and a ~20% increase in performance.

Frankly at that utterly turgid price/perf AMD don't need to beat the 76 billion transistor 4090 because the turgid overpriced muck that Nvidia are shovelling as their lower tiers is all they need to beat. And from the leaks given so far it is a very low bar indeed.

If I wanted 3090 performance I could get a used one for a lot less than a new 4080 12 GB. I think I can live without DLSS 3.0 thanks.

So can we please dispense with the idea that the only thing that matters is RTX 4090 vs 7900 XT because most of us are far more interested in the mid tier and it is clear Nvidia are fleecing us at the price/perf on offer. I have a sneaking suspision AMD will offer only marginally better price/perf.
 
20% increase in performance? According to Nvidia?

Look at Nvidia's own data from here. Look at the non-DLSS performance.


Poneros shared a table with the numbers for reference. Only 3 games but enough to see the 4080 12GB is ~ 20% faster than the 3080 10GB.

 
This is an AMD/RDNA 3 thread, can we stop posting about nvidia gpus/tech please.... :rolleyes:

Yes, yes you win... DLSS better than native. Can you move on so we can keep this about RDNA3 instead of DLSS please. I 'm sure you will still get paid by Jensen for all your hard work derailing and spamming AMD thread with Nvida is teh awsum.
:rolleyes:
If he's throwing the thread off topic just report him, this isn't an Nvidia thread.

Rules only apply to certain individuals when can't debunk the debunker I guess eh.....

:cry:
 
Last edited:
From what we can see of the 4080 leaks the 12GB is on power with a 3090 and on average about 15% faster than a 3080 12GB, or 20% faster than the 3080 10GB. The 4080 16GB is ~50% faster than a 3080 10GB and 72% more expensive.

Comically before the Ada reveal, the "only 10%" faster was mocked big time. Yet you could have had that 4080 two years ago with double the vram.. not sure why any person that has a top tier Ampere would even consider the 4080's.
 
Look at Nvidia's own data from here. Look at the non-DLSS performance.


Poneros shared a table with the numbers for reference. Only 3 games but enough to see the 4080 12GB is ~ 20% faster than the 3080 10GB.


The same slide also says the 4090 is 70% faster than the 3090Ti.

Most reviewer have it at 45% to 60%.

We need to stop taking vendor slides as gospel.
 
Last edited:
Status
Not open for further replies.
Back
Top Bottom