Fidelity Super Resolution in 2021

D.P. · 2 Jun 2021 at 14:34

nvidiamd said:
You know since MMX instructions they are all matrix operations. And you don't need separate hardware, you just use the FPU unit for all of them.

No, MMX is not matrix oeprations - they are vectored SIMD operations.

Why bother with any electronics, you can do matrix operations with steam valves. So what is your point?

The whole purpose of tensor cores is to accelerate various matrix oeprations to be hundreds of times faster than can be achieved through the use of regular compute units in GPUs, and not only that, the computation can be done in parrlalle freeing up the CU to handle the rest of the rendering (or computation) pipeline.

Harlequin · 2 Jun 2021 at 14:34

I remember Nv being forced to use FP16 , as the hardware wasnt good enough for full fat FP32, seems the same narrative is being pushed

D.P. · 2 Jun 2021 at 14:38

Harlequin said:
I remember Nv being forced to sue FP16 , as the hardware wasnt good enough for full fat FP32, seems the same narrative is being pushed

Indeed, I still remember all the nonsensical rubbish that FP16 wasn't good enough blah blah blah, when it was obvious that the required precision was very dependent on the shader. 10 years later it was then the big in thing and some miracle breakthrough because AMD happened to support it

humbug · 2 Jun 2021 at 14:46

chris85oc said:
freesync monitors - tons of issues, flickering and crappy in general, also vrr working only above a certain fps

No.

Harlequin · 2 Jun 2021 at 14:47

FP24 > FP16>IN12 , FP16 has always been too much of a compromise no matter how it is spun

edit:

FP16 is Half mode not mixed mode - thats FP24

Wrinkly · 2 Jun 2021 at 14:48

nvidiamd said:
Enjoy your "dedicated hardware"?

Very much so

nvidiamd · 2 Jun 2021 at 14:55

D.P. said:
No, MMX is not matrix oeprations - they are vectored SIMD operations.

Why bother with any electronics, you can do matrix operations with steam valves. So what is your point?

The whole purpose of tensor cores is to accelerate various matrix oeprations to be hundreds of times faster than can be achieved through the use of regular compute units in GPUs, and not only that, the computation can be done in parrlalle freeing up the CU to handle the rest of the rendering (or computation) pipeline.

The tensor cores are doing them faster because the regular CU does not have these instructions so it has to run them software. But RDNA2 CU is already able to do tensor matrix. The same for consoles. So a CU will behave exactly as a tensor core when/if needed.

nvidiamd · 2 Jun 2021 at 14:59

Wrinkly said:
Very much so

Maybe next gen Nvidia will sell two cards like we had in the 90s since the dedicated hardware is so useful.Voodoo was also the most performant 3d solution but very inefficient.

Chuk_Chuk · 2 Jun 2021 at 15:13

nvidiamd said:
Maybe next gen Nvidia will sell two cards like we had in the 90s since the dedicated hardware is so useful.Voodoo was also the most performant 3d solution but very inefficient.

I know you jest, but i kind of wish that the dedicated RT chip that coreteks speculated on was real. I could see so many advantages for offline RT rendering.

D.P. · 2 Jun 2021 at 15:15

nvidiamd said:
The tensor cores are doing them faster because the regular CU does not have these instructions so it has to run them software. But RDNA2 CU is already able to do tensor matrix. The same for consoles. So a CU will behave exactly as a tensor core when/if needed.

suffice to say, this is simply not true at all.

nvidiamd · 2 Jun 2021 at 15:20

D.P. said:
suffice to say, this is simply not true at all.

So it is able to do tensor math? What is so special about the tensor core, how is it different?

D.P. · 2 Jun 2021 at 15:20

Harlequin said:
FP24 > FP16>IN12 , FP16 has always been too much of a compromise no matter how it is spun

edit:

FP16 is Half mode not mixed mode - thats FP24

It entirely depends on the shader. INT8 is perfectly sufficient for many computations. It has nothing to do with compromises, merely the actual required precision. This isn't some kind fo quality mode. Where utilised proeprly, FP16 will provide mathematically identical results to FP32 or FP64.

Wrinkly · 2 Jun 2021 at 15:21

Why is there no facepalm smiley?

D.P. · 2 Jun 2021 at 15:23

nvidiamd said:
So it is able to do tensor math? What is so special about the tensor core, how is it different?

I think you are getting very confused with AMD's marketing slides.

But specifically, this statement "CU does not have these instructions so it has to run them software" is nonsense.

nvidiamd · 2 Jun 2021 at 15:39

D.P. said:
I think you are getting very confused with AMD's marketing slides.

But specifically, this statement "CU does not have these instructions so it has to run them software" is nonsense.

I don't think that this part of the marketing is wrong ( that is able to accelerate tensor math ). If it was just AMD maybe it was not true but this also has MS and Sony behind. Microsoft said the same thing that they added a lot of hardware for ML.

I am not saying that is better the way AMD did it, it would have been better if they were adding 20% more CU inside the card. I already said in the past that Nvidia packs more hardware inside the chipset than AMD does so the tensor cores are something extra. If the native performance is the same, it means the tensor cores are extra so AMD is a worse solution.
But i think it is better to have more CU's that can also do the tensor's work when needed, than to have tensor cores that do nothing when you don't use upscaling.

oguzsoso · 2 Jun 2021 at 15:47

Tbh I would like some actual "proof" that tensor cores actually doing actual work with current DLSS 2.1 games. It feels like... they're just there. Do they not consume power? If so how much? Don't they have any kind of "utilization" metric that we can follow? If they do, why are they not exposed to hardware monitor tools? Why Nvidia's GPUs feel like a black box?

Zefan · 2 Jun 2021 at 15:49

oguzsoso said:
Tbh I would like some actual "proof" that tensor cores actually doing actual work with current DLSS 2.1 games. It feels like... they're just there. Do they not consume power? If so how much? Don't they have any kind of "utilization" metric that we can follow? If they do, why are they not exposed to hardware monitor tools? Why Nvidia's GPUs feel like a black box?

Yeah, I also have this funny feeling that DLSS might not be quite as hardware driven as it's been made out to be. I wouldn't be surprised either way, time will tell I suppose.

Harlequin · 2 Jun 2021 at 15:52

D.P. said:
It entirely depends on the shader. INT8 is perfectly sufficient for many computations. It has nothing to do with compromises, merely the actual required precision. This isn't some kind fo quality mode. Where utilised proeprly, FP16 will provide mathematically identical results to FP32 or FP64.

8bit gaming - really? Theres a reason why FP16 is the very bottom end for math operations for gaming. This isnt 2004 you know

fs123 · 2 Jun 2021 at 16:57

chris85oc said:
How did it sail? Gsync still better than freesync. Not sure what you mean

freesync monitors - tons of issues, flickering and crappy in general, also vrr working only above a certain fps

gsync - premium monitors, barely any issues, no flickering, vrr working at super low fps compared to freesync

I've used my Dell S2721DGFA 1440P 165Hz monitor with a 5700XT and now a 3080FE and have had no problems at all. Gameplay is smooth on both cards and no flickering whatsoever. It was only £260 too which is a bargain.

Game · 2 Jun 2021 at 16:58

fs123 said:
You'd have to be a complete numbskull to believe FSR is the same as TrixxBoost. If it was then AMD would implement it in the driver without having to work with devs. Try harder mate.

The only time I tried trixx boost I got a windows error message couldnt be arsed to figure out what I needed to do to sort it maybe because I was running in hdr or something