• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

** The Official Nvidia GeForce 'Pascal' Thread - for general gossip and discussions **

Associate
Joined
15 Feb 2014
Posts
753
Location
Peterboghorror
More like 'who'... Don Pascal, godfather of all GPU's. He's the boss, don't upset him, or you'll wake up with a reference AMD card next to you in bed.

Well you certainly wouldn't be able to sleep through that racket!

Compute figures look astonishing, my choices for the last 2 cards I've bought have pretty much been because of superior opencl performance. It will be good to be able to make choices from both sides again.
 
Associate
Joined
23 Oct 2013
Posts
184
I don't know a great deal, however from all the ranting and raving. This is what I have been able to deduce, pascal cards should be 6 months (or more due to a earthquake) behind polaris cards. The power savings that Nvidia were able to squeeze on 28nn, should be negated when they put back in things striped from fermi. The "14" & "16" processes are more closer to 20nn. However the "14"nn is slightly denser, so going off that, polaris should be slightly ahead pascal in terms of performance.
 
Soldato
Joined
7 Feb 2015
Posts
2,864
Location
South West
Apparently Nvidia have fixed their driver to accept async commands. But it does not run the code async, it just queues it consecutively to be run in the graphics queue. This is no different to the game not having async.

Which still leaves AMD as the only ones who can perform async with compute being computed concurrent to the graphics.
 
Caporegime
Joined
18 Oct 2002
Posts
33,188
I don't know a great deal, however from all the ranting and raving. This is what I have been able to deduce, pascal cards should be 6 months (or more due to a earthquake) behind polaris cards. The power savings that Nvidia were able to squeeze on 28nn, should be negated when they put back in things striped from fermi. The "14" & "16" processes are more closer to 20nn. However the "14"nn is slightly denser, so going off that, polaris should be slightly ahead pascal in terms of performance.

Ish, I'd say Polaris has a lot more room for improvement as Nvidia brought a lot of their next gen efficiency improvements up into Maxwell and as you say they removed some stuff, compute and hardware scheduling which should end up back in for Pascal.

How to quantify it, not exactly possible but lets say from one gen to the next on a new node you would expect maybe 70-90% performance per watt gain from the process. Then architecture can potentially improve or worsen that. With Polaris you have a pretty old not hugely tweaked architecture moving with a huge update for the new gen so they should be(and appear to be from all current info) a pretty hefty increase above that.

Pascal, adding in less power efficient parts back in like compute and hardware scheduling(will be very hard to do good fast async computing, a major dx12 feature, without this) will make it harder to make any gains beyond just from the process. Depending on what hardware efficiency they can find maybe they can improve it, if Pascal isn't a huge update beyond adding compute back in you might find a worse than the 70-90% range because you're effectively comparing two different products. They will still gain a HUGE amount of performance per watt because 16nm is a huge step from 28nm.

In terms of process 14/16nm are just 20nm finfet chips but performance wise improve power usage so much they would compare well to a full real 14nm non finfet node. There is some density improvements, finfets are taller than they are wide where planar transistors are wider than they are tall so if you flip them all on their side you can fit them a tiny bit closer together. But the name is because of the power improvements and how the node would be compared to planar and also marketing BS in competing with Intel and not looking miles behind.

Pascal/Polaris should both offer huge improvement performance when you compare similar die sizes. The reality of process nodes is yields improve over time and bigger chips get made later on rather than early so unfortunately the first cores will usually be in the 250-400mm^2 range rather than 600mm^2 that Fury/Titan are so they'll struggle to double performance until chips that size come which doesn't look like till early 2017 at the earliest.
 
Last edited:
Caporegime
Joined
18 Oct 2002
Posts
30,341
Apparently Nvidia have fixed their driver to accept async commands. But it does not run the code async, it just queues it consecutively to be run in the graphics queue. This is no different to the game not having async.

Which still leaves AMD as the only ones who can perform async with compute being computed concurrent to the graphics.

That'll be interesting to see tbh. I'm wondering if pure horsepower will help it to compete...
 
Associate
Joined
14 Jun 2008
Posts
2,363
Apparently Nvidia have fixed their driver to accept async commands. But it does not run the code async, it just queues it consecutively to be run in the graphics queue. This is no different to the game not having async.

Which still leaves AMD as the only ones who can perform async with compute being computed concurrent to the graphics.

I wish the oxide dev had expanded on this a bit more: http://www.overclock.net/t/1590939/...-async-compute-yet-says-amd/370#post_24898074 It would indicate that AS has only very recently been added to Nvidia's drivers, that recently that current public DX12 code (from Oxide at least) is not taking any advantage.
 
Last edited:
Soldato
Joined
7 Feb 2015
Posts
2,864
Location
South West
I wish the oxide dev had expanded on this a bit more: http://www.overclock.net/t/1590939/...-async-compute-yet-says-amd/370#post_24898074 It would indicate that AS has only very recently been added to Nvidia's drivers, that recently that current public DX12 code (from Oxide at least) is not taking any advantage.

Nvidia can't do hardware Async at all on their current hardware. Their driver just queues the async commands into a single stream as though there was no async being used. So in the end Async provides no performance benefits on nvidia hardware as the async commands are just run as though the application had async disabled.

I think the older driver they tried to get hardware async working using context switching. but the recent highlights by a few users around the web has shown that async being used on the newer driver runs as though it was disabled in the application. And when they used GPU diagnostic tools, it showed that only the main graphics queue was being used and that all of the compute commands were being fed into the graphics queue along with graphics commands.

Like yeah, the driver exposes async, but it gets in the way and repackages the commands as though async was disabled for current nvidia hardware.
 
Last edited:
Associate
Joined
14 Jun 2008
Posts
2,363
Nvidia can't do hardware Async at all on their current hardware. Their driver just queues the async commands into a single stream as though there was no async being used. So in the end Async provides no performance benefits on nvidia hardware as the async commands are just run as though the application had async disabled.

I think the older driver they tried to get hardware async working using context switching. but the recent highlights by a few users around the web has shown that async being used on the newer driver runs as though it was disabled in the application. And when they used GPU diagnostic tools, it showed that only the main graphics queue was being used and that all of the compute commands were being fed into the graphics queue along with graphics commands.

Like yeah, the driver exposes async, but it gets in the way and repackages the commands as though async was disabled for current nvidia hardware.

Is their proof of this. I mean actual proof. Not regurgitating what Mahigan makes up. As it seems he has no idea what Nvidia actually have implemented either given this opening paragraph. http://www.overclock.net/t/1590939/...-async-compute-yet-says-amd/380#post_24898514

He sticks up some GPU view screenshots from Fable later on, from a 5 month old build, from back when NV had no AS support at all going on...

Lets wait and see how this all plays out before declaring a winner based on what one vendors PR output has to say. A PR mouthpiece that is so invested in this, he tirelessly works the forums even when his wife is being kept out his country of residence by immigration...
 
Soldato
Joined
7 Feb 2015
Posts
2,864
Location
South West
The sources I have are from a disqus chat. People using microsoft dx12 async examples. The nvidia 980ti tested having same fps with async on and off. While the furyx gained 20%+ performance.

Them using gpuview themselves showed only 1 queue being used when async is on or off on the latest nvidia drivers, doing all the above that I mentioned.

You can even run the examples and check yourselves, cant post links atm as at uni on phone.
 
Associate
Joined
14 Jun 2008
Posts
2,363
The sources I have are from a disqus chat. People using microsoft dx12 async examples. The nvidia 980ti tested having same fps with async on and off. While the furyx gained 20%+ performance.

Them using gpuview themselves showed only 1 queue being used when async is on or off on the latest nvidia drivers, doing all the above that I mentioned.

You can even run the examples and check yourselves, cant post links atm as at uni on phone.

You will forgive me if 'sombody on disqus said so' doesn't fill me with confidence. Also strange that if this was the case, that nobody credible has picked up on it given what a hot topic it is right now.
 
Caporegime
Joined
18 Oct 2002
Posts
33,188
It's really pretty simple, if Nvidia could do proper async compute it wouldn't need lots of broken promises about how it will be working in the next driver, their story wouldn't change month to month, game to game and they wouldn't be paying to remove async compute from games that had it as standard.

If it was there and was a basic hardware function it would have worked from their first DX12 driver, they have been working around it with multiple attempts... the only possible reason for this is they don't have proper hardware support for it, with the hardware scheduler moved off die and done on the CPU it's pretty ridiculous to presume they had async compute, the gpu isn't physically capable of scheduling stuff on the fly itself and the reason hardware scheduling(among other reasons) will end up back on the GPU in Pascal.

Context switching is something that with hardware support would work incredibly easily and efficiently, without it, working around it takes a little time as they work on drivers, algorithms for predicting what order to jam the commands through in.

In that same way Nvidia promised DX12 support from Fermi from launch of Windows 10 which failed to appear, which was promised for a few months later as if the hardware would magically change and the job would become easier... that date slipped again, another promise, etc, etc.
 
Soldato
Joined
7 Feb 2015
Posts
2,864
Location
South West
You will forgive me if 'sombody on disqus said so' doesn't fill me with confidence. Also strange that if this was the case, that nobody credible has picked up on it given what a hot topic it is right now.

Most of these 'credible' people, don't want to lose their nvidia partnerships by posting something negative.
 
Back
Top Bottom