The Official Nvidia GeForce 'Pascal' Thread - for general gossip and discussions

Simon Belmont · 26 Mar 2016 at 17:39

10 days hurry up would ya!

subbytna · 26 Mar 2016 at 21:07

easyrider said:
Dude you have 223 posts in 3 years ,so I'm guessing you contemplate each one...and you chose to post this ?

andybird123 · 26 Mar 2016 at 21:27

drunkenmaster said:
Async can't be overdone .

Async compute can not be overused just like HT can't.

according to IO Interactive, too much Async can even make it a penalty

: http://wccftech.com/async-compute-boosted-hitmans-performance-510-amd-cards-devs-super-hard-tune

Orangey · 26 Mar 2016 at 23:59

How quickly the tides turn.

CAT-THE-FIFTH · 27 Mar 2016 at 00:07

andybird123 said:
: http://wccftech.com/async-compute-boosted-hitmans-performance-510-amd-cards-devs-super-hard-tune

Actual article:
http://www.dualshockers.com/2016/03...-in-hitman-advanced-visual-effects-showcased/

Asynchronous compute granted a gain of 5-10% in performance on AMD cards, and unfortunately no gain on Nvidia cards, but the studio is working with the manufacturer to fix that. They’ll keep on trying.

That part made me LOL.

Boomstick777 · 27 Mar 2016 at 22:18

Hoping for a Pascal 'Nano' type card, more power efficient than the big guns but still packs a punch and better price. Would grab that to replace my Nano. If not will be AMD again if they do an new Nano type card. Really impressed with this card, power use / noise / performance / price sweet spot. I think this is AMD's best ever card I really do lol.

drunkenmaster · 27 Mar 2016 at 22:31

andybird123 said:
: http://wccftech.com/async-compute-boosted-hitmans-performance-510-amd-cards-devs-super-hard-tune

You can overuse async in the same way you can overuse cpu cores, or overuse shaders, or overuse rops, or memory bandwidth. But async isn't 'an effect', you can't overuse it like you can overuse tessellation. Overusing tessellation would be using a higher level than you can actually see any difference visually, that is what we're using "over use" in reference to. You can also overuse effects and bring the framerate down to 2fps on a Titan X by using too many effects(which individually all bring an IQ benefit). You can't just randomly decide to apply the context of overuse from one to the other for the sake of trying to score points... well I mean you can, but you'd not have a valid argument.

The biggest issue here isn't if Maxwell support async or not, nor how good it is or not... why if async doesn't bring anything to Maxwell has Nvidia spent the past 18 months promising async support on an architecture that quite clearly doesn't support it at all. They felt it was important enough to lie about it for the past 18 months rather than say 18 months ago "haha, our utilisation doesn't have the same problems AMD do so we don't require it"... instead they lied about it for a LONG time and now have an excuse when finally some people are calling Nvidia out on it.

Then again, where is that Fermi DX12 driver... I don't care that Fermi doesn't support DX12. I said at the time I sincerely doubt it does support it in any meaningful way and people at the time asked what the deal was. My answer back then is if Nvidia lie about Fermi supporting DX12 for a couple of years then people give them credit for having such wide support. They made claims about having more DX12 support, then two years later people just forgive them for it. Same deal as always with Nvidia, lie to the users, users use it as a pro Nvidia point for a couple of years then don't use it as a negative Nvidia point when it turned out to be BS.

andybird123 · 27 Mar 2016 at 22:36

Hahaha, you said you cant use too much async, like you cant use too much hyper threading... cue developer saying its not like threading for cpu, you have to carefully tune it per gpu and that using it too much causes a penalty... and you come back with the usual big long post trying to claim you never said that thing you said

Thanks for the laughs yet again charlie

I never even mentioned tesselation

bru · 27 Mar 2016 at 23:18

andybird123 said:
Hahaha, you said you cant use too much async, like you cant use too much hyper threading... cue developer saying its not like threading for cpu, you have to carefully tune it per gpu and that using it too much causes a penalty... and you come back with the usual big long post trying to claim you never said that thing you said

Thanks for the laughs yet again charlie

I never even mentioned tesselation

+1

drunkenmaster said:
Async can't be overdone.

Async compute can not be overused just like HT can't.

drunkenmaster said:
You can overuse async in the same way you can overuse cpu cores

Thank you DM that made me chuckle

Simon Belmont · 28 Mar 2016 at 07:47

andybird123 said:
Hahaha, you said you cant use too much async, like you cant use too much hyper threading... cue developer saying its not like threading for cpu, you have to carefully tune it per gpu and that using it too much causes a penalty... and you come back with the usual big long post trying to claim you never said that thing you said

Thanks for the laughs yet again charlie

I never even mentioned tesselation

Just ignore DM dude. He's a typing contradiction.

Calin Banc · 28 Mar 2016 at 08:12

It could be depending per engine what's going on, what cards you use (amd vs. nvidia), how skilled you are to actually know what's going on and what's worth doing with async. and what do to it etc. Kollock from Oxide said they're doing general stuff, nothing really specific for it and so far it seems to work ok:

http://www.overclock.net/t/1592431/...ctx-12-asynchronous-shading/350#post_24939794

I'm not sure you understand the nature of this feature. There are 2 main types of tasks for a GPU, graphics and compute. D3D12 exposes main 2 queue types, a universal queue (compute and graphics), and a compute queue. For Ashes, use of this feature involves taking compute jobs which are already part of the frame and marking them up in a way such that hardware is free to coexecute it with other work. Hopefully, this is a relatively straightfoward tasks. No additional compute tasks were created to exploit async compute. It is merely moving work that already exists so that it can run more optimally. That is, if async compute was not present, the work would be added to the universal queue rather than the compute queue. The work still has to be done, however.

The best way to think about it is that the scene that is rendered remains (virtually) unchanged. In D3D12 the work items are simply arranged and marked in a manner that allows parallel execution. Thus, not using it when you could is seems very close to intentionally sandbagging performance.

http://www.overclock.net/t/1592431/...ctx-12-asynchronous-shading/670#post_24962556

Last year you posted on here that you spent maybe 5 days optimizing for async. Have you spent more time optimizing it for AMD hardware between then and now?

We don't 'optimize' for it per say, we detangled dependencies in our scene so it can execute in parallel. Thus, I wouldn't say we optimized or built around it, we just moved some of the rendering work to compute and scheduled it to co-execute. Since we aren't a console title, we're not really tuning it like someone might on an XboxOne or PS4. However, consoles guys I've talked to think that 20% increase in perf is about the range that is expected for good use on a console anyway.

Anyway, considering this, this and this post, could it be that a proper implementation of async would allow a card to work more efficiently even on a code that doesn't suit its architecture in such a way that doesn't have "idle bubbles"? For instance, Keppler to perform better, because the issue it has with newer games may be from a code that not perfect for how the uarch was thought - as per those 3 posts.

drunkenmaster · 29 Mar 2016 at 14:40

andybird123 said:
Hahaha, you said you cant use too much async, like you cant use too much hyper threading... cue developer saying its not like threading for cpu, you have to carefully tune it per gpu and that using it too much causes a penalty... and you come back with the usual big long post trying to claim you never said that thing you said

Thanks for the laughs yet again charlie

I never even mentioned tesselation

Yeah, you walked in on a conversation comparing tessellation to async... so it doesn't matter if YOU never mentioned it, my post was directly responding to bru talking about tessellation... the context of my post has everything to do with what I said.

AS for the other part... the developer said it's not like threading for the CPU and you have to carefully tune it? First I'd say where in that link you posted it said it's not like CPU threading and second I want you to point out why you think devs don't tune for threads on the CPU... because they do.

So when it's pointed out how wrong you are, again you literally made up things the dev didn't say, made up an accusation of what I claimed to not say and... no that is it.

Oh, and then as with all things Nvidia on here, when you start talking crap to back up your posts suddenly the Nvidia bat signal goes out and you get the same suspects +1'ing your completely BS post.

I also never at all tried to claim I never said it... at all.

You took entirely the wrong end of the stick by butting in on a completely different conversation. bru decided to directly compare tessellation to async, not me, and that was the post I was responding to. If you want to continue THAT conversation by quoting and replying to my post then the context with which I posted is entirely relevant.

Rroff · 29 Mar 2016 at 14:42

You have to be careful comparing/not comparing to CPU threading/hyper-threading because there are some ways it is very much not like that but in other ways there are parallels.

drunkenmaster · 29 Mar 2016 at 14:52

Rroff said:
You have to be careful comparing/not comparing to CPU threading/hyper-threading because there are some ways it is very much not like that but in other ways there are parallels.

It was mostly an example of how it works but in reality it's far more like it than not.

Ultimately you don't need HT, a cpu core is perfectly capable of switching threads but switching threads without HT front end means a big latency hit in doing so, so loads of threads constantly switching is bad. HT being enabled from a higher level enables the front end logic of the core to deal with more threads, to enable the front end to push more threads through the the core at the same time to better utilise the pipelines available on each clock. This is by and large what async compute does, additional front end hardware to push more threads through to get better utilisation. It's not identical because gpus are more parallel in the first place and already dealt with more than one thread, conceptually they are very similar.

Gregster · 29 Mar 2016 at 14:53

It's a shame you can't just hold your hands up and admit you was wrong. We all get things wrong at times bud and that's what makes us human

Mauller · 29 Mar 2016 at 16:38

There is a slight parallel between Async compute and Hyperthreading. The whole merit of Hyperthreading is to allow a superscalar cpu core to fill spaces in its pipelines where a single thread may not be fully utilising it.

Async compute is similar as it allows Shaders to perform work that may otherwise be idle or under utilised.

A lot of the time there will be pipeline bubbles where a processing task is waiting on something, which hyperthreading and async allows better utilisation within this gap.

For a superscalar processor with multiple pipelines the improvement is typically around 15% for SMT aware applications. But it can be more for Async as it is dependent on how loaded the shaders are.

You cannot perform too much Async as DM said due to the way the work is slipped in alongside another task. But the amounts you can perform will often be dependent on what is being rendered at that moment. it comes down to a balance between the different rendering stages and how they load the shaders with work.

Also the performance boost from Async can be very dependant on the scene and even the type of game. Ashes gets a very good boost from async due to the sheer number of unique light sources within the game. So in larger battles Async will provide a larger boost overall compared to smaller ones. Even if the amount of async is fairly minimal in the game as the developers have stated, in certain situations the benefit can be very noticeable.

bru · 29 Mar 2016 at 17:37

Calm down people, we do not need to resort to those levels.

Lets just discuss the upcoming Pascal chips sensibly.

Mauller · 29 Mar 2016 at 17:39

bru said:
Calm down people, we do not need to resort to those levels.

Lets just discuss the upcoming Pascal chips sensibly.

On that note which i was going to get to, apparently a few new cards have been spotted on Zauba. People speculating that they are for the upcoming GPU hardware event next week.

CAT-THE-FIFTH · 29 Mar 2016 at 17:40

GTX1080 to be demoed next month:

http://wccftech.com/nvidia-flagship-gtx-1080-gtc-2016/

Mauller · 29 Mar 2016 at 17:41

CAT-THE-FIFTH said:
GTX1080 to demoed next month:

http://wccftech.com/nvidia-flagship-gtx-1080-gtc-2016/

Next month is next week.

Competitor rules

** The Official Nvidia GeForce 'Pascal' Thread - for general gossip and discussions **

The Official Nvidia GeForce 'Pascal' Thread - for general gossip and discussions