• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

Doom to use async compute

They'd be stupid to use it, only about 20 people who are able to use it on the PC. :p

Id expect it to be stripped out of just about every port if Nvidia havn't got it on Pascal.

Possibly GP104 isnt going to have async because is a die shrink of current.


We should be looking forward and utilise the tech available.
However i remember many here boasting thay TX and 980Ti are fully DX12 compatible and AMD 290/290X arent because the box never had this printed on

Here we are. 290X supporting DX12 much better than NV and you ask to cripple the AMD cards hardware abilities because NV cannot keep up?

Let nvidia burn........
 
Possibly GP104 isnt going to have async because is a die shrink of current.


We should be looking forward and utilise the tech available.
However i remember many here boasting thay TX and 980Ti are fully DX12 compatible and AMD 290/290X arent because the box never had this printed on

Here we are. 290X supporting DX12 much better than NV and you ask to cripple the AMD cards hardware abilities because NV cannot keep up?

Let nvidia burn........

NV do actually support more features in hardware. Whether they are the right features is for another thread...
 
In GPU space AMD and ATI before them always developed very forward leaning products but they've also struggled immensely to get the balance right and capitalise on it - often jumping the gun to their own detriment. This time around it is something that they could really work to their advantage as the architecture has great synergy with the nature of nodes sub 20nm planar and DX12 and similar APIs.

Regarding Kepler though - outside of maybe the witcher 3 which I've never played - I'm not really seeing Kepler falling behind (that isn't to say it couldn't be running better than it is if they were spending more time on optimising it but that isn't something I can qualify) - a lot of benchmarks if you dig into it you'll find they are either using old Kepler numbers and not actually retested older cards when they say they have, using numbers given to them by AMD or nVidia or testing Kepler cards limited to the reference on paper clocks - the numbers you see are often anything upto 20% (though normally more like 10-15%) lower than what users will be experiencing in the real world. (Before you add in any end user overclocking).

Kepler is falling behind though since everyone only looks at the GTX780 series cards and not any of the others ones. My GTX660 is now consistently much slower than an HD7870 when they were quite closely matched at launch,and the rest of the range really didn't overclock anywhere as well in percentage terms as the GTX780. One of the reasons the review sites had to change their testing methodology(and I was talking with someone who knew a reviewer) was because Boost V1 tended to have issues over longer time periods,ie,throttling,which a few German and French websites discovered and it made the results look higher than they did.

Then you add certain review sites,ie, like pcgameshardware which use pre-overclocked cards in their benchmarks and even a highly overclocked GTX770 is not doing as well. Even a pre-overclocked GTX770 running at 1.2GHZ clockspeed is loosing to stock clockspeed R9 280X in The Division:

http://www.pcgameshardware.de/The-Division-Spiel-37399/Specials/PC-Benchmarks-1188384/

The R9 280 and R9 280X had very minimal boost,ie,around 50MHZ and these cards could hit 1.2GHZ which meant they could easily bypass an overclocked GTX770 in the game.

An overclocked GTX670 is slower than an overclocked HD7870 in the game too.
 
Last edited:
Possibly GP104 isnt going to have async because is a die shrink of current.


Wrong on 2 counts:
1) GP104 isn't a die shrink, its a modified architecture.
2) Maxwell supports async compute anyway.


Async compute on maxwell just works different to GCN and works best with larger compute tasks. where it actually ends up very fast. It has a higher switching time so doesn't perform well with numerous small tasks.
 
Kepler is falling behind though since everyone only looks at the GTX780 series cards and not any of the others ones. My GTX660 is now consistently much slower than an HD7870 when they were quite closely matched at launch,and the rest of the range really didn't overclock anywhere as well in percentage terms as the GTX780. One of the reasons the review sites had to change their testing methodology(and I was talking with someone who knew a reviewer) was because Boost V1 tended to have issues over longer time periods,ie,throttling,which a few German and French websites discovered and it made the results look higher than they did.

Then you add certain review sites,ie, like pcgameshardware which use pre-overclocked cards in their benchmarks and even a highly overclocked GTX770 is not doing as well. Even a pre-overclocked GTX770 running at 1.2GHZ clockspeed is loosing to stock clockspeed R9 280X in The Division:

http://www.pcgameshardware.de/The-Division-Spiel-37399/Specials/PC-Benchmarks-1188384/

The R9 280 and R9 280X had very minimal boost,ie,around 50MHZ and these cards could hit 1.2GHZ which meant they could easily bypass an overclocked GTX770 in the game.

An overclocked GTX670 is slower than an overclocked HD7870 in the game too.
Unfortunately that's how it is.

Nvidia is more interested in winning drag race (benchmarks) than Le Mans 24Hours (gaming performance over time) :p
 
One of the reasons the review sites had to change their testing methodology(and I was talking with someone who knew a reviewer) was because Boost V1 tended to have issues over longer time periods,ie,throttling,which a few German and French websites discovered and it made the results look higher than they did.

You might have hit on something there that I was overlooking - I do remember the drama with Boost V1 - it could be (some) sites are still testing Kepler Boost 2.0 cards (which in 99% of cases can sustain their boosts indefinitely) in the same way they do Kepler Boost 1.0 cards - hence why the results generally match up with the on paper boost clock instead of the actual boost clocks that the cards will be running in the real world.

Infact the other day I had someone argue blue in the face that Boost 2.0 came out with the first Maxwell cards and they were a professional tech site writer lol.
 
Last edited:
The fact that Maxwell had any proper compute tech sliced out of it to focus on pure DX11 gaming performance is probably going to be the main reason it falls behind. Even more so than Kepler did.

Pascal is a full GPU architecture with plenty of compute shoved back in, and with DX12 and the way it likes to handle things I'm sure Maxwell is going to look might bad once even more games switch to DX12 and Vulkan.

They were good cards, for the short term it seems.

Fiji also lost a lot of compute:
Fiji went form 1/2 DP down to 1/16th.
Maxwell went from 1/24 to 1/32 Dp for the consumer part (the Tesla gpus had 1/3 DP)
 
Kepler is falling behind though since everyone only looks at the GTX780 series cards and not any of the others ones. My GTX660 is now consistently much slower than an HD7870 when they were quite closely matched at launch,and the rest of the range really didn't overclock anywhere as well in percentage terms as the GTX780. One of the reasons the review sites had to change their testing methodology(and I was talking with someone who knew a reviewer) was because Boost V1 tended to have issues over longer time periods,ie,throttling,which a few German and French websites discovered and it made the results look higher than they did.

Then you add certain review sites,ie, like pcgameshardware which use pre-overclocked cards in their benchmarks and even a highly overclocked GTX770 is not doing as well. Even a pre-overclocked GTX770 running at 1.2GHZ clockspeed is loosing to stock clockspeed R9 280X in The Division:

http://www.pcgameshardware.de/The-Division-Spiel-37399/Specials/PC-Benchmarks-1188384/

The R9 280 and R9 280X had very minimal boost,ie,around 50MHZ and these cards could hit 1.2GHZ which meant they could easily bypass an overclocked GTX770 in the game.

An overclocked GTX670 is slower than an overclocked HD7870 in the game too.

This one has a much bigger gab between the 770 and 280x.

http://www.techspot.com/review/1148-tom-clancys-the-division-benchmarks/page2.html

Maybe they used the 16.3s but it doesn't actually say.
 
Regarding Kepler though - outside of maybe the witcher 3 which I've never played - I'm not really seeing Kepler falling behind (that isn't to say it couldn't be running better than it is if they were spending more time on optimising it but that isn't something I can qualify) - a lot of benchmarks if you dig into it you'll find they are either using old Kepler numbers and not actually retested older cards when they say they have, using numbers given to them by AMD or nVidia or testing Kepler cards limited to the reference on paper clocks - the numbers you see are often anything upto 20% (though normally more like 10-15%) lower than what users will be experiencing in the real world. (Before you add in any end user overclocking).

What's happening here in the division. Kepler performance is dire at best compared to it's release.

http://www.techspot.com/review/1148-tom-clancys-the-division-benchmarks/page3.html

Kepler architecture just doesn't seem to cut it in this day and age due to what ever reason. As you said it looks like GCN was a very future looking architecture and it's proving it by still competing well now. Here a 7970ghz card is above the gtx780.
 
I'm sure it'll be a huge boon for XBOX, PS4 ... but otherwise it seems like a hollow victory. The game is going to be locked to 60FPS yet again.

This probably only matters for low end AMD cards, or high end ones at 4K ... and this year I'm hoping to be able to run 4K or 3440x1440 at well over 60FPS on titles like this.

NVIDIA obviously won't benefit at all as their hardware can only emulate the function.

Were the game's frame rate not locked, I'd be very happy about this.
 
Last edited:
What's happening here in the division. Kepler performance is dire at best compared to it's release.

http://www.techspot.com/review/1148-tom-clancys-the-division-benchmarks/page3.html

Kepler architecture just doesn't seem to cut it in this day and age due to what ever reason. As you said it looks like GCN was a very future looking architecture and it's proving it by still competing well now. Here a 7970ghz card is above the gtx780.

Don't you worry yourself about it, Ultra and high settings on my 780 holding 60 fps with drops to 52 occasionally.

Runs like a dream along with fallout 4.
 
1080p
I see 7fps, which is 9%
fury x only 5%(4fps)

1440p
I see 4fps on your 390, which is 7%
fury x gains 2fps which is 3%

4k
fury x and the rest of the AMD cards gain absolutely nothing

And I remember( I might be wrong) that AMD said Hitman will use async shaders quite a lot more than Ashes.

http://www.computerbase.de/2016-03/...2/2/#diagramm-hitman-mit-directx-12-3840-2160

I don't look at benchmarks but run it on my on computer, and get 76-77FPS on DX11, and 87FPS on DX12 on max settings.
 
Fiji also lost a lot of compute:
Fiji went form 1/2 DP down to 1/16th.
Maxwell went from 1/24 to 1/32 Dp for the consumer part (the Tesla gpus had 1/3 DP)

Fiji never lost any compute, it is still capable of 1/2 DP in hardware due to how GCN performs DP. They just limited the DP in software like the desktop Hawaii parts are limited.

NVIDIA literally removed silicon to reduce the DP of Maxwell, which is why Maxwell never replaced Kepler in the tesla parts and why they don't list the DP specs of their maxwell quadro parts on their site.

Also, due to memory constraints we will only see Hawaii replaced in fire pro parts once Vega drops with HBM. No fire pro Fiji due to memory constraints.
 
Last edited:
Fiji never lost any compute, it is still capable of 1/2 DP in hardware due to how GCN performs DP. They just limited the DP in software like the desktop Hawaii parts are limited.

NVIDIA literally removed silicon to reduce the DP of Maxwell, which is why Maxwell never replaced Kepler in the tesla parts and why they don't list the DP specs of their maxwell quadro parts on their site.

Also, due to memory constraints we will only see Hawaii replaced in fire pro parts once Vega drops with HBM. No fire pro Fiji due to memory constraints.

You need to provide some Evidence for that as everything I have read suggest AMD stripped transistors in Fiji reducing the DP performance. Why on earth would they limit it in software? Nvidia did with keep,r for specific business reasons to separate Teslas hardware sales. AMD doesn't have that reason, there is no compute orientated Fiji.
 
You need to provide some Evidence for that as everything I have read suggest AMD stripped transistors in Fiji reducing the DP performance. Why on earth would they limit it in software? Nvidia did with keep,r for specific business reasons to separate Teslas hardware sales. AMD doesn't have that reason, there is no compute orientated Fiji.

Read some white papers on how GCN performs DP, also desktop Hawaii is limited to 1/8 DP but is unlocked to use 1/2 in the fire pro parts.
 
Back
Top Bottom