Soldato
work load is likely - my gtx 980 is starting to tank hard above 6000 unique units
Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.
work load is likely - my gtx 980 is starting to tank hard above 6000 unique units
But the dev already said that they disabled async on ashes for NVIDIA hardware, so what else is causing the drop in performance from 11 to 12, when starswarm, using the same engine shows a boost from 11 to 12
I think posting up the starswarm on dx12 anand article is a massive own goal as far as trying to prove ashes isn't being deliberately optimised for AMD hardware, as starswarm shows NVIDIA can and should get a performance improvement going from dx11 to dx12
Whatever it is about ashes that causes a drop in perf from 11 to 12 should be optional for NVIDIA hardware, which is probably what nvidia asked for, options, not disabled entirely or removed, but the option to turn it down, like turning down tesselation in other games
That still doesnt explain why they get a perf drop going from 11 to 12 when starswarm gets a perf boost... I have no problems with AMD's performance being better with async, but nvidia getting a performance drop when other dx12 demos get a boost is still odd
I edited my Las post, you might want to check
That sounds like a completely made up excuse, well done.
what? you do fully understand that starswarm is the engine tech demo from 2013 , last updated in mid 2014 - and here we have the game using the latest version - the game being Ashes of the singularity , which will use the latest 2015 version of the game engine.
well done on not understanding how technology evolves. starswarm was the tech demo - ashes is the game.
what? you do fully understand that starswarm is the engine tech demo from 2013 , last updated in mid 2014 - and here we have the game using the latest version - the game being Ashes of the singularity , which will use the latest 2015 version of the game engine.
well done on not understanding how technology evolves. starswarm was the tech demo - ashes is the game.
LOL, what's he squawking about I wonder.
No doubt panicking and double-checking his next lot of copypasters haven't already been used elsewhere.
http://hardforum.com/showpost.php?p=1041825513&postcount=125
Well that's not good for Nvidia if true. Looks like AMD were actually onto something.
If you excuse me I'm off to go eat my hat.
http://hardforum.com/showpost.php?p=1041825513&postcount=125
Well that's not good for Nvidia if true. Looks like AMD were actually onto something.
If you excuse me I'm off to go eat my hat.
I'm not an expert but i think basically AMD's GPU's shader and compute throughput is through the same memory pool, its a lot like AMD's Heterogeneous System Architecture on the APU side, parallel and serial workloads work as one, so much like HSA can compute Floating Point Operations in parallel thought the high speed engine shader streaming is computed in parallel instead of serial, it has the potential to get a huge leg up in performance.A GTX 980 Ti can handle both compute and graphic commands in parallel. What they cannot handle is Asynchronous compute. That's to say the ability for independent units (ACEs in GCN and AWSs in Maxwell/2) to function out of order while handling error correction.
It's quite simple if you look at the block diagrams between both architectures. The ACEs reside outside of the Shader Engines. They have access to the Global data share cache, L2 R/W cache pools on front of each quad CUs as well as the HBM/GDDR5 memory un order to fetch commands, send commands, perform error checking or synchronize for dependencies.
The AWSs, in Maxwell/2, reside within their respective SMMs. They may have the ability to issue commands to the CUDA cores residing within their respective SMMs but communicating or issueing commands outside of their respective SMMs would demand sharing a single L2 cache pool. This caching pool neither has the space (sizing) nor the bandwidth to function in this manner.
Therefore enabling Async Shading results in a noticeable drop in performance, so noticeable that Oxide disabled the feature and worked with NVIDIA to get the most out of Maxwell/2 through shader optimizations.
Its architectural. Maxwell/2 will NEVER have this capability.
Oxide effectively summarized my thoughts on the matter. NVIDIA claims "full support" for DX12, but conveniently ignores that Maxwell is utterly incapable of performing asynchronous compute without heavy reliance on slow context switching.
GCN has supported async shading since its inception, and it did so because we hoped and expected that gaming would lean into these workloads heavily. Mantle, Vulkan and DX12 all do. The consoles do (with gusto). PC games are chock full of compute-driven effects.
If memory serves, GCN has higher FLOPS/mm2 than any other architecture, and GCN is once again showing its prowess when utilized with common-sense workloads that are appropriate for the design of the architecture.
Kinda makes me wonder whether Nvidia were lying again, or if there's something else going on... I guess we'll find out as more DX12 games come out. The fact that Nvidia asked for Async Compute to be disabled rings alarm bells.