Informative post. Makes a lot of sense with regard to the benchmark software.
I was looking at the same price range but decided on a Fury at £299 on here which I thought was a great price (I have a FreeSync monitor…)
All things being equal…
If you were coding for a game, as opposed to a benchmark, to be released on Xbox1, PS4, MS Windows (D12 or Vulkan). What would you likely be doing with the compute commands with regard to async compute, given the underlying hardware in the consoles is mostly identical, and how would that reflect on the Windows PC release?
Cheers. The fury X sounds like it was a good choice if you have a freesync monitor (and g-sync really is comparatively a bit of a rip-off imo), especially at that price. I just ordered a 1060 myself after sleeping on it and seeing one in stock at my price range, I really can't go over 250 or she'll kill me.
The async compute thing is interesting. I will confess I don't work on any console code, but you pick up some info from various presentations.
Regardless of platform though, async isn't a guaranteed performance boost. It really depends on the workloads. For example trying to async a bandwidth-heavy low-instruction-count compute shader could actually end up slower than running it sequentially on the direct queue. It could easily cause the graphics workload to suffer bandwidth problems, and/or thrash the cache and cause both queue's shaders to get excessive cache misses, and so use up further bandwidth to pull the missed data from memory again.
Anything designed for consoles is going to ensure it's async work plays nice with GCNs architecture. You'd just profile it and go from there when deciding whether to async those compute commands in that part of the frame, perhaps break a large compute kernel into separate kernels and partly async it etc. I would imagine the PC version would also consider the same issues on NVidia, but Pascal has only been available for a couple of months, so some of these early titles may not have considered that or even been able to profile the impact of async on NVidia until very recently (no real support on Maxwell for it). Fundamentally most compute passes designed initially for GCN will likely run ok on NVidia architectures, although perhaps not ideally. Still, you'd write for 64 thread wavefronts on GCN and those divide nicely into NVidia's 32 thread warps.
People might think devs are lazy but honestly time is usually very limited and game studios are infamous for going into 'crunch' to meet publisher dates (60-80 hour weeks). If the HLSL ports of your PS4 shaders don't cripple NVidia GPUs, some studios might not invest the time to optimise them to get that last 10% out of NVidia hardware. In the case of Vulkan where you can use GCN intrinsics, why not port those too if you have already GCN optimised shaders available from the PS4 version.
Ultimately any titles where consoles are a first class platform (not ported to after PC release) it's going to be easier to optimise for GCN on pc because that work is already done, in part. I don't want to mislead however, any studio that cares about PC as a platform will put serious effort into NVidia performance, simply because that's 70-80% of the PC market.