Kepler was actually a pretty big jump over Fermi, especially in the newer games at the time. 660Ti was often 50% faster than 560Ti. 670 often 50% faster than 570. The 680 was less so over the 580, but when you remember that the 680 was only the 'midrange' Kepler on a 300mm die compared to the 570mm die of the 570/580, it's still quite an impressive leap.
All that changed there was that they renamed the x70/80 series to be midrange for a generation in order to sell the bigger die top end cards later as yields improved. But the leap was there and it was big.
Not according to TPU:
https://tpucdn.com/reviews/NVIDIA/GeForce_GTX_670/images/perfrel.gif
The GTX670 was around 30% faster than a GTX570.
This is the thing - expecting a 232MM2 chip to be 50% faster than a GTX980TI is a lot.
If a 300MM2 GK104 could barely get 10% maybe 20% at most over a 565MM2 GF210,that would making it the biggest performance jump in like a decade if Polaris 10 could thrash a GTX980TI.
If you want to compare similar size dies,ie, GTX560TI which was around 330MM2 to a GK104 based GTX680 which was 300MM2 then it was 60% to 70% or thereabouts,comparing similar sized chiops.
But that is the problem here. Polaris 10 is around 232MM2. The closest GCN GPU for that is in the R9 270X and R9 370 which is 212MM2.
Let's look at the latest TPU GPU review:
https://tpucdn.com/reviews/Gigabyte/GTX_980_Ti_XtremeGaming/images/perfrel_2560_1440.png
If a 232MM2 Polaris 10 with a 256 bit memory controller was Fury level performance,it would nearly 2.25 times faster than a similar sized GCN1.0 die. If it were only R9 390X level it would be twice the performance.
So,if AMD want to get 50% over a GTX980TI,ie,double a R9 390,AMD would need to get close to 3.77 times more performance out of a similar size Polaris die when compared to GCN1.0.
Now compare that to GCN1.2,ie,the R9 380X.
The R9 380X has a 359MM2 GPU. If P10 was R9 390X level performance,it would be 50% faster and if it was Fury level performance it would be 68% faster for a 35% smaller die.
At this point if it needed to be 50% faster than a GTX980TI it would need to be 2.82 times faster than Tonga whilst having 35% less surface area.
Hence if we tried to compare GCN1.2 and P10,you would need Polaris to get 4 times more performance out of a similar sized GCN1.2 die.
At that point if AMD matches Fury with such a small chip,it is not doing that badly.
If Vega is 300MM2 to 400MM2,with HBM2 it might be the one which beats a GTX980TI convincingly on the AMD side.