As i understand it, and may be wrong.
Nvidia already use Preemption in Maxwell, Preemption and Scheduling are the same thing, instruction sets are organised in a way that avoids idle threading by pretiming call instructions to interleave.
AMD are adding Instruction scheduling to Polaris, ASynchronous shaders will be carried forward from CGN 1.1 and 1.2.
Preemption or Scheduling is used to reduce draw call overheads in DX11 and can be used in DX12.
AMD did not use Scheduling as it can introduce a latency caused by holding intrusions in a stack, which is necessary to organise said instructions.
In AMD's GPU's the instructions flow directly without stacking, the downside of that is a reduced flow of calls compared with Scheduling so it is less efficient, the up side is improved latency.
Cryengine actually has the same scheduling system in its extraction layer, this is why its able to run DX11 so beautifully balanced across upto 16 threads, which is why AMD's 8 core FX CPU's so unusually compared with almost all other games run's Crysis 3 ever so slightly better than a 3770K.
Nothing to do with AMD's partnership with Crytek in making the engine for Crysis 3
Anyway...
AMD's solution was to have multiple schedulers in the hardware, 8 in GCN 1.1 and 1.2, those ACE units.
The problem is its a bit like 8 core CPU's in GPU form and DX11 cannot handle this.
Mantle was the first API to be capable of running multiple shader threads.
DX12 has the same architecture, but of course its not at all borrowed from Mantle, its just a coincidence
Nvidia have proven how useful Schduling is in DX11 and AMD will now do the same for DX11 while for DX12 the ASynchronous hardware remains the same.
Pascal is the same as Maxwell, the difference is the ASynchronous capabilities derived from Scheduling have been switched on for Pascal
AlamoX posted the video that explains this nicely.