It was not tailored for any specific architecture. It overlaps different rendering passes for asynchronous compute, in paraller when possible. Drivers determine how they process these - multiple paraller queues are filled by the engine.
The reason Maxwell doesn't take a hit is because NVIDIA has explictly disabled async compute in Maxwell drivers. So no matter how much we pile things to the queues, they cannot be set to run asynchronously because the driver says "no, I can't do that". Basically NV driver tells Time Spy to go "async off" for the run on that card. If NVIDIA enables Asynch Compute in the drivers, Time Spy will start using it. Performance gain or loss depends on the hardware & drivers.