AMD provided us with an explanation of their approach that's worth reading in its entirety, so here it is:
With a traditional graphics API, multi-GPU arrays like AMD CrossFire™ are typically utilized with a rendering method called "alternate-frame rendering" (AFR). AFR renders odd frames on the first GPU, and even frames on the second GPU. Parallelizing a game's workload across two GPUs working in tandem has obvious performance benefits.
As AFR requires frames to be rendered in advance, this approach can occasionally suffer from some issues:
· Large queue depths can reduce the responsiveness of the user's mouse input
· The game's design might not accommodate a queue sufficient for good mGPU scaling
· Predicted frames in the queue may not be useful to the current state of the user’s movement or camera
Thankfully, AFR is not the only approach to multi-GPU. Mantle empowers game developers with full control of a multi-GPU array and the ability to create or implement unique mGPU solutions that fit the needs of the game engine. In Civilization: Beyond Earth, Firaxis designed a "split-frame rendering" (SFR) subsystem. SFR divides each frame of a scene into proportional sections, and assigns a rendering slice to each GPU in AMD CrossFire™ configuration. The "master" GPU quickly receives the work of each GPU and composites the final scene for the user to see on his or her monitor.
If you don’t see 70-100% GPU scaling, that is working as intended, according to Firaxis. Civilization: Beyond Earth’s GPU-oriented workloads are not as demanding as other recent PC titles. However, Beyond Earth’s design generates a considerable amount of work in the producer thread. The producer thread tracks API calls from the game and lines them up, through the CPU, for the GPU's consumer thread to do graphics work. This producer thread vs. consumer thread workload balance is what establishes Civilization as a CPU-sensitive title (vs. a GPU-sensitive one).
Because the game emphasizes CPU performance, the rendering workloads may not fully utilize the capacity of a high-end GPU. In essence, there is no work leftover for the second GPU. However, in cases where the GPU workload is high and a frame might take a while to render (affecting user input latency), the decision to use SFR cuts input latency in half, because there is no long AFR queue to work through. The queue is essentially one frame, each GPU handling a half. This will keep the game smooth and responsive, emphasizing playability, vs. raw frame rates.
Let me provide an example. Let's say a frame takes 60 milliseconds to render, and you have an AFR queue depth of two frames. That means the user will experience 120ms of lag between the time they move the map and that movement is reflected on-screen. Firaxis' decision to use SFR halves the queue down to one frame, reducing the input latency to 60ms. And because each GPU is working on half the frame, the queue is reduced by half again to just 30ms.
In this way the game will feel very smooth and responsive, because raw frame-rate scaling was not the goal of this title. Smooth, playable performance was the goal. This is one of the unique approaches to mGPU that AMD has been extolling in the era of Mantle and other similar APIs.
All I can say is: thank goodness. Let's hope we see more of this kind of thing from AMD and major game studios in the coming months and years. Multi-GPU solutions don't have to double their FPS averages in order to achieve smoother animations or improved responsiveness. I'd much rather see a multi-GPU team producing more modest increases that the user can actually feel and experience.