Thats some good info NickK, I'd always assumed it was a driver issue but on the grounds of locking people out rather than, as you explained, a radically different software process.
I fully understand about the CPU being the bottleneck for the GPU, but would two cores feeding the GPU be better than one in this case? Or is it all designated to one CPU for simplicities sake?
The GPU itself it fed with DMA transfers from system memory into GPU memory - the same process is used for texture loading for games/graphics. This is because the data is stored as textures.
Data is read off by effectively copying what is rendered (or transformed textures) back into system memory using DMAs again.
It's this packaging of data into textures that needs the CPU's help. So this could be done multi-threaded.
DC applications such as folding require this packaging & unpackaging to be done continuously (although often optimisation means there's a lot of attempts to keep the data in the GPU memory between GPU programs when they don't need transferring).
So, yes, a multi-threaded system could make use of multiple cores to feed a GPU although the bottleneck becomes the PCI-E and the memory bus bandwidth for all these operations (both CPU packaging and GPU DMA transfers and everything else use the system memory bandwidth).
The downside is that the size of data would have to be quite large otherwise keeping multiple threads syncronised (and the data in the CPU caches etc) would undo the benefit.
The CPU also has to organise the GPU programs to execute too as the GPU is actually quite dumb. The GPU programs are loaded by the same DMA process (sometimes a few are pre-loaded) and the CPU triggers the start of each program's execution.
So the CPU becomes the administrator, the GPU does the actual data processing.