Next gen GPU client in open beta

uncle_fungus · 10 Apr 2008 at 15:18

http://foldingforum.org/viewtopic.php?f=10&t=2020

If you've got an HD 2x0 or 3xx0 series card then you can start folding on it right away.

verbal · 10 Apr 2008 at 15:30

Cool, been waiting for this. I don't have a card yet but I'll probably get one and use this client just cos I like to be different

theheyes · 10 Apr 2008 at 15:50

Nice. Although I probably won't be using it myself, it will be interesting to see what happens with it.

edit - be also nice to see some numbers, ppd, flops etc. Just for numbers sake of course. I wonder if it will make some waves on the client statistics?

uncle_fungus · 10 Apr 2008 at 16:12

At the moment its possible to make around 1800PPD off an HD3870. No FLOP counts as of yet.

theheyes · 10 Apr 2008 at 16:32

Thanks. Just been reading the FAQ in the link you gave, its a good bit of info. Also, one of the forum posts suggests it seems to be CPU limited? Looks like there's a lot of milage in this new GPU client!

uncle_fungus · 10 Apr 2008 at 16:55

theheyes said:
Thanks. Just been reading the FAQ in the link you gave, its a good bit of info. Also, one of the forum posts suggests it seems to be CPU limited? Looks like there's a lot of milage in this new GPU client!

Yes, CPU speed is limiting the client at the moment. The GPUs are so fast that they can perform all the parallel parts of the code faster than the CPU can do the serial parts.

Duke · 10 Apr 2008 at 17:25

Anything for Nvidia around?

theheyes · 10 Apr 2008 at 18:35

Duke said:
Anything for Nvidia around?

No and won't be for the foreseeable future unfortunately. I don't know the ins and out but its not as programmable as the Ati cards or something afaik.

NickK · 11 Apr 2008 at 13:01

The nVidia issue is that they cannot guarantee what is actually executed.

CTM is effectively assembler so what they code is not touched by the AMD drivers on the host machine..

CUDA is basically output is actually recompiled by the drivers on the host machine.. so if nV mess up the drivers etc then they can't guarantee valid results.

I hope that helps.

NickK · 11 Apr 2008 at 13:04

uncle_fungus said:
Yes, CPU speed is limiting the client at the moment. The GPUs are so fast that they can perform all the parallel parts of the code faster than the CPU can do the serial parts.

Yup - think of like this:
a) the GPU is able todo hundreds of maths operations in parallel but on a specific data format.
b) the CPU has to organise the data into that format.
c) programs require the CPU to prepare the data before passing to the GPU.. including arranging the data transfers..

The more maths you can load into the GPU pass the better otherwise the overhead of organising the data and data transfer times prevent it being viable.

theheyes · 11 Apr 2008 at 15:31

Thats some good info NickK, I'd always assumed it was a driver issue but on the grounds of locking people out rather than, as you explained, a radically different software process.

I fully understand about the CPU being the bottleneck for the GPU, but would two cores feeding the GPU be better than one in this case? Or is it all designated to one CPU for simplicities sake?

Dougiebabe2003 · 11 Apr 2008 at 19:00

Anyone getting good results on this client?

Im still going to be running S@H but when I get my quad could let S@H have 3 cores & one core feeding the GPU client...

Just a thought...

P.S I have a 2900XT atm but it won't up to 3d gfx when I run it (tried it for 5 mins) Someone is offering me £110 for it, would it be worth me doing this then getting a 3870 to fold on or a 8800GT?

Cheers, Doug.

NickK · 11 Apr 2008 at 19:28

theheyes said:
Thats some good info NickK, I'd always assumed it was a driver issue but on the grounds of locking people out rather than, as you explained, a radically different software process.

I fully understand about the CPU being the bottleneck for the GPU, but would two cores feeding the GPU be better than one in this case? Or is it all designated to one CPU for simplicities sake?

The GPU itself it fed with DMA transfers from system memory into GPU memory - the same process is used for texture loading for games/graphics. This is because the data is stored as textures.
Data is read off by effectively copying what is rendered (or transformed textures) back into system memory using DMAs again.

It's this packaging of data into textures that needs the CPU's help. So this could be done multi-threaded.
DC applications such as folding require this packaging & unpackaging to be done continuously (although often optimisation means there's a lot of attempts to keep the data in the GPU memory between GPU programs when they don't need transferring).

So, yes, a multi-threaded system could make use of multiple cores to feed a GPU although the bottleneck becomes the PCI-E and the memory bus bandwidth for all these operations (both CPU packaging and GPU DMA transfers and everything else use the system memory bandwidth).
The downside is that the size of data would have to be quite large otherwise keeping multiple threads syncronised (and the data in the CPU caches etc) would undo the benefit.

The CPU also has to organise the GPU programs to execute too as the GPU is actually quite dumb. The GPU programs are loaded by the same DMA process (sometimes a few are pre-loaded) and the CPU triggers the start of each program's execution.

So the CPU becomes the administrator, the GPU does the actual data processing.

theheyes · 11 Apr 2008 at 20:32

Very good info, NickK. Thanks for taking the time to answer. If I understand what you have written correctly, then we should see some big improvement on newer Nehalem systems with the new QPI memory access (although the PCIe interface will remain the same).

IRT Doug, £110 sounds like a good price for that card if you're looking to upgrade. Also, the 8800GT, or any Nvidia card, will not work with folding@home - only AMD cards.

Dougiebabe2003 · 12 Apr 2008 at 15:30

Anyone got any PPD on this client yet?

theheyes · 12 Apr 2008 at 16:08

Roughly 800-900 depending on the card and the CPU. Test work units are out at the moment and points are being reviewed so everything is a bit up in the air. If I was considering the client I would hang fire on buying new hardware until other people do all the donkey work for you.

breman1972 · 12 Apr 2008 at 16:27

Dougiebabe2003 said:
Anyone got any PPD on this client yet?

I got a ppd of 650 with an asus eah3650 and quad core @ 3ghz with 2gig of ram

Going to stick with the smp clients at least i get 3520 ppd with those

uncle_fungus · 12 Apr 2008 at 19:32

theheyes said:
Roughly 800-900 depending on the card and the CPU. Test work units are out at the moment and points are being reviewed so everything is a bit up in the air. If I was considering the client I would hang fire on buying new hardware until other people do all the donkey work for you.

You can get 1800+ PPD with a 3870 and a fast CPU.

theheyes · 12 Apr 2008 at 19:55

Is that running 2x GPU2 clients on it? The modal figure I was seeing banded about was 800/900ppd, but there's obviously massive variations between setups. Regardless, you know more about it than I do.

muon · 12 Apr 2008 at 23:54

Is a sign of cpu limitation that the gpu core doesnt get fully loaded? I only get around 74% load on the gpu while one of my cpu cores is fully loaded.

edit: overclocking the cpu did the trick. gpu usage now is ~95%. I get 2095ppd with the 2799 wu doing each frame in 40s. hd3850(256mb) clocked at 864/999.