F@H - Linux + SMP + NV-GPU2

Associate
Joined
3 Jul 2004
Posts
866
Location
Helena, Montana
I've been rooting through many threads on various forums & wiki's, and it seems there's a viable installation path for Linux 64-Bit SMP + WinDoze GPU2 using wine, CUDA 2.0 and a custom Driver Wrapper.

Pro's:
--> Linux SMP Goodness
--> Nvidia Multi-GPU Goodness
--> ATI GPU Goodness (TBA, it's in the works from what I've read)
--> Headless Operation (No Monitors, Mouse, KB)
--> No WinDoze Lic. or the usual whooo's of WinDoze XP / Vista.

PPD Nominal Estimates:
--> Q6600 @ 3.0Ghz + 2x 8800 GTS (G92) Stock => 15K
--> I've seen data on a Quad with 4x GPU cards, it's shocking the PPD.

Con's / Concerns:
--> Requires a Custom Driver Wrapper = Done.
--> Native Linux Install, so you loose your WinDoze, unless it's a VM.
--> Have not seen a VMWare Install version, not to say it's not happening. If possible, it makes OC'ing the GPU's a breeze.
--> Takes some configuring to negate the need to X11 to be installed. Not a major issue.
--> Not Fully Supported by the Pande Group, but it's not being rejected.

So, where does that leave us. We need a rock solid Method / Guide for installing this beast (maybe add it to SirusB's F@H SMP Client Guide ?? ). I'm not a Linux GURU, but I can follow directions well, test and provide feedback.

Any Takers ??
.
 
Unlikely to be a VMWare solution for that as VMWare blocks access to the hardware (that's also why 3D doesn't work in VMWare).
 
It's a reality. Through the deticated work of many people ( too many to ccredit here), I along with many other folks are running Linux SMP + GPU's and so can you, it's dead easy (once you've cracked the first one)

Here's my current Setup:
CPU: Q6600 My grandmothers wheel chair rolls faster than that thing will clock (Duff Cores)
MOB: GA-P35-DS3L, was in the garage collecting spiders webs
RAM: OZC PC-6400, Old Junk I found in the parts bin
PSU: ToughPower 650, with 2x bad 12v Rails.. :eek:
HDD: 20GB Segate, bottom of the barrel crude
CAS: Peice of Wood: WxDxH 24"x12Xx1/2" Looks good though. I used Velcro to bolt things down ... LOL ...
GPU: New EVGA 9800 GT (Only descent part in the thing)
OSS: Ub-8.10 Server Edition, CLI onlly, no GUI
NIC: Netgear 311v3 .. Used ndiswrapper , the sys and inf files only, up and running in seconds.

Here's the Best Part: (1x SMP + 1x GPU, from mostly trash bin material):

-- Q6600-3-1 --
Min. Time / Frame : 10mn 57s - 2524.93 ppd
Avg. Time / Frame : 10mn 57s - 2524.93 ppd
Cur. Time / Frame : 10mn 59s - 2517.27 ppd
R3F. Time / Frame : 10mn 59s - 2517.27 ppd
Eff. Time / Frame : 11mn 04s - 2498.31 ppd

-- Q6600-3-GP1 --
Min. Time / Frame : 1mn 24s - 4937.14 ppd
Avg. Time / Frame : 1mn 24s - 4937.14 ppd
Cur. Time / Frame : 1mn 24s - 4937.14 ppd
R3F. Time / Frame : 1mn 24s - 4937.14 ppd
Eff. Time / Frame : 19mn 11s - 4937.14 ppd

Total PPD: => 7500 PPD .. I can live with that.

I've not even come close to maxing out the cores. Load Averaves are 3.36 / 3.42 / 2.58 ... the 2.58 will go up to mid 3's in a day or so.

So, with 2x GPU's and 1x SMP, your looking at a minimum of 12K PPD.
.
 
Last edited:
Awesome work mate :cool: I certainly couldn't have been arsed to faff about with something like that getting it to work but I'm sure glad someone can! :)

Unfortunately the way I have thing setup my CPU's are BOINCing and my GPU's are Folding. So untlill I get bored living in three camps (which is bound to happen sooner or later) or I get to the point where I a have to make a decision which project I really want to focus on (like a challenge or something) :) all I can do is just sit back and applaud!
 
After I figured out I had HW problems, things went smoothly. The install, assuming you already have Linux-Server CLI installed takes about 10 minutes.

I suspect this will be automated in the not to distant future, as the benifits for Stanford are pretty healthy from a production standpoint.

There are 2.25 Million Widows cpu's, 345,000 Linux cpu's, and 60,000 GPU clients (ATI + NV). If they pull just 10% of the linux Clients, they would double the number of high output GPU clients.

That's not the best part. Even with 2.5 Windows cores, the TFLOPS pale in comparison to NV which is 1754 TFLOPS, tied with PS3 at ~ 1753 TFLOPS.

The TFLOP numbers are staggering if the adoption of GPU folding rolls into mainstream applications.

I have a side project I'm been sketching up using mini-ITX boards that have a x16 PCIE slot. Wont give away the details yet, but it's goning to be cool.
.




.
 
Last edited:
Looking forward to it KE1HA! I always thought it would be cool if there was an (intel) atom based motherboard that had a PCI-e x16 slot, but i couldnt find one... Would really cut the power consumption and keep the points up from the a gpu.
 
Looking forward to it KE1HA! I always thought it would be cool if there was an (intel) atom based motherboard that had a PCI-e x16 slot, but i couldnt find one... Would really cut the power consumption and keep the points up from the a gpu.
I dont know about an Atom based / VIA board, but there are mini-ITX baords that support C2D, they deffinately have x4 or above PCIE slots, that's all for now :D
 
That's cool stuff, but 2500ppd is pretty low for 4-core SMP on Linux. Is it running an a1 core WU or a2?

I tried to get this working a little while ago. The Q6600 at 3Ghz was doing nearly 6000ppd from one SMP instance on a2 core, but I gave up because I couldn't even get the nV driver to install without killing the X server, let alone the GPU client. Good job on getting it running! Automation would be nice as the process for setting up the custom CUDA wrapper is quite involved... then again, most people who are running Linux don't mind a bit of tinkering.
 
Last edited:
That's cool stuff, but 2500ppd is pretty low for 4-core SMP on Linux. Is it running an a1 core WU or a2?

I tried to get this working a little while ago. The Q6600 at 3Ghz was doing nearly 6000ppd from one SMP instance on a2 core, but I gave up because I couldn't even get the nV driver to install without killing the X server, let alone the GPU client. Good job on getting it running! Automation would be nice as the process for setting up the custom CUDA wrapper is quite involved... then again, most people who are running Linux don't mind a bit of tinkering.
SMP has actually gone up a good bit, mid 3500 range. Will post the data later this eveing, as that will be a full 24 hours of crunching on both.
 
Rig-1
1x 9800 GT Stock
1x SMP CPU at 2.7 ish
Total PPD => 8,390

-- Q6600-3-1-SMP --
Min. Time / Frame : 8mn 43s - 3171.85 ppd
Avg. Time / Frame : 8mn 48s - 3141.82 ppd
Cur. Time / Frame : 9mn 01s - 3066.32 ppd
R3F. Time / Frame : 8mn 53s - 3112.35 ppd
Eff. Time / Frame : 8mn 57s - 3089.16 ppd

-- Q6600-3-GP1 --
Min. Time / Frame : 1mn 19s - 5249.62 ppd
Avg. Time / Frame : 1mn 25s - 4879.06 ppd
Cur. Time / Frame : 1mn 19s - 5249.62 ppd
R3F. Time / Frame : 1mn 19s - 5249.62 ppd
Eff. Time / Frame : 1mn 19s - 5249.62 ppd

---------------------------------------------

Rig-2
2x 9800 GT Stock
1x SMP CPU at Stock 2.4Ghz
Total PPD => 12,045

-- Q6600-1-1 --
Min. Time / Frame : 10mn 22s - 2667.01 ppd
Avg. Time / Frame : 10mn 27s - 2645.74 ppd
Cur. Time / Frame : 10mn 22s - 2667.01 ppd
R3F. Time / Frame : 10mn 22s - 2667.01 ppd
Eff. Time / Frame : 10mn 21s - 2667.01 ppd

-- Q6600-1-GP1 --
Min. Time / Frame : 1mn 23s - 4996.63 ppd
Avg. Time / Frame : 1mn 23s - 4996.63 ppd
Cur. Time / Frame : 1mn 23s - 4996.63 ppd
R3F. Time / Frame : 1mn 23s - 4996.63 ppd
Eff. Time / Frame : 1mn 24s - 4937.14 ppd


-- Q6600-1-GP2 --
Min. Time / Frame : 1mn 23s - 4996.63 ppd
Avg. Time / Frame : 1mn 23s - 4996.63 ppd
Cur. Time / Frame : 1mn 23s - 4996.63 ppd
R3F. Time / Frame : 1mn 23s - 4996.63 ppd
Eff. Time / Frame : 1mn 23s - 4996.63 ppd
---------------------------------------------

Im just happy I dont have to mess with Dummy Loads or Monitors at all with these things now :D
 
Intrepid + GPU folding?

Hey. I'm trying to set up folding on an 8800GT under Ubuntu 8.10, 64 bit. This seems as good a place as any to ask.

Is there a version of this 'Custom Driver Wrapper' available which will help? I got as far as a (windows) command line under wine which told my that my graphics card was unsupported. I'm using x11, which presumably makes things simpler.

I basically cant find a guide applicable to Intrepid, and following one for Hardy lead to the above result. Any advice please?
 
These threads will give you all the info you could ever need on the matter:

1. http://foldingforum.org/viewtopic.php?f=52&t=6793

2. http://foldingforum.org/viewtopic.php?f=52&t=3744&sid=747f23b5c3214fc274b34f0771930ce1

Personally i follwed the first guide - i did not run the client headless, but its pretty much the same. The second link is the 'development' thread - lots of info, but it is quite hard to wade through. There is a link to the custom wrapper in the thread. It works fine - you do need to use the -forcegpu nvidia_g80 flag though as it will pick up ati projects (bad!). I did have a few problems getting it to quit properly - using ctrl+c to exit as you would do with the windows client didnt work and gave guarded run errors every time. I had to kill the process mnaually by typing 'top' into the command line, finding the PID of the fah_core11 process and using the 'kill <PID>' command to prevent erros - other than that worked fine for me.

Hope all goes well - you could joing the folding forum (official stanford forum) and ask some of the very clever linux-ey people who developed the wrapper if you need any help, but the guide in the first link is fairly comprehensive.
 
Ah, those were good links. Now folding on my graphics card as well

Thank you very much, first one worked following it step by step :)
 
In my experience no. Comparing vista to linux, vista was getting roughly 200 ppd more from both the cards of my GX2. I have not tried it with the GTX280. This could be a limitation of the wrapper/WINE - it is not an officially supported combination.

One of the biggest problems i found with linux was that the gpu client really disrupts the SMP client whilst running. On ubuntu 8.04, the smp client would use ~95% of the cpu (accross 4 cores), which should have allowed the gfx clients to run in the remaining 5% of 'free' cpu. I found that no matter what i tried, it would not use more than 95% in total. This is a scheduler issue. The result was the SMP client got around 5400 ppd down from 6200 when not running the gfx clients. Other programs also really slowed the SMP client (each core needs to be in sync for max ppd), and in the end it was more worthwhile points wise to use windows and vmware, where i can have more control over the scheduler.
 
Back
Top Bottom