Updates to Windows and Linux SMP

Marine Iguana · 24 Jan 2010 at 08:15

We have just posted updates for all our SMP-capable clients except for the OS/X-Installer version. These updates do two things:
1. fix a small but important bug
2. enable the clients for the upcoming release of our first SMP2 cores (threads-based SMP)

The new client version is 6.29. Please update your clients at your earliest convenience if you are running SMP, particularly if you are running -bigadv work units.
We anticipate swapping over -bigadv work units to use only the new client in the coming weeks.

Client downloads are available at:
http://folding.stanford.edu/English/Download

and windows SMP at:
http://folding.stanford.edu/English/DownloadWinOther

Senture · 24 Jan 2010 at 11:17

Thanks for the update. Put the new build on my Linux VM's and it bricked the WU at 90%

Marine Iguana · 24 Jan 2010 at 11:33

Senture said:
Thanks for the update. Put the new build on my Linux VM's and it bricked the WU at 90%

Hope mine go's better

SiriusB · 24 Jan 2010 at 15:55

Will the new thread-based SMP improve Windows performance? I imagine it should as it wouldn't have to use that god-awful MPI crap.

JonJ678 · 25 Jan 2010 at 00:44

Moved to the new client which promptly nuked the bigadv unit it was previously working on. Waiting to see what it does now, it's downloading something.

Starting to think consistent normal units are a better idea than occasional massive ones.

No longer need to track down a different fah6 file for bigadv folding, the 6.29 one does it quite happily.

Marine Iguana · 25 Jan 2010 at 00:48

Also

upcoming release of SMP2 cores

After a long development process, we are excited to announce the upcoming release of SMP2 (threads-based SMP) cores to public testing. The first SMP2-based core is the A3 core, and it will soon become available on advanced methods for OS/X (Intel), 64-bit Linux, and Windows. We are still doing development work to refine the A3 core, but it is at the point where we are ready for public testing.

We are excited about the SMP2 cores because the threads-based parallelization allows us to dispense with the MPI-based parallelization that added an extra layer of complexity and was particularly troublesome for Windows users. We anticipate phasing out the earlier SMP cores and work units in favor of these new ones; at this point in the changeover process, our Windows SMP client will still require MPI to be installed so that the client can handle an A1 work unit if no A3 work units are available. In the near future we will release an updated Windows client that does not require MPI.

The SMP2 cores require a client update; please upgrade your SMP-capable clients to at least version 6.29. We will gradually discontinue SMP projects for earlier clients.

Important: the SMP2 cores use the early-completion bonus system that we piloted with the bigadv work units. We have revamped the benchmarking system to work with this bonus system. The base point values for SMP2 work units will appear low; the benchmarked points values **include bonuses.** Some third-party utilities have been updated to include these bonuses in their calculations.

Please see an accompanying post regarding the bonus system.

One important part of the bonus system is that users:
1. Must use a passkey to receive bonus points
2. Must successfully return >=10 A2 or SMP2 work units with their passkey to receive bonus points
3. Must successfully return >80% of A2 or SMP2 work units to receive bonus points

We will shortly perform a limited "reset" of the bonus-qualifying work unit history. Important: users who have qualified for a bonus will remain bonus-qualified. We will also maintain the % returned for users but will reduce the overall counts to 10. As we do not have an automated "timeout" for bonus qualification history, we may perform rare periodic such resets in the future.

Thanks! We are excited to release these new cores to the public.

And

Points system for SMP2 work units

The SMP2 Core A3 work units mark the debut of a new points system. We have been testing the key element of this system--early completion bonuses--in the bigadv work unit program. Please refer to this document with for a more detailed explanation of the points system. We are also changing our benchmark system over to a Core i5. Points have been calibrated against previous benchmarking setups, as described below.

Introduction
Points are a key aspect of distributed computing, such as Folding@home (FAH), as it both works to indicate to donors how much they have contributed, as well as foster friendly competition between donors that has always been an essential part of distributed computing. Folding@home’s point system is based on the concept of a benchmark machine, i.e. a particular class of hardware which we use as a standard to define how many points a given calculation should get. The choice of this benchmark machine can have implications for points for donors. Moreover, how we use this benchmark is important.

Our benchmarking philosophy tries to balance two elements: keeping a system reasonably simple (both for donors and for the FAH team to calculate) as well as keeping points in alignment with scientific value of a given calculation. Indeed, donors will optimize their machines (eg choice of hardware, choice of clients, etc) based on points, so it is important that points awarded be reflective of the scientific gain.

While our basic benchmark idea is pretty simple, this document is fairly long in order to give donors full details about how we have chosen the benchmark machine as well as giving detailed information of this machine and how this could impact points for donors.

Benchmark philosophy
Our philosophy is pretty simple: we would like to standardize benchmarks to a single machine and standardize and simplify the bonus schemes now employed. Bonuses have played a key role in aligning points with science and we will continue to use them. For example, returning work units (WUs) promptly can be very important for the science we’re doing, so we provide bonuses for this, especially with the high performance clients.

Machines used in comparison
We chose a 2.2 GHz E6600 as the prototype dual-core machine and a Q6600 at either 2.4 GHz or 3.2 GHz as the prototype quad-core.
The new benchmark machine is a Core i5-750 with Turbo Mode off. We compare single-core performance to the old benchmark machine, a 2.8 GHz Pentium 4.

FAH Projects used in the comparison
We base comparisons to the single-core benchmark machine on projects 4442 and 6315, comparing single-core speed on the 2.8 Ghz Pentium 4 to ideal quad-core speed on the 2.6 Ghz Core i5 machine.
We base comparisons to quad-core machines on project 2671.
We base comparisons to dual-core machines on project 6012.

Results.
Machine: Performance relative to Core i5:
P4 2.8 0.098 (on project 4442)
P4 2.8 0.12 (on project 6315)
E6600 0.30
Q6600-3.2 1.1
Q6600-2.4 0.82

Based on these multiplicative speed factors, we can project ppd output based on either the A1 or the A2 benchmarking standards.
Machine: A1 ppd: A2 ppd:
E6600 521 1663
Q6600-3.2 1933 6172
Q6600-2.4 1450 4629

Bonus point formula
Briefly summarizing our bonus formula, the bonus is applied for users who have a passkey, have successfully returned at least 10 bonus-eligible WU's, successfully return >=80% of assigned WU's, and return the WU before the preferred deadline. Bonus points do not apply to partial returns.

Our bonus formula calculates final points as follows:
final_points = base_points * max(1,sqrt(k*deadline_length/elapsed_time))
Note that the max(1,...) ensures that final_points are never lower than base_points.

We can convert this formula to points per day as follows:
ppd = base_ppd * speed_ratio * max(1,sqrt(x*speed_ratio)),
where speed_ratio is the machine speed relative to the Core i5, and x = k * deadline_length.

Parameter determination
If we set the new quad-core base ppd to 1024 and the parameter x to 30, we get the following results:

Machine: projected ppd:
E6600 903 (greater than A1, less than A2)
Q6600-3.2 6456 (greater than A2)
Q6600-2.4 4628 (approximately equal to A2)
P4 171 (on project 4442)
P4 228 (on project 6315)

Explanation of x parameter
We may vary the deadline length between projects (some projects require fast completion and thus have short deadlines). Each project has an associated k parameter that controls the bonus points yield. We standardize k as follows:
x * speed_ratio = k * deadline_length / elapsed_time
since we can express speed_ratio as Core_i5_time / elapsed_time:
x * Core_i5_time / elapsed time = k * deadline_length / elapsed_time
therefore:
x *Core_i5_time = k * deadline_length
solving for k, we obtain:
k = x * Core_i5_time / deadline_length
and since x is set to 30,
k = 30 * Core_i5_time / deadline_length, where Core_i5_time is the time to complete a work unit on our Core i5 benchmark machine.

Summary
According to our projections, this new benchmarking standard will result in points yield for a 2.8 GHz P4 that is slightly above the typical uniprocessor values, points yield for a 2.2 GHz E6600 that is greater than typical A1 core yields but less than typical A2 core yields, points yield for a 3.2 GHz Q6600 that is greater than typical A2 yields, and additional points yield rewards for faster systems. The crossover point between A3 and A2 ppd in speed falls approximately at a 2.4 GHz Q6600.

Cant wait

Marine Iguana · 25 Jan 2010 at 07:17

Just picked up the a3 core with a 470 point WU see how it goes

Bonus calculator HERE for those eligible i will need to wait until i have done the 10 WU's as i only just put my passkey in

KE1HA · 25 Jan 2010 at 09:51

Marine Iguana said:
Just picked up the a3 core with a 470 point WU see how it goes

Bonus calculator HERE for those eligible i will need to wait until i have done the 10 WU's as i only just put my passkey in

Oh, man, we need to add passkey's to get the bonus PPD for these thigns ? I don't have pass key's on anything. I guess I need to do that ASAP.

vertica · 25 Jan 2010 at 10:02

I'm 50% through a 6012 (core _a3), FAHmon obviously shows the project running at a very low ppd (sub 1k), but then the bonus calc shows a reduction too - about 4.5k rather than 5k or 6k ppd from my quad.

Marine Iguana · 25 Jan 2010 at 10:09

My i7 keeps throttling back even though temps are in the low 70°c mark i now have it @ 3.8Ghz on stock volts dunno why this is happening annoying me

Also got 1k PPD but won't get a bonus till i have done ten

KE1HA · 25 Jan 2010 at 10:24

Ive got an A3 underway now. FAHMon is saying about 11hrs to run it, and I had a couple hours break in there this evening when I was re-configuring things. I dont have this -bigadv flag set on my box. so dont know what's up with the A3 stuff for none 8 core boxes but it is running on UB-64bit. It's a pretty stout processor, Extreme Edition running with a mild OC at about 3.8 and 4GB of Reaper Ram. I suspect the CPU cache has allot to do with particular CPU performance but that's only a guess.
.

Marine Iguana · 25 Jan 2010 at 17:10

Right now have it sort of not down clocking itself and have 1500 PPD if i could get the bonus would mean a PPD of 9.5K

sir-les-mp · 25 Jan 2010 at 17:15

How do i get a passkey for the smp windows client
as ive updated the client as advised.

Marine Iguana · 25 Jan 2010 at 17:19

Go here http://fah-web.stanford.edu/cgi-bin/getpasskey.py

.walls · 25 Jan 2010 at 17:47

SiriusB said:
Will the new thread-based SMP improve Windows performance? I imagine it should as it wouldn't have to use that god-awful MPI crap.

oooh... is this thread based?

If it is... I might build one of these

Marine Iguana · 25 Jan 2010 at 19:27

.walls said:
oooh... is this thread based?

If it is... I might build one of these

Now that would be ludicrous..... so when are you going to build it

Biffa · 26 Jan 2010 at 17:24

So on a Q9650 do I need the -bigadv switch to get the bonus points with this client?

I've got an A3 WU but FAHMon is only saying 623 PPD for this vs the 3-4K PPD on notfred in a VM with an A2 WU

JonJ678 · 26 Jan 2010 at 18:03

Don't believe so. bigadv is a ridiculous work unit which a very fast quad core can just about get through fast enough to get bonus points, it was used to trial the bonus points which are now available with the A3 work units. Fahmon might not be able to work out the bonuses yet.

That's my understanding anyway, every time I try to force it to get a new work unit I'm getting some terrible A1 unit that refuses to use more than one core so I'm not sure I believe A3 units exist.

I'd quite like to know how long it takes your q9650 to get through a bigadv unit if your quad is on 24/7 anyway, my i7 is taking about 60 hours for each one it actually manages to finish (intolerant of reboots).

Biffa · 26 Jan 2010 at 18:14

Ok switched two machines over Q9650 and Q6600 to the new client, see how they go, will switch one more tonight and maybe turn my server back onto folding (have to watch the power meter)

Using the funky HFM.NET instead of fahmon now.

Oh also running the SMP2 client as a service on XP

miniyazz · 26 Jan 2010 at 18:54

JonJ678 said:
That's my understanding anyway, every time I try to force it to get a new work unit I'm getting some terrible A1 unit that refuses to use more than one core so I'm not sure I believe A3 units exist.

I'd quite like to know how long it takes your q9650 to get through a bigadv unit if your quad is on 24/7 anyway, my i7 is taking about 60 hours for each one it actually manages to finish (intolerant of reboots).

Do we need -advmethods, do you know? I seem to be getting a similar godawful A1 unit as you, although coincidentally it happened at the same time as upgrading the client. Using two cores (ish) of my CPU only. I've had to reopen two of my Windows Gromacs (no SMP) cores that give me a quarter the PPD of my Linux SMP client

Due to finish this core some time in the morning.. about 24 hours after starting it :mad:

Edit: Yep it appears to have been the A1 unit. Back on A2 now and going properly, but still no sign of A3.. -advmethods needed?