Linux SMP EUE's

Soldato
Joined
18 Oct 2002
Posts
3,023
Location
Temuka, New Zealand
Anyone on the beta team want to have a look at this please?

Both quotes have been snipped:)

The first EUE...

[16:23:04]
[16:23:04] *------------------------------*
[16:23:04] Folding@Home Gromacs SMP Core
[16:23:04] Version 1.73 (November 27, 2006)
[16:23:04]
[16:23:04] Preparing to commence simulation
[16:23:04] - Ensuring status. Please wait.
[16:23:04] - Starting from initial work packet
[16:23:04]
[16:23:04] Project: 3025 (Run 8, Clone 178, Gen 12)
[16:23:04]
[16:23:04] Assembly optimizations on if available.
[16:23:04] Entering M.D.
[16:23:22] al work packet
[16:23:22]
[16:23:22] Project: 3025 (Run 8, Clone 178, Gen 12)
[16:23:22]
[16:23:22] Entering M.D.
[16:23:22] ne 178, Gen 12)
[16:23:22]
[16:23:22] Entering M.D.
[16:23:28] ompleted 0 out of 5000000 steps (0 percent)
[16:23:28] a SSE boost OK.
[16:29:13] les
[16:29:13] Completed 50000 out of 5000000 steps (1 percent)
[16:34:58] Writing local files
[16:34:58] Completed 100000 out of 5000000 steps (2 percent)


[01:19:17] Writing local files
[01:19:17] Completed 4700000 out of 5000000 steps (94 percent)
[01:24:23] Quit 101 - NaN detected: (ener[20])
[01:24:23]
[01:24:23] Simulation instability has been encountered. The run has entered a
[01:24:23] state from which no further progress can be made.
[01:24:23] This may be the correct result of the simulation, however if you
[01:24:23] often see other project units terminating early like this
[01:24:23] too, you may wish to check the stability of your computer (issues
[01:24:23] such as high temperature, overclocking, etc.).
[01:24:23] Going to send back what have done.
[01:24:23] logfile size: 110476
[01:24:23] - Writing 111026 bytes of core data to disk...
[01:24:23] ... Done.
[01:24:24]
[01:24:24] Folding@home Core Shutdown: EARLY_UNIT_END
[01:24:28] CoreStatus = 72 (114)
[01:24:28] Sending work to server

Now the next unit....

[02:51:42][02:51:42] Folding@Home Gromacs SMP Core
[02:51:42] Version 1.73 (November 27, 2006)
[02:51:42]
[02:51:42] Preparing to commence simulation
[02:51:42] - Ensuring status. Please wait.
[02:51:42] - Starting from initial work packet
[02:51:42]
[02:51:42] Project: 3025 (Run 8, Clone 178, Gen 12)

[02:51:59] - Starting from initial work packet
[02:51:59]
[02:51:59] Project: 3025 (Run 8, Clone 178, Gen 12)
[02:51:59]
[02:51:59] Entering M.D.
[02:52:05] ompleted 0 out of 5000000 steps (0 percent)
[02:52:05] a SSE boost OK.
[02:58:10] les
[02:58:10] Completed 50000 out of 5000000 steps (1 percent)
[03:04:25] Writing local files
[03:04:25] Completed 100000 out of 5000000 steps (2 percent)
[03:10:43] Writing local files[11:42:56] Completed 4650000 out of 5000000 steps (93 percent)

[11:48:36] Writing local files
[11:48:36] Completed 4700000 out of 5000000 steps (94 percent)
[11:53:46] Quit 101 - NaN detected: (ener[20])

[11:53:46] Folding@home Core Shutdown: EARLY_UNIT_END
[11:53:52] CoreStatus = 72 (114)
[11:53:52] Sending work to server

This looks like it has had another go on the same unit and failed at the same place. Is my rig OK and the wu faulty?
Cheers, Matty.
 
It's failed at exactly the same point so it's either a bad WU or it isn't happy getting past that particualr point on your machine. Either way I wouldn't worry unless you start seeing more EUEs on other WUs.
 
from Wiki

Quit 101 - Fatal error: NaN detected: (ener[xx])

Quit 101 - Fatal error: NaN detected: (ener[xx])
...snip...
Folding@home Core Shutdown: EARLY_UNIT_END


xx can be 0, 11, 12, 13, 18
Note: NaN is short for "Not a Number".

Edit: don't know if that helps or not :o
 
Last edited:
Yeah, it's the exact same WU and it failed at exactly the same place so it looks like it's just the WU, your hardware should be fine.
 
diogenese said:
I've stuck a similar thread on FCF too, I'll check back after I've been to the pub.
you won't get better than what UF has just posted as he has the ability to check the results - well unless someone manages to complete it in the coming days
 
I've had two recent SMP EUEs in very quick succession, but none before or after.

Looks like I've had a non-SMP EUE today though...

marvin.markvgray.com-foldingathome-day.png


I'd better go check that out.
 
I thought that all of the SMP projects at the moment are in Beta - open Beta but still Beta. My interpretation of Beta programmes are that they are still testing the code, if you find a potential bug - report it.

If no one has done so already, I might suggest that this problem is reported on the FAH forums. You never know, it may go some way to removing bugs from the program in the future.
 
Well I've done 4 2605's since with no problems so I'm assuming it was a duff wu.
Nobody official has commented on my thread in FCF's linux SMP beta forum :(
 
Back
Top Bottom