Lost a Bigadv WU yesterday

Stan_Lite · 10 Apr 2010 at 09:29

.... at 84%

All I did was restart the client with the -oneunit flag as I wanted to do something with that rig once the WU had finished. It had been running fine for nearly 2 days and only had 6 hours to go. When I restarted the client, all seemed to be well until it tried to start from the checkpoint:

Code:

# Linux SMP Console Edition ################################################### 
############################################################################### 

Folding@Home Client Version 6.29 

http://folding.stanford.edu 

############################################################################### 
############################################################################### 

Launch directory: /usr/local/fah 
Executable: ./fah6 
Arguments: -bigadv -oneunit -verbosity 7 -smp 8 

[07:01:36] - Ask before connecting: No 
[07:01:36] - User name: Bigstan (Team 10) 
[07:01:36] - User ID: 729C5FBF3B4B8B62 
[07:01:36] - Machine ID: 1 
[07:01:36] 
[07:01:36] Loaded queue successfully. 
[07:01:36] 
[07:01:36] + Processing work unit 
[07:01:36] Core required: FahCore_a2.exe 
[07:01:36] - Autosending finished units... [07:01:36] 
[07:01:36] Trying to send all finished work units 
[07:01:36] + No unsent completed units remaining. 
[07:01:36] - Autosend completed 
[07:01:36] Core found. 
[07:01:37] Working on queue slot 09 [April 9 07:01:37 UTC] 
[07:01:37] + Working ... 
[07:01:38] 
[07:01:38] *------------------------------* 
[07:01:38] Folding@Home Gromacs SMP Core 
[07:01:38] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009) 
[07:01:38] 
[07:01:38] Preparing to commence simulation 
[07:01:38] - Ensuring status. Please wait- Files status OK 
[07:01:47] - Expanded 30235850 -> 159270593 (decompressed 100.6 percent) 
[07:01:47] 
[07:01:47] - Files status OK 
[07:01:59] teArray: compressed_data_size=30235850 data_size=159270593, decompressed_data_size=159270593 diff=0 
[07:02:00] - Digital signature verified 
[07:02:00] 
[07:02:00] Project: 2683 (Run 8, Clone 6, Gen 50) 
[07:02:00] 
[07:02:17] Assembly optimizations on if available. 
[07:02:17] Entering M.D. 
[07:02:21] (Run 8, Clone 6, Gen 50) 
[07:02:21] 
[07:02:21] Entering M.D. 
[07:02:27] Using Gromacs checkpoints 
[07:02:50] Resuming from checkpoint 
[07:02:53] 
[07:02:53] Folding@home Core Shutdown: INTERRUPTED 
[07:02:53] e=20 
[07:02:53] fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash. 
[07:03:01] CoreStatus = FF (255) 
[07:03:01] Sending work to server 
[07:03:01] Project: 2683 (Run 8, Clone 6, Gen 50) 
[07:03:01] - Error: Could not get length of results file work/wuresults_09.dat 
[07:03:01] - Error: Could not read unit 09 file. Removing from queue. 
[07:03:01] Trying to send all finished work units 
[07:03:01] + No unsent completed units remaining. 
[07:03:01] + -oneunit flag given and have now finished a unit. Exiting.- Preparing to get new work unit... 
[07:03:01] Cleaning up work directory 
[07:03:01] + Attempting to get work packet 
[07:03:01] Passkey found 
[07:03:01] ***** Got a SIGTERM signal (15) 

Folding@Home Client Shutdown.

That'll teach me. In future, if I intend restarting the client for whatever reason, I'll be making a copy of the directory first so I can hopefully resume from where I left off in case it all goes breasts up.

Wouldn't have been so bad if it was near the start but it's really galling to lose the WU after 43 hours of work

Bit sad but, that's the risk with beta clients I suppose (I'm trying to be stoic and brave about it when all I really want to do is sit in the corner and blub like a wee lassie

).

miniyazz · 10 Apr 2010 at 09:52

gutted!

Mattus · 10 Apr 2010 at 11:26

I used to get that SaveRestoreState error every now and then with WUs on the a2 core. Think it's an obscure bug in the core

Stanford really need to get all SMP stuff moved over to a3, pronto.

Biffa · 10 Apr 2010 at 13:50

I've given up. Back to A3's on my main rig, bigadv too much trouble on non dedicated kit.

Lost a Bigadv WU yesterday

More options

Stan_Lite

Stan_Lite

Man of Honour

miniyazz

miniyazz

Mattus

Mattus

Biffa

Biffa