Something up with my FAH?

Associate
Joined
8 Sep 2005
Posts
303
Location
Leixlip, Ireland
I havent been using my pc the last few days, got it running just right so i left it crunching away while i browsed some E3 stuff and was studying.

Just noticed i havent submitted a WU in almost a week so i opened my logs.

Seems something went very awry a few days ago, starting with it abandoning a WU mid-way through.

[00:15:33] Gromacs error.
[00:15:33]
[00:15:33] Folding@home Core Shutdown: UNKNOWN_ERROR
[00:15:36] CoreStatus = 79 (121)
[00:15:36] Client-core communications error: ERROR 0x79
[00:15:36] Deleting current work unit & continuing...
[00:15:38] - Preparing to get new work unit...
[00:15:38] + Attempting to get work packet
[00:15:38] - Connecting to assignment server
[00:15:51] - Successful: assigned to (171.64.122.136).
[00:15:51] + News From Folding@Home: Welcome to Folding@Home
[00:15:52] Loaded queue successfully.
[00:15:53] - Error: Attempt #1 to get work failed, and no other work to do.
Waiting before retry.

Ok... so i can take one WU messing up and not getting the server on first try... only...

[20:05:45] Couldn't send HTTP request to server (wininet)
[20:05:45] + Could not connect to Assignment Server
[20:05:45] Couldn't send HTTP request to server (wininet)
[20:05:45] + Could not connect to Assignment Server 2
[20:05:45] + Couldn't get work instructions.
[20:05:45] - Error: Attempt #109 to get work failed, and no other work to do.
Waiting before retry.
[20:53:57] + Attempting to get work packet
[20:53:57] - Connecting to assignment server
[20:53:57] Couldn't send HTTP request to server (wininet)
[20:53:57] + Could not connect to Assignment Server
[20:54:01] - Successful: assigned to (171.64.122.136).
[20:54:01] + News From Folding@Home: Welcome to Folding@Home
[20:54:01] Loaded queue successfully.
[20:54:06] + Closed connections

Ok... completly dead for almost 5 days then suddenly it just plain works. All through that time my pc was working fine and i never noticed anything off going on. Just a very bad turn of luck for me there or what?
 
well according to the FahWiki the error code means this:

Gromacs error.
Folding@home Core Shutdown: UNKNOWN_ERROR
CoreStatus = 79 (121)
Client-core communications error: ERROR 0x79
Deleting current work unit & continuing...

This error can occur with these lines preceding the error message above:

- Couldn't open work/wudata_xx.chk
- Couldn't open work/wudata_xx.chk
Couldn't open for writing
Writing local files

In this case the error is caused by the core being unable to open, and therefore write-to, it's checkpoint file. Check that the permissions on the files are correct and that you didn't run out of space on the disk

This error also can be caused by memory errors which may be related to overclocking or wrong voltages or simply by bad RAM. If this error occurs when the core is just starting, there's a reasonable chance that it was an "unable to allocate" issue such as running out of space in the paging file or a memory fragmentation issue.

but the large number of failed attempts to get new work would seem to be a problem your end. The wininet message usually appears when people install new firewalls or when they are having problems with a proxy (maybe something your ISP was doing)

Sorry it's a bit vague - best to just keep an eye on it for the next few WUs and check if it gets new work straight away or not
 
Sorry I cannot help, other then to say that the
Error: Attempt #1 to get work failed, and no other work to do.
Waiting before retry.
problem is the main reason ive removed FAH from all my systems, I was having PCs sat idle for days without noticing, thats no good, I never had connection issues with Seti or Boinc, so I will be switching over to a differant project instead. (dont know which yet)
 
Been folding for quite a while here now and can only say that any errors like that were because of my setup (firewall/router) problems and not FAH.

Maybee I'm just lucky! ;)
 
Steevo38 said:
Been folding for quite a while here now and can only say that any errors like that were because of my setup (firewall/router) problems and not FAH.

Maybee I'm just lucky! ;)
yep same here - folding is by far the most reliable project of those i've tried, it has to be given that there's no system for queueing decent work

the two most common reasons for FAH being unable to connect are:
1) SQUID Proxy used by ISP
2) Norton Internet Security 2005 (think it's just that year)

:(
 
Berserker said:
Does that mean that my efforts didn't kill off the squid proxy issues? It's been happily connecting through mine for months now. :confused:
well people still seem to be getting problems with SQUID proxies so I assume it didn't solve all the problems - i'm a little out of touch with the folding forums i'm afraid so couldn't say for sure :o
it may be a different problem :confused:
 
I'm also having some problems getting a new work unit on this machine:
FAHLog said:
[15:48:37] Trying to unzip core FahCore_7a.exe
[15:48:37] Decompressed FahCore_7a.exe (2371584 bytes) successfully
[15:48:37] + Core successfully engaged
[15:48:37] Deleting current work unit & continuing...
[15:48:41] - Preparing to get new work unit...
[15:48:41] + Attempting to get work packet
[15:48:41] - Connecting to assignment server
[15:48:42] Couldn't send HTTP request to server (wininet)
[15:48:42] + Could not connect to Assignment Server
[15:48:42] Couldn't send HTTP request to server (wininet)
[15:48:42] + Could not connect to Assignment Server 2
[15:48:42] + Couldn't get work instructions.
[15:48:42] - Error: Attempt #1 to get work failed, and no other work to do.
Waiting before retry.
Is this a problem at my end?
 
Joe42 said:
I'm also having some problems getting a new work unit on this machine:

Is this a problem at my end?
i would be more worried that it seems to have deleted a WU for no apparent reason :eek:


as for the wininet thing - if it's not a firewall/proxy issue then people have said changing the "use IE connection settings" to the opposite of what they currently are often solves the problem
some also say that it's just the restarting of the client that gets the client out of the cycle it's stuck in
and of course it never hurts to restart the computer/modem/router (or all that apply) if you are having problems connecting


edit: you can also check the status of the servers - http://fah-web.stanford.edu/serverstat.html
 
Last edited:
Plasmoid - forget what I said about it being a problem your end
I just found a thread by someone with the exact same problem connecting to the same server (171.64.122.136) :o

linky

if you go to the bottom of that page you'll see efishy has identified that it's most likely a bug in the client/server as it does manage to connect the first time and then when it fails to download work it gets stuck somewhere between the work server and assignment servers and just sits there

the odd thing is that in your case it did actually sort itself out - albeit after 5 days :confused:
all I can say is I really hope that v6 client is very very close now, until then I guess we'll all have to keep a closer eye on our rigs


sorry for the mis-information earlier on, looks like it's time i went back to foldy school for re-heducation :p
 
I think my problem is i've got it set to use i.e settings, and it shouldn't be set to that. I'll give it a try. No idea what its doing deleting units...
 
Well a lot of times you can mess with settings and get it to work again. By changing your configuration you can get it to connect to a different server that isn't kicking you out. Then, later, when the server comes back you'll be able to connect. It's sort of backwards but it works. :o
 
Well core 2 seems to be working again now, its downloaded a new unit.

Core 1 however seems to be having some wierd problems:
FAH Log said:
[12:33:10] Folding@Home GB Gromacs Core
[12:33:10] Version 1.86 (August 28, 2005)
[12:33:10]
[12:33:10] Preparing to commence simulation
[12:33:10] - Looking at optimizations...
[12:33:10] - Created dyn
[12:33:10] - Files status OK
[12:33:10] Error: Work unit read from disk is invalid
[12:33:10]
[12:33:10] Folding@home Core Shutdown: CORE_OUTDATED
[12:33:14] CoreStatus = 6E (110)
[12:33:14] + Core out of date. Auto updating...
[12:33:14] New core downloaded for this work unit, but still out of date.
[12:33:14]
Folding@Home will go to sleep for 1 day as there have been 5 consecutive Cores executed which failed to complete a work unit.
[12:33:14] (To wake it up early, quit the application and restart it.)
[12:33:14] If problems persist, please visit our website at http://folding.stanford.edu for help.
[12:33:14] + Sleeping...

I'm also a bit confused about the config files. I tried creating 2 separate shortcuts, one for each core, to the configuration, however they both seem to act as if they are changing the same configuration. I used the first shortcut to change the ie settings thing to no, and then used the shotrcut for the 2nd core to do the same with that, but its already been changed. They are changing the same configuration, and i'm not sure if its universal for both cores of if its only changed it for one core and the shortcut for the other one isn't working... :confused:
 
Last edited:
Joe42 said:
Bump.
Got a core here doing nothing... see above.
Any ideas?
the two clients are in seperate folders right?

seems like a very odd problem - not heard of shortcuts not working before, it's possible that the second client sees the first one running with the same machine ID (since the config is the same) and closes down

you could try using the -local switch to keep the config files seperate, did you have the graphical version running before or something as that seems to make running dual clients later a bit odd

i would stop the client which is running and use the -configonly and -local switches then configure both clients

for the inactive core you may even be best off deleting everything and starting again since it sounds like it's got assigned a duff copy of the core or something


hope this helps :)
 
It's starting to even out a little... now it seems 50:50 that it will get stuck in a cycle of death.

Trying to change the "Use IE settings" and some others to see if it helps.

Setting up a 2nd client with dealineless WU's for now.

Edit: So much for that...
[22:26:53] - Ask before connecting: No
[22:26:53] - User name: Plasmoid (Team 10)
[22:26:53] - User ID: 46A4E69717F3E43B
[22:26:53] - Machine ID: 3
[22:26:53]
[22:26:53] Loaded queue successfully.
[22:26:53] + Benchmarking ...
[22:27:00] - Preparing to get new work units...
[22:27:00] + Attempting to get work packet
[22:27:00] - Connecting to assignment server
[22:27:00] - Successful: assigned to (171.64.122.112).
[22:27:00] + News From Folding@Home: Welcome to Folding@Home
[22:27:00] Loaded queue successfully.
[22:27:01] - Deadline time not received.
[22:27:02] + Attempting to get work packet
[22:27:02] - Connecting to assignment server
[22:27:02] - Successful: assigned to (171.64.122.112).
[22:27:02] + News From Folding@Home: Welcome to Folding@Home
[22:27:02] Loaded queue successfully.
[22:27:03] - Deadline time not received.
[22:27:04] + Attempting to get work packet
[22:27:04] - Connecting to assignment server
[22:27:04] - Successful: assigned to (171.64.122.112).
[22:27:04] + News From Folding@Home: Welcome to Folding@Home
[22:27:04] Loaded queue successfully.
[22:27:05] - Deadline time not received.
[22:27:06] + Attempting to get work packet
[22:27:06] - Connecting to assignment server
[22:27:06] - Successful: assigned to (171.64.122.112).
[22:27:06] + News From Folding@Home: Welcome to Folding@Home
[22:27:06] Loaded queue successfully.
[22:27:07] - Deadline time not received.
[22:27:08] + Attempting to get work packet
[22:27:08] - Connecting to assignment server
[22:27:08] - Successful: assigned to (171.64.122.112).
[22:27:08] + News From Folding@Home: Welcome to Folding@Home
[22:27:08] Loaded queue successfully.
[22:27:09] - Deadline time not received.
[22:27:10] + Attempting to get work packet
[22:27:10] - Connecting to assignment server
[22:27:10] - Successful: assigned to (171.64.122.112).
[22:27:10] + News From Folding@Home: Welcome to Folding@Home
[22:27:10] Loaded queue successfully.
[22:27:11] - Deadline time not received.
[22:27:12] + Attempting to get work packet
[22:27:12] - Connecting to assignment server
[22:27:12] - Successful: assigned to (171.64.122.112).
[22:27:12] + News From Folding@Home: Welcome to Folding@Home
[22:27:12] Loaded queue successfully.
[22:27:13] - Deadline time not received.
[22:27:14] + Attempting to get work packet
[22:27:14] - Connecting to assignment server
[22:27:14] - Successful: assigned to (171.64.122.112).
[22:27:14] + News From Folding@Home: Welcome to Folding@Home
[22:27:14] Loaded queue successfully.
[22:27:15] - Deadline time not received.
[22:27:16] + Attempting to get work packet
[22:27:16] - Connecting to assignment server
[22:27:16] - Successful: assigned to (171.64.122.112).
[22:27:16] + News From Folding@Home: Welcome to Folding@Home
[22:27:16] Loaded queue successfully.
[22:27:17] - Deadline time not received.
[22:27:18] + Closed connections
[22:27:18]
[22:27:18] + Processing work unit
[22:27:18] Core required: FahCore_7a.exe
[22:27:18] Core found.
[22:27:18] Working on Unit 01 [May 20 22:27:18]
[22:27:18] + Working ...
[22:27:18]
[22:27:18] *------------------------------*
[22:27:18] Folding@Home GB Gromacs Core
[22:27:18] Version 1.90 (March 8, 2006)
[22:27:18]
[22:27:18] Preparing to commence simulation
[22:27:18] - Ensuring status. Please wait.
[22:27:35] - Looking at optimizations...
[22:27:35] - Working with standard loops on this execution.
[22:27:35] - Previous termination of core was improper.
[22:27:35] - Files status OK
[22:27:35] - Expanded 16832 -> 142191 (decompressed 844.7 percent)
[22:27:35]
[22:27:35] Project: 2097 (Run 69, Clone 88, Gen 2)
[22:27:35]
[22:27:35] Entering M.D.
[22:27:55] (Starting from checkpoint)
[22:27:55] Protein: p2097_A21_agbnp_amber99
[22:27:55]
[22:27:55] Writing local files
[22:27:55] GB activated
[22:27:55] Completed 240046 out of 25000000 steps (1)
[22:28:48] Writing local files
[22:28:48] Completed 250000 out of 25000000 steps (1)

Well... at least its sorting itself out.
 
Last edited:
Plasmoid said:
Edit: So much for that...


Well... at least its sorting itself out.
That's meant to do that - it's filling up the cache of 10 WUs :)

In a stroke of luck (makes a change eh? ;)) it would appear that all the deadlineless WUs on server 112 are 240ish point Tinkers which aren't bad ppd, a lot better than the newer Gromacs deadlineless work which aren't anything like as good points
 
Gah... made a fool of myself, was looking at the wrong log and everything... yet pasted the right one.

Is there a way to have both my main fah and backup fah client running as services or must one be running in a window?
 
Plasmoid said:
Is there a way to have both my main fah and backup fah client running as services or must one be running in a window?
yes you can run both as services (you can have up to 8 console clients running at the same time)

run the main FAH at "low" priority and run the backup FAH at "idle" priority and the two will run happily together with the main one getting 100% cpu time when it has work and the backup taking over when the main isn't doing anything
the priority setting is in Advanced settings when you run the config
 
Back
Top Bottom