fermi completion time weirdness

Soldato
Joined
10 Oct 2005
Posts
4,192
Location
London
Anyone else seeing longer completion times for seti fermi workunits?
Im getting some very long completion times - I have no VLAR's on the gpu's only normal fermi wu's and some VHAR's

fermi.jpg


Something doesn't seem right - I have dropped down to 2 wu's per card for the moment to see if that helps - what is everyone else seeing?
 
A few gpu tasks starting 28my10ad have taken 36 minutes but not many, am sat trying to figure out why every cpu task times out bang on 42 minutes :(
 
I've had another mixed bag for my GPUs. I got through 150 tasks in about 3 hours just after the outage began - a whole bunch of 2 minutes jobs. Got some that took 36 minutes each, and am now back to normal. I think its just variation in task size.
 
A few gpu tasks starting 28my10ad have taken 36 minutes but not many, am sat trying to figure out why every cpu task times out bang on 42 minutes :(

That sounds like the -177 error. Here's a fix to avoid them:


The even simpler alternative is to shut BOINC down completely and do a global replace in client_state.xml of all <rsc_fpops_bound> with <rsc_fpops_bound>3. That boosts the bound by a factor of 4 at least, but affects all tasks for all projects. If you can wait until the beginning of the outage, doing that just twice gives a boost of at least 34. That should be sufficient protection against -177 errors.

Backup the program Data folder before applying in case it all goes wrong.
 
Last edited:
Appreciate the input Area51 but you might just have given me a recipe for borsch in russian, have never been into client_state xml, for a start theres more than one:eek:
I understand what your saying about replace a with b but where is a and b residing. Just take a deep breath and pretend your trying to tell a five year seriously i am that pc illiterate :(
 
Appreciate the input Area51 but you might just have given me a recipe for borsch in russian, have never been into client_state xml, for a start theres more than one:eek:
I understand what your saying about replace a with b but where is a and b residing. Just take a deep breath and pretend your trying to tell a five year seriously i am that pc illiterate :(

OK - deep breath - first do this now:

Stop your client processing - NOW!. Once your tasks abort, I can't retrieve them (until my script is complete), so we need to stop you producing them first. Let me know when you have done this.
 
Back up the c:\Program data directory first.

1) Navigate to client_state.xml. You should find it in c:\Program Data\BOINC. There can only be one inside this folder - windows does not allow two files with the same name in one folder.
2) Right click client_state.xml, and left click the edit option (make sure the file loads into Notepad).
3) In the Edit menu, click Replace.
4) In the Find What box type the text: <rsc_fpops_bound>
5) In the Replace QWith box, type: <rsc_fpops_bound>3

Note, for both of the above (4 & 5) do not include any spaces. The first character you type should be <.

6) Click Replace All
7) Once the Search/Replace is complete, select Save from the File menu.
8) Re-start your client.

You should now not get any -177 aborts. Unfortunately, Until I have finished my script, I cannot recover the aborted tasks.
 
Followed your steps saved and restarted client (i discovered snap screen oooo) some tasks are at 39 mins now so if they get past 42 should be ok....fingers crossed :)
 
First cpu task for 24 hours has passed 42 mins...houston Area51 may have made a break through, yup 6 now past 42mins. On a side note i got my first astropulse tasks just before project outage that would not have affected anything would it?
 
Followed your steps saved and restarted client (i discovered snap screen oooo) some tasks are at 39 mins now so if they get past 42 should be ok....fingers crossed :)

Forget anything to do with task estimates. I imagine your DCF is screwed. It will recover itself. This seems to happen when tasks take longer than expected. I'm afraid this is all to do with the Credit New system, and the fact that the original concept of BOINC has been surpassed with GPUs and optimised apps.

Sorry about your aborted tasks, but my script will not be available for public consumption 'till next week. I'm working an extra shift again tomorrow, so I won't get a shot at getting a basic version running 'till the weekend or maybe Monday/Tuesday. Interested in becoming a beta tester?
 
First cpu task for 24 hours has passed 42 mins...houston Area51 may have made a break through, yup 6 now past 42mins. On a side note i got my first astropulse tasks just before project outage that would not have affected anything would it?

Good god, how many aborted?
 
I'd be happy to test it out for you, but remember "5year old" , Sure just let me know exactly what to do and more importantly what not to do and i'll try it.
Got nothing to loose as long as i back up mt data folder on an hourly basis, don't rush at it whenever its ready its ready.
Thank you very much for that, duly filed away in "do not delete you prat" folder.:D:D

**Easily a few hundred**

**2nd machine is now running cpu tasks correctly too**
 
Last edited:
I'd be happy to test it out for you, but remember "5year old" , Sure just let me know exactly what to do and more importantly what not to do and i'll try it.
Got nothing to loose as long as i back up mt data folder on an hourly basis, don't rush at it whenever its ready its ready.
Thank you very much for that, duly filed away in "do not delete you prat" folder.:D:D

Sounds like you're back and running.....

Great. The script is being written in a way that requires no input from the user (and will require no updates unless the fundamental structure of client_state.xml changes). It won't even run if your client is still running! I won't send it out 'till its finished running properly many times on my setup. You will need to install a PERL interpreter, but I will talk you through that - it really is no big deal. Can you e-mail me a copy of your client_state.xml file as it currently is (you can copy it to your desktop whilst you are running your client)? I'll PM my email address to you via the S@H forum - I'd like to use it a test case - I've not seen a -177 abort in my client_state.xml file for quite a while, and I need to make sure I have some live examples.
 
Back
Top Bottom