GTX295 F@H - Help Please

It is becoming a PITA!

Second WU failed, same error:

[20:20:24] Completed 68%
[20:21:23] Completed 69%
[20:22:12] Run: exception thrown during GuardedRun
[20:22:12] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[20:22:12] Going to send back what have done -- stepsTotalG=10000000
[20:22:12] Work fraction=0.6980 steps=10000000.
[20:22:17] logfile size=0 infoLength=0 edr=0 trr=23
[20:22:17] + Opened results file
[20:22:17] - Writing 642 bytes of core data to disk...
[20:22:17] Done: 130 -> 127 (compressed to 97.6 percent)
[20:22:17] ... Done.
[20:22:17] DeleteFrameFiles: successfully deleted file=work/wudata_03.ckp
[20:22:17]
[20:22:17] Folding@home Core Shutdown: UNSTABLE_MACHINE
[20:22:19] CoreStatus = 7A (122)
[20:22:19] Sending work to server
[20:22:19] Project: 6606 (Run 5, Clone 506, Gen 191)

Although it did go on and carry on folding with a new WU, before I stopped it.

The loft where it is running is hot, but the card seems fine at 77c. Project numbers I have at the moment as the logs keep overwriting are:

Project: 5765 (Run 13, Clone 381, Gen 1082) Completed
Project: 6606 (Run 5, Clone 506, Gen 191) Failed @ 69%

Currently running it with a different PSU. Might be an issue with Project 66XX's but I've not read anything about it.
 
Last edited:
EUE Pause on a 6606, this is getting annoying now. Either both the cards are faulty, unlikely but possible or there is something wrong elsewhere. I guess aside from the cards there is the installation and the motherboard.

[00:57:37] Completed 71%
[00:58:36] Completed 72%
[00:59:13] Run: exception thrown during GuardedRun
[00:59:13] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[00:59:13] Going to send back what have done -- stepsTotalG=10000000
[00:59:13] Work fraction=0.7260 steps=10000000.
[00:59:17] logfile size=0 infoLength=0 edr=0 trr=23
[00:59:17] + Opened results file
[00:59:17] - Writing 642 bytes of core data to disk...
[00:59:17] Done: 130 -> 127 (compressed to 97.6 percent)
[00:59:17] ... Done.
[00:59:17] DeleteFrameFiles: successfully deleted file=work/wudata_06.ckp
[00:59:17]
[00:59:17] Folding@home Core Shutdown: UNSTABLE_MACHINE
[00:59:20] CoreStatus = 7A (122)
[00:59:20] Sending work to server
[00:59:20] Project: 6606 (Run 0, Clone 206, Gen 207)


From what I've read "CoreStatus = 7A (122)" means either graphics hardware or a WU issue. Either way I've deleted the whole folding GPU folder, started a fresh. Should have some more GTX260s arriving in the next few days, so if I can't solve it then maybe new GPU hardware will.
 
Last edited:
It has happened again, I have to say I'm getting annoyed. Either I'm doing something seriously stupid or there is something seriously wrong. Here is the very lastest full log, this is from a fresh install of the GPU3 client, if someone could take a look I'd appriecate it:

http://www.markljlewis.com/FAHlog.txt
 
have you tried running the stanford gpu memory tester. That helped me identify some bad 295's when i was using them. Not sure of the url at the mo but its on the foldingforum site. Worth a try and if dodgy for folding but fine for games/benches flog on the bay for another one or a 460/470
 
I have yes, unfortunately I can't get it to test all the memory. Does about 750 for 25 tests and 700 for 100 tests. Either way they both passed.

I've improved the cooling in the case and changed the fan curve with MSI Afterburner so it is at 65%+ regardless of the temps. Hopefully that might sort it, but only time will tell. I've had it do 10/11WU and then fail, so until it does 30+ I won't be happy its stable.
 
Last edited:
Back
Top Bottom