• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

Undervolting and GPU recovery

Associate
Joined
7 Oct 2003
Posts
463
Location
Telford
I've just had a rather scary situation where my MBA 6800XT suddenly died and the machine stopped booting.

The situation is that I, like a few people, use my card to do a bit of "on the side" mining, mostly whilst I am at work, as the computer has to be on anyway, why not capitalize on this fact? But I digress. I had set the mining software in AMD's control panel to undervolt the card when it was in use, to try and limit power usage.

This usage situation has worked for weeks without problem, until today. My screens both went off, and the PC restarted itself. When it rebooted, the motherboard was displaying a VGA failure warning light. Cold reboot didn't sort it, and it seemed that the system, or at least, the GFX had had it. Panicking that my card had died, I then left the machine off, to let it cool. I then reseated the GFX card and for good measure, wiped the mobo CMOS.

This appears to have resolved the issue, as the machine is now working again. When it booted into windows, there was a pop-up from AMD's software advising that there had been an issue with "wattman" (I assume this is a subsystem that handles the GFX's power management) and that it had recovered from this error.

Now, this may have been a one-off, but for the moment, I will be giving the mining a rest.

After this long winded story, my question is does the GFX handle the voltage regulation purely via the software, or is it written to the GFX card via flash/volatile memory somehow? I only ask, because if it was purely software, surely a reboot of the machine would remove all evidence of any changes to the hardware, and it should start operating at its "default" levels again? Can anyone shed any further light on this for me?

Undervolting is something I potentially want to look into more going forward, but I know little about it, so the more I know about it, the better.
 
Leave it run without undervolting for a bit to check if it's that causing the issue? Then tweak it manually if all is stable at the nomal setting. Each chip is different and maybe the auto doesn't quite work for yours/your PSU is not stable/smooth (output) enough or something else in the build.
 
Back
Top Bottom