Good day all, been checking this thread here and there and had to create and account for this issue I believe Thiago is also experiencing.
So I have been pulling my hair out dealing with the Vega 64 instabilities and I think it has to be AMD tinkering with the drivers that is causing it. So I have a DIY AIO water cooled Reference Vega 64, I basically won the dollar scratch off ticket silicon lottery with this one, it is not a great undervolter. So I thought I had a stable profile with the last drivers "19.3.1" - P6 1532 @1125 P7 1652 @1150 Mem 1080 @1100 PL 18% Would run 1630+ mhz all day GPU around 55C, hot spot in the 80's. (fan rpm low and silent)
So I test stability with Valley Benchmark, windowed at 2560x1440 max settings, set camera to free and crank the environment settings (cloud, wind, rain) to max, move around a bit to find a spot in the environment with the lowest FPS and that puts a 99%+ load on gpu. It will usually sit there if stationary running fine, but if its unstable, it will soon crash with "Default Radeon WattMan settings have been restored due to unexpected system failure" when I move around the environment a bit.
Sometimes it will crash and Valley will keep running, but back at stock Vega settings, what I have found out is that after a crash, If I go and reload my profile, the clocks do not go as high as before, and thus doing the same thing with Valley, it no longer crashes. Same thing with Witcher 3, except it always quits running after a crash. I believe this "no longer crashing when reloading the profile" is due to the card not clocking as high as it was after a fresh reboot or having not crashed previously. Glitch with the drivers / software? Now if I reboot first, then clocks are back to normal higher clocks and will crash. (if unstable)
So after recently updating to "19.3.2", I started crashing in Witcher 3 again, I rebooted and ran Valley to check stability there and noticed I was hitting 1650+ mhz! It has never hit that at a 1652 P7. Usually in the 1630's Of course it crashed a short time later when moving around the environment. That time a hard crash, black screen. I rebooted and tried again, and again same higher 1650+ clocks, crashed after moving around a bit, benchmark still running this time. For ***** and giggles I reloaded my profile, (after killing and restarting the Radeon Settings process as it always craps out on a crash) this time the clocks were in the 1630's and no Valley crash.
So there seems to be some software / driver glitch that after a crash, (if not first rebooting PC) and reloading your same profile, is will not running correctly and run at lower max clock speeds.
Also these 19.3.2 drivers for me at least are bumping the MHZ somehow, the same profile is no longer stable as it was with 19.3.1.
About my (default) bios - 016.001.001.000.008892 it seems to clock closer the the actual P state that is set. For instance, P7 at 1652, I'll get close to 1640. This does not seem to be the case with other bios's correct? With the 19.3.2 drivers, P7 @ 1652 was giving 1650's+mhz.
**Update:**
So why this is effecting me? (and possibly Thiago), I think having a profile that runs the card right on the edge of stability. The profile was fine with 19.3.1, now with 19.3.2 and 19.3.3, for some reason my card is clocking around 15 - 20mhz higher, just enough to push it to instability.
I now have to back off P7 to 1642 to get around 1630mhz and keep it stable.
I discovered that when putting the PC in standby, which I do often, then upon waking. The damn clocks are also lower by around 15mhz than they are after a fresh reboot. WTF?
So I also found out I can use this tool
restart-1.3-src.zip to restart the display driver instead of rebooting. Which then bumps the clocks back to "normal" after a crash or coming from standby. Its something I've been using with another tool (CRU) for monitor over clocking and Free Sync rate min/max tweaks. All found here -
https://www.monitortests.com/forum/Thread-Custom-Resolution-Utility-CRU
Damn fun and pita card this is...
Also posted this in the drivers section, since it pertains to both.