BSOD Troubleshooting since Vega 64 upgrade

Associate
Joined
17 Jan 2019
Posts
14
Location
Scotland
I was posting in the Graphics Card subforum initially as I thought it would be entirely GPU related, but I'm not sure now so I've moved it here. I'll copy/paste my initial post:

Got a NITRO+ a few months ago, and I've been getting BSODs, particularly on Black Ops 4 but it can happen on the desktop too. What I've tried:
  • Reverting to older drivers
  • Switching VBIOS
  • Updating/downgrading motherboard BIOS (Some seem more stable than others...?)
  • Monitoring temperatures (Hot spot gets to 91C, rest of the core 73C)
  • Chkdsk
Here are my specs. I think it's worth noting my previous GPU was an R9 Fury, which I aggressively undervolted but I had no BSODs with.

I get all sorts of stop codes:
Code:
CRITICAL_PROCESS_DIED
PAGE_FAULT_IN_NONPAGED_AREA
SYSTEM_THREAD_EXCEPTION_NOT_HANDLED


@shankly1985 was helping me and thought it was a RAM related issue, however I ran memtest86 for 3 hours until it completed and there were no errors. No OC, XMP enabled. Same with the GPU. As I said though, some BIOS' are more stable than others, on F24d I'd get a BSOD ever day. on F23 I've had a couple in a week. Some days I'll play BO4 for hours before getting a BSOD, other days I won't play any games and I'll crash on the desktop. There doesn't seem to be evidence that it could be my PSU's lack of wattage, which some have been suggesting to me elsewhere, but it seems like everything else is stable but the GPU so I'm clueless.
 
Last edited:
Hey dude. Its gonna be hard to diagnose this because it does seem to be somewhat hardware related, however we can start by analysing the minidump files. Can you go to C:\Windows\Minidump, grab all the files you see there, zip them up and host them somewhere so I can download and take a look at them
 
Hey dude. Its gonna be hard to diagnose this because it does seem to be somewhat hardware related, however we can start by analysing the minidump files. Can you go to C:\Windows\Minidump, grab all the files you see there, zip them up and host them somewhere so I can download and take a look at them

Here's a couple minidumps.

Thank you dude, I'm really clueless at this point, besides putting my R9 Fury back in.
 
Last edited:
Yep, had Bluescreen View installed to help the user mentioned in the OP. https://pbs.twimg.com/media/DxUJRX1WwAAV5Rc?format=jpg&name=small
DxUJRX1WwAAV5Rc
 
Sorry for the late response I've only just managed to take a look, however the bluescreens seem to be hardware related. At this point it could be virtually anything, most likely it seems to be the GPU. Could you possibly get a temporary card (that isn't your R9 Fury) and try that out, I'd take a guess that if you don't see any more BSODs with the third GPU then you might as well RMA the Vega 64. Sorry I couldn't really be of much help. I had this issue recently too and the problem ended up being my CPU.
 
Might be worth pulling all storage except your os drive and retesting. Whenever I've come across this error, it's been storage related but it's going to be a process of elimination. Easiest way is to start with the bare minimum hardware installed test, reintroduce hardware and keep going until it crashes. Have you tried a Windows reinstall?
 
Might be worth pulling all storage except your os drive and retesting. Whenever I've come across this error, it's been storage related but it's going to be a process of elimination. Easiest way is to start with the bare minimum hardware installed test, reintroduce hardware and keep going until it crashes.
I could try this, however I can go days without crashing and then some nights get 4 in an hour. Is there something I can do other than chkdsk that could test my drives?

Have you tried a Windows reinstall?
I reset Windows after 1809 first launched.

Just to be extra sure I guess.
My Fury was stable, I don't see any reason (just now) why I shouldn't test that. My other option is to steal my sisters HD 7770 from her PC, and I don't think she'd be content with that. ;)
 
Id say the hotspot temp of 91c is too high and likely the cause as the card is probably getting unstable. Create a custom aggressive fan curve to keep it in check and see if that makes any difference mate.

I have the same card and my hotspot temp does not exceed 75 with the card undervolted+overclocked and custom aggressive fan curve. I did have stability issues myself when the hotspot temp was around the figure you mentioned.

Happy to share my Wattman settings just drop me a PM.
 
Last edited:
I've been getting BSODs

Here are my specs (Corsair Force GS 128GB)

I get all sorts of stop codes:
Code:
CRITICAL_PROCESS_DIED
PAGE_FAULT_IN_NONPAGED_AREA
SYSTEM_THREAD_EXCEPTION_NOT_HANDLED

Have you updated your SSD firmware? Your Corsair has the SandForce SF-2200 controller which has a serious bug that causes BSODs with at least 2 of those exact stop codes. I've seen this issue a few times, mostly on Intel SSDs which also use the SandForce controller. It starts off with the blue screens, then finally INACCESSIBLE_BOOT_DEVICE. The issue is fixed in later versions on the firmware which you can download from the Corsair website. This needs to be done ASAP.

If it's not that, try and remove one of your memory modules, if the problem still happens, try swapping with the other memory module. In rare cases, I've seen Memtest x86 pass a faulty module.
 
Have you updated your SSD firmware? Your Corsair has the SandForce SF-2200 controller which has a serious bug that causes BSODs with at least 2 of those exact stop codes. I've seen this issue a few times, mostly on Intel SSDs which also use the SandForce controller. It starts off with the blue screens, then finally INACCESSIBLE_BOOT_DEVICE.

I updated my SSD firmware before factory resetting Windows, but I just checked Toolbox and it's the latest version (5.24).

Id say the hotspot temp of 91c is too high and likely the cause as the card is probably getting unstable. Create a custom aggressive fan curve to keep it in check and see if that makes any difference mate.

I have undervolted and applied a custom fan curve which is slightly more aggressive in case that's the cause, however I've since uninstalled TriXX and haven't been monitoring.
 
Thought I'd give a little update, I flashed my BIOS a week ago and I haven't experienced a BSOD since. Could that be indicative of anything? Or is it just a coincidence? I did have a brief moment of stability when I flashed an older BIOS (F23 over F24d) but just a few days before the crashes came back. F25 has been really stable.
 
Thought I'd give a little update, I flashed my BIOS a week ago and I haven't experienced a BSOD since. Could that be indicative of anything? Or is it just a coincidence? I did have a brief moment of stability when I flashed an older BIOS (F23 over F24d) but just a few days before the crashes came back. F25 has been really stable.

Sherlock Holmes: Whatever remains, however unprobable must be the answer...computers, they can drive you mad some times :)
 
Back
Top Bottom