Random crash/reboot

  • Thread starter Thread starter ne0
  • Start date Start date

ne0

ne0

Associate
Joined
2 Feb 2018
Posts
1,108
My system's become unstable in recent weeks to the point now I'm slightly concerned I have a hardware issue.

Now, I was convinced it all started when the new 50 series drivers came out, but we've since seen 3 or 4 new drivers since then and despite using DDU to do everything properly, I'm now seeing crashes/BSOD/restarts in various games.

I think it's important to mention that so far, it has only ever happened under load and in most cases (not all) it has been when I have tried to alt tab out of a game.

Either the system will freeze up and then hang, forcing a forced restart, or it will BSOD and then restart itself. But it's very intermittent - sometimes I can alt tab out of a game multiple times and nothing happens. I can go for hours with nothing happening and then it can just happen out of the blue.

Event viewer shows the really unhelpful Kernel-Power error which can mean a lot of different things apparently. I can't see anything else in there that would give any clues.

I have noticed recently that my 3090 seems to be running maybe a touch warmer than usual - still well within safe temps (max 78c) but usually it would be in the lower 70's.

I have tried running MemTest86 on each stick of RAM separately - no errors - admittedly I only did 1 pass on each.
Ran sfc /scannow and it found errors and fixed them.

My PSU is a good one - Corsair Platinum 1000w - it's still under warranty so may pursue that.

BIOS and all drivers are up to date.
Windows is up to date.

General system performance seems normal otherwise.

Any advice on what I should try next?
 
Last edited:
Could try a older Nvidia driver, i am still on 561.09 (studio) and am not having any problems.

If you have Windows 24H2, that could be causing problems as well.
 
As mentioned above, it could be a windows update or 24H2 has been installed.

I would check the Windows update history and see if any updates or 24H2 was installed around the time the issues started.

If 24H2 was installed I would at the time the issues started I would see if its possible to roll back to 23H2 then test and test.
 
Thanks, I have been waiting for another incident to occur before updating. System restarted last night when exiting KCD2. This has to be a driver or OS system surely? I'm trying to think of a reason it would happen when exiting an application but not really sure.
 
I'm trying to think of a reason it would happen when exiting an application but not really sure.
The main thing that can happen when you exit an application is that you're switching from full screen 3D to 2D, I know Windows 24H2 made some changes in the background there, but it could also be driver related, since the card switches power states.

This has to be a driver or OS system surely?
It would be good to try and rule it out, though do note that if your GPU is running hotter, it might also be dumping more heat into the case, which could make overclocks unstable.

Ran sfc /scannow and it found errors and fixed them.
Is the SMART status of your drives all good?
 
The main thing that can happen when you exit an application is that you're switching from full screen 3D to 2D, I know Windows 24H2 made some changes in the background there, but it could also be driver related, since the card switches power states.


It would be good to try and rule it out, though do note that if your GPU is running hotter, it might also be dumping more heat into the case, which could make overclocks unstable.


Is the SMART status of your drives all good?
No overclocks apart from xmp so I think I'm ok there.

Can you elaborate on SMART status of drives? Not sure what you mean or what to check.
 
No overclocks apart from xmp so I think I'm ok there.

Can you elaborate on SMART status of drives? Not sure what you mean or what to check.
Have a look at something like CrystalDiskInfo or Samsung Magician and make sure they're not using reserved blocks/developing bad blocks or displaying other errors.
 
Last edited:
Have a look at something like CrystalDiskInfo or Samsung Magician and make sure they're not using reserved blocks/developing bad blocks or displaying other errors.
Thanks. I checked the drives for errors and all ok.

Was looking at my cooling in icue yesterday and noticed an alert set to restart my system when the pump coolant reaches 70c. I have disabled that and will see if it happens again. I'm skeptical it's the cause as no reason for my CPU to have reached that temp and I monitor the temps on an LCD screen but always a possibility the software is buggy. Also as nothing was reported in icue after it reboots I'm not sure it's the cause.
 
Back
Top Bottom