Help with PC game freezes and random part-shutdowns

Associate
Joined
13 Jun 2020
Posts
4
Hi all,

I have been experiencing issues with a PC I've built myself for a while now, and was wondering if you may help as I am out of ideas on how to troubleshoot the problem. The specs:
Nothing is overclocked AFAIK. CPU settings are on AUTO in the BIOS, I reset RAM to its stock speed of 2133 MHz, and I even undervolted the GPU (automatic -2%, and manual to 2000 MHz/1100V in Radeon Software). I have run the disk checker, and memtest86 without errors. When stress testing the temps stay around ~65C CPU, ~80 GPU.

The problem(s):
  • Games keep freezing or crashing. This seems GPU related as sometimes the Elite Dangerous locks up, but I can hear sounds in the background and if I am in the menus the on-mouse-hover sounds are playing, etc. But the video is frozen and eventually the game either crashes or I close it.
  • Fullscreen or windowed-borderless games cause black screens from which the only way to recover is to hard reset (may be 5700 XT black screen related). Happens in Red Dread Redemption 2, Battlefield 5, Elite Dangerous, World of Warships (sometimes), Generation Zero, etc. I have submitted AMD support tickets for this one.
  • And the one that has me most worried and the reason for this post - last night the system part-shutdown. All the peripherals shut down - monitor, audio, keyboard, mouse, but the fans and LEDs in the case stayed on. Pressing the reset or power buttons did nothing. I had to use the PSU power switch to reset the machine.
Due to the last issue I am worried that it may be my old PSU finally starting to act up and I don't want it to damage my system. Maybe you can help me track down the problem(s)?

Thanks in advance,
Nahdir
 
It could be your old PSU however we could try narrow it down by trying a few things... Have you got all your drivers upto date?
 
Welcome aboard.

Likely quite a few things could be cause of that.
Or actually multiple problems.
Have you checked Event viewer and reliability history for errors time marked with those problems?
Those could give hints.

If you use PC lot daily PSU could be considered well used.
Currently PSU availabilities might be just bad because of Corona virus related things. (really bad in OcUK)

How long did you run Memtest?
It should be run for many passes/longer time.

PC not powering down could be also something in motherboard.
(which controls signal keeping PSU powered)
 
No harm in cleaning out your fans. You'd be surprised how many faults a hot system can produce.

Other than that if all other software fails reinstall OS or I stall on a new HD or partition, try other drivers.

Then it's case of swapping out components unless your handy with a multimeter or oscilloscope.
 
It could be your old PSU however we could try narrow it down by trying a few things... Have you got all your drivers upto date?
Yep, all drivers up to date (motherboard, BIOS, GPU, Windows).
Welcome aboard.

Likely quite a few things could be cause of that.
Or actually multiple problems.
Have you checked Event viewer and reliability history for errors time marked with those problems?
Those could give hints.

If you use PC lot daily PSU could be considered well used.
Currently PSU availabilities might be just bad because of Corona virus related things. (really bad in OcUK)

How long did you run Memtest?
It should be run for many passes/longer time.

PC not powering down could be also something in motherboard.
(which controls signal keeping PSU powered)
There are a LOT of errors in the Event Viewer and Reliability Monitor and can be summarised as:
Reliability Monitor:
  • Windows was not properly shut down.
  • Hardware error. Problem Event Name: LiveKernelEvent. Code: 141.
  • Some Application Stopped Responding errors.
Event Viewer:
  • Perflib - 1020 - The required buffer size is greater than the buffer size passed to the Collect function of the "C:\Windows\System32\perfts.dll" Extensible Counter DLL for the "LSM" service. The given buffer size was 26936 and the required size was 31384.
  • Service Control Manager - 7034 - The AMD User Experience Program Launcher service terminated unexpectedly. It has done this 1 time(s).
  • Kernel-Boot - 29 - Windows failed fast startup with error status 0xC00000D4.
  • Application Error - 1000 - Faulting application name: Radeonsoftware.exe, version: 10.1.2.1798, time stamp: 0x5ecc0ba5. Faulting module name: ntdll.dll, version: 10.0.18362.815, time stamp: 0xb29ecf52. Exception code: 0xc0000374.
And others relating to individual application crashes. Looks like a good place to continue investigating.

In the last 10 years the PC probably has been on almost daily. :eek:

I ran the standard memtest tests. It ran for about 4 hours, something like 13 tests, 4 cycles each.
No harm in cleaning out your fans. You'd be surprised how many faults a hot system can produce.

Other than that if all other software fails reinstall OS or I stall on a new HD or partition, try other drivers.

Then it's case of swapping out components unless your handy with a multimeter or oscilloscope.
The PC was moved to a new case a few months ago so the fans & meshes are pretty clean. I also cleaned the PSU fan when upgrading. I reintalled windows a few months ago to try and solve these problems. They still persist. I haven't tried a new partition though, just a clean format. I have tried a bunch of different GPU drivers that purported to solving the black screen problem. None did for me. 19.x.x, a bunch of the 19.12.x & 20.x.x drivers. I'm now on the latest 20.5.1 where the issues are still happening.

It might still be a heat problem as I'm not happy with my CPU temps. I also forgot to mention that I RMA'd the GPU once for another of the same kind and the errors persisted.

Thanks for your help everyone. I am going to look into the errors highlighted above, and will run some 3DMark tests. Someone mentioned those tests helped them reproduce similar errors, while FurMark did not - which is what I ran.
 
Try resetting bios and setting bios to default settings.

It could be a HD error, a quick Google associates one of the errors above with an SSD problem.
 
KernelEvent 141 is hardware/hardware driver related

Perlib 1020 error is rather vague, but shouldn't be critical.
http://kb.eventtracker.com/evtpass/evtPages/EventId_1020_Microsoft-Windows-Perflib_67501.asp

You can disable AMD User Experience thing without losing anything.
https://www.amd.com/en/corporate/amd-user-experience
Useless junk on background is never good thing.

Kernel-Boot error might be caused Fast Startup, which seems like typical MS screwup and as likely to cause problems than help, so you could try disabling it.
https://www.howtogeek.com/243901/the-pros-and-cons-of-windows-10s-fast-startup-mode/

ntdll.dll is Windows component, so that's not exactly promising error.
Though did you try clean install of graphics card drivers?
 
So I have:
  • Reset BIOS
  • Checked Windows image
  • Disabled AMD User Experience
  • Disabled fast boot
  • Tried a different PSU
I haven't experienced any black screens or system hangs. ED is still crashing in the startport menus but I suspect its a bug in the game as the stack trace is pointing to Scaleform API. I have submitted a bug report with Frontier, will see where it takes me. I tried running Warframe and did not experience a system hang either, but will have to keep checking to be sure. Unfortunately I can't reproduce it with the 3DMark stress test.

I did clean intall of the drivers before, but not for the ones I am using now. It did not solve the issue before.
I am also not sure how to troubleshoot if the system hang is a HD problem. Move the affected games to a different HDD?

For now I will wait and see what Frontier support says, and if the issue exists in other games.
 
I am also not sure how to troubleshoot if the system hang is a HD problem. Move the affected games to a different HDD?
Malfunctioning drive can cause whole system issues by just being connected.
Even when not used for actual data traffic there's frequent communication between host and client devices of any bus and client can cause problems by giving non-standard data host end can't understand.

But because most issues went away, chances are good drive isn't the cause.
Though if that drive is 840 EVO and not 840 Pro then its made of real shaky tech.
 
Well I switched the PSU back and had a system lock up straight away. It was even in League of Legends which never happened before. As such I'm gonna play it safe and get new PSU, SSD, and more cooling for the CPU. Should get them on Tuesday and will see if the problems go away. Thanks for your help so far.
 
Back
Top Bottom