Computer just shuts down unexpectedly during gaming

Associate
Joined
21 Nov 2012
Posts
106
Location
Glasgow
Hi,
Bit of an odd one, after running pretty much flawlessly I've encountered an issue in the last two days with my main computer where it will shut down unexpectedly during gaming after a few seconds of stuttering. No blue screen no blank screen just straight switch off.

Spec is
I9 13900K w/ Corsair H170I elite
Asus Z790 extreme
2x 16GB Corsair Corsair Vengeance Black 32GB 7200MHz DDR5 (No XMP as was originally listed as a supported /approved set but was pulled from the listings after release :mad:)
RTX 4090 FE
Corsair HX1200
2TB Firecuda 530 nvme
Windows 10 64 bit pro (was 11 but had stability issues so went back to 10 for now.)

I've removed the header in case the switch was faulty, no change,
ran HW monitor to keep an eye on temps and voltages nothing too excessive during stress testing
re-applied thermal paste (Kryonaut) old application looked good. No change
updated to the BIOS that Asus just released on the 12th to address a micro code issue and ran into the news about 13th and 14th gen failing. I really hope this isn't it. Now running in Intel performance spec
Also updated every driver on it including the intel management engine etc.
actually seems to be getting more frequent.
plenty of errors in event viewer but not seeing anything critical?
Everything else on the computer runs exceptionally cool as it's inside a Corsair obsidian 900D w/ all the fans

I'm not sure what to do given I get no error log or blue screen to work from.

Any suggestions/ help would be very much appreciated!
 
Man of Honour
Joined
22 Jun 2006
Posts
12,509
The most obvious reason for a PC to just shut down out of the blue under load with no error messages is the PSU, but admittedly it is hard to trust these CPUs right now.

You haven't moved the PC lately and might have unsettled the power cables?

How is your 4090 connected to the power supply?

plenty of errors in event viewer but not seeing anything critical?
The majority of event viewer errors are meaningless, but WHEA errors would be more interesting.

What stability issues did you have with Windows 11?
 
Last edited:
Soldato
Joined
28 Aug 2017
Posts
2,879
Location
United Kingdom
Straight power off would suggest a power issue / overtemp problem. Though having a 13900k does also ring alarm bells given recent news.

I would ask what kind of age your components are first, if everything checks out i would be looking at the cpu in all honesty. try at diffrent power states to see if the problems lesen or stop, if they do you may have a bad cpu.

As Tetras said check your event log for WHEA errors, that basically points to the cpu having issues, if you have loads then your cpu is the problem.
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
The most obvious reason for a PC to just shut down out of the blue under load with no error messages is the PSU, but admittedly it is hard to trust these CPUs right now.

You haven't moved the PC lately and might have unsettled the power cables?

How is your 4090 connected to the power supply?


The majority of event viewer errors are meaningless, but WHEA errors would be more interesting.

What stability issues did you have with Windows 11?
Lot's of WHEA errors :( There's not even a second between them at points.

A corrected hardware error has occurred.

Component: PCI Express Root Port
Error Source: Advanced Error Reporting (PCI Express)

Primary Bus:Device:Function: 0x0:0x1C:0x4
Secondary Bus:Device:Function: 0x0:0x0:0x0
Primary Device Name:pCI\VEN_8086&DEV_7A3C&SUBSYS_88821043&REV_11
Secondary Device Name:


All I get on the critical error list is an unexpected shutdown.

This one is repeated most of the way through.

the 4090 was bought at launch in May 2023 iirc and the full 13900k rig was put together from new in December 2022 power supply is a little older

Power to the 3090 is using all individual PCIE cables 4 into the 12 pin FE adaptor (No daisy chained connectors). cable is as straight as can be and fully seated (lots of room in a 900D)

It just crashed while typing this not even gaming. :rolleyes:

I was having memory stability issues that seemed to be significantly worse under windows 11 than in 10. though this may have been more attributed to the dodgy BIOS issues the extreme seemed to suffer on release and is probably why it was discontinued so quickly. But at the time it seemed to make a difference. I haven't re-tried it since to be honest.

Yes after seeing the reports coming out I was dreading it being the same issue.


 
Man of Honour
Joined
22 Jun 2006
Posts
12,509
Lot's of WHEA errors :( There's not even a second between them at points.

A corrected hardware error has occurred.

Component: PCI Express Root Port
Error Source: Advanced Error Reporting (PCI Express)
Are you using a riser?

You could try setting it to PCI-E gen 3.0 in the BIOS.

A few of these PCIE errors are not a problem (especially if they're when the PC boots), but if they're very frequent then that's more alarming.

I was having memory stability issues that seemed to be significantly worse under windows 11 than in 10. though this may have been more attributed to the dodgy BIOS issues the extreme seemed to suffer on release and is probably why it was discontinued so quickly. But at the time it seemed to make a difference. I haven't re-tried it since to be honest.
Presumably your memory is actually running at 4800 or 5200, if XMP is disabled?
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
Are you using a riser?

You could try setting it to PCI-E gen 3.0 in the BIOS.

A few of these PCIE errors are not a problem (especially if they're when the PC boots), but if they're very frequent then that's more alarming.


Presumably your memory is actually running at 4800 or 5200, if XMP is disabled?
No riser. Just straight into the board. and supported as well.

Memory is 4800 CAS 40 at the moment.

Would that not strangle the performance of my 4090 going to 3.0?
 
Man of Honour
Joined
22 Jun 2006
Posts
12,509
Would that not strangle the performance of my 4090 going to 3.0?
No, you only lose a few %, but regardless, we're just trying to rule out the Intel issues because I'm afraid to say CPU connected devices producing errors is part of the symptoms.

 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
No, you only lose a few %, but regardless, we're just trying to rule out the Intel issues because I'm afraid to say CPU connected devices producing errors is part of the symptoms.

Yeah I'm getting pretty worried. I'll give it a go. I've just checked the power connector on the 4090 in case the old melting power pins was in progress (I've been trying my best not to disturb them since it was fitted in case I provoked the problem) but it's fully intact, no sign of any issue there at all.
 
Man of Honour
Joined
22 Jun 2006
Posts
12,509
I've just checked the power connector on the 4090 in case the old melting power pins was in progress (I've been trying my best not to disturb them since it was fitted in case I provoked the problem) but it's fully intact, no sign of any issue there at all.
Glad to hear that, are you using a support bracket to prop up the 4090?
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
Glad to hear that, are you using a support bracket to prop up the 4090?
Because of the bottom of the case being so far away it's held up with a Fine piece of black nylon attached to the overhead AIO. Not taking any chances. :cry: Ok going to give it a try with a few loops on 3dmark. I suspect Intels new microcode is going to have it running a lot slower. (and I thought I had left the performance losses of hardware vulnerabilities behind :( )
 
Man of Honour
Joined
22 Jun 2006
Posts
12,509
I suspect Intels new microcode is going to have it running a lot slower. (and I thought I had left the performance losses of hardware vulnerabilities behind :( )
It obviously isn't ideal to lose any performance, but it is not a big deal for gaming, even with the slowest profile Intel offer.

The top-end performance in benchmarks or long-run workloads can be impacted a lot more because they're more likely to exceed the power limits, or use the max single-core boost.
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
Some unusual behaviour so far as the cores boosting to higher clock speeds appear to be cores 4 and 5 hitting 5.8ghz with the rest capping at 5.5ghz and e-cores at 4.3 as usual. Temperatures are significantly down since the bios update at 70 max. previous was about 82-85
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
So after 20 loops of 3dmark steel nomad stress test and a good chunk of time on prime95 it's not crashed since changing to pci-e gen 3 and no more WHEA errors though some other one I've not seen before.

  1. Unable to open the job object \BaseNamedObjects\WmiProviderSubSystemHostJob for query access. The calling process may not have permission to open this job. The first four bytes (DWORD) of the Data section contains the status code.
  2. Metadata staging failed, result=0x80070490 for container
 
Man of Honour
Joined
22 Jun 2006
Posts
12,509
So after 20 loops of 3dmark steel nomad stress test and a good chunk of time on prime95 it's not crashed since changing to pci-e gen 3 and no more WHEA errors
Sounds good!

  1. Unable to open the job object \BaseNamedObjects\WmiProviderSubSystemHostJob for query access. The calling process may not have permission to open this job. The first four bytes (DWORD) of the Data section contains the status code.
  2. Metadata staging failed, result=0x80070490 for container
I haven't done any research or anything.., but I don't think that is likely to be an error to worry about.
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
It obviously isn't ideal to lose any performance, but it is not a big deal for gaming, even with the slowest profile Intel offer.

The top-end performance in benchmarks or long-run workloads can be impacted a lot more because they're more likely to exceed the power limits, or use the max single-core boost.
I think the frustration for me is mainly that I chose Intel over AMD at the time off the back of the increased memory bandwidth with 7200 initially being touted as a perfectly stable speed during the pre-launch review cycle and now I'm stuck at 4800 rolling back to PCIE gen 3 and running the biggest AIO I could get my hands on and it's potentially still **** it's pants. :mad:
 
Man of Honour
Joined
22 Jun 2006
Posts
12,509
and now I'm stuck at 4800
You might be able to sort that, I'm not sure what fiddling/troubleshooting you did at the time. That said, with the rumours/news about these CPUs, it may not be a good idea to be running your IMC at high clocks/volts at the moment.

rolling back to PCIE gen 3
If this fixes your crashing I don't know what it actually signifies, usually this only fixes the problem if someone is using a riser that wasn't intended for PCI-E 4.0.

You have a high-end motherboard, so I can't see it being a motherboard problem.

I know that Wendell mentioned issues with the PCI-E bus and NVME drives crashing/producing errors, so it could unfortunately be part of that. I can't say for certain either way and it'll be awhile before we can verify 100% that PCI-E gen 3 has fixed your issue.

and it's potentially still **** it's pants. :mad:
13900K/14900K users do seem to have got a raw deal this generation :o

I will say though, in the GN video their thoughts about contamination was that the CPUs affected were manufactured in 2023-2024, so if you have an early CPU, I suspect it is less likely to be affected. GN haven't gathered the data on the batches yet and there could be multiple issues at play with the CPUs, with some level of degradation being not connected to the possible contamination during manufacturing.
 
Last edited:
Back
Top Bottom