Computer just shuts down unexpectedly during gaming

Associate
Joined
21 Nov 2012
Posts
106
Location
Glasgow
Hi,
Bit of an odd one, after running pretty much flawlessly I've encountered an issue in the last two days with my main computer where it will shut down unexpectedly during gaming after a few seconds of stuttering. No blue screen no blank screen just straight switch off.

Spec is
I9 13900K w/ Corsair H170I elite
Asus Z790 extreme
2x 16GB Corsair Corsair Vengeance Black 32GB 7200MHz DDR5 (No XMP as was originally listed as a supported /approved set but was pulled from the listings after release :mad:)
RTX 4090 FE
Corsair HX1200
2TB Firecuda 530 nvme
Windows 10 64 bit pro (was 11 but had stability issues so went back to 10 for now.)

I've removed the header in case the switch was faulty, no change,
ran HW monitor to keep an eye on temps and voltages nothing too excessive during stress testing
re-applied thermal paste (Kryonaut) old application looked good. No change
updated to the BIOS that Asus just released on the 12th to address a micro code issue and ran into the news about 13th and 14th gen failing. I really hope this isn't it. Now running in Intel performance spec
Also updated every driver on it including the intel management engine etc.
actually seems to be getting more frequent.
plenty of errors in event viewer but not seeing anything critical?
Everything else on the computer runs exceptionally cool as it's inside a Corsair obsidian 900D w/ all the fans

I'm not sure what to do given I get no error log or blue screen to work from.

Any suggestions/ help would be very much appreciated!
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
The most obvious reason for a PC to just shut down out of the blue under load with no error messages is the PSU, but admittedly it is hard to trust these CPUs right now.

You haven't moved the PC lately and might have unsettled the power cables?

How is your 4090 connected to the power supply?


The majority of event viewer errors are meaningless, but WHEA errors would be more interesting.

What stability issues did you have with Windows 11?
Lot's of WHEA errors :( There's not even a second between them at points.

A corrected hardware error has occurred.

Component: PCI Express Root Port
Error Source: Advanced Error Reporting (PCI Express)

Primary Bus:Device:Function: 0x0:0x1C:0x4
Secondary Bus:Device:Function: 0x0:0x0:0x0
Primary Device Name:pCI\VEN_8086&DEV_7A3C&SUBSYS_88821043&REV_11
Secondary Device Name:


All I get on the critical error list is an unexpected shutdown.

This one is repeated most of the way through.

the 4090 was bought at launch in May 2023 iirc and the full 13900k rig was put together from new in December 2022 power supply is a little older

Power to the 3090 is using all individual PCIE cables 4 into the 12 pin FE adaptor (No daisy chained connectors). cable is as straight as can be and fully seated (lots of room in a 900D)

It just crashed while typing this not even gaming. :rolleyes:

I was having memory stability issues that seemed to be significantly worse under windows 11 than in 10. though this may have been more attributed to the dodgy BIOS issues the extreme seemed to suffer on release and is probably why it was discontinued so quickly. But at the time it seemed to make a difference. I haven't re-tried it since to be honest.

Yes after seeing the reports coming out I was dreading it being the same issue.


 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
Are you using a riser?

You could try setting it to PCI-E gen 3.0 in the BIOS.

A few of these PCIE errors are not a problem (especially if they're when the PC boots), but if they're very frequent then that's more alarming.


Presumably your memory is actually running at 4800 or 5200, if XMP is disabled?
No riser. Just straight into the board. and supported as well.

Memory is 4800 CAS 40 at the moment.

Would that not strangle the performance of my 4090 going to 3.0?
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
No, you only lose a few %, but regardless, we're just trying to rule out the Intel issues because I'm afraid to say CPU connected devices producing errors is part of the symptoms.

Yeah I'm getting pretty worried. I'll give it a go. I've just checked the power connector on the 4090 in case the old melting power pins was in progress (I've been trying my best not to disturb them since it was fitted in case I provoked the problem) but it's fully intact, no sign of any issue there at all.
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
Glad to hear that, are you using a support bracket to prop up the 4090?
Because of the bottom of the case being so far away it's held up with a Fine piece of black nylon attached to the overhead AIO. Not taking any chances. :cry: Ok going to give it a try with a few loops on 3dmark. I suspect Intels new microcode is going to have it running a lot slower. (and I thought I had left the performance losses of hardware vulnerabilities behind :( )
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
Some unusual behaviour so far as the cores boosting to higher clock speeds appear to be cores 4 and 5 hitting 5.8ghz with the rest capping at 5.5ghz and e-cores at 4.3 as usual. Temperatures are significantly down since the bios update at 70 max. previous was about 82-85
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
So after 20 loops of 3dmark steel nomad stress test and a good chunk of time on prime95 it's not crashed since changing to pci-e gen 3 and no more WHEA errors though some other one I've not seen before.

  1. Unable to open the job object \BaseNamedObjects\WmiProviderSubSystemHostJob for query access. The calling process may not have permission to open this job. The first four bytes (DWORD) of the Data section contains the status code.
  2. Metadata staging failed, result=0x80070490 for container
 
Associate
OP
Joined
21 Nov 2012
Posts
106
Location
Glasgow
It obviously isn't ideal to lose any performance, but it is not a big deal for gaming, even with the slowest profile Intel offer.

The top-end performance in benchmarks or long-run workloads can be impacted a lot more because they're more likely to exceed the power limits, or use the max single-core boost.
I think the frustration for me is mainly that I chose Intel over AMD at the time off the back of the increased memory bandwidth with 7200 initially being touted as a perfectly stable speed during the pre-launch review cycle and now I'm stuck at 4800 rolling back to PCIE gen 3 and running the biggest AIO I could get my hands on and it's potentially still **** it's pants. :mad:
 
Back
Top Bottom