Computer just shuts down unexpectedly during gaming

You could try laying the PC flat, for if the problem is the seating with the PCI-E slot and GPU sag, though I think the FE 3090 has a vapor chamber and those may not be designed to operate in a different orientation.

Did you set the graphics PCI-E gen only, or the M.2 PCI-E gen too?
Tried it outside the case on the motherboard box but no change. i knocked both back to 3.0 which seemed to work for a while but it came back. No actual crashes though just the WHEA errors. I've swapped out my 4090 FE for my old 3090 FE in to see if it made any difference and so far it hasn't reported any more errors other than the metadata stuff Though this happened yesterday as well and then came back so I'm not convinced at the moment. Temperatures and voltages are still within spec though the 3090 memory runs a good bit hotter so I've got it's fans on full bore to keep it cool. (should probably replace the thermal pads on it.)

No sign of any issues on the 4090 as far as sag it's still straight along the edge with no signs of distortion or damage to the pins etc.
 
Crashed whilst typing and not gaming… CPU?

I would just down clock it and see if it’s stable.
yep. just web browser open and hwmonitor logging in the background and because it doesn't get time to do a dump file I've got nothing concrete to work with.

I'd clipped it's max boost back to 5.5 already. No change. though I've not had an actual crash for a few days the WHEA errors were persisting.
Today I've had it back on regular performance setting while running the 3090 to try and provoke the issue again but no dice :(
 
I've tried a good few things now temperature is ok but even with the new microcode adjusted BIOS and in intels performance setting w/ max boost clocks of 5.3 are being reported as hitting 1.717vid on a stress test. Surely thats not ok? in the BIOS vcore is set at 1.350V
 
Ok so retailer has my processor for RMA for the last month. they couldn't fault it but are being told to use the latest microcode updates which seem to be masking the issue. I also sent them my motherboard as i did not accept that there was no issue and some definitely faulty ramm sticks. all testing ok with newest bios and microcode fixes.

Any idea what to do? suggestions? They've had it for more than a month and I don't know what to do as there doesn't seem to be a definitive way to test the processors to replicate the faulty. afaik all they are doing is running aida64 stress/ burn tests
Any tests I can ask them to try to replicate the fault? It seems like the microcode updates have effectively applied a bandaid to my processor but I've no confidence in it at all.

Intel did get back to me after it had been sent to the retailer and offered to warranty the processor so I guess thats my next port of call.

Any help would be greatly appreciated!
 
Intel did get back to me after it had been sent to the retailer and offered to warranty the processor so I guess thats my next port of call.

Any help would be greatly appreciated!
For intermittent/hard to diagnose issues, I would always send the CPU back to the manufacturer where possible, because their testing is much more extensive and I'm not sure they even possess the capability to return the same CPU to you once the RMA has been accepted and the return is being processed.
 
For intermittent/hard to diagnose issues, I would always send the CPU back to the manufacturer where possible, because their testing is much more extensive and I'm not sure they even possess the capability to return the same CPU to you once the RMA has been accepted and the return is being processed.
Thanks Tetras I think I'll do that then. given VID was reporting 1.717 - 1.719V at idle I really don't fancy keeping hold of it.

Not sure what to make of the ramm being tested without fault. memtest was pretty definitive when testing those sticks though all the errors were on even numbered cores.

20221219-183453.jpg

20221219-194617.jpg

20221219-202205.jpg

20221219-204358.jpg

20221219-204428.jpg

20221219-212458.jpg

20221219-221331.jpg
 
Last edited:
Above 1.55v is the danger zone so if your getting over 1.7v replace it under Intel's warranty,
Your chip is degrading supper quick.

Ram (speed wise) is dependent on your CPU so if the imc controller isn't working memory test will throw up errors no matter what.
 
Last edited:
Not sure what to make of the ramm being tested without fault. memtest was pretty definitive when testing those sticks though all the errors were on even numbered cores.
What was your testing process with the memory?

Usually you would test:
- 1 individual stick at a time, to determine if only 1 stick is the problem.
- With XMP disabled and XMP enabled, to check if the RAM is stable at stock settings and at the rated speed.

As said above, 7200 is a fairly high speed so there's a possibility your CPU IMC is the issue, especially if the CPU is degraded due to high voltage or is from the batch of 13th gen CPUs with the manufacturing fault.

Thread is too long to easily keep track of, but I think the amount of issues you've had with WHEA errors, PCIE and memory instability, high stock voltages and idle crashes are suggestive of a faulty/failing CPU.
 
Back
Top Bottom