• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

New 5900x crashing

Associate
Joined
28 Mar 2021
Posts
30
Hi. I have upgraded my old 5600x to a brand new 5900x to cope with the bottleneck I had. 5600x never gave me any problems in it's almost 2 years of service. no crashes nothing. But the new one....

The current specs are:
cpu: ryzen 9 5900x
Mobo: Rog Strix B550-F gaming
gpu: gigabyte RTX 3080
ram: 2x 8gb corsair venceance pro rgb 3200cl16 (dual channel, slots 2 and 4)
storage: 2tb nvme western digital black
psu: corsair rm-750x
cpu cooling: corsair h100i pro
os: windows 11 home

Now the history of events:

1. I have replaced the old CPU with the new one and hit a bios reset settings. left everything untouched and booted up into windows.

From here I have got a random crash while doing non demanding tasks on it.

In event viewer I found 2 errors that caused this crash:

Whea logger:
Description:
A fatal hardware error has occurred.
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Bus/Interconnect Error
Processor APIC ID: 0

And Whea logger:
Description:
A fatal hardware error has occurred.
Component: Memory
Error Source: Machine Check Exception
The details view of this entry contains further information.


Bear in mind that at this stage my oc were defaulted and my ram was on 2133mhz.

The same crash with same errors happened 40 minutes later when exiting a warzone 2 match.

2. I have decided to go and check if there is any bios update, my current version was one released in oct 2022. Found a newer one. Installed it using a usb flash drive.
Enabled D.O.C.P which brought my memory to 3200mhz and set the flck freq to 1600mhz. Checked of any driver updates on asus website, even used a tool which scans and checks of outdated drivers. Checked for windows update. Everything is up to date.

Opened warzone 2 to test again. in within 10 minutes of game it crashed again with same errors.

3. Decided to fire up a test and check on temps. fired up prime95 and chose the full blend test. In within 3 minutes worker 6 will fail the test. stopped the test and tried once more. same under 3 mins same worker.

4. Went back to bios and reseted everything back to stock settings. Fired prime95 once more and same, within 3 minutes worker 3 failed the test. Stopped the test and ran it again but only on that worker that failed, no errors, let it run for 15 mins and no error.

The temps were reaching 90c on some cores at some points according to hwmonitor.

5. Decided to test some more. opened up icue and set the pump and the fans from quiet mode to extreme. and fired up prime95 once more on all cores. same under 3 mins one of the workers failed the test. This time the temps never reached more than 85.7c( well the test didnt lasted more than 10 mins)


6. Going back to the crashes. since both errors appear at almost every crash and one says cpu and other memory, I decided to go back to the pc case and do some tinkering. I untighted the aio pump which sits on top of cpu a bit (maybe it was too tight) and moved the ram chips from 2-4 to 1-3

Went back to windows and this time I encountered another problem. Past the login screen the windows will crash with the error SYSTEM SERVICE EXCEPTION.
Went to safe mode and checked the event viewer and after some google search found out that there is some app that tries to access forbidden data from the sistem.
Tried several methods to fix the startup such as restore point, sfc scan now, DISM.exe, nothing helped.

7. Hit the reset this pc option using a usb with a new install kit. Booted to windows. updated it , installed all drivers.
Now I got a clean windows, went back to the bios and enabled once more P.O.C.D (ram oc 3200mhz) , ram flck to 1600mhz but this time I have manually set the SoC VDD voltage from 1.0000 to 1.1000 ( found this advice on some reddit post). Moved back the ram sticks to 2-4.

Went to windows and tested once more prime95. Same, a worker will fail in within 3 mins. But no crashes. Did a 3 hour session of warzone to last evening and no crashes.

Now the questions I need help with.

Are there any other tools to test the cpu?

It is a faulty cpu? should I RMA it?
I am planning to let it run for at least 2 more days and if it crashes again I will rma. But I need to make sure is faulty.
 
Last edited:
Associate
Joined
25 Sep 2015
Posts
153
It is indeed pointing to a faulty CPU - Especially if it is doing it with RAM at below rated speeds - We 100% sure the CPU pump is running? Have you tried CO and set PBO limits on the chip? just to try and cool it a little more? (but at worst this should just throttle, not crash)
 
Associate
OP
Joined
28 Mar 2021
Posts
30
It is indeed pointing to a faulty CPU - Especially if it is doing it with RAM at below rated speeds - We 100% sure the CPU pump is running? Have you tried CO and set PBO limits on the chip? just to try and cool it a little more? (but at worst this should just throttle, not crash)
The pump is definetely running. I can hear it if I get close to the cpu socket. Since is a corsair one if you put it on extreme it is quite noisy.

I have made a naming mistake in the original post. the voltage I have modified is called SoC VDD voltage. which I increased from auto (1.0000) to manual 1.1000.

These are my current temps and voltage under low load are link
 
Associate
OP
Joined
28 Mar 2021
Posts
30
If it crashes at stock doing mundane task I'd RMA it.
This is the problem. I don't know if the issue was that is faulty, or maybe overtightened, or even the windows was causing the issue. Because I reinstalled the windows and untightened it in the same time. Since then is still failing on one worker in prime95 but no crashes so far.
I need to perform some more tests on it. hence why I am asking you if what you would suggest.
 
Pet Northerner
Don
Joined
29 Jul 2006
Posts
8,072
Location
Newcastle, UK
This is the problem. I don't know if the issue was that is faulty, or maybe overtightened, or even the windows was causing the issue. Because I reinstalled the windows and untightened it in the same time. Since then is still failing on one worker in prime95 but no crashes so far.
I need to perform some more tests on it. hence why I am asking you if what you would suggest.

It's highly unlikely to be windows.

Try loosening the cpu block a little and see if it helps or not. If it doesn't just RMA.

On a fully working CPU you should have to set manual curve optimisation on single cores just for stability out of the box.
 
Associate
Joined
25 Sep 2015
Posts
153
The pump is definetely running. I can hear it if I get close to the cpu socket. Since is a corsair one if you put it on extreme it is quite noisy.

I have made a naming mistake in the original post. the voltage I have modified is called SoC VDD voltage. which I increased from auto (1.0000) to manual 1.1000.

These are my current temps and voltage under low load are link
You could try a little bit more voltage on the DRAM to see if it stabilises? - But as you have said above, its still crashing within prime @ stock (although thats an extreme test) - What AIO have you got on it? 240 / 360mm? - I'd be tempted to use Curve optimiser on a blanket -10 all cores to see if that drops temps and see if you still get crashing (just to rule that out - at stock it shouldnt be doing it) - Although if it was me, and it was a new chip, id 100% be RMAing it not being able to run @ stock (providing the cooler is sufficient and working ok)
 
Last edited:
Associate
OP
Joined
28 Mar 2021
Posts
30
It's highly unlikely to be windows.

Try loosening the cpu block a little and see if it helps or not. If it doesn't just RMA.

On a fully working CPU you should have to set manual curve optimisation on single cores just for stability out of the box.
Thats what I did. but then the windows failed to boot anymore so I had to reset (reinstall) it. Since then no crashes.

I will be running now aida64 extreme test to see how it behaves.
 
Associate
Joined
25 Sep 2015
Posts
153
Not sure why you wouldnt just RMA a new CPU if its still failing at stock levels (again, providing the cooler is beefy enough to tame it)
 
Associate
OP
Joined
28 Mar 2021
Posts
30
Did a 15 minutes test of aida64 no sign of issues.

Went again on prime95 and did a 13 minutes test of smallest FFT, no errors.

Result print

Did the small FFT test, for 15 mins, no errors.

Did a Large FFT test and one worker failed in within 15 seconds. Did the test again and in 13 minutes another worker failed.

Result print

Going to run a memtest86 from bios to see if any errors.


I'd get the latest BIOS for the B550 board first, then RMA if that doesn't work.

I have updated to the latest version yesterday.
 
Last edited:
Associate
Joined
31 Dec 2011
Posts
816
I have the same cooler as you with a 5900X, never above 80 degrees although I never run prime. steady as a rock though. Sounds as though you need to RMA.
 
Soldato
Joined
28 Oct 2009
Posts
5,294
Location
Earth
Mine never used to go above 63c with CB23 multi run z73 360 aio , with PBO per core set and motherboard limits disabled used to pull around 133watts cpu package power cb23 run, wonder what the CPU is pulling for that 80c temp ? try CB23 run
my goal was to beat stock peformance for no increase in temps or power

probably have it set with no limits, yeah you can gain higher scores but I didnt see the gain worth it for the power and temp increase

but saying all that if you have everything at stock shouldnt be having them issues only should expect them once you start playing around
 
Last edited:
Associate
OP
Joined
28 Mar 2021
Posts
30
Finished 2 out of 4(roughly 30 mins each) tests of memtest86 v10 then I stopped the test. no errors.

did a test of occt with my ram at 3200mhz. in under 1 min core 6/thread 12 started failing every second.

Rebooted and ran the test again this time with the ram on auto (2133mhz) been 15 minutes since and no errors.

current status print

Im starting to believe that the SoC voltage fixed the actual system halts. no crashes since, and I have been stressing this cpu all day.

I am still not convinced with the crashes in prime95 and occt.
 
Last edited:
Soldato
Joined
28 Oct 2009
Posts
5,294
Location
Earth
I used to get 63c CB23 multi run chip using around 133w with z73 360 aio have you tried with CB23 multi run ?
 
Last edited:
Associate
OP
Joined
28 Mar 2021
Posts
30
I used to get 63c CB23 multi run chip using around 133w with z73 360 aio have you tried with CB23 multi run ?
Right now I let it ran with stock values. Everything on default except SoC 1.1v
Been 32 mins and no errors.

I have a corsair h100i platinum which is a 240mm rad.

Which settings would you suggest to change in bios?
 
Last edited:
Back
Top Bottom