Is it possible my new 5950x system hardware has already degraded?

Associate
Joined
10 Dec 2020
Posts
26
I've been running fully stable at 3600cl16 / 1800flck the past week and a half on my 5950x build that is about 1,5 week old. Not a single WHEA error in the event log or any crashes.

Last night i left the system idle during bedtime and when i woke up i noticed WHEA errors in the event log, i was thinking this was related the idle state of the cpu, the past 1,5 week of me using the computer it has really not been at idle at all.

The problem is, the WHEA error seems to be "creeping up" now suddenly i am getting WHEA errors DURING LOAD, something that would never happen EVER in the past 1,5 weeks, and i have not changed any settings or any drivers. Right now i am getting a literal flood of WHEA errors, it seems like it is just increasing by the hour. Its like my hardware isn't the same? WTF happened? Temps very never bad and i never touched any voltages.

Any ideas?
 
Yeah to my surprise now i decided to try everything at stock, so i am running the system at 2666mhz/1333flck and guess what? My event log is spamming with WHEA errors at stock setting also, this means something broke suddenly, during idle at night and now my hardware is most likely screwed, i cant even run stock without any errors. Wow.

That and my cpu never passed 70c, also never touched any voltages, only put flck to 1800, memory to xmp 3600, that's it, no PBO nothing. Also not changed any drivers from before and after the errors started occurring. Is my hardware now suddenly faulty or could it be something else? It just seems strange that everything was running absolutely stable for a week+ and now i cant even get stock setting without WHEA error and i didn't change anything. More likely the hardware has broken?
 
Last edited:
Can we have a spec list for your system and which components are new?

Do you have any hardware you can swap out to test with?

In the past I've had RAM work flawlessly for a week then a stick will just start throwing out errors.
 
5950x
Auros x570 Extreme Rev 1.1 F30 (tested F31 bios , no change)
Ballistix MAX 4x16gb 4000CL19
3090 FE
850 Platinum SeaSonic
1TB Samsung Pro 980 gen4 nvme

But i solved the issue, as in i locked the clockspeeds at 3000mhz removing the boosting algorithm completely and now my new computer is slower then my old but it works without any WHEA errors, this means it must be the cpu right? Chances of the motherboard breaking when its not running even close to its limits seems strange. If it was a ram issue i would still have the WHEA errors now when the cpu is at 3000mhz because the ram is still running at 2666mhz.

Or am i mistaken here?
 
5950x
Auros x570 Extreme Rev 1.1 F30 (tested F31 bios , no change)
Ballistix MAX 4x16gb 4000CL19
3090 FE
850 Platinum SeaSonic
1TB Samsung Pro 980 gen4 nvme

But i solved the issue, as in i locked the clockspeeds at 3000mhz removing the boosting algorithm completely and now my new computer is slower then my old but it works without any WHEA errors, this means it must be the cpu right? Chances of the motherboard breaking when its not running even close to its limits seems strange. If it was a ram issue i would still have the WHEA errors now when the cpu is at 3000mhz because the ram is still running at 2666mhz.

Or am i mistaken here?
Could be the gigabyte bios as mine hip has been a bit flaky today
 
Could be the gigabyte bios as mine hip has been a bit flaky today
but it was stable for over a week now i can't seem to get any stability what so ever? I did try to re-flash the bios but it seems nothing helps.
to me the cpu has changed, ive initiated an RMA and it looks like it won't be any new cpu for me until early jan, so it will be awhile.

i am more than certain more people will experience something like i did where the cpu works fine in the beginning and your overclocks are rocking then suddenly hell breaks loose, that 1.5v to core boost at idle is probably going to screw up a whole lot of systems until amd patches it and lowers the boosts, but then you've essentially gotten scammed purchased something that was advertised to do something which it couldn't do in the long term

hopefully i am wrong
 
So the only overclocking you did was change RAM speed and FCLK?

Yeah, my goal was a new fast stable system, i've read that just setting your ram/FLCK to something like 3600/1800 1:1 will yeild a massive performance improvement without any risks, so this is the only thing i did do + tighten the timings because my ram is specced at 4000mhz.
 
It might be worth testing the memory. I'd put more money on that being an issue than cpu degradation.

Also about the 1.5v thing, it's totally normal. Voltages like that are only dangerous if the cpu is actually doing work. The voltage algorithm on these cpus is designed like this. This in no way would cause degradation.
 
It might be worth testing the memory. I'd put more money on that being an issue than cpu degradation.

Also about the 1.5v thing, it's totally normal. Voltages like that are only dangerous if the cpu is actually doing work. The voltage algorithm on these cpus is designed like this. This in no way would cause degradation.

I will try setting my memory back to 3600cl16/1800 flck with core boosts disabled, i read another thread and it appears the WHEA errors are directly related to the core boost algorithm, i am working right now so i can't reboot but once i am done with todays grind i will try it out and see if i have any errors. I personally don't think its the memory but i do agree with you that it's by design to boost to 1.5v so it seems very odd that would cause degredation in so little time. I am surprised to see how this issue crept up on me, from nothing for a week straight to not even being able to run at stock.

I found this thread (https://www.overclock.net/threads/replaced-3950x-with-5950x-whea-and-reboots.1774627/) where the user has the same issue as me, the only difference seem to be that for him it started out like this from day one, rather then starting after day 8 which to me is the part that baffles me the most, it just came out of nowhere and is why i had the theory my hardware may have degraded in one way or another.
 
I will try setting my memory back to 3600cl16/1800 flck with core boosts disabled, i read another thread and it appears the WHEA errors are directly related to the core boost algorithm, i am working right now so i can't reboot but once i am done with todays grind i will try it out and see if i have any errors. I personally don't think its the memory but i do agree with you that it's by design to boost to 1.5v so it seems very odd that would cause degredation in so little time. I am surprised to see how this issue crept up on me, from nothing for a week straight to not even being able to run at stock.

I found this thread (https://www.overclock.net/threads/replaced-3950x-with-5950x-whea-and-reboots.1774627/) where the user has the same issue as me, the only difference seem to be that for him it started out like this from day one, rather then starting after day 8 which to me is the part that baffles me the most, it just came out of nowhere and is why i had the theory my hardware may have degraded in one way or another.
I’m now playing with a good old fashioned all core overclock until a more stable bios starts filtering through
 
Ok i've done some tests, here are the results:

Core Performance Boost -> Disabled, CPU Multipiler x34 (auto), FLCK 1800, MEM 3600 = WHEA Errors
Core Performance Boost -> Disabled, CPU Multipiler x34 (auto), FLCK 1333, MEM 3600 = WHEA Errors
Core Performance Boost -> Disabled, CPU Multipiler x34 (auto), FLCK 1333, MEM 2666 = WHEA Errors
Core Performance Boost -> Disabled, CPU Multipiler x34 (auto), FLCK 1333, MEM 1333 = WHEA Errors
Core Performance Boost -> Auto, CPU Multipiler x32, Mem 3600, FLCK 1800 = No errors!
Core Performance Boost -> Auto, CPU Multipiler x36, Mem 3600, FLCK 1800 = No errors!
Core Performance Boost -> Auto, CPU Multipiler x38, Mem 3600, FLCK 1800 = No errors!

So what is wrong here? Clearly disabling "Core Performance Boost" does not yield the same effect as setting a static multipiler for the cpu yet the resulsts in hwmonitor are the same. If you set a multipiler the cpu no longer boosts, and if you disable core performance boost the cpu no longer boosts, so why does one generate WHEA errors and the other does not? Clearly some **** code in those gigabyte bioses.

I'm starting to think when you update a bios there is actual microcode that gets updated in the CPU and even if you flash back to an older bios that microcode remains, this is my new theory to why my problems started yesterday. I did update to bios f31 but i flashed back to f30 after i didn't find the f31 one to yield any better results and i knew i was stable on f30 so why even bother using the newer one? Anyway, not long after that i went to bed and in the morning i had the WHEA errors. So this may be the fault to them, the F31 bios updated some microcode in the processor and due to the older microcode in the older bios the cpu won't take that because it's an older microcode.

I am only guessing, i am far from an expert in this field.

Anyway, everything is stable with a static multipiler overclock and memory runs just fine so its something fubar with the bios/cpu, thanks gigabyte i guess?
 
Last edited:
What’s the setting on the CPU PLL and Nb PLL

Have you checked the voltages using zen timing and ryzen master?

best dial in vram, vsoc, IOD and CCD voltages manually.
 
What’s the setting on the CPU PLL and Nb PLL

Have you checked the voltages using zen timing and ryzen master?

best dial in vram, vsoc, IOD and CCD voltages manually.

I have not, i expect the motherboard to do this automatically? Can you give me some "good values"? as i don't know what they should be if they are faulty. Cheers!
 
Best set your own volts

set vdram to 1.4
Vsoc to 1.05
V IOD and V ccd to 1.0v
V DDP to 0.95.

then leave CPU on auto and PBO on motherboard under AMD overclocking.

have you memtested? 2 sticks at a time? Also I guess these are not 64GB kits. You have 2 lots of 32GB?
 
Best set your own volts

set vdram to 1.4
Vsoc to 1.05
V IOD and V ccd to 1.0v
V DDP to 0.95.

then leave CPU on auto and PBO on motherboard under AMD overclocking.

have you memtested? 2 sticks at a time? Also I guess these are not 64GB kits. You have 2 lots of 32GB?

that's correct, mem was bought in 2x16 packs

i did a massive amounts of memtest and aida64 the first 4 days, everything was stable, this issue came out of nowhere, and the settings i tried them now, no change.

the only thing that fixes the issue is setting a static multipiler for the cpu, even disabling core boost doesn't help, but setting the static multipiler which pretty much does the same as disabling core boosts help, i think its all gigabyte trash code at this point i am however still wondering why it occurred out of seemingly nowhere after system was stable for over a week.

EDIT: You may be on to something anyway, i realize now even when i set the multipiler to static but hardpress my memory using aida64 cache benchmark i still get (tho rarely) those whea errors.

I will run another set of memtests over the day to see if suddenly my memory kit has gone bad, is this common for memory to work fine then suddenly go bad out of nowhere? Temps never went above 38c on the memory according to hw monitor (they have built in sensors).
 
Last edited:
If the volts are not set properly there will be errors. Two sets of dual rank rams will be pushing the ram controller harder on the CPU thus you are likely to need more volts to get them working at 3600Mhz I am not sure if these ram will go past 3600MHz.

motherboard said they are approved for the speed only test them at the kit size. They won’t be doing extensive testing on full dimm population.

I would in your case manually set the ram frequency to 3200MHz and fclk to 1600 and with those volts and all the ram timing at auto. Just make sure the Gear Down Mode is enabled and BGS is disabled. Alt BGS is enabled. Then run a step of memtest86 and Aida64 system stability. If that pass then go to 3400 with fclk 1700 and do the same then 3600 with fclk 1800 and do the same. If 3600 also pass with all auto timing then start tweaking timings. Start with the XMP timing.

mall of these sounds like some kind of memory errors, either ram is not behaving due to lack of volts or the controller is struggling due to insufficient volts.

also make sure your PLL are set to correct levels. What motherboard is it? ASUS and MSI boards you can set them to level 2/3.

I am not sure about what gigabyte or ASRock boards do.

I always say this “manually dial in volts for ram overclocking”. I just don’t trust motherboards to manage things by themselves. This eliminates any issues and you are in control of those settings so you know it won’t go wrong.
 
Last edited:
Back
Top Bottom