Constant Re-Boots

Associate
Joined
2 Nov 2006
Posts
125
Location
Kent, UK
Hi All!

I have a Threadripper 1920x system with a Microstar X399 mobo, 32gb RAM and two RTX 2800ti GPUs. I know it is getting on a bit but there is still plenty of life in the system and it renders pretty quick using Redshift.

The problem I am getting started last week, I would be working on my PC and then it would make a faint blip noise and re-boot. I updated the GPU driver still does it. Then I left it with no apps running and it still does the same thing. So I thought I would test it using memtest86 with a bootable USB drive. Booted up fine and started the test and after about 10 mins it rebooted itself. So doing this I know the problem is not Windows 10 since it didn't boot into Windows - right?

So I booted into BIOS and left it to run and low an behold it blipped and re-booted.

I cannot really pinpoint what is causing this. All I can think is that it is either the power supply which is a Corsair AX1600i which is only 4 years old, the RAM Corsair Vengance, the GPUs which are identical (and I have tried running the machine on both and it still crashes so I don't think it is either of these) or a flaky CPU or mobo. I know the CPU is not overheating since I monitored the temperature in BIOS.

If anyone has any ideas PLEASE let me know. The machine is reasonably useable until it decides to re-boot!
 
@Tetras I think I did a Windows update but other than that nothing. I don't think it is Windows though since when I boot into BIOS and leave it it does it little blip re-boot and when I ran memtest86 it does the same thing so with those tests it is not touching the Windows OS since it cannot be them. I have just checked all the temperatures and they seem to be fine I am wondering of it might be a malfunctioning temperature sensor that is causing it to think it is running too hot and then reboot? Would that make sense? Or would a temperature thing cause the machine to shut down?

All the fans are running fine from what I can see, spent a half day cleaning everything. I think I might unplug the USB devices and one of the GPUs and some of the mem and see if it behaves then. It is REALLY annoying!
 
Well I would be really annoyed if it was the PSU since it was a Rolls Royce of a PSU, a Corsair AX1600i which is 1600 watts and cost me over £400! And it's only 4 years old…
 
OK the latest update is I took out the memory and tested each stick individually after disconnecting all the hard drives. I also removed one of the identical GPUs. It still re-booted. So I switched to the other GPU - it still rebooted! Then I looked at my PSU since it has a test button on the AX1600i PSU. It turned green and the fan spun but I tested it several times and occasionally it went red… It's only four years old so I went back to the supplier who said plug in the iCue USB cable and switch if from multi-rail to single rail. Now I cannot do that since both of my internal USB slots are taken, one with the H100i Cooler and the other with the front USB slots which I use for my Wacom tablet and mouse.

Funnily enough having done the PSU test my system seems to have stabilised somewhat and only reboots occasionally not consistently. I am wondering what I have done? I know that I cannot add any more RAM in the vacant RAM slots since the USB cable coming out of the H100i cooler is in the way... I am also not convinced that going from multi-rail to single will make a difference since the PC has worked fine for four years. Using the iCue software I can also see that my system does not seem to suffer from overheating problems. Can I unplug the iClue USB cable from the H100 cooler temporarily and plug in my PSU just to check that out?
 
System seems to be running fine now and I haven't done anything other than install the iCue software which doesn't make sense since MemTest86 which doesn't boot into Windows still had the re-boot problem. Anyway happy bunny at the moment since it has remained stable for the past two days. So I am mystified, some sort of low level fault that has corrected itself?
 
OK one thing that I was considering was updating the bios on my MSI X399 Gaming Pro Carbon AC mobo. It's weird since it currently says it's version 1.B0 and yet on MSI website there is no mention of this version but a newer version 7B09v1C. I am very tempted just to install this anyway but not sure if it will fix anything.

My suspicions are that there is a component on my PC that is on it's way out but it is finding out which one which is the problem! RAM, mobo, CPU, H100i cooler, PSU, M2 sticks? Since the machine crashes even in BIOS it rules out Windaz.
 
Thanks for all the advice peeps. I have recently run OCCT software on my PC and everything checks out fine. Indeed since I ran the tests my PC has behaved! I remember doing a Windows update just before the problems started and am wondering if the update upset something low level on my PC?
 
Well my PSU which was under warranty went back to the shop and it proved faulty and they sent me a new one. Hurrah I thought the problem is fixed! So I installed it and all the connections and guess what? Yup it went back to spontaneous re-booting... I can only think that this problem is due to the CPU or Mobo now. Although I am tempted to unplug the USB cable running from the CPU to the mobo which is used for the iCue software. Or update the BIOS.
 
OK the latest is I updated the BIOS to 1C and it still re-booted every now and then. Interesting thing is when it did try and re-boot one time it stuck and the CPU diode light on the mobo was on… Then I pushed the re-boot button and it behaved. Now another thing I have noticed is that the front USB 2 slots on the front of the case have stopped working although the USB 3 is still working didn't do this before??? I then tried re-booting into my old Windows 10 M2 drive (I kept it as a back-up) and it still re-boots. I am now wondering if the thermal paste needs changing or the mobo is just on it's way out. Very annoying since it's tricky finding a replacement board that takes two GPUs that are both 16 bit PCIE - I was looking at the ASUS B650 one that seems to fit the bill.

This kind of reminds me of my previous PC which I originally built back in 2012. It started having problems where is just wouldn't boot. Eventually I had the RAM replaced and it was fine. I gave it to my sister and after a few years she had problems booting and had to keep pushing F8. So I took it back and re-installed Windows from scratch and it has behaved fine since then and that was nearly a year ago. I am wondering whether I should scrap my Windows installation and re-install everything from scratch to see if that removes the problem? Sound like a good idea?
 
OK I thought I would test the RAM again but more thoroughly and using slot B2 as the manual said. So the first two RAM sticks on their own still had re-boots, but the third has given me no re-boots after 5 hours so maybe the RAM is faulty…
 
OK I am still holding off replacing any hardware on this PC since I am convinced that the problem is something simple. I am pretty sure that the problem is down to the mobo/CPU and I read that sometimes the CPU might not be getting enough current and is re-booting? The problem does seem to be like something is shorting out. Is this some thing to do with 'AMD Cool n' Quiet' being enabled and 'Global C-states Control' set to Auto? This is slightly beyond my knowledge but is it worth a try? Also setting the voltage to the CPU to 1.35 from the default - defintely don't know what I am doing here…
 
OK I know I might be flogging a dead horse here but I have found that when I am running AIDA64 Extreme on my PC it never re-boots… So I thought I would play Skyrim on the sneaking suspicion that when my PC is under a certain load it will behave. Well I played the game for a solid hour and it never re-booted. Could this be to do with voltages? I have noticed that when my system does re-boot sometimes the onboard EZ Debug LED CPU light stays on indicating a fail - dodgy or knackered Threadripper 1920x CPU perhaps? AIDA gives the CPU Core as 1.440V and the CPU VID as 1.225V. I haven't got a clue what these should be???
 
@Tetras Sorry didn't check back on this thread I will give your suggestion a go but at the moment the PC takes like half an hour to boot up sometimes - deffo something wrong with either CPU or mobo. The Onboard LED CPU light stays on until the machine boots, the manual suggests that this means 'CPU is not detected or fail'. So I am not sure even voltages would do anything since this before it reached the BIOS?
 
Last edited:
Back
Top Bottom