GPU crashes

Associate
Joined
28 Oct 2002
Posts
1,819
Location
SE London
Hi guys, I was wondering if you could give me some advise - my wife's PC is struggling to play games, after about an hour or so it's basically crashing to a black screen and just straight up dying and will reboot.

I'm not sure whether the issue is related to the GPU itself, or power supply, it's a rather frugal 450w SFX PSU, specs are (as close to it as I can):

CPU: Intel Core i5-7500 3.4 GHz Quad-Core Processor
CPU Cooler: Noctua NH-L9i 33.84 CFM CPU Cooler
Motherboard: Asus Z170I PRO GAMING Mini ITX LGA1151 Motherboard
Memory: Crucial Ballistix 32 GB (2 x 16 GB) DDR4-3600 CL16 Memory
Storage: Samsung 970 Evo 1 TB M.2-2280 NVME Solid State Drive
Video Card: EVGA GeForce GTX Titan X 12 GB Video Card
Case: Fractal Design Node 202 HTPC Case
Power Supply: Silverstone SFX 450 W 80+ Bronze Certified SFX Power Supply

After it dies, I check thermals, they look acceptable, my only thinking is as I said, either power related, or sheer age of the card... but a top tier (in it's day) card should still be chugging along after all this time?

In the systemlog I do notice that there's an error event pertaining to nvidia, I've investigated this and it appears to be a generic GPU crash event, which isn't helpful.

Either way, I was wondering if you guys had any insight in to what it could be?
 
After it dies, I check thermals, they look acceptable, my only thinking is as I said, either power related, or sheer age of the card... but a top tier (in it's day) card should still be chugging along after all this time?
I think the PSU is a bit low.
TPU has the Titan X (Maxwell) maxing at 275W and peaking at 243W. And that doesn't really leave much left. (Also, I don't think top tier for Titan necessarily means better components in terms of VRMs etc.)
But what I came in here to respond was the After it dies part. Surely, it makes much more sense to run a logging monitoring program the whole time so that when it crashes you don't have to guess?
That way you can see what they temps were, how high the utilisation was and how much power the GPU, CPU, etc were drawing before it crashed.
Plus if you add a bunch of motherboard sensors you should be able to see any v-drop etc.
 
According to PC part picker, it runs at ~360w for the entire PC.

With regards to GPU/temp/motherboard logging/monitoring, do you have any suggestions?
 
How many thermals are you able to check? That's a hefty graphics card for such a small case, practically anything could be being cooked on the motherboard, the graphics card, even the RAM.

How long was it running without any problem?

I'd agree with the above, that 450 Watt with a Titan X is rather optimistic, it will force the PSU to run a lot hotter and close to capacity than I'd like.

You could probably rule out the memory overheating by knocking it back to stock (2400 for an i5-7500). Similarly, you could try a big underclock on the graphics card (core and memory), to lower power consumption and heat output.
 
According to PC part picker, it runs at ~360w for the entire PC.

With regards to GPU/temp/motherboard logging/monitoring, do you have any suggestions?
I have often run borderline PSU like a 330W Seasonic with a HD7950 and have been fine.
But since all those issues popped up with modern (especially Ampere) GPUs running so peaky and boosting all the time, I have somewhat suspicious of spiking bursts which draw more than the average would imply.
Well, either MSI Afterburner or OpenHardwareMonitor.
The later is 100% portable which is nice. And MSI Afterburner can also be run portable.
There are so many loggers out there, so anything which can log to a file (CSV or plain text).
I'm sure someone will be along with a better suggestion soon.
Also, don't forget that monitoring is a bit like quantum superposition: as in monitoring is not free and it has a small overhead which may affect the results.
 
Back
Top Bottom