Associate
- Joined
- 3 Oct 2020
- Posts
- 29
Pre-amble: My 3090 FE has been running perfectly up until Easter Sunday. I have done no OC on it, left everything at stock setting since I bought it on release day. Average Temps used to be 75°C maybe 82°C on really hot summer days (UK summer averages around 25-30°C), under heavy gaming load. I played Dragons Dogma 2 that sunday with DLSS 'balanced' at 4k and all other settings cranked to max, no issues. Solid 80+fps for those who cares.
Actual Issue: On Easter monday the fans on the GPU started to go nuts as soon as I started up a game. Quick look at hardware monitor showed that the temps were hitting 85°C during gaming. Right now UK is in Springtime, with ambient temps averaging 10-15°C currently. This is happening with pretty much all games I tried (Dragons Dogma 2, Jedi Survivor, BG3, Fortnite). The PC was cleaned back in December, but I gave it another thorough cleaning with the compressor. I tried turning all the settings down in games and even running them in windowed mode at 1080p resolution. Still the same thing. Infact the temps climbed up to 88°C at one point. All the while the fans at stock curve settings are ramping up to 2200+RPM and sound like jet engines. I downloaded MSI afterburner for the first time and undervolted the card. Still no change.
EDIT: I rolled back the windows OS updates and Nvidia driver versions before the next couple of steps. Needless to say those steps didn't work.
Since it's been over three and half years Nvidia's warranty has expired, I repadded and repasted the GPU. There was the thinnest layer of dust on the board so I cleaned that and the fins+fans out (since I had it all apart anyway, it made sense). There was no change. Temps didn't get worse, they didn't get any better. Now the PC case itself didn't feel that much hotter under load so I borrowed a thermal imaging camera to get an idea of what was happening. I've linked the google drive that has screencaps of hardware monitored compared to the thermal images I took.
https://drive.google.com/drive/folders/1ooHTvUY-GTGxWdBb9xXxxgD2F1n2M1xF?usp=sharing
NOTE 1: On the thermal images, the green crosshair reading is top left. Red crosshair is maximum temp reading (on the image somewhere), shown at the bottom left.
NOTE 2: My case is inverted. Just incase the photo's throw people off. And yes I did wonder if the upside down layout could have damaged the fan bearings but surely that's a moot point since even right way up one of the fans are facing the bottom anyway?
Basically what I'm seeing on the thermal camera is that there is a drastic temperature difference between what the hardware monitor is reading and what the camera is showing me (under load). Like GPU temp is 86°C but hottest point reading on thermal is 61°C.
When idling the two readings are fairly similar.
I'm hoping someone more knowledgable can shed some light. Is it a faulty temp sensor? Did the fan bearings get damaged overnight? Did a solder fail somewhere on the board?
The suddeness with how this has happened is worrying. If these are the early signs of an inevitable GPU death then I need to start making plans for what to do if it fails and for an unplanned system upgrade for the 50 series (first world problem I know).
Thank you to anyone who's read up to here.
Actual Issue: On Easter monday the fans on the GPU started to go nuts as soon as I started up a game. Quick look at hardware monitor showed that the temps were hitting 85°C during gaming. Right now UK is in Springtime, with ambient temps averaging 10-15°C currently. This is happening with pretty much all games I tried (Dragons Dogma 2, Jedi Survivor, BG3, Fortnite). The PC was cleaned back in December, but I gave it another thorough cleaning with the compressor. I tried turning all the settings down in games and even running them in windowed mode at 1080p resolution. Still the same thing. Infact the temps climbed up to 88°C at one point. All the while the fans at stock curve settings are ramping up to 2200+RPM and sound like jet engines. I downloaded MSI afterburner for the first time and undervolted the card. Still no change.
EDIT: I rolled back the windows OS updates and Nvidia driver versions before the next couple of steps. Needless to say those steps didn't work.
Since it's been over three and half years Nvidia's warranty has expired, I repadded and repasted the GPU. There was the thinnest layer of dust on the board so I cleaned that and the fins+fans out (since I had it all apart anyway, it made sense). There was no change. Temps didn't get worse, they didn't get any better. Now the PC case itself didn't feel that much hotter under load so I borrowed a thermal imaging camera to get an idea of what was happening. I've linked the google drive that has screencaps of hardware monitored compared to the thermal images I took.
https://drive.google.com/drive/folders/1ooHTvUY-GTGxWdBb9xXxxgD2F1n2M1xF?usp=sharing
NOTE 1: On the thermal images, the green crosshair reading is top left. Red crosshair is maximum temp reading (on the image somewhere), shown at the bottom left.
NOTE 2: My case is inverted. Just incase the photo's throw people off. And yes I did wonder if the upside down layout could have damaged the fan bearings but surely that's a moot point since even right way up one of the fans are facing the bottom anyway?
Basically what I'm seeing on the thermal camera is that there is a drastic temperature difference between what the hardware monitor is reading and what the camera is showing me (under load). Like GPU temp is 86°C but hottest point reading on thermal is 61°C.
When idling the two readings are fairly similar.
I'm hoping someone more knowledgable can shed some light. Is it a faulty temp sensor? Did the fan bearings get damaged overnight? Did a solder fail somewhere on the board?
The suddeness with how this has happened is worrying. If these are the early signs of an inevitable GPU death then I need to start making plans for what to do if it fails and for an unplanned system upgrade for the 50 series (first world problem I know).
Thank you to anyone who's read up to here.
Last edited: