Hi guys!
I thought I would be writing about this to see if you have suggestions for me, because I am a bit desperate at the moment .
I currently have a Threadripper 3960X in a TRX40 Arous Xtreme. The PSU is a Corsair AX1600i and the computer has two GPUs: a Gigabyte RTX 3090 Gaming OC and a Gigabyte Eagle 3070 OC. Before the 3000 series I used to have a 2080Ti and a 980Ti that worked flawlessly. My PSU was an old Seasonic X-1250, since the Corsair was installed just a few weeks ago.
The computer is used to run simulations (that require it to be under heavy load 24/7 for quite a few days), gaming when I finish work (working from home at the moment) and I also use it to develop my own games in Game Maker. All in all, it is working most of the time. The operating system is Windows 10 Pro and I have 3 screens: one is 4K, connected to the 3090, and the other two are 2K and normal HD, connected to the 3070.
I received the cards at the same time, the 10th of November. I installed both and they were running fine for around 15 days. Then, I started to notice that when the 3090 was under heavy load I could hear sound artifacts when using the motherboard audio out (crippling sound) and from time to time some USB devices (a hard drive in particular) were disconnecting. It started to smell like a power related problem (the Seasonic was already 9 years). Luckily, I had a sound card around, so I installed it, checked that the sound was fine, and I kept working with the machine.
Everything was fine for another 15 days, but suddenly the 3090 started misbehaving every time it was under heavy load: black screen and fans spinning at 100%. This started to become very frequent, with the card lasting no more than a few minutes working at full steam. In the event viewer you could see that these were proper crashes most of the time.
I started checking what was happening in GPU-Z and I found that the 3090 was receiving power mostly only through one of the 8-Pin connectors (150W) but the other one and the PCIe slot were not providing much juice at all. Therefore, I thought: "ah! the card is getting crazy because it is demanding lots of power and is not getting it".
At this point, I re-arranged the cables in my old Seasonic in order to use a single output from the PSU for each of the 8-Pin connector in the cards and I also decided to order the Corsair AX1600i. With the new cable distribution, the computer became stable again and I could see in GPU-Z that the right amount of power was provided all the time.
Three weeks later the Corsair arrived. I installed it and everything was suddenly great, no more crippling sound under heavy load, no USB disconnection, great stability... I was very happy because I was assuming that I had found the problem. Then, two days ago I installed the new Nvidia drivers (461.09) and every time I was trying to execute Game Maker the problem was coming back: black screen, fans at 100%... this was happening 2 seconds after clicking on the icon, with no GPU load at all. I made a post with the details in Nvidia forums, in case you want to have a look ( https://www.nvidia.com/en-us/geforc...screen-and-fans-spinning-at-100-with-my-3090/ ) but the long story short is that this failure was very consistent and easy to replicate. I could also see in the Event Viewer that, in this case, after failing the computer was still alive, with the DWM.exe trying to recover the Desktop to no avail time after time. Funnily enough, Game Maker seemed to be the only problematic app, the rest, including games, were fine. Things like this, I decided to roll back to my previous drivers (460.89) and this fixed the problem. The computer has been running a simulation for 24 hours under heavy load and it seems to be completely stable.
So... after all this rant... any idea of what is going on? It seems quite clear to me that the old PSU was struggling, but it is also obvious that even if new one solved many issues, it did not make the computer bullet proof either. I would be tempted to suspect that the GPU is at fault, but the fact that yesterday it crashed under no load in such a reproducible way, and also the fact that rolling back to the previous drivers solved the issue and it has been running a simulation for 24 hours without problems (with 320W of GPU power draw), makes me think that this may be well software related.
In any case, this is starting to get old, I am quite fed up and I don't really know what to do. If GPU availability was normal, I would try to RMA this one, buy another from a different brand and sell the RMAed unit for cheap, just to get rid of it. That is, sadly, not very possible at the moment. Moreover, I am not sure about starting the RMA process because it may well be that they try the GPU for a few days and everything is fine in their hands, or it is a software problem. Gigabyte is also reported to have a dismal RMA service, so I am afraid I would be without my card for a long time (I did not experience it by myself though, so I would like to give them a chance)
What do you think? Any help will be greatly appreciated. Sorry for the long read!
P.D: the 3070 has worked totally fine all this time, by the way.
I thought I would be writing about this to see if you have suggestions for me, because I am a bit desperate at the moment .
I currently have a Threadripper 3960X in a TRX40 Arous Xtreme. The PSU is a Corsair AX1600i and the computer has two GPUs: a Gigabyte RTX 3090 Gaming OC and a Gigabyte Eagle 3070 OC. Before the 3000 series I used to have a 2080Ti and a 980Ti that worked flawlessly. My PSU was an old Seasonic X-1250, since the Corsair was installed just a few weeks ago.
The computer is used to run simulations (that require it to be under heavy load 24/7 for quite a few days), gaming when I finish work (working from home at the moment) and I also use it to develop my own games in Game Maker. All in all, it is working most of the time. The operating system is Windows 10 Pro and I have 3 screens: one is 4K, connected to the 3090, and the other two are 2K and normal HD, connected to the 3070.
I received the cards at the same time, the 10th of November. I installed both and they were running fine for around 15 days. Then, I started to notice that when the 3090 was under heavy load I could hear sound artifacts when using the motherboard audio out (crippling sound) and from time to time some USB devices (a hard drive in particular) were disconnecting. It started to smell like a power related problem (the Seasonic was already 9 years). Luckily, I had a sound card around, so I installed it, checked that the sound was fine, and I kept working with the machine.
Everything was fine for another 15 days, but suddenly the 3090 started misbehaving every time it was under heavy load: black screen and fans spinning at 100%. This started to become very frequent, with the card lasting no more than a few minutes working at full steam. In the event viewer you could see that these were proper crashes most of the time.
I started checking what was happening in GPU-Z and I found that the 3090 was receiving power mostly only through one of the 8-Pin connectors (150W) but the other one and the PCIe slot were not providing much juice at all. Therefore, I thought: "ah! the card is getting crazy because it is demanding lots of power and is not getting it".
At this point, I re-arranged the cables in my old Seasonic in order to use a single output from the PSU for each of the 8-Pin connector in the cards and I also decided to order the Corsair AX1600i. With the new cable distribution, the computer became stable again and I could see in GPU-Z that the right amount of power was provided all the time.
Three weeks later the Corsair arrived. I installed it and everything was suddenly great, no more crippling sound under heavy load, no USB disconnection, great stability... I was very happy because I was assuming that I had found the problem. Then, two days ago I installed the new Nvidia drivers (461.09) and every time I was trying to execute Game Maker the problem was coming back: black screen, fans at 100%... this was happening 2 seconds after clicking on the icon, with no GPU load at all. I made a post with the details in Nvidia forums, in case you want to have a look ( https://www.nvidia.com/en-us/geforc...screen-and-fans-spinning-at-100-with-my-3090/ ) but the long story short is that this failure was very consistent and easy to replicate. I could also see in the Event Viewer that, in this case, after failing the computer was still alive, with the DWM.exe trying to recover the Desktop to no avail time after time. Funnily enough, Game Maker seemed to be the only problematic app, the rest, including games, were fine. Things like this, I decided to roll back to my previous drivers (460.89) and this fixed the problem. The computer has been running a simulation for 24 hours under heavy load and it seems to be completely stable.
So... after all this rant... any idea of what is going on? It seems quite clear to me that the old PSU was struggling, but it is also obvious that even if new one solved many issues, it did not make the computer bullet proof either. I would be tempted to suspect that the GPU is at fault, but the fact that yesterday it crashed under no load in such a reproducible way, and also the fact that rolling back to the previous drivers solved the issue and it has been running a simulation for 24 hours without problems (with 320W of GPU power draw), makes me think that this may be well software related.
In any case, this is starting to get old, I am quite fed up and I don't really know what to do. If GPU availability was normal, I would try to RMA this one, buy another from a different brand and sell the RMAed unit for cheap, just to get rid of it. That is, sadly, not very possible at the moment. Moreover, I am not sure about starting the RMA process because it may well be that they try the GPU for a few days and everything is fine in their hands, or it is a software problem. Gigabyte is also reported to have a dismal RMA service, so I am afraid I would be without my card for a long time (I did not experience it by myself though, so I would like to give them a chance)
What do you think? Any help will be greatly appreciated. Sorry for the long read!
P.D: the 3070 has worked totally fine all this time, by the way.