• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

My many problems with a Gigabyte RTX 3090 Gaming OC (long read).

Associate
Joined
2 May 2015
Posts
182
Location
Oxford
Hi guys!


I thought I would be writing about this to see if you have suggestions for me, because I am a bit desperate at the moment :).


I currently have a Threadripper 3960X in a TRX40 Arous Xtreme. The PSU is a Corsair AX1600i and the computer has two GPUs: a Gigabyte RTX 3090 Gaming OC and a Gigabyte Eagle 3070 OC. Before the 3000 series I used to have a 2080Ti and a 980Ti that worked flawlessly. My PSU was an old Seasonic X-1250, since the Corsair was installed just a few weeks ago.


The computer is used to run simulations (that require it to be under heavy load 24/7 for quite a few days), gaming when I finish work (working from home at the moment) and I also use it to develop my own games in Game Maker. All in all, it is working most of the time. The operating system is Windows 10 Pro and I have 3 screens: one is 4K, connected to the 3090, and the other two are 2K and normal HD, connected to the 3070.


I received the cards at the same time, the 10th of November. I installed both and they were running fine for around 15 days. Then, I started to notice that when the 3090 was under heavy load I could hear sound artifacts when using the motherboard audio out (crippling sound) and from time to time some USB devices (a hard drive in particular) were disconnecting. It started to smell like a power related problem (the Seasonic was already 9 years). Luckily, I had a sound card around, so I installed it, checked that the sound was fine, and I kept working with the machine.


Everything was fine for another 15 days, but suddenly the 3090 started misbehaving every time it was under heavy load: black screen and fans spinning at 100%. This started to become very frequent, with the card lasting no more than a few minutes working at full steam. In the event viewer you could see that these were proper crashes most of the time.


I started checking what was happening in GPU-Z and I found that the 3090 was receiving power mostly only through one of the 8-Pin connectors (150W) but the other one and the PCIe slot were not providing much juice at all. Therefore, I thought: "ah! the card is getting crazy because it is demanding lots of power and is not getting it".


At this point, I re-arranged the cables in my old Seasonic in order to use a single output from the PSU for each of the 8-Pin connector in the cards and I also decided to order the Corsair AX1600i. With the new cable distribution, the computer became stable again and I could see in GPU-Z that the right amount of power was provided all the time.


Three weeks later the Corsair arrived. I installed it and everything was suddenly great, no more crippling sound under heavy load, no USB disconnection, great stability... I was very happy because I was assuming that I had found the problem. Then, two days ago I installed the new Nvidia drivers (461.09) and every time I was trying to execute Game Maker the problem was coming back: black screen, fans at 100%... this was happening 2 seconds after clicking on the icon, with no GPU load at all. I made a post with the details in Nvidia forums, in case you want to have a look ( https://www.nvidia.com/en-us/geforc...screen-and-fans-spinning-at-100-with-my-3090/ ) but the long story short is that this failure was very consistent and easy to replicate. I could also see in the Event Viewer that, in this case, after failing the computer was still alive, with the DWM.exe trying to recover the Desktop to no avail time after time. Funnily enough, Game Maker seemed to be the only problematic app, the rest, including games, were fine. Things like this, I decided to roll back to my previous drivers (460.89) and this fixed the problem. The computer has been running a simulation for 24 hours under heavy load and it seems to be completely stable.


So... after all this rant... any idea of what is going on? It seems quite clear to me that the old PSU was struggling, but it is also obvious that even if new one solved many issues, it did not make the computer bullet proof either. I would be tempted to suspect that the GPU is at fault, but the fact that yesterday it crashed under no load in such a reproducible way, and also the fact that rolling back to the previous drivers solved the issue and it has been running a simulation for 24 hours without problems (with 320W of GPU power draw), makes me think that this may be well software related.


In any case, this is starting to get old, I am quite fed up and I don't really know what to do. If GPU availability was normal, I would try to RMA this one, buy another from a different brand and sell the RMAed unit for cheap, just to get rid of it. That is, sadly, not very possible at the moment. Moreover, I am not sure about starting the RMA process because it may well be that they try the GPU for a few days and everything is fine in their hands, or it is a software problem. Gigabyte is also reported to have a dismal RMA service, so I am afraid I would be without my card for a long time (I did not experience it by myself though, so I would like to give them a chance)


What do you think? Any help will be greatly appreciated. Sorry for the long read!


P.D: the 3070 has worked totally fine all this time, by the way.
 
Clearly the nVidia driver? Rolling back solved the issue it seems.

Thanks! It could be, but I reached this point where I don't trust the GPU anymore, that is why I was explaining the whole situation to you guys in order to hear your opinions :).

Hi,

I notice you seem to have a two x 8pin connectors.
Gigabyte have quietly added a 3rd connector to one of the 3080's, which asks the question, how much different is the 3090 board design given they are both GA102.

https://videocardz.com/newz/gigabyte-adds-a-third-8-pin-power-connector-to-geforce-rtx-3080-master

Saw this first in my youtube feed where someone was having throttling/stability issues with their gigabyte 3080 and stated there was now a Rev 2 with the third connector. Unfortunatly I can't find it now, but did find the VC link above.

Yes, I just have two 8-pin power connectors in this card, but the fact that it was crashing under no load makes me think that power is not the problem this time. I am pretty sure that there was an issue with it in before, though, but since stability issues under heavy load disappeared after installing the new PSU and sound problems went away, I think it it is fine now in that regard.
 
Thanks for your answers!

Gigabyte has one of the best in the uk, partly thanks to UK rep @GIGA-Man , think a lot of mobo and gpu owners on here can vouch for that one. If it wasnt for covid, you could have popped your GPU in personally if you lived close to Milton Keynes

pop open rma request and see how it fares

*** did you remember to register for 4th year warranty as well?

That is great :). Yes, I registered for the 4th year warranty, so I am covered in that regard!

Are you using seperate power cables and not daisy channing them. I would also do the following. Update motherboard bios. Update chipset drivers. Reverting back to good known drivers is a good idea.

Yeps, separate power cables. The BIOS is up to date (I will check the chipset drivers) and since I reverted back to the previous Nvidia driver I did not have more problems. It is a bit strange, all this, since the card was able to pass all the stability tests and everything after I went back to the old driver, and it was running totally fine again for weeks after changing the PSU. Maybe it was a driver issue, but I would prefer to have it checked. Another thing I did was to change the Nvidia driver to High Performance mode, since apparently lots of users are having TDR problems due to this.
 
Thank you so much all! I really really appreciate all your feedback and info :)

I did not have another problem since I rolled back to the previous driver... yesterday, after playing for a couple of hours, I also let Aida64 stressing the whole system overnight and everything was fine. However, it is great to have the RMA contact details and read your suggestions, just in case this happens again. As Sasahara mentioned, performance here is important, but stability and reliability are more important than anything else :).
 
So... at the end I contacted the RMA service and they were as efficient as you mentioned :). They sent me all the info to return the card and I told them that since it could be a software issue I would be keeping the GPU at the moment but, if it happens again, I will let it with them :).
 
I have to start a simulation with this guy today. If it glitches, the amount of profanity in this living room will be difficult to quantify :p Good that I am playing Ion Fury these days and it does not need more than a simple VGA card if you want.
 
Sound like to me you had a faulty or inadequate PSU which you then switched for a proper one and then just ran into a poor driver right after. I could be wrong but since the older driver seems to work and is still relatively new I would just use that and if no issues arise then go on with my life. No point in trying to dissect why a driver is misbehaving if the previous is fine as it will drive you up the wall at some point... been there, done that, don't recommend it.

It could actually be yes... :) They guy has been totally fine since I rolled back to the previous driver, and it is running a simulation now. Hopefully, it will manage to finish it. I will keep this driver at the moment, as you mention. I usually don't play super modern games, so it will be fine (I recently finished Prey, but a lot of time I am just playing WADs in GZDoom...).
 
Sooo... I am resurrecting this post to keep the tale up to date :). The GPU was fine for around 15 days, and today as soon as I booted the computer, bum, it started failing again, this time right after the login screen (the usual, black screen, fans spinning at 100%...). It did this three times in a row, and system configuration did not change for the last ten days, so I decided to fill the RMA form and I will be sending it back for them to check. It must be a power issue, since when it works it works fine for days but when it fails it tends to do it many times in a row, and it is independent on the load now. Luckily, the 3070 is fine and I also kept my old 980Ti, so I should be quite OK while I wait (I am using the 3070 for the main screen and the 980 for the auxiliary ones) fingers crossed it won't take forever!
 
I agree... it totally looks like a power issue, although the card also managed to exhibit the problem all the sudden when the computer had been running for days under constant load. There must be something in the PCB that is a bit on the edge and it fails from time to time. It may be something stupid, like a problem in the power connectors, but I cannot open it to have a look because I am afraid to void the warranty. I cannot do the underclocking test in an easy way either for two reasons: the first one is that this is a production machine at the moment and it needs to be working most of the time; the second that the card can be totally fine for days, so such a test could take a very long time. I hope they are able to find the issue, otherwise I will be forced to sell it for cheap, telling about the problem, and buy another one. I was suppossed to receive a workstation one of these days (when it finally arrives I won't need my regular desktop for work), but everything has gone down the drain lately in terms of availability and suppliers, so I need to be patient :(
 
Thanks! My card is actually one of those that should have the new connectors, although I did not open it to check. I am waiting to receive the RMA number, and then I will send it to them. I will keep you updated on what is going on and how this ends :).
 
Thanks! In the worst case, if they say that they don't find anything wrong, I will buy another one and sell this unit telling about the issue. Problem is that I would lose some money and, also, they are so scarce nowadays that sourcing another looks like a nightmare :(.
 
Soo... today I drove to Milton and left the card with them at 16:30. A guy attached the RMA form and a few papers showing the error codes from my reliability monitor to the box, then he said that everything was OK and I left. There was another 3090 Gaming OC in there too... and a person testing equipment in the background, surrounded by oscilloscopes. Let´s see what happens now :).
 
Hello! As promised, I am updating the post with the results of the RMA service. I just received this e-mail from them:

Hello

In regards to this returns number and reports left.
Our repair team advise no issues found with this item as reported/described but the item has been swapped for you anyway and is ready for self collection. Although your reports indicate hardware error, it does not specifically state the error is caused by the VGA so could be another hardware error.
Please see their comments below and pictures attached.
Therefore is issue persists it is very unlikely to be a VGA issue. Please advise date and ETA of self collection Monday to Friday 9am till 5pm. Item is not available for collection today but from next Monday onwards.
Regards
Returns Admin Dept


// Original card test No Fault Found, cannot find where user say "Fan start spin 100%, crash PC, sunddenly it dies 5 times in 5 min", but swap new to user, if same faulty not graphica card, may be other hardware problem. //

In summary, they could not reproduce the crash, something that is not entirely surprising because the card was able to run fine for days before entering in one of the crazy crashing loops, but they replaced the GPU for a new one and I can collect it on Monday. So, in principle, I am very happy with their service :). Now, if this happens again, I guess I will need to RMA the mobo! :p I doubt it, though, the computer has been totally fine with my old 980Ti and the 3070 Eagle :).
 
Thanks!

Tomorrow I will pick up the new unit and see how it goes :). If the problem persists, it may be an incompatibility with the motherboard, although at the moment I think it was GPU related.
 
Well, in my case it seems that the problems were totally GPU related. I have been with the new one during three weeks and I did not experiment any problem. However, the first unit was definitely not stable :eek:
 
Back
Top Bottom