Overclock.net - An Overclocking Community - Reply to Topic

Thread: 980ti memory or vrm overheating Reply to Thread
Title:
Message:

Register Now

In order to be able to post messages on the Overclock.net - An Overclocking Community forums, you must first register.
Please enter your desired user name, your email address and other required details in the form below.
User Name:
If you do not want to register, fill this field only and the name will be used as user name for your post.
Password
Please enter a password for your user account. Note that passwords are case-sensitive.
Password:
Confirm Password:
Email Address
Please enter a valid email address for yourself.
Email Address:

Log-in


  Additional Options
Miscellaneous Options

  Topic Review (Newest First)
12-04-2018 10:01 AM
Sugita2Junko
Quote: Originally Posted by 8051 View Post
Here's hoping they don't have any 980Ti's left!
I got a 1070ti.


Update:

I put the temperature probe on the same spot with the 1070ti, on the front plate right above the VRAM, got 34C while under 100% load gaming, so much cooler. The 980ti definitely had something overheating at 48-50C+ before it auto shutdown. Not sure if it was the VRAM, mosfet, or VRM since the heat plate was on top of them all.
11-30-2018 11:25 AM
8051
Quote: Originally Posted by Sugita2Junko View Post
No clue what I am getting, but people on reddit reported getting either a refurb 980ti, 1070ti or 1080. Guess it is whatever they have available and fixed.
Here's hoping they don't have any 980Ti's left!
11-29-2018 12:03 PM
Sugita2Junko
Quote: Originally Posted by 8051 View Post
That's great. How are they sending you a replacement 980Ti though? They can't have any of those in stock anymore can they?
No clue what I am getting, but people on reddit reported getting either a refurb 980ti, 1070ti or 1080. Guess it is whatever they have available and fixed.
11-29-2018 11:44 AM
8051
Quote: Originally Posted by Sugita2Junko View Post
It has been fine for 2 years with dual NF-F12 @ 1500rpm which is pretty strong, much stronger than the stock fans. Not sure if GPU component just degraded, generating more heat than before or something faulty. EVGA is actually sending me a RMA replacement despite being out of warranty shy of 2 weeks.
That's great. How are they sending you a replacement 980Ti though? They can't have any of those in stock anymore can they?
11-29-2018 07:44 AM
Sugita2Junko
Quote: Originally Posted by 8051 View Post
It sounds like you definitely need more cooling. Maybe you could try attaching an even higher powered fan or fans to the heatsink. Personally, I have a noctua NFA14-ippc3000 and a SanAce 127x38mm attached to my GPU's heatsink and both are shrouded.
It has been fine for 2 years with dual NF-F12 @ 1500rpm which is pretty strong, much stronger than the stock fans. Not sure if GPU component just degraded, generating more heat than before or something faulty. EVGA is actually sending me a RMA replacement despite being out of warranty shy of 2 weeks.
11-28-2018 12:51 PM
8051
Quote: Originally Posted by Sugita2Junko View Post
I play Overwatch and cap the FPS to 162, so the GPU utilization is usually 60-80%, rarely maxing out. At first I also thought it was a PSU issue. Swapped out a brand new one and still reboots. While trying to take the GPU out to re-seat I discovered it was freaking hot. Too hot to hold.

Installed a thermal probe on the front/backplate which covers the VRAM etc and found out it usually reboots when temp reach 50C. I tried stress testing again but this time fanning it hard to cool a little and it didn't reboot. Lowered the power limit to 60% and my games stopped rebooting.
It sounds like you definitely need more cooling. Maybe you could try attaching an even higher powered fan or fans to the heatsink. Personally, I have a noctua NFA14-ippc3000 and a SanAce 127x38mm attached to my GPU's heatsink and both are shrouded.
11-28-2018 12:26 AM
Sugita2Junko
Quote: Originally Posted by 8051 View Post
Do you ever see any perfcaps before it reboots? I wonder if it could be your PSU and some sort of over current protection kicking in? PSU's heat up too and if they heat up enough that can affect their output.
I play Overwatch and cap the FPS to 162, so the GPU utilization is usually 60-80%, rarely maxing out. At first I also thought it was a PSU issue. Swapped out a brand new one and still reboots. While trying to take the GPU out to re-seat I discovered it was freaking hot. Too hot to hold.

Installed a thermal probe on the front/backplate which covers the VRAM etc and found out it usually reboots when temp reach 50C. I tried stress testing again but this time fanning it hard to cool a little and it didn't reboot. Lowered the power limit to 60% and my games stopped rebooting.
11-27-2018 08:21 PM
8051 Do you ever see any perfcaps before it reboots? I wonder if it could be your PSU and some sort of over current protection kicking in? PSU's heat up too and if they heat up enough that can affect their output.
11-27-2018 07:35 PM
Sugita2Junko
Quote: Originally Posted by Desolutional View Post
Furmark will max out power consumption at P0 on most recent modern GPUs without dropping down a state. This is the best way to test if there is an issue with power delivery, e.g. underpowered PSU. It's also very useful for VRAM artifact testing. Considering the OPs card failed 5 seconds into the test on a warm boot implies something is wrong, and by reducing core offset and VRAM offset, they can eliminate those from the equation.

Caveat, it is less useful for general stability testing however - synthetic demos and games will be better for that.

@Sugita2Junko , those green sections are the definitely the VRAM modules. Disassemble the backplate and card and ensure that the thermal pads are mating with the VRAM modules, when removing them they should have rectangular shaped indentations if they have been mounted. Unstable VRAM can cause kernel panic and driver lockup on the 980 Ti, especially Hynix memory, the fact that it occurs after a few hours means something is heating up to steady state, unstable core clock would result in a watchdog timeout, not a hard reboot.
I can run furmark a long time before it reboots. But using realbench it reboots with in 15minutes from cold boot. Once it reboots running realbench immediately following would instantly reboot in a few seconds, but furmark can last a few minutes.

Don't see any artifacts or issues while gaming. No bluescreen or kernel panic or driver lockup. It just hard reboots once hot enough or something. Not sure if it is VRAM or VRM or mosfet or what but something is overheating. All I know is GPU core is well below thermal limit usually 50-60c gaming.

Going to see my EVGA will do an RMA on my 980ti. It is 2 weeks past the 3yr warranty cut off mark. If not, ill order some thermal pad and maybe stick on extra aluminum heatsink on the plate to help cool.
11-27-2018 02:49 PM
The Pook
Quote: Originally Posted by Desolutional View Post
It's also very useful for VRAM artifact testing.

You can find your vRAM OC in less than a minute with Unigine Heaven, that's not a reason to use Furmark. Start the test in windowed mode at 720p/1080p, pause somewhere during the test on a scene of your choice, increase your vRAM OC until you either get artifacts or performance drops (the scene will still render @ xxx FPS while it's paused) and then back off until you don't have artifacts or performance goes back up. You don't need Furmark to OC quickly, it's a useless program that does nothing outside making a ton of heat for no reason - pretty much why it's nicknamed "power virus."
This thread has more than 10 replies. Click here to review the whole thread.

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off