Overclock.net banner

(GTX 970/GTX 980): Why Bios modding is mandatory for most cards if you want a stable overclock

1 reading
61K views 57 replies 24 participants last post by  yinto  
#1 ·
* I keep this post technical and assume that whoever reads this is familiar with Maxwell Bios Tweaker and terms I use, such as "voltage table" etc. This is really a post for modders.

I looked at LOTS of GTX 970 Bios and with a good number of cards you can not add more voltage externally, say with Afterburner.
One indication for this is the THIRD slider in MBT's "voltage tab" in the BIOS, When there is one fixed value and no range, it's an indication that you cannot add more voltage.

This means, with many cards, like from EVGA, ASUS, MSI etc. you find yourself in a situation where you overclock, say in Afterburner but CAN NOT add more voltage.

This is bad for some reasons (which I will explain), but there is also a fix for it:

You will know that cards, depending on many factors like ASIC quality, power target, load etc. will boost to a certain BOOST CLK in the boost table, depending on their default given boost clock.
(A lot of cards in the 70%-ish ASIC quality range for example boost to CLK 63 or CLK 64 in the boost table.)

Each CLK entry in the table has its assigned voltage range.

So for example, your card may have a default boost clock of 1316, so it boosts to CLK63. At lower load maybe to clock 60 or 58.

You may find your max stable clock is 1500, so what you do in Afterburner you add 184 to the core.

But once you added core clock in Afterburner, the BIOS voltage table does not reflect the correct voltages.

You added clocks, say, equal to 10 or more boost bins to your card, but the card/bios does STILL USE THE SAME CLK BINS as if you were not overclocking.

Example: Normally, when your card would go to CLK 53 at stock, it might give you (example), 1.000V. But since you added a lot more (you're overclocking)...you would require a lot more voltage than what is specified in CLK BIN 53. (Again: the bios does NOT jump higher in the voltage table when you add clocks externally!)

What you need to do, you must specify your max. stable found voltage as the "default boost clock" in the BIOS and also make sure that the last entry in the boost table is this clock. (Say, 1506, or 1480, whatever you found stable).

By specifying your max. stable clock as "default boost clock" you force your card to the max. boost bin, all the way at the end of the voltage table. (And not, as would be default, somewhere in the middle, like CLK 63 or CLK64)

This is why you would not be stable with MANY cards where you cannot add voltages externally, you MUST modify the bios to get the correct voltages all through the voltage table.
(The instability becomes most apparent when the card CLOCKS DOWN, because it may be ok at your max. stock, default boost clock, say CLK63 and 1.200V, but then it boosts down and the individual voltages etc, are all messed up).

Of course, the best way is if you have a card which simply allows adding a bunch of volts on top of your OC (with Afterburner)...but as said MANY don't allow that.
 
#2 ·
More findings:

MOST of the time, in your BIOS, the "boost table" (as seen in MBT) might be sufficient to cover all your overclocks.
(For example, my card makes 15xx "in games" but to get it really stable in extremely demanding benchmarks I choose 1481.0, the next one down from 1506).

The default bios boost table goes already up to 1506.

----> If your boost table covers your maximal attainable overclock (ie: The last value in the table, 1506 is equal or actually more than your actual possible overclock)
IT'S BEST NOT TO TOUCH THE BOOST TABLE. So do NOT shift the values in the boost table around.

---> Just put your maximum stable max overclock as "boost clock" in MBT. (In my case, stock boost clock is 1316. So I changed this value to 1481.0)

If you KEEP the default boost table it will automatically find the right voltages and will (depending on power limit, target etc.) use the same, correct voltages and clocks as a stock bios.

I did extensive testing on this just now and I confirmed that overclocking with Afterburner, WITHOUT the ability to add voltages is the worst thing you can do!
*** It gets worse: Overclocking with Afterburner actually DROPS voltages compared to stock ***

Short: If you have a card where you cannot manipulate voltages externally (those are many cards!) do NOT use Afterburner to overclock...but use a BIOS where you have your max. stable OC as "boost clock".

*** Here is an example what I mean

80% Power Target Test
1.150V clock freq 1329 < -STOCK bios no overclock, 1329mhz = 1.150V
1.125V clock freq 1406 <- card overclocked to 1480 with +140 in AB, now at 1406mhz, card gives only 1.125V (which is not enough!)
1.150V clock freq 1329 <- card where you specified 1480 in MBT as "boost clock", rather than using AB to add clock: The correct, sufficient voltage as with stock voltage for this clock.

This is also the explanation why many folks are complaining about instability at lower clocks! Because they use Afterburner to OC but their card (voltage regulator, that is) doesn't allow giving more volts with Afterburner. (Or they think they can but in reality their card doesn't take the voltage you set in Afterburner. I too can set AB to +50mv but it doesn't mean squat, it doesn't do a a thing to my voltages)
 
#4 ·
Thanks for this post. I have noticed this a great deal - specifically in WoW with 200% render scale - card gets nuked and likes to pull ~30-50w more than some other games. I power limit it to 250w (980) and cap to 200fps in-engine because i sometimes run that game 50 hours a week and there's no need for the noise, but in that game and several others, i've seen 2 choices:

1; max OC all the time, not hitting power limit
or
2; instability

this should fix it.
Quote:
I've been looking for a low utilisation bugfix for months now...
You're probably playing CPU limited games, etc. This shouldn't improve GPU utilization. What games and situations are you seeing GPU below 100% load at and what's the rest of your hardware and screen resolution/game settings?
 
#5 ·
Quote:
Originally Posted by Cyro999 View Post

Thanks for this post. I have noticed this a great deal - specifically in WoW with 200% render scale - card gets nuked and likes to pull ~30-50w more than some other games. I power limit it to 250w (980) and cap to 200fps in-engine because i sometimes run that game 50 hours a week and there's no need for the noise, but in that game and several others, i've seen 2 choices:

1; max OC all the time, not hitting power limit
or
2; instability

this should fix it.
You're probably playing CPU limited games, etc. This shouldn't improve GPU utilization. What games and situations are you seeing GPU below 100% load at and what's the rest of your hardware and screen resolution/game settings?
No, it's not about that the utilisation would be a problem itself but when you OC the card with AB the voltage tables will shift and this may lead to instability at lower clock speeds even though you can run Unigine Heaven or Crysis for hours without problems.

I had this problem specifically with LoL for example.
The card didn't boost at all because of medium load and because the voltage tables were screwed up it would crash within a few minutes even with a mild OC while I could play BFH for hours!

I really hope this will fix the problem!
 
#6 ·
Quote:
Originally Posted by Geicher View Post

You are my hero!

Thank you so much for explaining!

So basically I just have o edit 1 value to fix this mess in MBT?

I've been looking for a low utilisation bugfix for months now...
Yes, with one caveat which I forgot to mention!

Change the "Max Boost Clock" in Default Tab to your max stable clock, say 1481.0

Now, it can happen you changed your Boost Clock to 1481.0, but your card still only boosts to, say, 1250.

Now, look over the voltage tab and , at the bottom, the last some entries might have voltages values which are a lot higher than your card can actually supply.
(In ALL Bios I looked at, the voltage tab I looked at, the maximum voltages in the voltage table is 1281.3 (Which is WAY too high, so the card won't ever boost there but only to the clock entry in the table which has the max voltage)

*** --> Edit all the values in the voltage table which are HIGHER than your actual card's max voltage to your actual card's max voltage. Those are a bunch on the right and a few on the left. This allows the card to use the entire voltage table.

I give you an example of my own card:



My card's absolute max voltage is 1.212V (many cards are limited to 1.212V), however my card is a special case since at 1.212V it may black screen in Heaven Benchmark. So I "artificially" limited my card to a max. voltage of 1.200V (It is stable at 1481 at 1.200V)

You see in my voltage table that I changed all the values which had been HIGHER than 1.200V to 1.200V. Means the card will now boost maximum to CLK72, WHICH IS ALSO THE ENTRY IN THE DEFAULT BOOST TABLE, which is 1481.1 <--- so they match up.

The last two clocks, CLK 73 and CLK74 I put in "crazy high values" so the card never uses CLK73 and CLK74, but CLK 72 as maximum.
(Also, don't forget in the "Boost States" tab to adjust GPC MAX for P00 and P02 to whatever your max stable OC although I dont think it's really needed. But its a good thing to do anyway)

This voltage table keeps essentially the same clocks/voltages as the default BIOS so the voltages are correct, also when it boosts down.

TLDR: Adjust the voltage table so your card uses the entire table, but essentially leave the original curve and increases of voltages and don't touch them. This is how I did it.
 
#7 ·
Quote:
I had this problem specifically with LoL for example.
The card didn't boost at all because of medium load and because the voltage tables were screwed up it would crash within a few minutes even with a mild OC while I could play BFH for hours!

I really hope this will fix the problem!
Yea, that should be due to this problem. Your GPU utilization itself wouldn't go higher (LoL is a graphically light and quite highly CPU bound game) but it should run at for example 1200mhz @1.05v instead of 1300mhz@1.05v (if you had a +100mhz offset) when it's not boosting up to your max OC.

Thanks a lot for this thread!
 
#8 ·
Just some additional notes:

Not all people who mod BIOS know actually what they're doing. So careful if you get/use some BIOS from somewhere else who modded it for you.

There can also be a problem that the card might not boost to its maximum because people modded their TDP and power targets to some insanely high values.

Say, if I were to mod my GTX 970 to a 500W TDP/Power limit or something.

Then there would be the problem that the card runs a game and of course never gets nowhere those way too high limits and then THINKS it doesn't need to boost because of "low utilitzation", in other words, upping power limit and TDP to crazy high levels does exactly the opposite.

What you want to do, if you mod a card and want to increase power limits, keep it REASONABLE, I always say 15%-20% more but not more.

As a rule, I run Heaven Benchmark and during the benchmark it should not throttle and maybe show 95%-98% usage. You should NOT mod your card so you run Heaven Benchmark and the card thinks it it's only at 50% usage...of course that would be wrong too.

My GTX 970 default, stock power limit is 170W, with 187W at 10% in the slider. I modded it to 196W as default (0%) and 206W with slider at 105%. 196W is a good value for THIS card so I keep it there. It already gets very hot 77C in Heaven BM since the ACX2.0 cooler is junk.
 
#9 ·
Yea almost.

An example for my specific problem:

The card would normally run at 1250 MHz at 1.0V for example. (without boosting!)

Now, after I apply any OC to the card it would still run at 1250 MHz (cause the card won't boost in this application or game) but because of the shifted voltage tables the applied voltage would be lower (depending on the offste clock you applied in Afterburner)

So after OC I would have only 0.95V at 1250 MHz for example. If I increase the clock speed even higher the voltage will drop at the same time.
 
#10 ·
Dear flexy123, can you post your modded BIOS for your 970 ACX 2.0 that reflects your considerations in previous messages?
I think it will be easy for everyone to follow you with it opened in MBT.
smile.gif
 
#11 ·
Check this bios out:
1304.0 tdp base clock
1544.5 3d base clock(in-game and benchmark frequencies)
1544.5 boost
272.8W tdp and max power(95400x2/6-pins) and (82000-PCI-E slot)=272800mW(100%power limit, constant)
1.2750v
idle temps 22-37C
load temps 48-55C
no throttling or perf caps
I ran heaven and valley, both for an hour+10 rounds of fire strike.

EVGAGTX970SSCACX2.0_1.275v_1545Mhz.zip 136k .zip file
Fire Strike 1620/4001, 1.275v, 49C
http://www.3dmark.com/3dm/6851504

This is the graph data from AB 4.1.0:

 

Attachments

#12 ·
Quote:
Originally Posted by Brama View Post

Dear flexy123, can you post your modded BIOS for your 970 ACX 2.0 that reflects your considerations in previous messages?
I think it will be easy for everyone to follow you with it opened in MBT.
smile.gif
Sure!

1481lol.zip 136k .zip file


Be aware:

This my personal BIOS for my SC ACX2.0 GTX 970 with an ASIC of 70%. On stock, it only boosted to 1.200V

After many tests I concluded that the card hits its limit at 1.212V, therefore I don't use the 1.212V.

It uses my tested 1481.0 at 1.200V where it's stable. Power limit is also slightly increased to 196W default and 206W max. Those cards are IMHO not designed to get pushed higher, not with that junk of a cooler. 196W is a good value to keep temps still reasonable. I haven't seen the card throttling in games (this was the main reason I modded in the first place)..:EXCEPT firestrike demo and some special cases which I think is where most cards throttle.

* Be also aware this is for the standard SC ACX2.0 cards which have a crappy voltage controller which does not go higher than 1.212V and which also doesn't allow setting voltage externally.

A SSC+ etc. uses entirely different components, it also has 1x6pin + 1x8pin...so use my BIOS only if you have a normal SC ACX2.0 card and think that it would do 1480..and you want it stable. (Obviously, SOME cards might get higher....but for me I found 1480 the best, 1510 etc. might black screen me out after an hour or so Heaven Benchmark, even at 1.212V)

Edit: The fact that I dont use the maximum possible Voltage of 1.212V in my bios has another advantage: It never reaches the voltage limit!
 

Attachments

#14 ·
I have a 970 PNY reference design and the voltage perfcap activates any time I read on AB 1.212 V or more.
Does it mean that my card is hardware limited at 1.212 V?
Can someone confirm if am I right?
 
#15 ·
Quote:
Originally Posted by flexy123 View Post

Edit: The fact that I dont use the maximum possible Voltage of 1.212V in my bios has another advantage: It never reaches the voltage limit!
Is there something real suggesting not to reach the voltage perfcap?
I wish to squeeze any MHz my VGA can achieve and +12 mV could help.
smile.gif


Using a full water dissipation, what level of normal and max power in W do you suggest?
 
#16 ·
Quote:
Originally Posted by Brama View Post

Is there something real suggesting not to reach the voltage perfcap?
I wish to squeeze any MHz my VGA can achieve and +12 mV could help.
smile.gif


Using a full water dissipation, what level of normal and max power in W do you suggest?
I'd say for any "normal" card I would always use the max. possible voltage, if temps are not a problem tho.

My SC ACX2.0 (AND the one I had before!) all have a problem with max voltage. Yesterday I found my card black screens even at 1.200V in Heaven BM, so now I changed my BIOS to use a maximum of 1.187V, it's still stable at 1480. I don't know what's wrong with those *****y EVGA cards but lower voltage is the only solution I found to prevent the black screens. That being said, on Air...and with the s****Y EVGA cards, say when it is stable at 1500 at 1.187V and doesn't really *require* 1.200 or 1.212 I would also prefer to run it at the lowest stable voltage, it reduces power draw (means: reduces the trehshold until it might throttle AND of course temps).

I am telling you, it's the ****y components, mainly voltage regulators etc. on those cards which cause problems, if they get too hot with high voltage the card(s) black screen. I am not the only one with this problem as it looks.
 
#17 ·
Quote:
Originally Posted by flexy123 View Post

I am telling you, it's the ****y components, mainly voltage regulators etc. on those cards which cause problems, if they get too hot with high voltage the card(s) black screen. I am not the only one with this problem as it looks.
I suspect that cards with NCP81174 voltage regulator hit some overcurrent protection that cut the power to card. I have similar black screens and in my experience they don't seems a unstable overclock but some hardware protection that is triggering in.
 
#18 ·
Quote:
Originally Posted by Brama View Post

I suspect that cards with NCP81174 voltage regulator hit some overcurrent protection that cut the power to card. I have similar black screens and in my experience they don't seems a unstable overclock but some hardware protection that is triggering in.
Yes, because with a non stable OC it's usually that the drivers crashes but the system recovers. The funny thing is, "in principle", those cards go up to 1550 or higher even, it's just that all of a sudden the card seems to "shut off" but otherwise is stable even at very high clocks. I wonder whether a good aftermarket cooler would actually help there.
 
#19 ·
Quote:
Originally Posted by flexy123 View Post

I wonder whether a good aftermarket cooler would actually help there.
I don't think so.
I have an EK waterblock on my PNY 970 and the card is able to hit and mantain about 1580 MHz for about 30 minutes without any issue.
Depending on load (in my PC FarCry 4 is perfect to check this kind of stability issue), after this time there is the infamous black screen that is engaged more by power load than gpu core frequency.
With light games, I can stay at 1620 MHz for hours.

I think that in many 970 and 980 they used very cheap VRM components and the OCP (Over Current Protection) of NCP81174 is kept very low, to protect the poor VRM components.
 
#20 ·
>>
that is engaged more by power load than gpu core frequency.
>>

Yes this is exactly the conclusion I came up with too! In fact, it's looking that anything beyond 195W draw is at some point becoming critical. The actual core freq or voltage doesn't matter (of course there is a relationship ultimately), but it's the overall power draw those cards cannot handle at some point. That's why I also say, IF YOU CAN and test your card stable at a certain freq...it might be a good idea to use the lowest possible voltage to keep the draw within tolerance. (I am simply assuming that lower voltage doesn't stress the VRMs that hard)
 
#21 ·
I agree.
There is another way: disable OCP of NCP81174 and risk the health of VRM active components.
I studied the data sheet of above VRM controller and seems very easy to bypass the OCP hardware protection.
If someone has the time and is so brave to try, I can explain how and, if it works, he can take the honor of the mod.
biggrin.gif
 
#22 ·
I am sure there is a hardmod and I already read about this somewhere, but TBH...those cards would be the last on my list I would want to push by any means
smile.gif


Right now I am happy being able to run Heaven Benchmark for an hour at 1.187V at 1481 without Black Screen, knock on wood. Was pretty *issed yesterday since I could've sworn I tested for a long time at 1.200V and thought the problem was gone.

SOME of those cards, like the one I RMAed cannot even handle 110% power target at STOCK (!) clocks...imagine that. As I said somewhere else, that's what you get when a company wants to save money and uses cheap components..... EVGA is really good and super-fast with service and RMA....but they definitely don't impress "enthusiasts" with those SC cards...
 
#24 ·
Quote:
Originally Posted by flexy123 View Post

Sure!

1481lol.zip 136k .zip file


Be aware:

This my personal BIOS for my SC ACX2.0 GTX 970 with an ASIC of 70%. On stock, it only boosted to 1.200V

After many tests I concluded that the card hits its limit at 1.212V, therefore I don't use the 1.212V.

It uses my tested 1481.0 at 1.200V where it's stable. Power limit is also slightly increased to 196W default and 206W max. Those cards are IMHO not designed to get pushed higher, not with that junk of a cooler. 196W is a good value to keep temps still reasonable. I haven't seen the card throttling in games (this was the main reason I modded in the first place)..:EXCEPT firestrike demo and some special cases which I think is where most cards throttle.

* Be also aware this is for the standard SC ACX2.0 cards which have a crappy voltage controller which does not go higher than 1.212V and which also doesn't allow setting voltage externally.

A SSC+ etc. uses entirely different components, it also has 1x6pin + 1x8pin...so use my BIOS only if you have a normal SC ACX2.0 card and think that it would do 1480..and you want it stable. (Obviously, SOME cards might get higher....but for me I found 1480 the best, 1510 etc. might black screen me out after an hour or so Heaven Benchmark, even at 1.212V)

Edit: The fact that I dont use the maximum possible Voltage of 1.212V in my bios has another advantage: It never reaches the voltage limit!
I copied your bios values to my G1 Gaming 970 bios and it turns out to be really nice, my FPS went down from 66 to 64 in Heaven from my old custom bios but the voltage and temperatures are way way better, it went down to 1481 at 1.2 vs 1531 at 1.275.
2 fps is nothing so I might keep depending if is stable. Thanks for sharing!
 
#25 ·
Quote:
Originally Posted by ValValdesky View Post

I copied your bios values to my G1 Gaming 970 bios and it turns out to be really nice, my FPS went down from 66 to 64 in Heaven from my old custom bios but the voltage and temperatures are way way better, it went down to 1481 at 1.2 vs 1531 at 1.275.
2 fps is nothing so I might keep depending if is stable. Thanks for sharing!
Try this one too:

1506-1.200V.zip 136k .zip file

(1506 at 1.200V)

1506.zip 136k .zip file

(1506 at 1.1870V)

The last one uses the max fom my boost table, 1506, at 1.187V. To my amazement my card is actually stable at that clock and voltage.The first one uses 1.200V as max voltage at 1506.
 

Attachments

#26 ·
Quote:
Originally Posted by flexy123 View Post

I am curious nevertheless, it will likely involve some pencil mod, right?
Yes, or better, you have to use conductive paint on a resistor in order to make a short circuit between two points and put always to 0 V the feedback used by VRM controller to trigger the OCP.
wink.gif