Overclock.net banner
21,261 - 21,280 of 21,988 Posts
GB6 will catch issues with transient loads that no other benchmark will catch. These issues can be the reason other stress tests pass, but people get crashes during something like gaming, where the load is varying.
 
That's usually what GB does when it fails. It just stops running. I have had GB3 to hard lock though. But most of the time the progress window just disappears with no messages. For me it's always been too little vCore.
One odd behaviour I'm noticing is that even though, math wise my current VF setup should be like "1.474+0.050 = 1.534 max voltage, ish?" For 1-4 core workload 61x short TVB boosting (down a bin at 65c and 80c).

But sometimes I see it'll go up to numbers like 1.560, for just a moment idling. During, say, Pi 1M -- the voltages are I guess correct, bouncing around 1.4 and 1.53 as the load jumps around 6100-5800 between cores. If I lower the offset by 10, the max becomes like 1.52 as expected.

But if I let it sit at idle, or it returns to idle state, that happens. LLC 5 @ ACLL 0.02 w/+0.10v it would be 1.56~ idling. LLC 4 @ 0.19 w/+0.050 it's basically the same.

If I limit IA VR Voltage to 1550, scores drop a little, 61x boosts don't happen much at all, and max vid idling is like 1.53 instead. There's some buffer of sorts it needs.

(Edit)
I want to lower the offset/max vid since I feel that needing 1.5+ for 61x is too much, but alas I errored in GB6 at +0.040 last time. But perhaps it's something to do with the lower clocks/voltages.
 
A reddit post spoke of the same thing happening - their GB6 would sometimes stop. No error popup just stop. After they raised VF Curve Voltage some amount, they said they could keep running it just fine.

It just becomes difficult to know where it's failing on VF curves/VCore. My assumption is that, for example, maybe 58x6 @ factory vid vf of 1.414 (mine for example) is ok but the 57, 56, 55x interpolation from previous VF (5400) isn't enough (this shouldn't be the case though if full load 56x8 eg. R23 is passing). The 13900k OCTVB guide says to just add offset to VF 10 (for me, 1.474 native @ 6000).

And I guess VF 11 is reserved for pushing further when the inherited vid from 10's offset isn't enough. I'm currently getting the error popup sometimes too. My VF 11 is for 61x, so it's inheriting the same 1.474+0.050 current offset, but it might need more. Or maybe it's 59/60 VF 10 that does.
Yeah this sounds tricky. I don't use adjusted VF curves but I've had good results with waiting for GB6 to log a WHEA with an APIC ID (re-running GB6 as necessary until I get a WHEA), grabbing the ID from the event log and translating it to the physical core (using CPU-Z report), then increasing vcore or decreasing that core's multi. Guess that's not particularly helpful in your case, but perhaps you could limit every core's multi to something stable, then increase the multi until it crashes and determine the problem points that way.

FWIW all 3 of my chips (13900KS, 14900K, 14900KS) have had this instability at stock/oob bios settings and unlocked power limits with air cooling. At least one of them also had it when limited to 253W (I didn't test the other two), it'd just take hours to trigger instead of 5-10 runs of 100 iterations.

Thanks @CarSalesman for originally pointing me towards GB6 realtime btw. :)

Edit: Clarified need to translate APIC ID to physical core.
 
Yeah this sounds tricky. I don't use adjusted VF curves but I've had good results with waiting for GB6 to log a WHEA with the APIC ID (re-running if necessary until I get a WHEA), then grab the core from the event log and increase vcore or decrease that core's multi. Guess that's not particularly helpful in your case, but perhaps you could limit every core's multi to something stable, then increase the multi until it crashes and determine the problem points that way.

FWIW all 3 of my chips (13900KS, 14900K, 14900KS) have had this instability at stock/oob bios settings and unlocked power limits with air cooling. At least one of them also had it when limited to 253W (I didn't test the other two), it'd just take hours to trigger instead of 5-10 runs of 100 iterations.

Thanks @CarSalesman for originally pointing me towards GB6 realtime btw. :)
APIC ID isn't the physical core.
To get the physical core, you need to link the APIC ID to the core by using "CPU-Z" "report -->save" (either text or html).

Rectangle Azure Screenshot Font Parallel
 
  • Rep+
Reactions: manyhats
APIC ID isn't the physical core.
To get the physical core, you need to link the APIC ID to the core by using "CPU-Z" "report -->save" (either text or html).

View attachment 2654303
Both of your comments help greatly here - I can do this, all of my corrected WHEA have primarily been APIC ID 32 40 or 48. But I didn't know how that corresponded.

For @manyhats ,I think on Asus equivalent I might be able to manually increase an offset to that cores adaptive voltage. But lowering the multiplier down is easier. I don't really know how to bring the max voltages down re: last post going over weird max vid behaviour.
 
Okay so according to CPU-Z my APIC ID 32 is "Core 4 (ID 12) Thread (8) / Thread 9".

APIC ID 40 is Core 5 Thread (10) / Thread 11, and 48 is Core 6 Thread (12) / 13.

That's interesting, as these start at 0 same as hwinfo, it means that my hwinfo's report of P7 (PCore 8) - which is above and beyond my HOTTEST core (I locked its ratio at 58x max just to help it not make any problems while doing all this), instantly maxing out faster than the others for example at red 96c, vs P0 which would be 81c or similar in that same time frame - isn't the one throwing errors, and is actually Cores 4 (5) 5 (6) and 6 (7), which are more.. in the middle ground. These will be around the 90c mark, maybe 92~, if P7 is 96c for example.

This is referring to Corrected WHEA's (Parity and TLB) during the first few seconds of R23 when the VCore is too high and the cpu rockets to high 90s instantly, specifically P7.

If I keep it out of the red and like high 80s/low 90s, I pass R23, never tried for long but 3-5 passes back to back is to be expected, unless the VCore is too low (which is vc 1.208, vid 1.212~1.215) for me, then it will whea or program error. * NO FULL BSOD YET *.

----

I found last night that my vmin to pass R23 (like 3-4 runs) would be 1.217 showing, and VID reporting 1.222~. This would be out of the red early into the night when it was cool, and I'd get like 41.6k points with High Prio, 56x/44x/46x. Later, that same kind of values were redlining the heat and throwing corrected wheas, so it must have been much hotter in my room. I need to re-paste my CPU today and check @ same settings.

I have a good understanding of my R23 full-load voltages, it's just the VF/Light Load voltages I'm trying to figure out for GB6 infinite stability. Especially the WHY as to why 1.474+0.050 = idles 1.578 peaks. When it should be around 1.524 maximum...
 
GB6 will catch issues with transient loads that no other benchmark will catch. These issues can be the reason other stress tests pass, but people get crashes during something like gaming, where the load is varying.
Thats why I mainly use GB for stability testing. I allways need 20-30mv more on all V/F points compared to just testing CB23vmin.
Its easy to test the specific frequencys by setting up different power plans with diffrent max frequencys in windows. Normally 5800 and higher fail. Especially when using TVB and not controlling the voltages lead to gb failures.
 
Thats why I mainly use GB for stability testing. I allways need 20-30mv more on all V/F points compared to just testing CB23vmin.
Its easy to test the specific frequencys by setting up different power plans with diffrent max frequencys in windows. Normally 5800 and higher fail. Especially when using TVB and not controlling the voltages lead to gb failures.
Yup gb6 has been great... it's really low load too.

Ah like if you aim for some 61x profiles at the end but it's kinds mixed with 58x6 eg mine, u can start off just capping all cores to 58x and seeing if there is errors there, then you know you only need to up the VF points related to 58x and below? Actually really simple way of approaching it..

I never know what's failing for me - is it 60/61x? Or maybe 57 or 58? I might go ahead and do this..
 
  • Rep+
Reactions: X909
Yup gb6 has been great... it's really low load too.

Ah like if you aim for some 61x profiles at the end but it's kinds mixed with 58x6 eg mine, u can start off just capping all cores to 58x and seeing if there is errors there, then you know you only need to up the VF points related to 58x and below? Actually really simple way of approaching it..

I never know what's failing for me - is it 60/61x? Or maybe 57 or 58? I might go ahead and do this..
Work your way up, not down. Far easier to come to a conclusive result instead of an ambiguous one.
And don't just rely on GB6. Shader caching and rendering/encoding are incredibly accurate tests as well that don't push the CPU too hard.
 
Work your way up, not down. Far easier to come to a conclusive result instead of an ambiguous one.
And don't just rely on GB6. Shader caching and rendering/encoding are incredibly accurate tests as well that don't push the CPU too hard.
Yeah I remember you telling me about premiere and hogwarts ,, last time I tried those they all worked fine xD like premiere to see if my IMC was the problem back during irql bsods to rule it out and determine it was just VCore
 
  • Rep+
Reactions: Ichirou
Repasted my CPU! - did not change any settings since before doing so - and the R23 runs that would 96c red thermal limit / etc just 2 seconds into the test & instantly throw corrected WHEA parity & TLB errors (1.217-1.225vcore reading, 1.222-1.229v on actual VID during) for 56x/44x/46x, they now have hit 90c on my worst/hottest core (p7) and down to 79c on the coolest core. I still have a pretty significant core descripency though - P0 = 79c, P1 = 81c, P2 = 79c, P3 = 89c, P4 = 82c, P5 = 88c, P6 = 82c, P7 = 90c (and E-Cores all around 73-79c much more even).

But yes, I did 3 passes just now after repasting the CPU on those exact same load voltage settings and instead of max temp instantly & corrected wheas, each one passed with less wattage/amps (about 285w, 231A) than before without error (yet).

So that seems to fix the load voltage wheas, they were indeed thermally induced. No degradation on load either, because 1.217v is basically just about what I needed back in the day too, after including ram oc (it could go down as low as 1.204 or something before including XMP).

Now it is time to.. I guess limit my OCTVB core oc to "58X" maximum on all cores for now, and run GB6 over and over to see if the "Auto" value @ current acll/VF curve is fine for 56->58.
 
  • Rep+
Reactions: Ichirou
Repasted my CPU! - did not change any settings since before doing so - and the R23 runs that would 96c red thermal limit / etc just 2 seconds into the test & instantly throw corrected WHEA parity & TLB errors (1.217-1.225vcore reading, 1.222-1.229v on actual VID during) for 56x/44x/46x, they now have hit 90c on my worst/hottest core (p7) and down to 79c on the coolest core. I still have a pretty significant core descripency though - P0 = 79c, P1 = 81c, P2 = 79c, P3 = 89c, P4 = 82c, P5 = 88c, P6 = 82c, P7 = 90c (and E-Cores all around 73-79c much more even).

But yes, I did 3 passes just now after repasting the CPU on those exact same load voltage settings and instead of max temp instantly & corrected wheas, each one passed with less wattage/amps (about 285w, 231A) than before without error (yet).

So that seems to fix the load voltage wheas, they were indeed thermally induced. No degradation on load either, because 1.217v is basically just about what I needed back in the day too, after including ram oc (it could go down as low as 1.204 or something before including XMP).
There you go. Case closed.
 
There you go. Case closed.
Yup!! Im really surprised though because, don't these things at stock (or similar) people run things like R23 at 95-100c constantly and not WHEA?

Maybe the cores are more sensitive to temperature as time has gone on? Either way it is a great relief. I ONLY have to worry about the transient octvb stuff again. And it is as minor instability, I go a whole day playing game without whea or crash so far during all of this, it is only Gb6 that triggers it.

I really still cannot find the cause of my idle vcore being SO high though! I even read like 10 pages of the 13900k guide, and saw some people mention enabling manual Ring made idle voltage high (1.5 - but mine is going as high as 1.58!), so I let Ring Down Bin enabled + let minimum Ring set to Auto - now my idle voltage bounce from 0.7X to 1.58, rather than 1.19-1.58, which is better I guess. But it SHOULD only be 1.534~ tops! (1.474+0.050)
 
Yup!! Im really surprised though because, don't these things at stock (or similar) people run things like R23 at 95-100c constantly and not WHEA?

Maybe the cores are more sensitive to temperature as time has gone on? Either way it is a great relief. I ONLY have to worry about the transient octvb stuff again. And it is as minor instability, I go a whole day playing game without whea or crash so far during all of this, it is only Gb6 that triggers it.

I really still cannot find the cause of my idle vcore being SO high though! I even read like 10 pages of the 13900k guide, and saw some people mention enabling manual Ring made idle voltage high (1.5 - but mine is going as high as 1.58!), so I let Ring Down Bin enabled + let minimum Ring set to Auto - now my idle voltage bounce from 0.7X to 1.58, rather than 1.19-1.58, which is better I guess. But it SHOULD only be 1.534~ tops! (1.474+0.050)
Thermal instability is exponential. I already hinted towards this when talking about how the rate of degradation gets worse the higher the multiplier you go.

Your idle voltage will spike high because your chip needs a lot of Vcore to run 61x for low/idle loads. You can safely ignore those spikes as the current is low.
If you pull your highest multiplier down to 58x or less, you'll find the Vcore spikes to be lower as well.
 
Thermal instability is exponential. I already hinted towards this when talking about how the rate of degradation gets worse the higher the multiplier you go.

Your idle voltage will spike high because your chip needs a lot of Vcore to run 61x for low/idle loads. You can safely ignore those spikes as the current is low.
If you pull your highest multiplier down to 58x or less, you'll find the Vcore spikes to be lower as well.
I see, I did read it before. I guess just didn't feel or recognize the impact of it. Outside of these benchmarking times my cpu is never pushed or come close to more than 80-85c while normal usage. So the 8~month or so (bit more probably) running at things like 61x and 62x at only 1-4 core voltage loads impacted how sensitive the chip is at high temperature than before. It makes sense, and now I know it too + also know that all core hasn't degraded.

I have set a 1600 IA VR voltage limit just to control the spikes and ensure nothing ever goes beyond that too even at light load (though oddly enough I notice if I drop to something like 1550 - the max voltage becomes instead 1.53~ and scores drop. Everything kind of shifts down, but there remains this buffer of voltage between max and true ia vr set max.

ASUS has a final VF Curve #11 for the "OC Ratio" aka maximum set multiplier. Right now it's on Auto, but it has the same factory vid as VF #10/9 - which is assigned to 6000mhz of native 1.474. I figured that 11 just "inherits" what I set to #10, if I leave 11 at auto. But I guess it's deciding some things on its own when at idle.

When I run "Pi 1M" or any similar single-core based workload, 90% of the VID reports "accurate" offset numbers - 1.53~ max with rare spikes beyond to 1.55.

Will give it a rest tonight and then tomorrow work from bottom to top like you suggested, and lock my OCTVB to 58x max first, and see if "auto" vf voltages cause issue in GB6.
 
I see, I did read it before. I guess just didn't feel or recognize the impact of it. Outside of these benchmarking times my cpu is never pushed or come close to more than 80-85c while normal usage. So the 8~month or so (bit more probably) running at things like 61x and 62x at only 1-4 core voltage loads impacted how sensitive the chip is at high temperature than before. It makes sense, and now I know it too + also know that all core hasn't degraded.

I have set a 1600 IA VR voltage limit just to control the spikes and ensure nothing ever goes beyond that too even at light load (though oddly enough I notice if I drop to something like 1550 - the max voltage becomes instead 1.53~ and scores drop. Everything kind of shifts down, but there remains this buffer of voltage between max and true ia vr set max.

ASUS has a final VF Curve #11 for the "OC Ratio" aka maximum set multiplier. Right now it's on Auto, but it has the same factory vid as VF #10/9 - which is assigned to 6000mhz of native 1.474. I figured that 11 just "inherits" what I set to #10, if I leave 11 at auto. But I guess it's deciding some things on its own when at idle.

When I run "Pi 1M" or any similar single-core based workload, 90% of the VID reports "accurate" offset numbers - 1.53~ max with rare spikes beyond to 1.55.

Will give it a rest tonight and then tomorrow work from bottom to top like you suggested, and lock my OCTVB to 58x max first, and see if "auto" vf voltages cause issue in GB6.
Honestly, unless you absolutely need 60x+ performance for some reason, stick with safe longevity at 57/47/52x and call it a day until the next generation you decide to upgrade to.

That's a multiplier you can daily on all-core all-day without needing to make compromises with OCTVB or whatever. And it doesn't need a whole lot of voltage.

Voltage scaling is awful once you break past those multipliers. Exponential increase. Incredibly inefficient use of power unless you're on LN2.
 
Honestly, unless you absolutely need 60x+ performance for some reason, stick with safe longevity at 57/47/52x and call it a day until the next generation you decide to upgrade to.

That's a multiplier you can daily on all-core all-day without needing to make compromises with OCTVB or whatever. And it doesn't need a whole lot of voltage.

Voltage scaling is awful once you break past those multipliers. Exponential increase. Incredibly inefficient use of power unless you're on LN2.
My E-Cores suck, when I was first dialing in the lowest stock vcore with acll tuning (before octvb at all) I could do 45x E-Cores in R23 (with only 5-10w power increase) but 46x crashed just getting into Windows a few times, and would insta crash R23. My Ring.. well I never took my ring past 48. Right now it's 46 but I intend to bring it back up to at least 48x after finally kicking out the intermittent/infrequent GB6 program error or whea correction. L2 = 1.325v, SA = 1.275v (up from 1.315/1.250 when you suggested SA would stabilize high-heat core situation. Now that I repasted, I can bring those back down again now).

As for the former... it's just an adhd thing. I don't like those numbers. All core has to be either 56 or 58x.. Cache 46 48 50 etc... Numbers like 61, 73, 77, etc don't bother me tho haha. But we sure as hell aren't getting to 73x ratios! More than likely, gaming performance - stutters, frame time dips, etc - would be better with an all core static situation. But... I dunno. It's cool to look at the end result and see the tuning.. That instead of stock 60x on 2 cores I can do 61x on 4. That kind of thing. The only thing that concerns me is as you said even on "light load ie. gaming" 61x set to 1.474+0.050~ offset on the VF curve can degrade over time even if Amperage and Wattage are not high.
 
My E-Cores suck, when I was first dialing in the lowest stock vcore with acll tuning (before octvb at all) I could do 45x E-Cores in R23 (with only 5-10w power increase) but 46x crashed just getting into Windows a few times, and would insta crash R23. My Ring.. well I never took my ring past 48. Right now it's 46 but I intend to bring it back up to at least 48x after finally kicking out the intermittent/infrequent GB6 program error or whea correction. L2 = 1.325v, SA = 1.275v (up from 1.315/1.250 when you suggested SA would stabilize high-heat core situation. Now that I repasted, I can bring those back down again now).

As for the former... it's just an adhd thing. I don't like those numbers. All core has to be either 56 or 58x.. Cache 46 48 50 etc... Numbers like 61, 73, 77, etc don't bother me tho haha. But we sure as hell aren't getting to 73x ratios! More than likely, gaming performance - stutters, frame time dips, etc - would be better with an all core static situation. But... I dunno. It's cool to look at the end result and see the tuning.. That instead of stock 60x on 2 cores I can do 61x on 4. That kind of thing. The only thing that concerns me is as you said even on "light load ie. gaming" 61x set to 1.474+0.050~ offset on the VF curve can degrade over time even if Amperage and Wattage are not high.
I think you mean OCD, because I have ADHD as well and odd numbers don't bother me, nor should they bother you.

These chips are not meant to be run above 58/47/52x. It's just that Intel wants to keep one-upping AMD, so they set these chips to use ridiculous amounts of Vcore and let their warranty cover people.
 
I think you mean OCD, because I have ADHD as well and odd numbers don't bother me, nor should they bother you.

These chips are not meant to be run above 58/47/52x. It's just that Intel wants to keep one-upping AMD, so they set these chips to use ridiculous amounts of Vcore and let their warranty cover people.
Yes that's the one XD Always trying to arrange or tidy things goes all the way back to Runescape days where I hated being on certain levels because they looked bad. For the latter part, mmm that's why I never entertained the idea of running more then the 56x in all-core workload. Not only my cooling wouldn't handle it in tests but it is just too much. Only these light load tinkering is what I think about. Even my Ring I know you say it is free up to 50 or similar, but the performance difference is not that big so I figure safe 46 or 48 is just fine without infringing on stability.

I was reading yours & Carsalesmans post many pages ago regarding the PLL's as a means of reducing VCore requirements (at least, in full load situation, I have no idea how it would work for all the per core adaptive shenanigans). I'm a bit puzzled on how the E-Core PLL interacts, but I have a good understanding of the Core PLL & Ring PLL from your discussion with them. I haven't touched the SA OR IMC PLL's and I probably won't considering everything RAM related is working fine and I don't want to upset that at all, but to tune the Core PLL load voltage, I'd simply need to back down my vcore to something 'almost bsod unstable' (in my case, 1.208v reading in R23) and then... Start at 1.10 Core PLL? Or work up from the stock value little bit at a time? And then in Ring's case since they share VID, just whenever Ring becomes unstable ie won't play Hogwarts or compile, try raising Ring PLL this time right?

Right now my Core & Ring PLL's are 0.99, I remember reading something very early on (it might have been Carsalesman themselves already, or Dainluke) to just start there as it was a 'pretty much always better' situation.
 
Yes that's the one XD Always trying to arrange or tidy things goes all the way back to Runescape days where I hated being on certain levels because they looked bad. For the latter part, mmm that's why I never entertained the idea of running more then the 56x in all-core workload. Not only my cooling wouldn't handle it in tests but it is just too much. Only these light load tinkering is what I think about. Even my Ring I know you say it is free up to 50 or similar, but the performance difference is not that big so I figure safe 46 or 48 is just fine without infringing on stability.

I was reading yours & Carsalesmans post many pages ago regarding the PLL's as a means of reducing VCore requirements (at least, in full load situation, I have no idea how it would work for all the per core adaptive shenanigans). I'm a bit puzzled on how the E-Core PLL interacts, but I have a good understanding of the Core PLL & Ring PLL from your discussion with them. I haven't touched the SA OR IMC PLL's and I probably won't considering everything RAM related is working fine and I don't want to upset that at all, but to tune the Core PLL load voltage, I'd simply need to back down my vcore to something 'almost bsod unstable' (in my case, 1.208v reading in R23) and then... Start at 1.10 Core PLL? Or work up from the stock value little bit at a time? And then in Ring's case since they share VID, just whenever Ring becomes unstable ie won't play Hogwarts or compile, try raising Ring PLL this time right?

Right now my Core & Ring PLL's are 0.99, I remember reading something very early on (it might have been Carsalesman themselves already, or Dainluke) to just start there as it was a 'pretty much always better' situation.
Ring clock up to 50x is virtually free if your Vcore is in the 1.30V+ range. So test and take advantage of it. No reason not to capitalize on it.
Could probably even test up to 52x if your chip's got good ring binning.

Just think of PLL as like, decimal voltages of Vcore. You can try increasing them to see if they help with stability.
In my experience, only increasing the Core PLL actually showed an effect. The rest provided nothing.
 
  • Rep+
Reactions: MANON
21,261 - 21,280 of 21,988 Posts