Overclock.net banner

CoreCycler - tool for testing single core stability (e.g. Curve Optimizer settings)

68 reading
1M views 2.1K replies 268 participants last post by  sp00n82  
#1 ·
Over the last couple of days resp. weeks I've been working with the Curve Optimizer for Ryzen processors a bit more, but I hadn't found a good way to test the settings for stability. CineBench single threaded almost always worked fine, and getting Prime95 stable with load on all cores was also relatively quick. Waiting for crashes while idling or while playing a game wasn't so appealing either, and on Reddit someone even suggested using the Windows Repair as some kind of stability test... that didn't seem like a good idea to me.

So this sparked the idea for this tool. It's a PowerShell script which starts up an instance of Prime95 with only a single worker thread, stressing only a single physical CPU core. And it cycles through all the available cores after an adjustable time, so that you can run this tool e.g. over night, and then the next day you can check which cores have run fine and which ones have thrown an error in Prime95.

By now it looks polished enough for a release, however so far I'm the only one who has tested it, so additional reports are welcome.

You can find it here:


To execute it, simply double click the "Run CoreCycler.bat".
And be sure to read the included readme.txt as well as the config.ini (resp. config.default.ini) to get a grasp of what settings you can change.
(Note: the config.ini will be auto-generated on the first start from the config.default.ini)


Screenshots of the script in action:


And here's an example for a summary, this is how the testing went for me during development. As you can see, this still takes quite some time to get stable. (And no, the summary will not be generated automatically, you'll still have to do this yourself ;)):



Here's an excerpt from the readme.txt

This little script will run Prime95 with only one worker thread and sets the affinity of the Prime95 process alternating to each physical core, cycling through all of them. This way you can test the stability of your Curve Optimizer setting for each core individually, much more thoroughly than e.g. with Cinebench or the Windows Repair, and much easier than manually setting the affinity of the process via the Task Manager.
It will still need a lot of time though. If for example you're after a 12h "prime-stable" setup which is common for regular overvlocks, you'd need to run this script for 12x12 = 144 hours on a 5900X with 12 physical cores, because each core is tested individually, and so each core also needs to complete this 12 hour test individually. Respectively, on a 5600X with its 6 physical cores this would be "only" 6x12 = 72 hours.
Unfortunately such an all-core stress test with Prime95 is not effective for testing Curve Optimizer settings, because the cores cannot boost as high if all of them are stres tested, and therefore you won't be able to detect instabilities that occur at a higher clock speed. For example, with my CPU I was able to run a Prime95 all-core stress test for 24 hours with an additional Boost Override of +75 MHz and a Curve Optimizer setting of -30 on all cores. However, when using this script, and with +0 MHz Boost Override, I needed to go down to -9 on one core to have it run stable (on the other hand, another core was still happy with a -30 setting even in this case).

When you start the script for the first time, it will copy the included config.default.ini to config.ini, in which you then can change various settings, e.g. which mode Prime95 should run in (SSE, AVX, AVX2, CUSTOM, where SSE causes the highest boost clock, because it's the lightest load on the processor of all the settings), how long an individual core should be stressed for before it cycles to the next one, if certain cores should be ignored, etc. For each setting there's also a description in the config.ini file.

As a starting point you could set the Curve Optimizer to e.g. -15 or -20 for each core and then wait and see which core runs through fine and which throws an error. Then you could increase the setting for those that have thrown an error by e.g. 2 or 3 points (e.g. from -15 to -13) and decrease those that were fine by 2 or 3 further into the negative (-15 to -17). Once you've crossed a certain point however there is no way around modifying the value by a single point up/down and letting the script run for a long time to find the very last instabilities.

By the way, it is intended that only one thread is stressed for each core if Hyperthreading / SMT is enabled, as the boost clock is higher this way, compared to if both (virtual) threads would be stressed. However, there is a setting in the config.ini to enable two threads as well.
 
#2 ·
After testing hours of y-cruncher/TM5 memory test and even Prime95 AVX multi thread without issues I tried your script without any modifications... To my surprise I got rounding errors at stock, after a clear CMOS + load defaults!

I was thinking my CPU was stable but it seems it's actually not!

Regarding the script it works well, thanks for sharing and keep the good work!
 
  • Rep+
Reactions: shabbirali
#52 ·
Same here, i found out that one of my best cores needed +9 to run without errors (5800x), i really do not know if should send it to RMA, because the core that was having the problem was my prefered software and hardware core and i noticed that most of the time, it has receiving way less voltage than what i think the "main core" should receive.
 
#3 ·
Already been done.
Single core Prime95 test script for Zen 3 curve offset tuning
But that one focuses on a single FTT size with some reporting there CPU error quicker with different FTT which could be one advantage to your version of the test even if it takes a bit longer to run.

I find that after passing that single threaded prime95 test and y-cruncher it will still quickly blue screen with much lower curves when using AIDA64 cache benchmark on my 5800X so I will give your test a try and see what it finds thanks.
 
  • Rep+
Reactions: sp00n82
#5 ·
This doesn't help you detect core problems at non peak frequency/voltages though.. so at best it only helps with a small range of the space of where undervolting can cause crashes. It seems like we'd really need AMD to provide us a way to "lock" a core at an arbitrary point on it's boost curve so we could test the whole range.
 
#6 · (Edited)
Gave it a run for two passes of your prime but then today I gave AIDA64 cache bench another go and the system rebooted so I dropped the offset for all cores 1 notch below these values.
This 5800X seems to allow decent curves up to 4875MHz but at 4900MHz they plummet but that is what is required to hit the max WHEA free IF on this CPU.

101x48.5=4898.5MHz max and distance below max during per core prime95 single thread.
3x scaler LLC off
CORE #0 13 CPPC=150, Curve-11 =1304mV max 15.510W
CORE #1 9 CPPC=146, Curve+1 =1384mV -50MHz 18.136W
CORE #2 6 CPPC=139, Curve-5 =1385mV -10MHz 18.814W
CORE #3 0 CPPC=127, Curve-5 =1374mV -40MHz 18.276W
CORE #4 3 CPPC=135, Curve-6 =1386mV -20MHz 17.573W
CORE #5 5 CPPC=143, Curve-3 =1384mV -20MHz 17.499W
CORE #6 5 CPPC=131, Curve-2 =1364mV -100MHz 16.470W
CORE #7 12 CPPC=150, Curve-17 =1271mV max 15.179W

For comparison this is what was stable with just 50MHz less where prime95 stability testing was good enough as AIDA64 doesn't crash from tweaking curves till it goes over 4850MHz.
Core #1 seems to be the only one that errors in prime before crashing AIDA64 at higher frequency.
CORE #0 13 CPPC 150 -14 =1262mV
CORE #1 9 CPPC 146 -9 =1323mV
CORE #2 6 CPPC 139 -22 =1245mV
CORE #3 0 CPPC 127 -22 =1263mV
CORE #4 3 CPPC 135 -17 =1286mV
CORE #5 5 CPPC 143 -23 =1238mV
CORE #6 5 CPPC 131 -22 =1275mV
CORE #7 12 CPPC 150 -23 =1220mV
With these curves all cores were able to maintain 4850MHz during the single threaded prime95 test unlike at 4900MHz
 
#11 ·
My 5800X with the curve optimized/tested with Core Cycler is also able to maintain 4850MHz during the single core tests with Core Cycle/CB R20.

Single core at 4850Mhz seems to be a pattern with decent 5800X CPU's when not using PBO Boost Offset.

I can't squeeze any more MHz with the PBO Boost Offset (not even 25Mhz) without having to drop all curve values that leads to multi core performance loss.

In multi core running CB R20 I'm getting 4750 Mhz all core.

I prefer to leave it this way ( 4750 Mhz all core with zero PBO offset ) than having to lower all cores to get more 25/50MHz in single core.
It does not worth it.
Multi core scenarios are 90% of the use case scenarios this days.
 
#9 ·
I ran core cycler several times in the last week. What’s found is that after one night passing all the test.. I canceled the script.. so I was on idle and then I ran CB20.. an then.. it reboot. No BSOD and nothing else. I don’t know but I guess core cycler would be fine to find some extreme values conflict but it doesn’t help to fix the reboot on idle issue. Has something similar happened to anyone?


Sent from my iPhone using Tapatalk Pro
 
#12 ·
With version 0.8 CoreCycler will support Aida64 and Y-Cruncher as additional stress tests. Although for Aida64 you'll need to manually download and extract the Portable Engineer version due to its license (I don't think I can include it).
There's also a new config switch which if set (which it is by default) will periodically suspend and restart the stress test, with which I hope to emulate the load change scenarios and therefore catch more instabilities. Also the test order can be changed, by default it's now alternating between core 1 on CCD1, core 1 on CCD2, core 2 on CCD1, etc for processors with more than 8 cores, or random for anything up to 8 cores. With that I hope to squeeze out some more MHz because the hot spots are distributed more evenly, which could lead to slightly higher boost clock as well.

It should be ready soon and I'm looking for beta testers for this version, so if you're interested, let me know. 🙃
 
#13 ·
With version 0.8 CoreCycler will support Aida64 and Y-Cruncher as additional stress tests. Although for Aida64 you'll need to manually download and extract the Portable Engineer version due to its license (I don't think I can include it).
There's also a new config switch which if set (which it is by default) will periodically suspend and restart the stress test, with which I hope to emulate the load change scenarios and therefore catch more instabilities. Also the test order can be changed, by default it's now alternating between core 1 on CCD1, core 1 on CCD2, core 2 on CCD1, etc for processors with more than 8 cores, or random for anything up to 8 cores. With that I hope to squeeze out some more MHz because the hot spots are distributed more evenly, which could lead to slightly higher boost clock as well.

It should be ready soon and I'm looking for beta testers for this version, so if you're interested, let me know. [emoji854]
Yes! I’m definitely want to try the new version. Again, the y-cruncher helps me to find errors in the curve very easy. I think is by far better than P95 because I could run hours of P95 alone or even with the core cycler without any errors.. but there are errors.. and after having canceled the test I had a reboot on idle. So.. y-cruncher is very good but I think it put in consideration other factors like memory.. so if the new CC version stop and then start again it would be great. The other thing is to have some schedule .. you now, running a test for 144hs straight is almost impossible.. maybe if it can run for 12 hours.stop and continue .. would be fine, perhaps only test 1 core and then stop..


Sent from my iPhone using Tapatalk Pro
 
#18 ·
Thanks for sharing!
y-cruncher seems to be the best and faster option for error detection
After 2 iterations 10m per core for each stress test y-cruncher both times failed on core 6, aida64 and prime95 no errors.
Changing core 6 curve from -23 to -22 and test with y cruncher again for 2 iterations 10m per core no errors

+ 22:00:10 - ...checking CPU usage: 4.17%
+ ...current CPU frequency: ~4815 MHz (130.1%)
+ Suspending the stress test process
+ Suspended: True
+ Resuming the stress test process
+ Resumed: True
+ 22:00:22 - ...checking CPU usage: 0%
+ 22:00:22 - ...the CPU usage was too low, waiting 2000ms for another check...
+ Process Id: 1368
+ 22:00:22 - ...checking CPU usage again: 0%
+ ...still not enough usage, throw an error
+ There has been an error with the stress test program!
ERROR: 22:00:28
ERROR: Y-Cruncher seems to have stopped with an error!
ERROR: At Core 6 (CPU 12)
ERROR MESSAGE: The Y-Cruncher process doesn't use enough CPU power anymore (only 0% instead of the expected 4.17%)
+ The stress test program is Y-Cruncher, no detailed error detection available
+ There has been some error in Test-ProcessUsage, checking
+ Trying to close the stress test program to re-start it
+ Trying to close the stress test program
+ Trying to close Y-Cruncher
+ Trying to gracefully close Y-Cruncher
+ Y-Cruncher closed
+ restartTestProgramForEachCore is not set, restarting the test program right away
22:00:28 - Trying to restart Y-Cruncher
+ Starting the stress test program
+ Starting Y-Cruncher
+ Trying to get the stress test program window handler
+ Looking for these window names:
+ ^.*00-x86\.exe$
+ 22:00:28 - Window found
+ Found the following window(s) with these names:
+ - WinTitle: Y-Cruncher - 00-x86.exe
+ ProcessId: 3820
+ Process Path: C:\Users\User\Desktop\CoreCycler-v0.8.0.0-RC3\test_programs\y-cruncher\Binaries\00-x86.exe
+ Filtering the windows for ".*00-x86.exe$":
+ - WinTitle: Y-Cruncher - 00-x86.exe
+ ProcessId: 3820
+ Process Path: C:\Users\User\Desktop\CoreCycler-v0.8.0.0-RC3\test_programs\y-cruncher\Binaries\00-x86.exe
+ Stress test window handler: 459910
+ Stress test window process ID: 3820
+ Stress test process: 00-x86
+ Stress test process ID: 3820
+ The Performance Process Counter Path for the ID:
+ \\desktop-nuhesut\process(00-x86)\id process
+ The Performance Process Counter Path for the Time:
+ \\desktop-nuhesut\process(00-x86)\% Processor Time
+ Y-Cruncher seems to have stopped with an error at Core 6 (CPU 12)
+ Alternating test order selected, getting the core to test...
+ Previous core: 6
+ The selected core to test: 1
22:00:30 - Set to Core 1 (CPU 2)
23:47:12 - Iteration 1
----------------------------------
+ Alternating test order selected, getting the core to test...
+ Previous core:
+ The selected core to test: 0
Notice!
Apparently Aida64 doesn't like running the stress test on the first thread of Core 0.
Setting it to thread 2 of Core 0 instead (Core 0 CPU 1).
23:47:12 - Set to Core 0 (CPU 1)
+ Setting the affinity to 2
+ Successfully set the affinity to 2
Additional stress tests for higher frequencies may improve error detection, y-cryncher 00-x86 test is one of the best option available compared to aida64 and prime95 but can't reach peak frequencies.

e.g.
5900x +150mhz

2484266


2484267
 
#20 ·
Thanks for sharing!
y-cruncher seems to be the best and faster option for error detection
After 2 iterations 10m per core for each stress test y-cruncher both times failed on core 6, aida64 and prime95 no errors.
Changing core 6 curve from -23 to -22 and test with y cruncher again for 2 iterations 10m per core no errors

+ 22:00:10 - ...checking CPU usage: 4.17%
+ ...current CPU frequency: ~4815 MHz (130.1%)
+ Suspending the stress test process
+ Suspended: True
+ Resuming the stress test process
+ Resumed: True
+ 22:00:22 - ...checking CPU usage: 0%
+ 22:00:22 - ...the CPU usage was too low, waiting 2000ms for another check...
+ Process Id: 1368
+ 22:00:22 - ...checking CPU usage again: 0%
+ ...still not enough usage, throw an error
+ There has been an error with the stress test program!
ERROR: 22:00:28
ERROR: Y-Cruncher seems to have stopped with an error!
ERROR: At Core 6 (CPU 12)
ERROR MESSAGE: The Y-Cruncher process doesn't use enough CPU power anymore (only 0% instead of the expected 4.17%)
+ The stress test program is Y-Cruncher, no detailed error detection available
+ There has been some error in Test-ProcessUsage, checking
+ Trying to close the stress test program to re-start it
+ Trying to close the stress test program
+ Trying to close Y-Cruncher
+ Trying to gracefully close Y-Cruncher
+ Y-Cruncher closed
+ restartTestProgramForEachCore is not set, restarting the test program right away
22:00:28 - Trying to restart Y-Cruncher
+ Starting the stress test program
+ Starting Y-Cruncher
+ Trying to get the stress test program window handler
+ Looking for these window names:
+ ^.*00-x86\.exe$
+ 22:00:28 - Window found
+ Found the following window(s) with these names:
+ - WinTitle: Y-Cruncher - 00-x86.exe
+ ProcessId: 3820
+ Process Path: C:\Users\User\Desktop\CoreCycler-v0.8.0.0-RC3\test_programs\y-cruncher\Binaries\00-x86.exe
+ Filtering the windows for ".*00-x86.exe$":
+ - WinTitle: Y-Cruncher - 00-x86.exe
+ ProcessId: 3820
+ Process Path: C:\Users\User\Desktop\CoreCycler-v0.8.0.0-RC3\test_programs\y-cruncher\Binaries\00-x86.exe
+ Stress test window handler: 459910
+ Stress test window process ID: 3820
+ Stress test process: 00-x86
+ Stress test process ID: 3820
+ The Performance Process Counter Path for the ID:
+ \\desktop-nuhesut\process(00-x86)\id process
+ The Performance Process Counter Path for the Time:
+ \\desktop-nuhesut\process(00-x86)\% Processor Time
+ Y-Cruncher seems to have stopped with an error at Core 6 (CPU 12)
+ Alternating test order selected, getting the core to test...
+ Previous core: 6
+ The selected core to test: 1
22:00:30 - Set to Core 1 (CPU 2)
23:47:12 - Iteration 1
----------------------------------
+ Alternating test order selected, getting the core to test...
+ Previous core:
+ The selected core to test: 0
Notice!
Apparently Aida64 doesn't like running the stress test on the first thread of Core 0.
Setting it to thread 2 of Core 0 instead (Core 0 CPU 1).
23:47:12 - Set to Core 0 (CPU 1)
+ Setting the affinity to 2
+ Successfully set the affinity to 2
Additional stress tests for higher frequencies may improve error detection, y-cryncher 00-x86 test is one of the best option available compared to aida64 and prime95 but can't reach peak frequencies.

e.g.
5900x +150mhz

View attachment 2484266

View attachment 2484267
Which exact tests did you use? Can you please share a screenshot of your y-cruncher settings?
Thanks :)
 
#21 ·
Here's v0.8.0.0 RC4, only use it if you're willing to beta test. 👆


@Theo164
It's interesting, I've never been able to have Y-Cruncher actually fail a stress test. It's either passing for me or rebooting the whole system. 🤷‍♂️
And regarding BoostTester, I've looked at it as well, and its source code, and unfortunately I don't see a way how I could use this for error testing. There's no error checking involved, all it seems to do is to assign a large array and then randomly accesses these entries. Or to quote the developer: "The goal of this function is to create a "100%" load at extremely low IPC. The best way I can think of to do this is by constantly stalling waiting for data from RAM."

I'm open for suggestions for other programs that I could use as a stress test. They'd need to be controllable from the command line though and give some sort of feedback if an error occurs (either just stop or preferably also generate a log file I could parse).
 
#22 ·
Now this is odd, my Core 5 managed to fail in safe mode within 4 iterations of prime95, 10minute per core.
While with y-cruncher, it's managed to survive an hour in both regular Windows as well as Safe Mode. Could it be the fact that other cores were being tested in the prime95 test that core 5 crashed? Even though the total stress time of the core is same in both cases?
Edit: Before stress testing it in prime95 along with other cores, I had also had it run stable for around 1.5 hours in the same test while having added the other cores to the ignored list.
 
#23 ·
Once you passed a certain stability threshold, I've found that errors happen only infrequently. Which is the whole reason why for all-core overclocks you'd generally do a 12 hour stress test over night (or even longer), to catch such infrequent errors. The problem with single core boost behaviour and therefore single core stress tests is that for the same level of confidence you would need to run the stress test for 12 hours for every single core, so it's much more time consuming than an all-core overclock. Basically, 1.5 hours of total test time per core is nothing if you're looking for a rock-stable overclock.

I've had cores fail after 14 nights of running fine without an error.

2484410
 
#24 ·
RC5 is now out, this is hopefully the last RC before I can push out the final version 0.8.
Again, only use this if you're willing to beta test.

 
#25 ·
Happy Easter!

Version 0.8.0.0 is now available.
It includes support for Aida64 and Y-Cruncher!

Download:
Releases · sp00n/corecycler


Changelog:
  • Updated Prime95 to 30.5 build 2
  • Support for Aida64 and Y-Cruncher! You can now define which stress test program to use (Note: see the readme.txt for more infos on Aida64)
  • Restructured the config file to support Aida64 and Y-Cruncher. Old config files will not work!
  • Added a new "suspendPeriodically" setting, which is activated by default. This setting tries to simulate load changes by periodically suspending and resuming the stress test program
  • Added a new "coreTestOrder" setting, which defaults to "Alternate" (for more than 8 cores/2 CCDs) or "Random" (max 8 cores/1 CCD). You can also define a custom order for the sequence in which the cores should be tested (e.g. "5, 7, 5, 1, 0, 7, 4")
  • Added more presets for Prime95: "Moderate", "Heavy" & "HeavyShort". See the config file for additional explanation about these presets
  • The priority of the stress test process is now set to "High" to prevent other processes from "stealing" processing power and produce false alarms due to low CPU usage
  • Starting a stress test program will not steal the focus anymore (or give it back immediately to the last opened window)
  • The script will now try to close the stress test program after pressing CTRL+C, if it is still running and using CPU processing power
  • The title of the terminal window is now set to "CoreCycler"
  • The approximate CPU frequency is now visible in the verbose output (but unfortunately it's not as accurate as e.g. HWInfo64 or Ryzen Master, which is why it's not in the main output)
  • All logs are now in the /logs directory
  • Added an "analyze_prime95_logfile.ps1" script to the /tools directoy, which can be used to determine the time it took for all FFT sizes to appear at least once for each iteration within a Prime95 log file
  • Added "CoreTunerX" to the /tools directory. See CXWorld/CoreTunerX of what it can do
 
#26 ·
Happy Easter!

Version 0.8.0.0 is now available.
It includes support for Aida64 and Y-Cruncher!

Download:
Releases · sp00n/corecycler


Changelog:
  • Updated Prime95 to 30.5 build 2
  • Support for Aida64 and Y-Cruncher! You can now define which stress test program to
  • Added more presets for Prime95: "Moderate", "Heavy" & "HeavyShort". See the config file for additional explanation about these presets
Hi, I’m trying this last version and it’s really good to find core instabilities. I say that the new presets for P95 indicates that are for Rayzen 5000, why is that? What changes regarding the previous presets SSE, AVX..
I found that I pass the test with SSE but I couldn’t pass with the same settings with moderate or Heavy presets. I’m using the last both preset to stabilize my curve

Thanks!



Sent from my iPhone using Tapatalk Pro
 
#28 ·
Hi, I read the German site with the overclocking guide. I could pass several heavy and heavy short cycles, however in the moderate test I found instabilities after several iterations. At first I thought that I hadn’t estimated the right the time for the core iteration. Then I set 20 minutes per core and after reading the German thread the author of the guide clarifies that this test would be affected by memory overclocked .. so that’s would be my case. I already have a memory oclked and even is stable maybe it can be conflicting with the test

Regarding y-cruncher, I read here in OCN that the best test to find fast instabilities are the 15 and 16. Maybe they are for multi core .: do you think you can add to Core Cycler?

It is good AIDA for this kind of testing? I have the extreme version and I could buy the engineer. But I don’t know if it worth it for this specific test. What is the benefit of Aida vs P95 and Y-C?

Thanks!


Sent from my iPhone using Tapatalk Pro
 
#29 ·
Yes, every memory overclock can also have an effect on CPU stress testing. Therefore the best approach is to separately test memory and CPU overclocks, and only if both proved to be stable, combine them together. Otherwise you cannot really be sure if an error happened due to an unstable CPU or an unstable memory overclock. Eliminate the number of variables.
And even a thoroughly tested memory overclock could have an impact on CPU overclocking, as the IMC gets stressed more when the memory is overclocked, which could in turn reduce the OC capabilities of the CPU, as more heat & more voltage for the IMC could potentially affect the cores themselves as well.

Tests 15 and 16 on y-Cruncher are N32 and N64, both are already part of the preconfigured tests there.

And regarding Aida64, the main advantages of it are that a) it's simploy another stress test with another instructions and b) that with e.g. the CACHE stress test, you can achieve boost clocks on your cores compared to Prime95 and y-Cruncher, which could show even more of the edge case instabilities.
Also you don't necessarily have to buy the Engineer version for stress testing, you can use the portable Trial version for free for 30 days. Which should be enough to determine if you overclock is stable or not.
 
#30 ·
Im probably being stupid.
But when I run the BAT I see loads of errors.
What am I doing wrong?

Add-Content : Could not find a part of the path 'C:\corecycler-0.8.0.0\corecycler-0.8.0.0\logs\CoreCycler_2021-04-18_12-24-05_PRIME95_SSE.log'.
At C:\corecycler-0.8.0.0\corecycler-0.8.0.0\script-corecycler.ps1:455 char:9
  • Add-Content $logFileFullPath (''.PadLeft(11, ' ') + ' + ...
  • ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (C:\corecycler-0...PRIME95_SSE.log:String) [Add-Content], DirectoryNotFoundException
+ FullyQualifiedErrorId : GetContentWriterDirectoryNotFoundError,Microsoft.PowerShell.Commands.AddContentCommand
 
#34 · (Edited)
@Veii

Als Österreicher solltest du des deutschen ja mächtig sein, dann lies mal die Originale. (vielleicht ist die Übersetzung nicht korrekt)

Wenn ich TDP Klassen Limits schreibe, dann meine ich die Limits die AMD vorgibt und das sind eben 142 / 95 / 140 z.B. bei einer 105W TDP CPU (Standardwerte von AMD bestätigt)
Ich empfehle deswegen nichts höheres, weil es den Leuten selbst überlassen sein soll was sie da eingeben.

Ich verstehe nur nicht, wie du behaupten kannst, dass die Limits falsch seien, denn alle deutschen Reviewer haben die selben Limits angegeben (bei 105w TDP) also liegen alle falsch?

Was die MB Hersteller bei "Auto" machen ist eine andere Geschichte, aktuell hat MSI mit dem 1202er sogar ein neuen Powertable hinzugefügt, der mal greift, mal nicht, jenachdem ob man im CBS- oder OC Menü Einstellungen vornimmt (welches genau es war habe ich nicht herausfinden können, könnte aber sogar sein, dass es durch das einstellen verschiedener Settings in beiden Menüs gleichzeitig ausgelöst wird).

Ich würde dennoch gerne wissen, welche Limits deiner Ansicht nach korrekt sind.
 
#35 · (Edited)
Das Problem bei der ganzen Geschichte mit AMD ist, dass verschiedene Quellen verschiedene Informationen erhalten
Und sobald man die wirklichen Werte (nicht die sicheren Werte) anfrage, müsse man unter NDA stehen.
Ich selber habe nicht die exakten maximum Werte, den NDA und OCer passen nicht zusammen.

Was ich allerdings rauslesen kann, sind die Stock Werte welches die CPU vorprogramiert bekaam
Jedoch sind es "Arbitrary" (Platzhalter) Werte und keine Limits.
Von den FUSE Limits (hard-limits) möchte keiner sprechen. Weder Mr. Hallock noch andere Engineers

Über die Powertable Änderungen, die 4 neuen Sensoren und das umschriebene dLDO + FIT Q-Table verhalten, weiß ich allerdings ein wenig Bescheid
(CTR Berücksichtige es)

Der 5800X ist besonders, einer seiner haupt-orientierungs Werte ist tDIE sowie tJunction ~ neben FIT-Q Range und +/- dLDO_Injector headroom (voltage injection headroom im FIT-Q Table headroom)
Somit habe dieser keine fixen boost limits, aber ich kann von AVX2 limits ausgehen, welche FIT-Q fasst auf 0.00 erzwinge (y-cruncher bzw p95 ~ aber ich finde y-cruncher schlimmer)
* mehr zu dem später wenn ich ein Sample habe (1-2 Tage)
** wie im oberen post genannt, bypasse er seine fixierten PBO Werte und hällt sich nur ans Thermal Limit von 90°

Getested wird es auf PBO DISABLED @ 1800/3600
~ mit 900-900-940-980-1100 (Realistische Stock Limits)
(CPU VDDP, cLDO_VDDP, VDDG CCD, VDDG IOD, VSOC)

STOCK
5600X
PPT: 76W
TDC: 60A (ausgerechnet, da ich auf stock das limit nicht erreiche)
EDC: 90A

5800X
PPT: 142W
TDC: 95A
EDC: 140A

5900X
PPT:
TDC:
EDC:

5950X
PPT:
TDC:
EDC:

EDC FUSE LIMITS:
5600X
- 122A (120A mit peaks auf 122A)
5800X - 166A (160A mit peaks auf 166A)
5900X - 200A (210A auf MSI Boards mit Telemetry faking)
5950X - 200A (die selben wie oben)

SILICON HARDLIMITS @ Q1 2021
5600X

PPT: 160W
TDC: 140A
EDC: 185A

5800X
PPT: 170W
TDC: 140A
EDC: 195A

5900X
PPT:
TDC:
EDC:

5950X
PPT: 200W
TDC: 160A
EDC: 250A

Leider sind diese Hard-Cap Limits von AMD immer noch weit entfernt von dem was uns consumern geboten wird. (besonders 5600X Nutzern)
Bevor die CPUs auch nur eine Chance haben die Silicon Limits (dynamic) zu erreichen, hängen wir im EDC FUSE limit

Bitte verwechsel nicht die TDP Werte mit den vorkonfigurierten Werten.
Mir ist klar dass ASUS, MSI & Gigabyte andere Silicon Scalar Werte haben bzw diese durch telemetry faking "erweitern"
ASRock als Vergleich mache dies nicht, aber da PBO von Natur an die V/F curve greift, muss es auf DISABLED sein

AMD's TDP limits sollten eigentlich bei PPT greifen,
Allerdings ist der wirkliche focus auf EDC (A)
Worin memOC sowie SOC state eine große übergreifende Rolle spielen & dementsprechend "das boosting Verhalten" ändern. Im existierenden "Powerbugdet"
Somit mein Tipp,
~ Bleibe im EDC FUSE limit, oder erweitere das Stock Boosting Limit ~ fals man interesse fürs memOC hat
* Genauere Hardlimits dürften nicht genannt werden, den das Silicon Verhalten (FIT) spielt hier eine große Rolle.
** Labor-Testwerte bleiben leider ein Firmengeheimniss. Aber nur die EDC Werte spielen eine Rolle für die Lebensqualität unserer Chips.
PPT sehe als Netzteil Limit an, bzw Cooler Limit. Dies ist ebenso ein limiter in Notebooks ~ worin STAMP noch obendrauf greift.
===============
The Dilemma on this whole Question, starts with AMD's limited information sharing. Where each of the Sources get different values.
And in-case you where interested about the actual hard-limits & get permission to ask, it would require you to sign an NDA.
As for myself, i don't stand under any NDA ~ simply as NDA's and OCer don't fit together.
* But there is stuff i should not talk about

What i certainly can read out, are the stock values you requested and talked about.
Although these are indeed Arbitrary values, which do not follow any limits whatsoever :)
About the actual FUSE Limits (closer to hardlimits) nobody wants to talk about and denies their existance.
Neither Mr. Hallock nor other Engineers

On the question about the new changes on >1.2.0.1, including the 4 new active sensors & rewritten dLDO + FIT Q-Table, i do have some information *
(CTR's functionality does take them into account)

About the 5800X mentioned earlier,
Like mentioned above, it is by AMD a "Gaming CPU" and bypasses it's limit on SSE and AVX1 loads
It's orientation is tDIE and tJunction ~ aside from FIT-Q range within +/- dLDO_Injector range/headroom
Soo in short ~ It doesn't have stock limits like the remain lineup and defines them dynamicly up to FIT
But i can go from the sillicon health limits under harsh AVX2 loads, which should max out FIT-Q and let no dLDO_Injection range left (voltage injection)
(for example with y-cruncher or p95, but i find y-cruncher to be more aggresive)
* more infromation on this topic when i get a sample infront of me (1-2 days) ~ got a sample
** like mentioned on my post earlier, it bypasses it's set PBO limits and only focuses on a Thermal limit of 90°

Everything is tested under PBO DISABLED @ 1800 FCLK / 3600MT/s
~ with 900-900-900-940-980-1100 (using realistic stock voltages)
(CPU VDDP, cLDO_VDDP, VDDG CCD, VDDG IOD, VSOC)

STOCK
5600X
PPT: 76W
TDC: 60A (calculated as my sample doesn't reach max TDC on stock without boost overrides)
EDC: 90A

5800X
PPT: 142W
TDC: 95A
EDC: 140A

5900X
PPT:
TDC:
EDC:

5950X
PPT:
TDC:
EDC:

EDC FUSE LIMITS:
5600X
- 122A (120A with peaks to 122A)
5800X - 166A (160A with peaks to 166A)
5900X - 200A (210A with telemetry faking on MSI)
5950X - 200A (same thing as above)

SILICON HARDLIMITS @ Q1 2021
5600X

PPT: 160W
TDC: 140A
EDC: 185A

5800X
PPT: 170W
TDC: 140A
EDC: 195A

5900X
PPT:
TDC:
EDC:

5950X
PPT: 200W
TDC: 160A
EDC: 250A

Sadly there is a big difference between Silicon Limits and EDC Fuse limits defined by FIT. (especially 5600X users)
We consumers still are artificially limited. Before our units can ever reach Silicon FIT limits (dynamic), they hang on the EDC FUSE limit.

Please don't cross-compare AMD Specified TDP Values, with actually on stock configured Precision Boost values.
I am in clear that MSI, Gigabye & ASUS do telemetry faking, and have different predefined Auto values for the Scalar & "Motherboard Limits"
ASRock in comparison doesn't do telemetry faking & the bios is quite barebones. But because PBO on it's nature modifies the V/F curve, it has to be explicitly DISABLED
* Motherboard limits do not bypass EDC FUSE limits

AMDs Paper TDP Limits should in theory trigger on PPT,
But in reality they do focus on EDC (A)
Where memOC including V-SOC do have a big influence on our "boosting powerbudget" , simply as they go into this EDC limit.
Soo my advice,
~ Stay within the EDC Fuse Limit, or manually extend the stock boosting limit ~ if you want to care about memOC without eating away the CPU Boosting budget
* As for more accurate (higher) Hardlimits, i'm sorry but i should not speak about it ~ also because FIT dynamic limits depend on the silicon quality :)
** When it comes to Labor-Values, these remain a Company-Secret. Just for us only the EDC Values matter, when it comes to silicon health and integrity of it.
PPT you can see as simple PSU Hard-Limiter, or Cooler Capability Limit. It also is a limit on Notebooks ~ just that STAMP integrity, is also included into the limiting.
There are still things i should not talk about
Be it because of the respect for the supportive people who helped me, or because the information is not 100% clear
I don't have any active NDA right now, but in order to keep up respect to the people who have one and still helped me ~ i prefer not to share everything :coffee:

But overall, i want to thank you @Verangry - for all the work you do and support towards the german community.
The powerplan for Matisse who YOU, L3tum & masterleros ~ designed, was one of the most valuable resources and inspiration for an upcomming powerplan for Vermeer :)
I did end up rewriting it fully and it's current state is nearly 100% compatible with CTR and dLDO modifications, but it needs a bit more work and finally a non broken 1202 update ~ to confirm stability on it

I've seen you helped on many other projects on ComputerBase and i think HWLuxx too ?
Just want to say Thank You !, now that i could catch you~

But to stay on this Thread Topic:
I personally don't think, any user should be worried about their stock limits and voltage overriding's when using PBO + CO
They should be worried about the boost override, which does modify once more the V/F curve
But they can be free to run 175-145-400A , like i do. With lifted cTDP and Package Power Limits of 400W (CBS, SMU)
You won't be able to bypass the FUSE limit anyways, just for OC_Mode and CTR ~ move inside these limits or just poor LN2 on it :p

PBO does add up a slight bit of allcore-voltage and it's good to keep a balance between higher input VID and lower VID by negative CO
I do also prefer to keep up AMDs CO limits (which could be figured out by CTR) and just change the positive and negative magnitude on all of them together (allcore CO does not wipe the predefined CO values by AMD)
But that's just "a different method of handling"
Redoing and bettering up AMDs CO values, can still lead to better results ~ soo i do support this project here fully :)

I will be finishing up this post in the next 24h & edit it
My supporters need to have more time, to run these tests for me
Sadly only have a 5600X here
Don't want to let you wait that much @Verangry :p but give me at least 24h more.
Would like to doubleconfirm some values before the rest is public
 
#37 ·
Thanks for this great tool @sp00n82, it helps a lot finding a stable curve setting.

"The listed limits are wrong" Well, this is funny. You are only mentioning EDC, Veii. What about PPT and TDC? Verangry mentiones TDP Limits, which is not only about EDC. And of course, 142/95/140 ARE the stock limits of 5900/5950X. How you handle these limits, is your choice of course. Maybe you should add a reclaimer "for high risk users only" on your statments...
 
#38 · (Edited)
I don't understand what you mean
The first approach was "they looked wrong"
The 2nd approach after manual testing, where i mentioned "i can retest it if needed" ~ where surprisingly similar, but still slightly bit off

The 5900X and 5950X values are work in progress , and the post will update it with the correct values
There is no space for personal flavor on the post above
It's the proconfigured arbitrary values how the bioses are on the current state ~ where the real silicon limits are much higher, but on the current date of writing "limited" for us enthusiasts

Although limited , the stock arbitrary values end up not mattering at all (if you read the whole post ~ which you could in two languages)
Simply "as everything will fall back to the EDC-FUSE limit, which you can not bypass by any methods, unless going to OC-MODE = CTR or Allcore"

But i give you a point that it might have been too confusing for non trusting people, soo i want to attach this little post here too:
However you turn it,
The endresult is ~ "it doesn't matter what you set" :)
  • Stock values are arbitrary and do not follow neither AMDs paper-written limits (TDP on paper ≠ PPT, but should be)
  • You can not exceed the EDC FUSE limit by any chance under Precision Boost, and that's where everything ends up at / if we speak degradation, danger or anything range wise
  • If you do indeed want to know the nearly highest limits a user "should orient", then it's to follow the Hardlimits posted at the current state. * They are in reality a bit higher, but that's trade-secret.
* following them in-case ASUS users want to utilize dynamic OC mode switching, which should have at least some TDC limit in place.

That's about it,
I don't know what you missunderstood, but "Limits doesn't matter"
Not on Vermeer at-least.
"It doesn't matter", as there is a huge list of protectors and auto adjusters + cache boosters in place
Limiting them lower only gifts performance and limits Cache access time + L3 performance. A limit without reason , purely because of FUD (Fear under denial) :)

Please at least let me finish the remain samples, before you judge on it
and maybe re-read it, to get my point
It really doesn't matter and Pre-1202 , post 1.1.0.0D a higher EDC limit of 400A for example, is needed in order to lift one of the limits and let cache boost up ~ to the limit of what FIT allows
Also NO, the CPU won't overvolt itself that way and degrade. It's "impossible" on Precision Boost mode to exceed the FUSE limits with FIT in place. Damage-Overvolting is not possible

EDIT:
I really don't know how to bring it over with other words
  • They are slightly bit off for "stock arbitrary values" ~ but nearly correct
  • They have no weight as "limits to follow" and will change in the bios update future. Ampere pushed to silicon = FUSE limits are to follow, or HardLimits if you want to go madman mode and bypass FIT
At the end, the listed "limits" by Verangry are slightly bit off, and are no "limits"
It's a good orientation point every user can test for themself, yes ~ but they still are a tiny bit off & that was the main intention of my post

The "HardLimits" to the extend i can share & the explanation, is just a little bonus :)
All ends up at the FUSE limit on the current date, and the Hardlimit on the silicon living-date.
They are a bit different than the listed ones, but it is a changing optimization topic ~ on which i shall not speak more
 
#39 ·
Again, it's ok to make your personal conclusion based on your findings, Veii. But i would never recommend "limits doesn't matter" for normal users as a general rule. Even AMD does not recommend to exceed these limits: You lose you guarantee, if you do it, right?. For me it's also useless ignoring these limits when it comes to find the best balance between performace and efficicency. You are playing around these limits by yourself, right? So it doesn't matter? Again: I understand your approach and it's absolutely ok to play around these limits to find whatever you like, but you cannot claim this as a general rule.
 
#40 · (Edited)
Ehm, no
What you quote is also not what i wrote, and not the conclusion i took
There is no space for personal flavor on the post above
No personal flavor or personal taste of secureness

Thought i should add a bit more information,
The arbitrary limits set in place on stock - are predefined to meet:
  • at 1800 FCLK with 1.1v SOC as limit , like tested above
  • under the CPUs capability within maximum rated boost (4.65 for a 5600X, 4.95 for a 5900X)
  • predefined V/F curve with the intention to hit 100% up to specific workloads, soo supplied voltage lowers and score increases
  • all like 1.) within SOC powerdraw

They remain arbitrary, however you want to turn it
Like on Matisse, also on Vermeer the performance increases if you hit the limits and let it lower voltage. (optimally peaking 98% TDC and 100% EDC on Cinebench R20)
I think we do know at this point, that undervolting not only helps stay within the powerbudget but also helps increase the remain boosting powerreserves

If you push FCLK higher, your reserves get smaller and smaller
If you increase V-SOC , your reserves on the predefined arbitrary values (not limits) diminishes further and further
These arbitrary limits are there for normal consumers at 1800 FCLK to hit the best efficiency to boosting result
They are not there as any type of safety limits, but as giving bonus for the users ~ to have better results. And a bit as a safety net within the range of +30 CO.
If it wasn't designed that way, the RMA rate would be higher. As provided voltage chokes, soo provided frequency throttles if it doesn't fake it. User sees a chip not meeting boosting targets. User returns the unit.

If you work with CO , the whole story changes.
First of all, every CPU has predefined own CO values. Which are gettting up to date by FIT. They and FIT-Q + CPPC rating will change up to user influenced vcore , user influenced scalar, and user influenced allcore-CO
Alone by changing and playing with CO, you technically already do Telemetry Faking.
The arbitrary limits that are set on stock, do not matter here as you "fake" them and go around them
... soo much about "secureness to the user" ... :)

They end up arbitrary and not optimal even if you as a consumer run 1900 FCLK. But for example the FUSE limit, is designed more loose to meet the 1900 FCLK target.
I can remind you about the 1900 FCLK lock attempt on AGESA 1.1.0.0 C , before the Patch D ABL patch
(which doubled cache bandwidth and allowed dynamic boosting to function upon EDC and other limits)
I should also remind you the overboost spike issues which hit 50ghz and higher , because of the "not so optimal" , "too early" dynamic boosting, which was unlocked to users.
It's a too big book to try and read from, but again the values that you try to follow where never ment for anyone who uses CO (VID faking) or any user who tries to run 1900 FCLK and higher
They have a reason to exist, but are "not limits"
Please understand that.
Users appear to have control over their CPUs , but their control is nearly also "only arbitrary" :)
Which i personally find very sad, and continue with #FreeRyzen demanding higher FUSE limits and no FCLK locks
Someday you'll understand what i mean. Can't force you to accept what i try to tell you, and can't put a viewpoint on you when your reality is different
Ty for the talk & read, tho. I'm happy that you took your time to read my little lesson :giggle:
 
  • Rep+
Reactions: Sam64
#49 ·
With Prime95 I run this. Temps stay under 80C and pretty good at finding unstable cores. :)

On my 5950x.

Code:
Log Level set to: ......... 2 [Writing debug messages to log file]
Stress test program: ...... PRIME95
Selected test mode: ....... SSE
Logical/Physical cores: ... 32 logical / 16 physical cores
Hyperthreading / SMT is: .. ON
Selected number of threads: 2
Runtime per core: ......... 6 MINUTES
Suspend periodically: ..... ENABLED
Restart for each core: .... OFF
Test order of cores: ...... DEFAULT (ALTERNATE)
Number of iterations: ..... 10000
Selected FFT size: ........ 128-128 (128K - 128K)