Overclock.net banner
1 - 20 of 40 Posts

Lgn

· Registered
Joined
·
18 Posts
Discussion starter · #1 · (Edited)
Hello guys!
I've recently (August) bought a new pc:

5800x
MSI MAG X570 Tomahawk WiFi (BIOS Version 7C84v17)
Corsair Vengeance RGB PRO Black DDR4-RAM 3600 MHz 32gb (4x8)
Corsair PSU RM850x
980 Pro nvme


I run everything at default with only XMP profile active and everything was perfect. I decided to try to use PBO and do a bit of overclocking.
I set +50mhz with EDC 130 PPT 130 TDC 90, -10 on the best 2 cores and -20 on the others. Everything else was default.
I had some decent result and some temperature decrese, i was pretty happy with it since i didn't want an heavy OC. Run some benchmarks (CB r20, 23, 3d mark, prime95) and not a single problem.
A couple of weeks ago my pc started to reboot pretty randomly (a couple of time when i was playing games, netflix, opening some programs, using content aware in Photoshop) but never under heavy load or stress tests.
This is the infamous error:

"A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 12 (sometimes 13)


The details view of this entry contains further information."


At this point i was pretty sure my PBO settings were wrong and i've tried different ones to ease the pressure on core 6 (the one with all the WHEA errors). I lowered the negative value on core 6 and set it to 0, also lowered some more the other values.
Same problem, getting random reboot without a solid way to recreate it.
I decided to try default bios settings with only XMP on. It didn't last even a couple of days and the error was back, this time when i was watching Netflix... Still no errors with stress tests like corecycler+ycruncher for 4-5 iterations.
I also tried to disable Global C-states and set the Power Supply Idle Control to Typical.

N O P E. Same stuff.

Also did a TestMem5 with 1usmus cfg. No errors detected.

Right now I'm thinking to RMA the cpu and get a new one but i wanted to try one last thing: setting the manual oc and changing the voltages manually.
I'm not an expert on this and I'd like some help from you.

Those are the settings I'm using now:

    • CPU multiplier 45.5
    • Vcore OVERRIDE 1.35 volts
    • Vsoc OVERRIDE 1.1 volts
    • Load like calibration MODE 6
    • Cpb and pbo DISABLE
    • XMP ON profile 2 (3600 mhz/ IF 1800 mhz)
    • VRAM manual 1.35 volts
    • Global c states DISABLE
    • Windows power options on control panel balanced, cpu 10%-100%, hdd sleep 0, PCI-E energy saver OFF. Win 10 Power & sleep panel, Balanced.


I did a testmem5 run and 2 iterations of corecylcer, no problems so far. Lost almost 6-7 degrees max on the cpu, but that wasn't a problem before. Max temps were around 75 degrees during cinebench and corecycler.
Here's a screen shot of my ZenTimings:

Rectangle Font Screenshot Technology Parallel



Any help will be REALLY appreciated. Thanks! ♥
 
Discussion starter · #2 · (Edited)
Nevermind, instant reboot on Cinebench r20 with those settings... Yesterday it was fine, today istant reboot. W T F is going on.

I also noticed that since the day my pc is doing crazy reboot i've also another event viewer error: ID 28

Error setting traits on Provider {8444a4fb-d8d3-4f38-84f8-89960a1ef12f}. Error: 0xC0000001
Microsoft-Windows-Kernel-EventTracing/Admin

I can't find out what is it.
 
Discussion starter · #4 ·
You are getting crashes. Your settings were never stable. Two iterations of corecycler is not a lot. APIC ID is the thread that failed ranging from 0-15, that corresponds with cores 0-7 where each core gets two threads.
Yes i also think that but i don't really understand how for 2 months i never had a single problem and all of the sudden it reboots everywhere randomly.
Also it's absurd that i'm also having problems at default settings.

Ye, the core getting all the errors is the core 6, first i tried to lower it to 0 when i was still using pbo, with not results at all. Removed pbo completely no success, same errors and reboots.

I'm really fustrated by this becasue i can't even work without thinking that it's gonna reboot by any seconds.
 
These are my settings :
-6 -15 -20 -23 -11 -11 -20 -13
WRT to each core from 0-7. This is with no PBO2 tested under Corecycler over ~2weeks. The LLC i used was "default". You are going to need to start from scratch as thatl help you more than just guessing and hoping. Also recommend you right down the changes you make in a spreadsheet.
 
Discussion starter · #6 · (Edited)
What's the point of using a negative PBO is the cpu is not stable at default? Also last time i tried to change some settings in PBO it couldn't even boot on windows.
I don't really understand, shouldn't i try some different vcore settings or something more "stable"?

Sorry, I probably missing something and i'm kinda in panic becasue i'm losing a ton of work.
 
The point is using a negative offset per core in curve optimizer. My # for my specific chip show that there can be a massive range in difference between cores and their offsets. My best cores are 1/5. Core 0 being absolute weak can only do -6. You going -10 and -20 can easily have issues with stability.
 
Discussion starter · #8 ·
The point is using a negative offset per core in curve optimizer. My # for my specific chip show that there can be a massive range in difference between cores and their offsets. My best cores are 1/5. Core 0 being absolute weak can only do -6. You going -10 and -20 can easily have issues with stability.
Yes i totally understand that BUT my cpu is NOT stable even at default, without PBO. Why negative PBO should help me in this situation? If you are lowering even more the voltage curve isn't the system be even more unstable. Also, as i mentioned before, i literally cannot boot with setting like -5 on that particular core otherwise it's rebooting even before entering windows.

Again i've no idea why it worked with much more hard setting before. That's the main problem right now. Is my CPU suddenly trash?
 
One of my cores in my 5900X, the 3rd best one actually requires +10 to be stable. Also wanted to confirm that with PBO off, your overclock offset is back to 0?

If yes, look in your bios for DF C States. Disable that and turn global C States back on.
 
Discussion starter · #11 ·
One of my cores in my 5900X, the 3rd best one actually requires +10 to be stable. Also wanted to confirm that with PBO off, your overclock offset is back to 0?

If yes, look in your bios for DF C States. Disable that and turn global C States back on.
My motherboard on default has PBO "auto". I tried both Disabled and auto. Same results. Reboot at random times.
I'll try to search for that setting but i honestly didn't see it, is DF C States for IF?

Have you “set” your settings back to stock or did you load optimized default and start over with xmp? If the first try the second.
I tried to set everything at default (f6 in the bios) with xmp. Again, it did the same **** when i was watching netflix.


Right now i'm trying this:

Go into BIOS. Disable CBP, save, reboot. Go back into BIOS.
Then go to the PBO settings and
set PBO to enabled
Open the advanced tab and open AMD overclocking.
Select PBO here
set PBO to advanced and the limit to motherboard.
(The main thing) EDC - Set Spike VRM-out current limit to 200A.
(Just in case) PPT - Set the socket power limit to 130W. (depends on mobo)
(Just in case) TDC - Set the vrm thermal limit to 85. (depends on mobo and proc)
EDC is a temporary massive increase, PPT here a decrease. TDC depends on your cooling of VRMS and such.

Leave at zeros all the rest in that menu
Set Idle Voltage to Typical
Set Global C-states control to Disable
MAKE SURE THAT ECO MODE IS OFF.
NOW you can reboot, go into BIOS and set Core Precision Boost back to On, everything should work.


I've read a lot of people with a similar problem who fixed with this.
What do you think? Does it make any sense at all?
 
My setup would do this with anything less than 1.12 vsoc

Yours is reporting at 1.087

ALSO if you override cpu core voltage (like me), you cannot run PBO or this will happen. Looks like you set yours to 1.35. I run 1.275
 
Discussion starter · #13 ·
My setup would do this with anything less than 1.12 vcore soc

Yours is reporting at 1.087

ALSO if you override core voltage (like me), you cannot run PBO or this will happen
By default it's 1.1v on my mobo, droop to 1.087. Do you think i should increase it?
Thanks so much for all the help guys, i really appreciate it!
 
Wouldnt hurt. Looks like I am using 1.156 vsoc at 3800 cl14

That's the happiest voltage for my board/ram/cpu combo. If I were to run a lower speed, it would be happier at a lover setting like the 1.12 I stated, but that was at cl16 2400. Out of the box, wasnt stable with all default settings on everything without increasing vcore to 1.12

I have also run up to 1.25 vcore at 3933mhz cl19, and wouldnt recommend it, but that was the best voltage setting for that ram speed


Rectangle Font Screenshot Parallel Technology
 
By default it's 1.1v on my mobo, droop to 1.087. Do you think i should increase it?
Go to load-line calibration settings and set Mode 2 for NB/SOC and overcurrent protection to enhanced. That'll give you 1.1v on SOC.

As for WHEA errors, I was getting a rare (once 3-4 days) WHEA 19 Bus/Interconnect error after I tightened timings on my 3600 CJR kit.
Setting VDDP 0.9v, VDDG IOD 1.06v and VDDG CCD 0.94v fixed it. Although it might change since errors were pretty rare in the first place.
 
Try this. Increase RAM voltage to 1.37v.
 
My motherboard on default has PBO "auto". I tried both Disabled and auto. Same results. Reboot at random times.
I'll try to search for that setting but i honestly didn't see it, is DF C States for IF?
DF C States should be in the Advanced menu, under CPU options. It is the lowest voltage idle state which does trigger idle reboots for some Zen3 CPUs, when the PC is not in use. The symptom would be coming back to your computer and seeing it at the Windows login screen. To me, it doesn't save much power for a desktop, probably more useful for laptops.
 
Discussion starter · #19 ·
"Go into BIOS. Disable CBP, save, reboot. Go back into BIOS.
Then go to the PBO settings and
set PBO to enabled
Open the advanced tab and open AMD overclocking.
Select PBO here
set PBO to advanced and the limit to motherboard.
(The main thing) EDC - Set Spike VRM-out current limit to 200A.
(Just in case) PPT - Set the socket power limit to 130W. (depends on mobo)
(Just in case) TDC - Set the vrm thermal limit to 85. (depends on mobo and proc)
EDC is a temporary massive increase, PPT here a decrease. TDC depends on your cooling of VRMS and such.
Leave at zeros all the rest in that menu
Set Idle Voltage to Typical
Set Global C-states control to Disable
MAKE SURE THAT ECO MODE IS OFF.
NOW you can reboot, go into BIOS and set Core Precision Boost back to On, everything should work."

Ok, as i thought this is completly useless, still rebooted today.

Does the same happen with XMP disabled and stock bios settings?
I'm going to try disabling xmp next but if i will get no errors this way i'm going to rma the cpu anyway, i honestly don't want a 400 euro cpu who can't run XMP. It's ridicolous
And the weird thing is that i had xmp on for 2 months without any problem...


DF C States should be in the Advanced menu, under CPU options. It is the lowest voltage idle state which does trigger idle reboots for some Zen3 CPUs, when the PC is not in use. The symptom would be coming back to your computer and seeing it at the Windows login screen. To me, it doesn't save much power for a desktop, probably more useful for laptops.
I don't see that particular setting anywhere, the only thing i have is Global C states and Idle Voltage.Maybe on my motherboard have a different name? I dunno.
 
1 - 20 of 40 Posts