Overclock.net banner

My system with a Ryzen 5000 CPU reboots with BIOS defaults

1,581 - 1,600 of 1,658 Posts
My 5950x NEVER so much as hiccuped under load, PBO on or off.

Every time it has happened, the cpu was idling, at most watching a youtube.

I've used it in many different combinations of PBO on/off, ram at JEDEC/XMP, I even tried +50 mV to DRAM and +60 mV to CPU. I have found no configuration that is stable.
I have an asus b550-m, 2x32GB of 3600 ram, predator somethingsomething and a seasonic 550W platinum.

I am not even sure if it is the cpu, but I can not think what else it can be.

As I said, it has never, not once, failed when stress testing it to hell and back.
When you are stress testing and all the cores are working, they obviously produce a lot of heat and that thermally limits how high they can clock. When you are doing something like watching youtube, the heat is low as most cores are idling so the one or two cores that are active try to reach a much higher frequency.
 
What people?
Me. I got my 5950x about six weeks ago and I am having these same issues. Black screen crashes in idle (though my motherboard seems to think it is a RAM issue). I got it on Amazon, so maybe it is an old returned one that was being resold. I will check the number on the box when I get home from work this evening.
 
Me. I got my 5950x about six weeks ago and I am having these same issues. Black screen crashes in idle (though my motherboard seems to think it is a RAM issue). I got it on Amazon, so maybe it is an old returned one that was being resold. I will check the number on the box when I get home from work this evening.
"six weeks ago " and registered today, tell me more, as you said maybe old stock or a return they resold.

Keep us informed and good luck.

Fortunately you have this entire thread of info to help you. :)
 
Me. I got my 5950x about six weeks ago and I am having these same issues. Black screen crashes in idle (though my motherboard seems to think it is a RAM issue). I got it on Amazon, so maybe it is an old returned one that was being resold. I will check the number on the box when I get home from work this evening.
The best option would be to check the batch info, but that is printed in the CPU itself, so you will need to remove the cooler. Probably is from 2020 or early 2021.
 
I bought a 5900x and Asus Dark Hero last weekend from Micro Center and have had a horrible time with it.

I upgraded from a 9900k. Full water cooling loop, great temps, B-Die ram, 3090 Kingpin, 1000W Seasonic PSU.

I've been getting occasional random restarts, especially at idle or doing something simple like watching a YouTube video.

Tried all of the common advice, loosened ram timings, adjusted voltage, disabled C-states, etc.

I'm about two random restarts from returning everything and going back to the 9900k even though I can use the additional cores.
 
I bought a 5900x and Asus Dark Hero last weekend from Micro Center and have had a horrible time with it.

I upgraded from a 9900k. Full water cooling loop, great temps, B-Die ram, 3090 Kingpin, 1000W Seasonic PSU.

I've been getting occasional random restarts, especially at idle or doing something simple like watching a YouTube video.

Tried all of the common advice, loosened ram timings, adjusted voltage, disabled C-states, etc.

I'm about two random restarts from returning everything and going back to the 9900k even though I can use the additional cores.
random restarts are probably down to PBO curve optimiser boost issues. Make sure it was set to default?
 
Yep, tried default, tried adding voltage instead of subtracting, tried Dynamic OC Switcher on and off... all sorts of things.

The random restarts are very hard to reproduce or predict, but I've also been getting instability with various settings in things like Prime95 or IntelBurnTest.

It's been a very frustrating slog. I loved AMD back in the day (early-mid 2000s) but jumped ship when they couldn't keep up with Intel. I was excited that AMD finally came back with this generation, but it's just been a heap of disappointment.

It's been more than a decade since I've had to do this much manual tweaking just to try to achieve stability. XMP settings for my ram seem to be impossible to achieve (G.Skill DDR4 3600 CL15), while it was incredibly easy with the Intel platform.

Ugh.

I'm frustrated, and unfortunately no advice seems to be worth applying at this point. I have to decide to either try to exchange for another 5900x or 5950x or return the entire thing. Micro Center does 15 days for CPUs and motherboards, so I'd have about a week to figure out if a replacement is fully stable should I go that route.

I'd heard horror stories of AMD drivers and stability, but it seemed that was mostly a thing of the past as well as primarily with the GPUs. I figured I'd have minimal trouble given that I've been building my own PCs since the 90s. No such luck.
 
I bought a 5900x and Asus Dark Hero last weekend from Micro Center and have had a horrible time with it.

I upgraded from a 9900k. Full water cooling loop, great temps, B-Die ram, 3090 Kingpin, 1000W Seasonic PSU.

I've been getting occasional random restarts, especially at idle or doing something simple like watching a YouTube video.

Tried all of the common advice, loosened ram timings, adjusted voltage, disabled C-states, etc.

I'm about two random restarts from returning everything and going back to the 9900k even though I can use the additional cores.
I too have the Asus ROG Crosshair VIII Dark Hero motherboard for my 5950x, I have new Crucial 2x32 3600 RAM (I think it is Samsung E die) in slots 2 and 4. I don't get blue screens or anything like a normal Windows crash. The monitor screen just goes dark, some fans spin up and some fans stop, the amber light comes on indicating a RAM error, all the motherboard LED lights stay on. The only thing that works on the computer is the power supply shut off switch. This usually happens when I leave the computer on for a long period of time or when I am using it but not really doing anything. In the beginning, it was a few times a day. Then I loaded XMP (DOCP or something like that) and the crashes happened more frequently until I raised the RAM voltage from 1.35 to 1.37. I tried higher, but seemed to get more of these crashes. I called Crucial and they directed me to reset CMOS. I did that and set everything in the Asus firmware to default. I got one crash not long after, but none in the last day or two.

EDIT: My problem seems to have been solved by one of two things (or both) which I did following advice in this thread and others: I disabled power down mode in memory timings, and I put 10% in Windows 10 Power Options/Advanced Settings/Processor Power Management/Minimum Processor State. The computer was on for a day and no black screen crash. I then turned on DOCP/XMP and LEFT THE RAM VOLTAGE AT THE RATED 1.35v. The computer has been on in this state for about 24 hours and no crash. Fingers Crossed.
 
I bought a 5900x and Asus Dark Hero last weekend from Micro Center and have had a horrible time with it.

I upgraded from a 9900k. Full water cooling loop, great temps, B-Die ram, 3090 Kingpin, 1000W Seasonic PSU.

I've been getting occasional random restarts, especially at idle or doing something simple like watching a YouTube video.

Tried all of the common advice, loosened ram timings, adjusted voltage, disabled C-states, etc.

I'm about two random restarts from returning everything and going back to the 9900k even though I can use the additional cores.
The best option would be to check the batch info, but that is printed in the CPU itself, so you will need to remove the cooler. Probably is from 2020 or early 2021.
As @LuchoU said, check the manufacturing date reported on the CPU.

If it is an early sample, like late 2020 or early 2021, ask your reseller a replacement.

Otherwise send it back to AMD, it is a bit annoying, but I wouldn't come back to the 9900, if you really need the core count.
 
Yep, tried default, tried adding voltage instead of subtracting, tried Dynamic OC Switcher on and off... all sorts of things.

The random restarts are very hard to reproduce or predict, but I've also been getting instability with various settings in things like Prime95 or IntelBurnTest.

It's been a very frustrating slog. I loved AMD back in the day (early-mid 2000s) but jumped ship when they couldn't keep up with Intel. I was excited that AMD finally came back with this generation, but it's just been a heap of disappointment.

It's been more than a decade since I've had to do this much manual tweaking just to try to achieve stability. XMP settings for my ram seem to be impossible to achieve (G.Skill DDR4 3600 CL15), while it was incredibly easy with the Intel platform.

Ugh.

I'm frustrated, and unfortunately no advice seems to be worth applying at this point. I have to decide to either try to exchange for another 5900x or 5950x or return the entire thing. Micro Center does 15 days for CPUs and motherboards, so I'd have about a week to figure out if a replacement is fully stable should I go that route.

I'd heard horror stories of AMD drivers and stability, but it seemed that was mostly a thing of the past as well as primarily with the GPUs. I figured I'd have minimal trouble given that I've been building my own PCs since the 90s. No such luck.
you can check event viewer of the warning and error logs, hopefully it shows which cores may be crashing.

but if you are crashing with bios default, better send it in for exchange or rma
 
Ive been living with this issue since February and it's driving me insane.

Screens go black and I can still hear audio come through for a bit if I'm playing music. Then the fans spin up and then the system reboots itself.
I check the event logs and no event logs are recorded automatically to tell me what even went wrong.
How can I fix this? ELI5, please.
 
Ive been living with this issue since February and it's driving me insane.

Screens go black and I can still hear audio come through for a bit if I'm playing music. Then the fans spin up and then the system reboots itself.
I check the event logs and no event logs are recorded automatically to tell me what even went wrong.
How can I fix this? ELI5, please.
Did you try disabling power down mode in memory timings?
 
Ive been living with this issue since February and it's driving me insane.

Screens go black and I can still hear audio come through for a bit if I'm playing music. Then the fans spin up and then the system reboots itself.
I check the event logs and no event logs are recorded automatically to tell me what even went wrong.
How can I fix this? ELI5, please.
First you need to be at full stock settings in bios, just set your ram to XMP profile.

Look for CoreCycler in google and test all your cores.

In Windows there is an option to create a full system dump when there is a critical error, that should give Windows some time to write the event in the log and you will be able to see the APIC ID to identify the failing core. If I remember well the option is located in system settings.

i believe the only possible way to workaround this issue is by adding additional voltage to the failing cores. You can try to add +10 or +5 in bios (AMD PBO) to all cores and see of there is any gain in stability. By using CoreCycler it will be easier to identify the problematic cores.

The definitive solution is to RMA your CPU. If you can look for the batch in your CPU (you will need to remove your cooling solution) an it's from 2020 or early 2021 then most probably you got a very badly binned CPU.

Enviado desde mi SM-G960U1 mediante Tapatalk
 
First you need to be at full stock settings in bios, just set your ram to XMP profile.

Look for CoreCycler in google and test all your cores.

In Windows there is an option to create a full system dump when there is a critical error, that should give Windows some time to write the event in the log and you will be able to see the APIC ID to identify the failing core. If I remember well the option is located in system settings.

i believe the only possible way to workaround this issue is by adding additional voltage to the failing cores. You can try to add +10 or +5 in bios (AMD PBO) to all cores and see of there is any gain in stability. By using CoreCycler it will be easier to identify the problematic cores.

The definitive solution is to RMA your CPU. If you can look for the batch in your CPU (you will need to remove your cooling solution) an it's from 2020 or early 2021 then most probably you got a very badly binned CPU.

Enviado desde mi SM-G960U1 mediante Tapatalk
Thanks for the help. I'm going to RMA the CPU.
I was thinking that this was a GPU thanks for pointing me in the right direction.


I checked the setting in system settings and its already set to automatically create a dump file. When I opened the file up in WinDBG I don't see any APIC mentions, the bottom of the file talks about nvlddmkm.sys


***
* Bugcheck Analysis *
***

VIDEO_TDR_FAILURE (116)
Attempt to reset the display driver and recover from timeout failed.
Arguments:
Arg1: ffffc401104be460, Optional pointer to internal TDR recovery context (TDR_RECOVERY_CONTEXT).
Arg2: fffff80367347b14, The pointer into responsible device driver module (e.g. owner tag).
Arg3: ffffffffc000009a, Optional error code (NTSTATUS) of the last failed operation.
Arg4: 0000000000000004, Optional internal context dependent data.
UGCHECK_CODE: 116
BUGCHECK_P1: ffffc401104be460
BUGCHECK_P2: fffff80367347b14
BUGCHECK_P3: ffffffffc000009a
BUGCHECK_P4: 4
VIDEO_TDR_CONTEXT: dt dxgkrnl!_TDR_RECOVERY_CONTEXT ffffc401104be460
Symbol dxgkrnl!_TDR_RECOVERY_CONTEXT not found.
PROCESS_OBJECT: 0000000000000004
BLACKBOXBSD: 1 (!blackboxbsd)
BLACKBOXNTFS: 1 (!blackboxntfs)
BLACKBOXPNP: 1 (!blackboxpnp)
BLACKBOXWINLOGON: 1
PROCESS_NAME: System

STACK_TEXT:
ffffa18d`063ef9d8 fffff803`61c91cae : 00000000`00000116 ffffc401`104be460 fffff803`67347b14 ffffffff`c000009a : nt!KeBugCheckEx
ffffa18d`063ef9e0 fffff803`61c424d4 : fffff803`67347b14 ffffc40f`deff68a0 00000000`00002000 ffffc40f`deff6960 : dxgkrnl!TdrBugcheckOnTimeout+0xfe
ffffa18d`063efa20 fffff803`61c3b00f : ffffc40f`df04e000 00000000`01000000 00000000`00000002 00000000`00000002 : dxgkrnl!ADAPTER_RENDER::Reset+0x174
ffffa18d`063efa50 fffff803`61c913d5 : 00000000`00000100 ffffc40f`df04ea58 00000000`00000000 00000000`00000000 : dxgkrnl!DXGADAPTER::Reset+0x4df
ffffa18d`063efad0 fffff803`61c91547 : fffff803`51f24440 00000000`00000000 00000000`00000000 00000000`00000000 : dxgkrnl!TdrResetFromTimeout+0x15
ffffa18d`063efb00 fffff803`514b8505 : ffffc40f`ebdc7040 fffff803`61c91520 ffffc40f`bb487750 ffffc40f`00000000 : dxgkrnl!TdrResetFromTimeoutWorkItem+0x27
ffffa18d`063efb30 fffff803`51555845 : ffffc40f`ebdc7040 00000000`00000080 ffffc40f`bb4ac100 00000000`00000001 : nt!ExpWorkerThread+0x105
ffffa18d`063efbd0 fffff803`515fe828 : ffff9180`8d3a3180 ffffc40f`ebdc7040 fffff803`515557f0 00000000`00010000 : nt!PspSystemThreadStartup+0x55
ffffa18d`063efc20 00000000`00000000 : ffffa18d`063f0000 ffffa18d`063e9000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x28


SYMBOL_NAME: nvlddmkm+dc7b14
MODULE_NAME: nvlddmkm
IMAGE_NAME: nvlddmkm.sys
STACK_COMMAND: .thread ; .cxr ; kb
FAILURE_BUCKET_ID: 0x116_IMAGE_nvlddmkm.sys
OS_VERSION: 10.0.19041.1
BUILDLAB_STR: vb_release
OSPLATFORM_TYPE: x64
OSNAME: Windows 10

Am I looking at this wrong?
 
Thanks for the help. I'm going to RMA the CPU.
I was thinking that this was a GPU thanks for pointing me in the right direction.


I checked the setting in system settings and its already set to automatically create a dump file. When I opened the file up in WinDBG I don't see any APIC mentions, the bottom of the file talks about nvlddmkm.sys


***
* Bugcheck Analysis *
***

VIDEO_TDR_FAILURE (116)
Attempt to reset the display driver and recover from timeout failed.
Arguments:
Arg1: ffffc401104be460, Optional pointer to internal TDR recovery context (TDR_RECOVERY_CONTEXT).
Arg2: fffff80367347b14, The pointer into responsible device driver module (e.g. owner tag).
Arg3: ffffffffc000009a, Optional error code (NTSTATUS) of the last failed operation.
Arg4: 0000000000000004, Optional internal context dependent data.
UGCHECK_CODE: 116
BUGCHECK_P1: ffffc401104be460
BUGCHECK_P2: fffff80367347b14
BUGCHECK_P3: ffffffffc000009a
BUGCHECK_P4: 4
VIDEO_TDR_CONTEXT: dt dxgkrnl!_TDR_RECOVERY_CONTEXT ffffc401104be460
Symbol dxgkrnl!_TDR_RECOVERY_CONTEXT not found.
PROCESS_OBJECT: 0000000000000004
BLACKBOXBSD: 1 (!blackboxbsd)
BLACKBOXNTFS: 1 (!blackboxntfs)
BLACKBOXPNP: 1 (!blackboxpnp)
BLACKBOXWINLOGON: 1
PROCESS_NAME: System

STACK_TEXT:
ffffa18d`063ef9d8 fffff803`61c91cae : 00000000`00000116 ffffc401`104be460 fffff803`67347b14 ffffffff`c000009a : nt!KeBugCheckEx
ffffa18d`063ef9e0 fffff803`61c424d4 : fffff803`67347b14 ffffc40f`deff68a0 00000000`00002000 ffffc40f`deff6960 : dxgkrnl!TdrBugcheckOnTimeout+0xfe
ffffa18d`063efa20 fffff803`61c3b00f : ffffc40f`df04e000 00000000`01000000 00000000`00000002 00000000`00000002 : dxgkrnl!ADAPTER_RENDER::Reset+0x174
ffffa18d`063efa50 fffff803`61c913d5 : 00000000`00000100 ffffc40f`df04ea58 00000000`00000000 00000000`00000000 : dxgkrnl!DXGADAPTER::Reset+0x4df
ffffa18d`063efad0 fffff803`61c91547 : fffff803`51f24440 00000000`00000000 00000000`00000000 00000000`00000000 : dxgkrnl!TdrResetFromTimeout+0x15
ffffa18d`063efb00 fffff803`514b8505 : ffffc40f`ebdc7040 fffff803`61c91520 ffffc40f`bb487750 ffffc40f`00000000 : dxgkrnl!TdrResetFromTimeoutWorkItem+0x27
ffffa18d`063efb30 fffff803`51555845 : ffffc40f`ebdc7040 00000000`00000080 ffffc40f`bb4ac100 00000000`00000001 : nt!ExpWorkerThread+0x105
ffffa18d`063efbd0 fffff803`515fe828 : ffff9180`8d3a3180 ffffc40f`ebdc7040 fffff803`515557f0 00000000`00010000 : nt!PspSystemThreadStartup+0x55
ffffa18d`063efc20 00000000`00000000 : ffffa18d`063f0000 ffffa18d`063e9000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x28


SYMBOL_NAME: nvlddmkm+dc7b14
MODULE_NAME: nvlddmkm
IMAGE_NAME: nvlddmkm.sys
STACK_COMMAND: .thread ; .cxr ; kb
FAILURE_BUCKET_ID: 0x116_IMAGE_nvlddmkm.sys
OS_VERSION: 10.0.19041.1
BUILDLAB_STR: vb_release
OSPLATFORM_TYPE: x64
OSNAME: Windows 10

Am I looking at this wrong?
From how you describe it, indeed it seems to be related to the gpu, instead of the cpu.

It seems that the Nvidia driver crashed, with a TDR error, which mean that the gpu was busy too long and didn't report back to the OS in time.

I would advise a clean driver install, get yourself the DDU tool, enable the driver cleaning in safe mode, in the tools options.

Download the latest AMD chipset drivers and Nvidia gpu drivers.

Unplug the ethernet cable, wifi, launch the tool and perform a cleaning of both AMD and gpu drivers, rebooting in safe mode.

Reboot once finished, install the chipset drivers and the gpu drivers.
Plug back your lan ethernet cable, wifi.

Test again, with default settings as @LuchoU pointed out, clear your CMOS, boot up into the bios, load optimized settings, setup the XMP/DOC memory profile.
Check the gpu t°, if everything seems fine on the gpu side, you can try to drop the pci-e Gen4 to pci-e Gen3 and check if the TDR crash still occurs.

The point is, trying to understand if the issues come from the gpu or from an unstable cpu, CoreCycler is a way to go to check your cpu cores.
 
@zerodisbelief

You could also try taking out the video card, cleaning the PCIe contacts on the video card with a pencil eraser and then re-seating the GPU ...
 
From how you describe it, indeed it seems to be related to the gpu, instead of the cpu.

It seems that the Nvidia driver crashed, with a TDR error, which mean that the gpu was busy too long and didn't report back to the OS in time.

I would advise a clean driver install, get yourself the DDU tool, enable the driver cleaning in safe mode, in the tools options.

Download the latest AMD chipset drivers and Nvidia gpu drivers.

Unplug the ethernet cable, wifi, launch the tool and perform a cleaning of both AMD and gpu drivers, rebooting in safe mode.

Reboot once finished, install the chipset drivers and the gpu drivers.
Plug back your lan ethernet cable, wifi.

Test again, with default settings as @LuchoU pointed out, clear your CMOS, boot up into the bios, load optimized settings, setup the XMP/DOC memory profile.
Check the gpu t°, if everything seems fine on the gpu side, you can try to drop the pci-e Gen4 to pci-e Gen3 and check if the TDR crash still occurs.

The point is, trying to understand if the issues come from the gpu or from an unstable cpu, CoreCycler is a way to go to check your cpu cores.

I'll defo do all these tests because I just looked in to my event viewer and I just spotted this.I might be having both CPU and GPU issues *** T-T

How can 5950x and 3080OC both do me so wrong -.- ....
I just did BIOS flashback to the newest version of the BIOS too.

2521521
 
I'll defo do all these tests because I just looked in to my event viewer and I just spotted this.I might be having both CPU and GPU issues *** T-T

How can 5950x and 3080OC both do me so wrong -.- ....
I just did BIOS flashback to the newest version of the BIOS too.

View attachment 2521521
This kind of ACPI error is annoying, indeed, in theory, it should not impact performance or stability.
Depending on the drivers, software installed and the bios settings, the OS will complain it could not access to various part of the ACPI hardware devices, listed into the bios.
In this case, it seems that something is preventing or is accessing the EC without OS awareness, maybe some monitoring, RyzenMaster or the motherboard software.

What you should look for in the event viewer, are the common WHEA errors, reporting the cpu or system crash, if any.
 
1,581 - 1,600 of 1,658 Posts