Overclock.net › Forums › Software, Programming and Coding › Operating Systems › Windows › Crash Analysis and Debugging › Analyzing 0x124: WHEA_UNCORRECTABLE_ERROR bugchecks
New Posts  All Forums:Forum Nav:

Analyzing 0x124: WHEA_UNCORRECTABLE_ERROR bugchecks

post #1 of 4
Thread Starter 

Part 1: Cache Errors



Disclaimer: 0x124 bugchecks require multiple dumps to even close to successfully troubleshoot due to one single dump not being much to go on. For example, one 0x124 dump can provide one error, and the next could provide something completely different (saying it is hardware related of course, but not CPU related). It's important to have multiple dumps to truly figure out whether or not the CPU itself is at fault.

Please also do note that this is not intended for beginners or for those who have jumped right into analysis, although I may end up adding keys and such eventually. If you are brand new to analyzing and debugging crash dumps, please check my beginners tutorial to learn the basics and various terminologies.


Code:
WHEA_UNCORRECTABLE_ERROR (124)
    A fatal hardware error has occurred. Parameter 1 identifies the type of error
    source that reported the error. Parameter 2 holds the address of the
    WHEA_ERROR_RECORD structure that describes the error conditon.
    Arguments:
    Arg1: 0000000000000000, Machine Check Exception
    Arg2: fffffa800ddde028, Address of the WHEA_ERROR_RECORD structure.
    Arg3: 00000000b6004000, High order 32-bits of the MCi_STATUS value.
    Arg4: 00000000e6000175, Low order 32-bits of the MCi_STATUS value.

    Debugging Details:
    ------------------


    BUGCHECK_STR:  0x124_AuthenticAMD

    CUSTOMER_CRASH_COUNT:  1

    DEFAULT_BUCKET_ID:  WIN7_DRIVER_FAULT

    PROCESS_NAME:  WebKit2WebProc

    CURRENT_IRQL:  f

    STACK_TEXT: 
    fffff880`03297b08 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KeBugCheckEx


    STACK_COMMAND:  kb

    FOLLOWUP_NAME:  MachineOwner

    MODULE_NAME: AuthenticAMD

    IMAGE_NAME:  AuthenticAMD

    DEBUG_FLR_IMAGE_TIMESTAMP:  0

    FAILURE_BUCKET_ID:  X64_0x124_AuthenticAMD_PROCESSOR_CACHE

    BUCKET_ID:  X64_0x124_AuthenticAMD_PROCESSOR_CACHE

    Followup: MachineOwner
    ---------

Alright, cool, right off the bat we are thankfully greeted with fairly respectable instructions. It tells us that parameter 1 contains and identifies the type of error source that reported the error. Now, in this dump, that would be 'Machine Check Exception'.


What is a Machine Check Exception (otherwise known as a MCE) you may ask? Well, it's not as hard to describe as the name makes it sound. This simply means that the computer's CPU detects that there is a hardware problem and reports it to the Operating System.


Moving on, you can now see it sees parameter 2 holds the address of the WHEA_ERROR_RECORD structure that describes the error condition. Now, in this dump, the WHEA_ERROR_RECORD structure address is: fffffa800ddde028.


So, with these handy instructions that we now understand, let's go ahead and run an !errrec (dumps a specific WHEA error record) on the WHEA_ERROR_RECORD structure address, which in our case is fffffa800ddde028!

!errrec fffffa800ddde028

We are then presented with:
Code:
5: kd> !errrec fffffa800ddde028
    ===============================================================================
    Common Platform Error Record @ fffffa800ddde028
    -------------------------------------------------------------------------------
    Record Id     : 01ce686c947ffec6
    Severity      : Fatal (1)
    Length        : 928
    Creator       : Microsoft
    Notify Type   : Machine Check Exception
    Timestamp     : 6/13/2013 19:40:22 (UTC)
    Flags         : 0x00000000

    ===============================================================================
    Section 0     : Processor Generic
    -------------------------------------------------------------------------------
    Descriptor    @ fffffa800ddde0a8
    Section       @ fffffa800ddde180
    Offset        : 344
    Length        : 192
    Flags         : 0x00000001 Primary
    Severity      : Fatal

    Proc. Type    : x86/x64
    Instr. Set    : x64
    Error Type    : Cache error
    Operation     : Generic
    Flags         : 0x00
    Level         : 1
    CPU Version   : 0x0000000000100fa0
    Processor ID  : 0x0000000000000005

    ===============================================================================
    Section 1     : x86/x64 Processor Specific
    -------------------------------------------------------------------------------
    Descriptor    @ fffffa800ddde0f0
    Section       @ fffffa800ddde240
    Offset        : 536
    Length        : 128
    Flags         : 0x00000000
    Severity      : Fatal

    Local APIC Id : 0x0000000000000005
    CPU Id        : a0 0f 10 00 00 08 06 05 - 09 20 80 00 ff fb 8b 17
                    00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00
                    00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00

    Proc. Info 0  @ fffffa800ddde240

    ===============================================================================
    Section 2     : x86/x64 MCA
    -------------------------------------------------------------------------------
    Descriptor    @ fffffa800ddde138
    Section       @ fffffa800ddde2c0
    Offset        : 664
    Length        : 264
    Flags         : 0x00000000
    Severity      : Fatal

    Error         : [B][U][COLOR=008B00]DCACHEL1_EVICT_ERR (Proc 5 Bank 0)[/COLOR][/U][/B]
      Status      : 0xb6004000e6000175
      Address     : 0x0000000000000700
      Misc.       : 0x0000000000000000

As you can see, we have a Cache Error in this specific dump. If you see Section 2 of the !errrec report, we can see that the error specifically is 'DCACHEL1_EVICT_ERR (Proc 5 Bank 0)'. Simply put, this means:

DCACHEL1_EVICT_ERR (Proc 5 Bank 0)

- This means it could not read data from L1 cache.

What does that mean? L1 Cache = Level 1 Cache, otherwise known as the primary cache. It's used for temporary storage of instructions and data organized in blocks of 32 bytes.


Now that we have this info, let's take a look at another 0x124 dump from the same system:


**Rather than pasting the entire dump, I am just going to show the output of running the !errrec on the WER structure address**
Code:
4: kd> !errrec fffffa800ec8e838
    ===============================================================================
    Common Platform Error Record @ fffffa800ec8e838
    -------------------------------------------------------------------------------
    Record Id     : 01ce686c947ffec5
    Severity      : Fatal (1)
    Length        : 928
    Creator       : Microsoft
    Notify Type   : Machine Check Exception
    Timestamp     : 6/13/2013 19:31:21 (UTC)
    Flags         : 0x00000002 PreviousError

    ===============================================================================
    Section 0     : Processor Generic
    -------------------------------------------------------------------------------
    Descriptor    @ fffffa800ec8e8b8
    Section       @ fffffa800ec8e990
    Offset        : 344
    Length        : 192
    Flags         : 0x00000001 Primary
    Severity      : Fatal

    Proc. Type    : x86/x64
    Instr. Set    : x64
    Error Type    : Cache error
    Operation     : Data Write
    Flags         : 0x00
    Level         : 1
    CPU Version   : 0x0000000000100fa0
    Processor ID  : 0x0000000000000003

    ===============================================================================
    Section 1     : x86/x64 Processor Specific
    -------------------------------------------------------------------------------
    Descriptor    @ fffffa800ec8e900
    Section       @ fffffa800ec8ea50
    Offset        : 536
    Length        : 128
    Flags         : 0x00000000
    Severity      : Fatal

    Local APIC Id : 0x0000000000000003
    CPU Id        : a0 0f 10 00 00 08 06 03 - 09 20 80 00 ff fb 8b 17
                    00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00
                    00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00

    Proc. Info 0  @ fffffa800ec8ea50

    ===============================================================================
    Section 2     : x86/x64 MCA
    -------------------------------------------------------------------------------
    Descriptor    @ fffffa800ec8e948
    Section       @ fffffa800ec8ead0
    Offset        : 664
    Length        : 264
    Flags         : 0x00000000
    Severity      : Fatal

    Error         : [B][U][COLOR=008B00]DCACHEL1_DWR_ERR (Proc 3 Bank 0)[/COLOR][/U][/B]
      Status      : 0xf614c00000000145
      Address     : 0x000000043679f000
      Misc.       : 0x0000000000000000

Now, in this one, as we can see this is also reporting a Cache Error. If you see Section 2 of the !errrec report, we can see that the error specifically is 'DCACHEL1_DWR_ERR (Proc 3 Bank 0)'. Simply put, this means:

DCACHEL1_DWR_ERR (Proc 3 Bank 0)

- This means it could not write data from L1 cache.

Now we have two dumps showing read & write errors from the L1 cache. Are two dumps enough to go on? I would say no, however, with an error like this, it's a big flag raiser for a faulty CPU. In this specific situation, the rest of the user's dumps were all read & write errors from the L1 cache, so it was more than likely a faulty CPU.
post #2 of 4
Faulty CPU? Pfft. Maybe 1 in a million tongue.gif More likely to be an unstable overclock if they are appearing on this forum wink.gif
Ol' Sandy
(28 items)
 
"Zeus"
(12 items)
 
Elite Preview
(6 items)
 
CPUMotherboardGraphicsRAM
Intel Xeon E3-1230v3 Gigabyte GA-Z97X-UD5H-BK MSI Gaming GTX 980 Kingston 32GB (4x8) 
Hard DriveHard DriveHard DriveHard Drive
Plextor PX-256M5S 256GB Samsung EVO 1TB Hitachi HDS721010CLA332 Hitachi HDS723020BLA642 
Hard DriveHard DriveHard DriveOptical Drive
Hitachi HDS723020BLA642 Hitachi HUA722010CLA330 WDC WD10EARS-00Z5B1 TSSTcorp CDDVDW SH-S223B 
CoolingCoolingOSMonitor
Phanteks PH-TC14PE with TY-140's Lamptron FCv5 (x2) Windows 8 Pro 64-bit Dell U2412M 
MonitorMonitorMonitorKeyboard
Dell U2412M Dell U2212HM Dell U2713HM Topre Realforce 87UB | Ducky DK9087 G2 Pro 
PowerCaseMouseMouse Pad
Corsair AX-750 Corsair Obsidian 650D Logitech G700 XTRAC Ripper XXL 
AudioAudioAudioAudio
Beyerdynamic DT-770 Pro 250ohm Schiit Bifrost DAC Schiit Asgard 2 HiVi Swan M50W 2.1 
CPUMotherboardRAMHard Drive
Intel Xeon E5-2620 Super Micro X9SRL-F-B 128GB 1333MHz LSI 9271-8i 
OSPowerCase
VMware ESXi 5.5 SeaSonic SS-400FL2 Fractal Define R3 
CPUMotherboardGraphicsRAM
Intel Core i5-3437U HP EliteBook Folio 9470m  Intel HD Graphics 4000  16GB DDR3 SDRAM 
Hard DriveOS
256GB SSD Windows 10 Insider Preview 
  hide details  
Reply
Ol' Sandy
(28 items)
 
"Zeus"
(12 items)
 
Elite Preview
(6 items)
 
CPUMotherboardGraphicsRAM
Intel Xeon E3-1230v3 Gigabyte GA-Z97X-UD5H-BK MSI Gaming GTX 980 Kingston 32GB (4x8) 
Hard DriveHard DriveHard DriveHard Drive
Plextor PX-256M5S 256GB Samsung EVO 1TB Hitachi HDS721010CLA332 Hitachi HDS723020BLA642 
Hard DriveHard DriveHard DriveOptical Drive
Hitachi HDS723020BLA642 Hitachi HUA722010CLA330 WDC WD10EARS-00Z5B1 TSSTcorp CDDVDW SH-S223B 
CoolingCoolingOSMonitor
Phanteks PH-TC14PE with TY-140's Lamptron FCv5 (x2) Windows 8 Pro 64-bit Dell U2412M 
MonitorMonitorMonitorKeyboard
Dell U2412M Dell U2212HM Dell U2713HM Topre Realforce 87UB | Ducky DK9087 G2 Pro 
PowerCaseMouseMouse Pad
Corsair AX-750 Corsair Obsidian 650D Logitech G700 XTRAC Ripper XXL 
AudioAudioAudioAudio
Beyerdynamic DT-770 Pro 250ohm Schiit Bifrost DAC Schiit Asgard 2 HiVi Swan M50W 2.1 
CPUMotherboardRAMHard Drive
Intel Xeon E5-2620 Super Micro X9SRL-F-B 128GB 1333MHz LSI 9271-8i 
OSPowerCase
VMware ESXi 5.5 SeaSonic SS-400FL2 Fractal Define R3 
CPUMotherboardGraphicsRAM
Intel Core i5-3437U HP EliteBook Folio 9470m  Intel HD Graphics 4000  16GB DDR3 SDRAM 
Hard DriveOS
256GB SSD Windows 10 Insider Preview 
  hide details  
Reply
post #3 of 4
Thread Starter 
Quote:
Originally Posted by tompsonn View Post

Faulty CPU? Pfft. Maybe 1 in a million tongue.gif More likely to be an unstable overclock if they are appearing on this forum wink.gif

laughingsmiley.gif

On OCN, it's usually always the overclock rolleyes.gif
post #4 of 4
Quote:
Originally Posted by pjBSOD View Post

laughingsmiley.gif

On OCN, it's usually always the overclock rolleyes.gif

tongue.gif
Ol' Sandy
(28 items)
 
"Zeus"
(12 items)
 
Elite Preview
(6 items)
 
CPUMotherboardGraphicsRAM
Intel Xeon E3-1230v3 Gigabyte GA-Z97X-UD5H-BK MSI Gaming GTX 980 Kingston 32GB (4x8) 
Hard DriveHard DriveHard DriveHard Drive
Plextor PX-256M5S 256GB Samsung EVO 1TB Hitachi HDS721010CLA332 Hitachi HDS723020BLA642 
Hard DriveHard DriveHard DriveOptical Drive
Hitachi HDS723020BLA642 Hitachi HUA722010CLA330 WDC WD10EARS-00Z5B1 TSSTcorp CDDVDW SH-S223B 
CoolingCoolingOSMonitor
Phanteks PH-TC14PE with TY-140's Lamptron FCv5 (x2) Windows 8 Pro 64-bit Dell U2412M 
MonitorMonitorMonitorKeyboard
Dell U2412M Dell U2212HM Dell U2713HM Topre Realforce 87UB | Ducky DK9087 G2 Pro 
PowerCaseMouseMouse Pad
Corsair AX-750 Corsair Obsidian 650D Logitech G700 XTRAC Ripper XXL 
AudioAudioAudioAudio
Beyerdynamic DT-770 Pro 250ohm Schiit Bifrost DAC Schiit Asgard 2 HiVi Swan M50W 2.1 
CPUMotherboardRAMHard Drive
Intel Xeon E5-2620 Super Micro X9SRL-F-B 128GB 1333MHz LSI 9271-8i 
OSPowerCase
VMware ESXi 5.5 SeaSonic SS-400FL2 Fractal Define R3 
CPUMotherboardGraphicsRAM
Intel Core i5-3437U HP EliteBook Folio 9470m  Intel HD Graphics 4000  16GB DDR3 SDRAM 
Hard DriveOS
256GB SSD Windows 10 Insider Preview 
  hide details  
Reply
Ol' Sandy
(28 items)
 
"Zeus"
(12 items)
 
Elite Preview
(6 items)
 
CPUMotherboardGraphicsRAM
Intel Xeon E3-1230v3 Gigabyte GA-Z97X-UD5H-BK MSI Gaming GTX 980 Kingston 32GB (4x8) 
Hard DriveHard DriveHard DriveHard Drive
Plextor PX-256M5S 256GB Samsung EVO 1TB Hitachi HDS721010CLA332 Hitachi HDS723020BLA642 
Hard DriveHard DriveHard DriveOptical Drive
Hitachi HDS723020BLA642 Hitachi HUA722010CLA330 WDC WD10EARS-00Z5B1 TSSTcorp CDDVDW SH-S223B 
CoolingCoolingOSMonitor
Phanteks PH-TC14PE with TY-140's Lamptron FCv5 (x2) Windows 8 Pro 64-bit Dell U2412M 
MonitorMonitorMonitorKeyboard
Dell U2412M Dell U2212HM Dell U2713HM Topre Realforce 87UB | Ducky DK9087 G2 Pro 
PowerCaseMouseMouse Pad
Corsair AX-750 Corsair Obsidian 650D Logitech G700 XTRAC Ripper XXL 
AudioAudioAudioAudio
Beyerdynamic DT-770 Pro 250ohm Schiit Bifrost DAC Schiit Asgard 2 HiVi Swan M50W 2.1 
CPUMotherboardRAMHard Drive
Intel Xeon E5-2620 Super Micro X9SRL-F-B 128GB 1333MHz LSI 9271-8i 
OSPowerCase
VMware ESXi 5.5 SeaSonic SS-400FL2 Fractal Define R3 
CPUMotherboardGraphicsRAM
Intel Core i5-3437U HP EliteBook Folio 9470m  Intel HD Graphics 4000  16GB DDR3 SDRAM 
Hard DriveOS
256GB SSD Windows 10 Insider Preview 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Crash Analysis and Debugging
Overclock.net › Forums › Software, Programming and Coding › Operating Systems › Windows › Crash Analysis and Debugging › Analyzing 0x124: WHEA_UNCORRECTABLE_ERROR bugchecks