Overclock.net › Forums › Software, Programming and Coding › Operating Systems › Linux, Unix › Machine Check Exceptions
New Posts  All Forums:Forum Nav:

Machine Check Exceptions

post #1 of 12
Thread Starter 
OK, my sig rig is throwing up MCE's under Linux and I am not sure what to do. I just built it a couple of weeks ago. I've never had to deal with a hardware issue before.

At any rate, here's an example of some of the errors:

Code:
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
MCE 0
CPU 1 0 data cache 
ADDR 11a771480 
TIME 1297058401 Mon Feb  7 00:00:01 2011
  Data cache ECC error (syndrome 0)
       bit46 = corrected ecc error
  memory/cache error 'data read mem transaction, data transaction, level 2'
STATUS 94004000f1000136 MCGSTATUS 0
MCGCAP 106 APICID 1 SOCKETID 0 
CPUID Vendor AMD Family 16 Model 4
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
MCE 1
CPU 1 0 data cache 
ADDR 12a649480 
TIME 1297058401 Mon Feb  7 00:00:01 2011
  Data cache ECC error (syndrome 0)
       bit46 = corrected ecc error
       bit62 = error overflow (multiple errors)
  memory/cache error 'data read mem transaction, data transaction, level 2'
STATUS d4004000f1000136 MCGSTATUS 0
MCGCAP 106 APICID 1 SOCKETID 0 
CPUID Vendor AMD Family 16 Model 4
It looks like the CPU L2 cache is throwing errors. Has anyone had to deal with this before? if so, should I contact AMD or should I check RAM and mobo? I don't have any other AM3 CPU's to check in this mobo. I do have some DDR2 RAM laying around somewhere, though.

NOTE: I am not running this CPU overclocked. Everything is at stock speeds and voltages.
Skylake Build
(12 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5-6600k Gigabyte Z-170 Gaming 7 Gigabyte R9 390  Gskill Ripjaws V DDR4 
Hard DriveCoolingOSMonitor
Samsung 850 Evo Corsair H115i Windows 10 Pro Asus  
KeyboardPowerCaseMouse
Generic EVGA NEX750 G1 Phanteks Eclipse P400 GSkill MX780 
  hide details  
Reply
Skylake Build
(12 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5-6600k Gigabyte Z-170 Gaming 7 Gigabyte R9 390  Gskill Ripjaws V DDR4 
Hard DriveCoolingOSMonitor
Samsung 850 Evo Corsair H115i Windows 10 Pro Asus  
KeyboardPowerCaseMouse
Generic EVGA NEX750 G1 Phanteks Eclipse P400 GSkill MX780 
  hide details  
Reply
post #2 of 12
See that overclock?
Lower it.
Waiting on X399
(13 items)
 
  
CPUMotherboardGraphicsRAM
AMD Phenom II B57 @ X4 3.9 Gigabyte 790FXTA-UD5 Sapphire Radeon 290 8 GB G.Skill 2133 
Hard DriveCoolingOSKeyboard
250 GB 840 EVO Noctua NH-D14 Windows 10 Logitech K350 
PowerCaseMouseMouse Pad
Seasonic x750 Corsair 600T Logitech G100s Razer Goliathus Speed 
Audio
Plantronics Gamecom 788 
  hide details  
Reply
Waiting on X399
(13 items)
 
  
CPUMotherboardGraphicsRAM
AMD Phenom II B57 @ X4 3.9 Gigabyte 790FXTA-UD5 Sapphire Radeon 290 8 GB G.Skill 2133 
Hard DriveCoolingOSKeyboard
250 GB 840 EVO Noctua NH-D14 Windows 10 Logitech K350 
PowerCaseMouseMouse Pad
Seasonic x750 Corsair 600T Logitech G100s Razer Goliathus Speed 
Audio
Plantronics Gamecom 788 
  hide details  
Reply
post #3 of 12
Thread Starter 
Quote:
Originally Posted by beers View Post
See that overclock?
Lower it.
No, I am running at stock speeds and voltages. Still getting the errors.
Skylake Build
(12 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5-6600k Gigabyte Z-170 Gaming 7 Gigabyte R9 390  Gskill Ripjaws V DDR4 
Hard DriveCoolingOSMonitor
Samsung 850 Evo Corsair H115i Windows 10 Pro Asus  
KeyboardPowerCaseMouse
Generic EVGA NEX750 G1 Phanteks Eclipse P400 GSkill MX780 
  hide details  
Reply
Skylake Build
(12 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5-6600k Gigabyte Z-170 Gaming 7 Gigabyte R9 390  Gskill Ripjaws V DDR4 
Hard DriveCoolingOSMonitor
Samsung 850 Evo Corsair H115i Windows 10 Pro Asus  
KeyboardPowerCaseMouse
Generic EVGA NEX750 G1 Phanteks Eclipse P400 GSkill MX780 
  hide details  
Reply
post #4 of 12
This happened when I switched from the 2.6.32 kernel to anything above that kernel. It made me think that they might have implemented something in the new kernels that doesn't quite like my AMD cpu. I get machine check errors all the time, nothing like that, but I get them none the less. I have an unlocked cpu at stock speeds, I think if I don't unlock it they might go away? I'm unsure, but I've tested my unlock against everything and never had a problem. So I ignore them, and you probably can too.
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
post #5 of 12
Thread Starter 
Quote:
Originally Posted by mushroomboy View Post
This happened when I switched from the 2.6.32 kernel to anything above that kernel. It made me think that they might have implemented something in the new kernels that doesn't quite like my AMD cpu. I get machine check errors all the time, nothing like that, but I get them none the less. I have an unlocked cpu at stock speeds, I think if I don't unlock it they might go away? I'm unsure, but I've tested my unlock against everything and never had a problem. So I ignore them, and you probably can too.

I would ignore them, as most of them have no effect. However, I will get a hard lockup every few days that forces a reboot. At first I ignored them thinking there was some buggy software somewhere, then I looked at my logs and found all the MCE's.
Skylake Build
(12 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5-6600k Gigabyte Z-170 Gaming 7 Gigabyte R9 390  Gskill Ripjaws V DDR4 
Hard DriveCoolingOSMonitor
Samsung 850 Evo Corsair H115i Windows 10 Pro Asus  
KeyboardPowerCaseMouse
Generic EVGA NEX750 G1 Phanteks Eclipse P400 GSkill MX780 
  hide details  
Reply
Skylake Build
(12 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5-6600k Gigabyte Z-170 Gaming 7 Gigabyte R9 390  Gskill Ripjaws V DDR4 
Hard DriveCoolingOSMonitor
Samsung 850 Evo Corsair H115i Windows 10 Pro Asus  
KeyboardPowerCaseMouse
Generic EVGA NEX750 G1 Phanteks Eclipse P400 GSkill MX780 
  hide details  
Reply
post #6 of 12
Quote:
Originally Posted by thiussat View Post
I would ignore them, as most of them have no effect. However, I will get a hard lockup every few days that forces a reboot. At first I ignored them thinking there was some buggy software somewhere, then I looked at my logs and found all the MCE's.
Don't know what to tell ya. Did you start getting lockups after a kernel upgrade? I'm guessing they could have changed something in the kernel. Everything works good for me, it just outputs to console and really pisses me off. I mean, really really really pisses me off. I'll get my entire console filled up with them, and then I'll go to type something and it'll erase it (still there but I can't see it) so then I'm typing blind. I figured if 2.6.32 doesn't throw errors at me, ever, I shouldn't worry. =(

Sorry to hear your getting lockups, I'd try and remember if any software updates hit.
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
post #7 of 12
Quote:
Originally Posted by mushroomboy View Post
Don't know what to tell ya. Did you start getting lockups after a kernel upgrade? I'm guessing they could have changed something in the kernel. Everything works good for me, it just outputs to console and really pisses me off. I mean, really really really pisses me off. I'll get my entire console filled up with them, and then I'll go to type something and it'll erase it (still there but I can't see it) so then I'm typing blind.
You know you can disable console messages right? blindly type:

# dmesg -n 1 <enter>

and it should stop.

Quote:
Originally Posted by thiussat View Post
OK, my sig rig is throwing up MCE's under Linux and I am not sure what to do. I just built it a couple of weeks ago. I've never had to deal with a hardware issue before.

At any rate, here's an example of some of the errors:

Code:
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
MCE 0
CPU 1 0 data cache 
ADDR 11a771480 
TIME 1297058401 Mon Feb  7 00:00:01 2011
  Data cache ECC error (syndrome 0)
       bit46 = corrected ecc error
  memory/cache error 'data read mem transaction, data transaction, level 2'
STATUS 94004000f1000136 MCGSTATUS 0
MCGCAP 106 APICID 1 SOCKETID 0 
CPUID Vendor AMD Family 16 Model 4
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
MCE 1
CPU 1 0 data cache 
ADDR 12a649480 
TIME 1297058401 Mon Feb  7 00:00:01 2011
  Data cache ECC error (syndrome 0)
       bit46 = corrected ecc error
       bit62 = error overflow (multiple errors)
  memory/cache error 'data read mem transaction, data transaction, level 2'
STATUS d4004000f1000136 MCGSTATUS 0
MCGCAP 106 APICID 1 SOCKETID 0 
CPUID Vendor AMD Family 16 Model 4
It looks like the CPU L2 cache is throwing errors. Has anyone had to deal with this before? if so, should I contact AMD or should I check RAM and mobo? I don't have any other AM3 CPU's to check in this mobo. I do have some DDR2 RAM laying around somewhere, though.

NOTE: I am not running this CPU overclocked. Everything is at stock speeds and voltages.
It's hard to say what's the problem at this point. Have you tried to run some hardware diagnostics like memtest86+ or something? have you tried to re-seat the memory modules? does the problem go away when you boot a different kernel or OS?
TAIPEI
(10 items)
 
AURORA
(13 items)
 
 
MotherboardGraphicsRAMHard Drive
ASRock X99 Extreme11 EVGA GTX 980 Superclocked 32GB 8x4GB Corsair LPX Samsung XP941  
Hard DriveCoolingOSMonitor
Western Digital 3TB RE Noctua NH-D15 Fedora 21 Linux Samsung S27D590C 
PowerCase
Seasonic SS-1200XP Cooler Master Cosmos II 
CPUMotherboardGraphicsRAM
Dual Quad-core L5430 2.66Ghz 12mb cache Intel 5000 chipset ATI ES1000 64GB FBDIMM DDR2 PC2-5300 667Mhz 
Hard DriveOSPower
WD3000FYYZ PERC H700 w/ 512MB cache CentOS 7.2.1511 950W x2 
  hide details  
Reply
TAIPEI
(10 items)
 
AURORA
(13 items)
 
 
MotherboardGraphicsRAMHard Drive
ASRock X99 Extreme11 EVGA GTX 980 Superclocked 32GB 8x4GB Corsair LPX Samsung XP941  
Hard DriveCoolingOSMonitor
Western Digital 3TB RE Noctua NH-D15 Fedora 21 Linux Samsung S27D590C 
PowerCase
Seasonic SS-1200XP Cooler Master Cosmos II 
CPUMotherboardGraphicsRAM
Dual Quad-core L5430 2.66Ghz 12mb cache Intel 5000 chipset ATI ES1000 64GB FBDIMM DDR2 PC2-5300 667Mhz 
Hard DriveOSPower
WD3000FYYZ PERC H700 w/ 512MB cache CentOS 7.2.1511 950W x2 
  hide details  
Reply
post #8 of 12
Quote:
Originally Posted by BLinux View Post
You know you can disable console messages right? blindly type:

# dmesg -n 1 <enter>

and it should stop.



It's hard to say what's the problem at this point. Have you tried to run some hardware diagnostics like memtest86+ or something? have you tried to re-seat the memory modules? does the problem go away when you boot a different kernel or OS?
I can disable them in the desktop/xterm situation, but in a tty terminal I haven't bothered with a long term solution as I'm not in those that often.
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
Current Rig
(14 items)
 
  
CPUMotherboardGraphicsRAM
FX-8350 4.6GHz@1.44v GA-990FXA-UD3 R4.0 HD 7950 (1100/1450) 8G Muskin DDR3 1866@8CLS 
Hard DriveOptical DriveOSMonitor
1TB WD LiteOn DVD-RW DL Linux/Windows 19" Phillips TV 1080p 
PowerCaseMouseMouse Pad
OCZ 600W Generic Junk Logitech MX400 Generic Junk 
Audio
SBL 5.1 
  hide details  
Reply
post #9 of 12
Sorry, but the most likely explanation is a faulty CPU. You should replace it.
Underground
(14 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 920 C0 ASUS P6T6 WS Revolution GTX 460 TR3X6G1600C8D 
Hard DriveOptical DriveCoolingOS
WD1001FALS SAMSUNG SH-S223F 22X DVD MULTI Corsair H50 Fedora 16 KDE x86_64 
MonitorKeyboardPowerCase
HP w19b Microsoft Comfort Curve Corsair CX600 Thermaltake Armor VA8003BWS 
MouseMouse Pad
Razer DeathAdder Black 
  hide details  
Reply
Underground
(14 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 920 C0 ASUS P6T6 WS Revolution GTX 460 TR3X6G1600C8D 
Hard DriveOptical DriveCoolingOS
WD1001FALS SAMSUNG SH-S223F 22X DVD MULTI Corsair H50 Fedora 16 KDE x86_64 
MonitorKeyboardPowerCase
HP w19b Microsoft Comfort Curve Corsair CX600 Thermaltake Armor VA8003BWS 
MouseMouse Pad
Razer DeathAdder Black 
  hide details  
Reply
post #10 of 12
Thread Starter 
Quote:
Originally Posted by error10 View Post
Sorry, but the most likely explanation is a faulty CPU. You should replace it.
Yeah, I think so too. I went ahead and swapped my PSU out with another one I had laying around and I am still getting the errors. This leaves only the CPU or motherboard -- but it's most likely the CPU.

Luckily Newegg approved my RMA even after my 30 days, so I am good to go. It just sucks being without a computer for a week or so.
Skylake Build
(12 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5-6600k Gigabyte Z-170 Gaming 7 Gigabyte R9 390  Gskill Ripjaws V DDR4 
Hard DriveCoolingOSMonitor
Samsung 850 Evo Corsair H115i Windows 10 Pro Asus  
KeyboardPowerCaseMouse
Generic EVGA NEX750 G1 Phanteks Eclipse P400 GSkill MX780 
  hide details  
Reply
Skylake Build
(12 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5-6600k Gigabyte Z-170 Gaming 7 Gigabyte R9 390  Gskill Ripjaws V DDR4 
Hard DriveCoolingOSMonitor
Samsung 850 Evo Corsair H115i Windows 10 Pro Asus  
KeyboardPowerCaseMouse
Generic EVGA NEX750 G1 Phanteks Eclipse P400 GSkill MX780 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Linux, Unix
Overclock.net › Forums › Software, Programming and Coding › Operating Systems › Linux, Unix › Machine Check Exceptions