post #1 of 1
Thread Starter 
TL; DR Summary
The linux kernel is a powerful tool to detect instabilities in your overclock settings with both greater accuracy and sensitivity than either Prime95 or IBT/LinX.

More Details
The linux kernel supplies users with a dead simple method for measuring hardware instabilities -- like those caused by an 'unstable' overclock. There is nothing special to install as this functionality seems to be naively included in the kernel itself. To use it, simply run a standard stress test such as Prime95 or Linpack and watch the output from dmesg. If the system is unstable due to insufficient voltage settings, excessive heat, it will report:
Code:
[Hardware Error]: Machine check events logged

I have seen the kernel throw these errors during a prime95 run before prime95 gave an error in the math. Further, I have seen these errors appear when and linpack did not detect the settings are unstable as evident by the residual number not chaining during the run when the error occurred.

How to Stress Test Under Linux
Probably the most newb-friendly flavor of Linux is Ubuntu. Users can run it live off a CD or a USB without installing it to their systems. Further, it is pre-configured to boot into a GUI with network and hardware autodetected. Download an image from Home | Ubuntu - I recommend the 64-bit version as the 32-bit Linux suffers from the same <4 GB of memory limitation that the 32-bit Windows does,

Note: don't feel like Ubuntu is your only option. There are many other Linux distributions out there from which to choose.

Download the iso, burn it to media or to a USB and boot. Ubuntu prompts users to either "try ubuntu" or "install ubuntu." Just hit the "try ubuntu" button and you will be dumped into the live linux environment.

Here are a few suggestions for stress testing:
1) mprime ---> linux version of prime95. Help to download and run mprime.
2) linpack ---> back end to both LinX and IBT. Help to download and run linpack.
3) x264 video encoding.
4) Compiling something large like the linux kernel.

I have seen on my own machine the ability to pass tests #1 and #2 but an inability to get more than 10 min into a x264 encode or to compile something 4-5 times without errors. It is important to test using several orthogonal stresses. While stressing, print the output of the kernel ring buffer. You can do this in one of two ways:

1) Open a terminal and type dmesg to see a snapshot.
2) Perhaps more useful is to be informed when something happens rather than typing dmesg over and over again! You can do this with the following command:
Code:
sudo cat /proc/kmsg

It looks like nothing is happening, but actually, the command more or less opened a connection to the ring buffer; it will update when something happens. To test it, plug in a USB thumb drive.

Example on my box:
Code:
<5>[13393.025582] scsi 10:0:0:0: Direct-Access     Kingston DataTraveler 112 1.00 PQ: 0 ANSI: 2
<5>[13393.026103] sd 10:0:0:0: [sdc] 7831552 512-byte logical blocks: (4.00 GB/3.73 GiB)
<5>[13393.026449] sd 10:0:0:0: [sdc] Write Protect is of<>133065]s 0000 sc oeSne 30 00

Anyway, you will want to watch for that message I posted above:
Code:
[Hardware Error]: Machine check events logged

Edited by graysky - 5/20/13 at 8:24am
computer
(13 items)
 
  
CPUMotherboardRAMHard Drive
3370K Asus P8Z77X-V Pro GSKILL Ripjaw Vertex 4 
CoolingOSPowerCase
NH-D14 Linux SEASONIC SS-560KM P183 
  hide details  
Reply
computer
(13 items)
 
  
CPUMotherboardRAMHard Drive
3370K Asus P8Z77X-V Pro GSKILL Ripjaw Vertex 4 
CoolingOSPowerCase
NH-D14 Linux SEASONIC SS-560KM P183 
  hide details  
Reply