Overclock.net banner

1 - 20 of 62 Posts

·
Registered
Joined
·
51 Posts
Discussion Starter #1
Read couple of threads about the error, not sure what kind of problem I have - RAM or dying CPU
So I removed all OCs that I got and ran Prime95 Blend test and each and every time I got this:

Code:

Code:
[[Mon May 02 02:03:33 2016]
Self-test 24K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.d!
so after that I checked my RAM with raised VDRAMM to 1.66 for 9 hours(9 passes) with memtest - no errors.
after that I dropped CPU mult and again this error
raised VCORE to 1.35 - error again
Almost every time error appears in Self-test 1120K, so I ran custom test 1120K and had no errors.
 

·
Iconoclast
Joined
·
30,430 Posts
Need to know your system system specs and the version of Prime95 you are using. Also, is any particular core failing first?

Rounding errors in Blend are commonly cache or memory. If you ran HCI correctly and passed, you can probably look at one of the cache levels.
 

·
Registered
Joined
·
51 Posts
Discussion Starter #3
My rig in profile. All specs were safe loaded via bios. So all was on auto except vdramm(1,66)
It's latest version of prime95.
And I'm not so sure how could I find out what core falling first? It's about what workers stopped?
HCI?
 

·
Iconoclast
Joined
·
30,430 Posts
Quote:
Originally Posted by lokigarson View Post

My rig in profile. All specs were safe loaded via bios. So all was on auto except vdramm(1,66)
You should not need anywhere near 1.35 vcore for a stock i7-920.
Quote:
Originally Posted by lokigarson View Post

And I'm not so sure how could I find out what core falling first? It's about what workers stopped?
Yes. They are assigned to specific logical cores.
Quote:
Originally Posted by lokigarson View Post

HCI?
Sorry, I assumed HCI memtest for windows. If you were using Memtest86 or 86+ it would be useful to know the version of that.
 

·
Premium Member
Joined
·
10,417 Posts
Prime is a Proof and a Plague. It simply means that something in your system has overheated or does not have enough voltage, so it creates an error. RAM timings could be off for the voltage given...heat dissipation is the most common culprit for Prime...if it can't cool quick enough Prime will fail.
 

·
Registered
Joined
·
51 Posts
Discussion Starter #6
Quote:
Originally Posted by Blameless View Post

You should not need anywhere near 1.35 vcore for a stock i7-920..
yes I know, I know but it was eleminating process, i was trying boosting vcore fo sure to exclude that.
Quote:
Originally Posted by Blameless View Post

Sorry, I assumed HCI memtest for windows. If you were using Memtest86 or 86+ it would be useful to know the version of that.
no worries it's my mistake too. Memtest86 4.3.7 and Prime v289.win64.
Quote:
Originally Posted by Blameless View Post

Yes. They are assigned to specific logical cores.
so I tested again and right now it's 1 worker=1CPU but sometimes it's 6 or 8


PROBN4LYFE
yes I'm quite familiar with meaning of this error but problem is that I have none of this - overheating?

all thing on save load BIOS setting so all auto


 

·
Registered
Joined
·
1,210 Posts

·
Registered
Joined
·
51 Posts
Discussion Starter #10
right now i'm getting

Code:

Code:
Database error 
The ASUS Republic of Gamers [ROG] | The Choice of Champions � Overclocking, PC Gaming, PC Modding, Support, Guides, Advice database has encountered a problem.
 

·
Iconoclast
Joined
·
30,430 Posts
Quote:
Originally Posted by lokigarson View Post

no worries it's my mistake too. Memtest86 4.3.7 and Prime v289.win64.
so I tested again and right now it's 1 worker=1CPU but sometimes it's 6 or 8
If it was a weak core, you'd see errors on that core, but logical 1, 6, and 8 are on three different physical cores.

Memtest86 4.3.7, especially if you aren't forcing it to run multiple threads all the time, will miss many potential problems with the memory subsystem.

Your temperatures are fine. CPU is probably fine. What happens if you run Small FFTs?

Assuming P95 small FFTs doesn't show any issues, memory or memory timings are almost certainly the culprit, though could possibly be the seating of the CPU or DIMMs as well. I'd back off to 1.5 volts on VDIMM and try running a more memory specific stress test. If that fails, manually input memory timings and set the uncore ratio to one multiplier higher than double the DDR rate you are using.

If that still fails, revert to auto, and start pulling DIMMs until it passes. Find the weak DIMM or DIMMs and test them individually to confirm a physical defect with the memory.
Quote:
Originally Posted by Cakewalk_S View Post

INcrease your memory timings... those seem really really low.
All those timings are standard JEDEC spec for DDR3-1066, which is what his memory is running at on auto.
 

·
Registered
Joined
·
51 Posts
Discussion Starter #12
Quote:
Originally Posted by Blameless View Post

Assuming P95 small FFTs doesn't show any issues
small FFTs for 2 hours - no errors or bsod.
Quote:
Originally Posted by Blameless View Post

though could possibly be the seating of the CPU or DIMMs as well
oh well, i forgot to mention that about week ago I was cleaning my motherboard and applying new thermal compound to CPU i checked everything twice, but maybe i messed things up. Will check it again.
Quote:
Originally Posted by Blameless View Post

I'd back off to 1.5 volts on VDIMM and try running a more memory specific stress test.
I've done it VDIMM - AUTO - and rounding error again.

What kind of more specific stress test?
 

·
Iconoclast
Joined
·
30,430 Posts
Quote:
Originally Posted by lokigarson View Post

small FFTs for 2 hours - no errors or bsod.
CPU itself is almost certainly fine.
Quote:
Originally Posted by lokigarson View Post

What kind of more specific stress test?
In order of worst (least stressful) to best (most stressful), for your platform:

- Memtest86+ 5.01 forced multi-threaded (F2 on start I think, I don't recall, but there is a prompt) looping tests 4-7.

- HCI Memtest for Windows, one instance per logical core (eight in this case), each with about 10% of your total memory allocated to it.

- Linux stressapptest. Easiest way I've found to do this is to boot from the install media of my Linux distro of choice (Lubuntu in this case, but almost anything will do; Mint and Ubuntu are pretty popular), select try distro without installing (which works as a live CD), open the package manager, search for "stressapptest", and install it. Then you'd open a terminal and type: sudo stressapptest -M 5120 -s 7200 -m 4 -i 8 -W -v 20 -p 2097152

That will run for two hours, any errors on a part that is Prime95 small FFT stable is almost guaranteed to be memory.
Quote:
Originally Posted by lokigarson View Post

oh well, i forgot to mention that about week ago I was cleaning my motherboard and applying new thermal compound to CPU i checked everything twice, but maybe i messed things up. Will check it again.
It's possible for a badly seated processor or uneven heatsink mounting pressure to cause memory errors because the memory data pins connect directly to the CPU.
 
  • Rep+
Reactions: lokigarson

·
Guru
Joined
·
1,079 Posts
Quote:
Originally Posted by Blameless View Post

That will run for two hours, any errors on a part that is Prime95 small FFT stable is almost guaranteed to be memory.
It's possible for a badly seated processor or uneven heatsink mounting pressure to cause memory errors because the memory data pins connect directly to the CPU.
Now THAT is what I call a stretch. Memory errors due to uneven heatsink pressure?? Get outta here.
tongue.gif

Plus I can't imagine how it could be even possible for intel CPU to be badly seated. It's either in or not all, it can't be "half pregnant" unless the socket pins are all bent.
wink.gif
 

·
Registered
Joined
·
51 Posts
Discussion Starter #15
Quote:
Originally Posted by Blameless View Post

- Memtest86+ 5.01 forced multi-threaded (F2 on start I think, I don't recall, but there is a prompt) looping tests 4-7..
5.01 is obsolete as i understand, 6.30 newest and in UEFI cant run all CPUs with my rig, but with 4.3.7 BIOS version all CPUs for 5 hours no errors.
Quote:
Originally Posted by Blameless
- HCI Memtest for Windows, one instance per logical core (eight in this case), each with about 10% of your total memory allocated to it.
I did as you said ~2 hours test - no errors
Quote:
Originally Posted by Blameless
- Linux stressapptest. Easiest way I've found to do this is to boot from the install media of my Linux distro of choice (Lubuntu in this case, but almost anything will do; Mint and Ubuntu are pretty popular), select try distro without installing (which works as a live CD), open the package manager, search for "stressapptest", and install it. Then you'd open a terminal and type: sudo stressapptest -M 5120 -s 7200 -m 4 -i 8 -W -v 20 -p 2097152
and done that too - no errors, but there were some power spikes....

stressapptest1.txt 31k .txt file
 

Attachments

·
Iconoclast
Joined
·
30,430 Posts
Quote:
Originally Posted by SmOgER View Post

Now THAT is what I call a stretch. Memory errors due to uneven heatsink pressure?? Get outta here.
tongue.gif
It's not a stretch. Compress an LGA pin too far and the way they are bent the tip will start to lift or slide off the pad, or into other pins. Poor contact on one of the data pins can cause intermittent functionality. Memory channels can disappear, or memory errors can happen.
Quote:
Originally Posted by lokigarson View Post

and done that too - no errors, but there were some power spikes....

stressapptest1.txt 31k .txt file
The power spikes are part of the stress test. Some errors won't reveal themselves at constant load.

What happens if you run just: sudo stressapptest -M 5120 -s 7200 -W ?

If that still passes stressapptest, try setting the uncore multiplier to 21x (default with everything on auto is probably 20) and set qpi/vtt to 1.25v then running P95 Blend again.
 

·
Guru
Joined
·
1,079 Posts
Quote:
Originally Posted by Blameless View Post

It's not a stretch. Compress an LGA pin too far and the way they are bent the tip will start to lift or slide off the pad, or into other pins. Poor contact on one of the data pins can cause intermittent functionality. Memory channels can disappear, or memory errors can happen.
You obviously don't understand how this works.
When you lock the CPU in place the pins DO move and lock into their slots (small plastic cut-outs) in the socket. If they didn't move they would bend/break.
And once they are locked in place, there is literally nowhere to go for them and any additional pressure will put stress on the socket itself (plastic) rather than the pins which sit flush with it.
 

·
Iconoclast
Joined
·
30,430 Posts
Quote:
Originally Posted by SmOgER View Post

When you lock the CPU in place the pins DO move and lock into their slots (small plastic cut-outs) in the socket.
They don't lock into any slots. Everything past the first bend on the pin produdes from the hole in the underlying socket, even with the CPU locked in place.
Quote:
Originally Posted by SmOgER View Post

If they didn't move they would bend/break.
A spring that doesn't move isn't much of a spring.
Quote:
Originally Posted by SmOgER View Post

And once they are locked in place, there is literally nowhere to go for them and any additional pressure will put stress on the socket itself (plastic) rather than the pins which sit flush with it.
The pins are flush with the bottom of the CPU, which is itself flush with small areas of the socket. The socket plastic is not infinitely stiff (nor is the substrate of the CPU itself), and it's quite possible to compress the CPU in such away that it or the socket warps enough to push pins further than they were intended. This can push the pins against the lower part of the socket, below the stand-offs/shim that is actually supporting the CPU, and dislodge them from the CPU pads, permanently bend them, or crush them.

Here is an LGA-1366 socket (though this applies to most of intel's other LGA sockets as well): http://assets.vr-zone.net/14439/Intel_Socket_1155.jpeg

As you can clearly see, the CPU is supported above, the point where the pins protrude, by the corners, several other points around the edge, and the rectangular structure in the middle. There are no slots for the pins to lock into...the pins bend down at a spring hinge that sticks above the opening the pin protrudes from, they are not pushed back into that hole.

I've seen excessive/uneven mounting pressure cause socket issues more than a few times.
 

·
Guru
Joined
·
1,079 Posts
They don't exactly lock into place per se, but they spring down and sit flush with the socket (those small black separators) once the CPU is installed. Again, any additional pressure transfers directly to the frame of the socket and not the pins which simply can't move at all by this point.

TpWzWru.jpg


The issues you had were most likely related to either the motherboard warping or the cooler/backplate shortening the mobo (happens more often than you think).
 
1 - 20 of 62 Posts
Top