Overclock.net banner
1 - 20 of 41 Posts

HeroofTime

· Registered
Joined
·
218 Posts
Discussion starter · #1 ·
Hey guys! I don't have much posts here, but believe me I love this forum and know where to go for help. Right now I don't know what to do. I will keep this cut and dry. Desktop not overclocked whatsoever.

Motherboard: EVGA X58 SLI3
CPU: Xeon X5690
RAM: 3 x 2GB of Patriot Sector 7 @ 1333MHz (rated for 1600MHz)
GFX: EVGA GeForce GTX 780 Ti
Storage: 512GB Crucial MX100, 2x250GB WDs in RAID0
PSU: Rosewill HIVE-650W
Sound Card: ASUS Xonar D1
Case: DIYPC Alpha-GT3

Desktop used to be equipped with an i7-950 at stock clocks. I found a batch of Xeon X5690s online for a good deal, so I bought one. Took out i7-950, and placed Xeon X5690 in socket carefully. Applied AS5 thermal paste which isn't the best, but certainly isn't the worst. Finished the installation carefully, and made sure the heat sink assembly is tightly secured. Next, I cleared CMOS and configured BIOS for more compatibility. I left everything stock in regards to clock rates and voltages. Desktop reboots and identifies the new processor successfully, and Windows 7 64-bit likes the new processor too.

I downloaded Prime95 to ensure that the processor is okay. I started with a regular "Blend" test. The very first time I ran this test on this desktop, I got...

[Mon Sep 05 19:25:25 2016]
Self-test 24K passed!
Self-test 960K passed!
Self-test 24K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
[Mon Sep 05 19:30:41 2016]
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
Self-test 24K passed!
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
Self-test 32K passed!
Self-test 32K passed!
Self-test 32K passed!
Self-test 32K passed!
Self-test 32K passed!
Self-test 32K passed!
Self-test 32K passed!
Self-test 32K passed!
Self-test 32K passed!
Self-test 32K passed!

Afterwards, I ran a series of tests including the regular large FFTs test, and the regular small FFTs test for two hours each. This was an attempt to try and isolate the faulty hardware triggering this issue. No errors at all. Next, I burned an .iso of MemTest86 v4.3.7 and ran it for 21 hours. The following picture is the final result...



At this point I had believed my RAM was not the point of failure. I turned back to Prime95, and decided to let it run for about as long as MemTest86 did to see what I would come up with. I configured Prime95 with a custom setup (8K-4096K, "Run FFTs in-place" Off, 4816MB RAM use, and 30 minutes for each FFT size). I came back to my desktop to see the following...



I clicked "Close program" and the computer froze. I waited a good 5 minutes, and it was still frozen. So, I hit the physical reset button on my desktop. Afterwards, I went into Prime95's logs and saw the following...

[Wed Sep 07 08:08:57 2016]
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
Self-test 1120K passed!
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.

There is no more text recorded. That's all there is to it. Take into consideration the maximum recorded temperatures in the photo I took. They're not that high. I'd love to provide more information, but I can't think straight any longer and would greatly appreciate somebody's helping hand! Please feel free to ask any relevant questions, as I may have missed an important detail.
 
I cant read your temps in the picture. what are your temps?
 
Discussion starter · #3 ·
Apologies.

Cores 1, 2, 3, 4, 5, 6... 69°C, 68°C, 66°C, 64°C, 69°C, 68°C.
 
Discussion starter · #4 ·
Anyone got any ideas? I ran another test earlier today, and Prime95 crashed again. Temperatures weren't even that hot this time around, and lower than my previous post. Prime95 settings were 24K-1120K, 4096MB RAM use, "Run FFTs in-place" Off, and 20 minutes for each FFT size.

[Thu Sep 08 13:39:52 2016]
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
[Thu Sep 08 13:47:15 2016]
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
[Thu Sep 08 14:00:47 2016]
Self-test 32K passed!
Self-test 32K passed!

That's all. No more text was recorded. Again, everything is stock still.
 
Discussion starter · #5 ·
Got some more information. Ran Prime95 on the regular small FFTs test for 6 hours and 10 minutes. No errors whatsoever, but temperatures reached higher than ever.

Cores 1, 2, 3, 4, 5, 6... 73°C, 72°C, 72°C, 69°C, 74°C, 73°C.
 
If it runs perfect on everything but p95 I wouldn't worry about it. Such a useless test for a cpu that in the real world will never have to see those kind of stresses. But I'll bet if you bump the Vcore just a tad it will probably go away.
 
Discussion starter · #7 ·
Thank you for the advice Moparman. I went ahead and bumped VCore from Auto to 1.25V, and VTT from Auto to +25mV.

Do you think this is sufficient? I booted the desktop back up and ran Prime95 before I headed off to work. I'm at work now and afraid to return to a crashed Prime95 again. I would greatly appreciate some more instruction.
 
Did you test the i7 950 with prime95? I suggest to stress test with HCI instead of Memtest, I find it more reliable. If your memory passed HCI then try re-seating the CPU if that didn't work then you probably got a defective CPU at hand.
 
Could well be a degraded CPU or a board that is reaching the end of it's life.
Quote:
Originally Posted by Moparman View Post

If it runs perfect on everything but p95 I wouldn't worry about it.
If it's a stable build of Prime95 and it's finding problems, something is wrong.

It may not have caused problems in real use, yet, but that's not to say that issues cannot or will not crop up.
 
Discussion starter · #10 ·
Well, I got some more information.

I stress tested my i7-950. Prime95 crashed, and behaves like it did when the X5690 was in place. Stock clocks and voltages.

Cores 1, 2, 3, 4... 82°C, 79°C, 77°C, 78°C.

I am lost now. I guess I'll try HCI and see how it goes. I'm doubtful I'll see a difference though.
 
Is there any clock speed/memory settings it does not crash with?

If it's stable in other tasks at stock, but is failing P95 while heavily under clocked, it's possible that you've found a bug in Prime95, or unpatched errata in that Westmere stepping.
 
Try realbench. Check that out for an hour or 2 hour stress test. If that passes you're pretty much golden. Prime95 has some error with Sky Lake also and will error out when I know my chip is completely stable at stock boost clock of 4.2Ghz...
 
Westmere is not Skylake and has no known uncorrected errors that would impact Prime95, nor do current stable builds of Prime95 have known issues with Westmere. Prime95 failing is significant, one way or another, and should not be dismissed.

Prime95 28.9 does crash on my X5670 and I haven't had time to thoroughly investigate the problem. I was going to put down my error to memory configuration issues, but this thread and comments from other Westmere users have piqued my interest and I'm starting to think their may be a legitimate bug in the newest build.
 
Discussion starter · #14 ·
I'm thoroughly confused as well. My Bloomfield and Westmere-EP are behaving the same way though. Again, everything is stock clocks with stock voltages. I've disabled Intel Speedstep on both CPUs before doing anything too. I do not like Intel Speedstep as well as the Turbo functionality where your multiplier gets +1 boost. Both have been turned off. On both CPUs I've also ran my 1600MHz RAM at 1333MHz speeds. Upping voltage (VCore and VTT) on the X5690 didn't help. I doubt it will help because Prime95 is the only thing crashing, and it looks like the picture above where that half of the screen is black (that half is where Prime95 sits when I run it).

I have brand new 2133MHz Corsair RAM at home (6x4GB) that I've ordered. Should I throw these in there, run them at 1333MHz, and see what I come up with in Prime95?
 
  • Rep+
Reactions: Blameless
Find version 26.6 and it will save u a lot of hassle. If passed 26.6 then 28.9 is bugged. It errors in my skylake build no matter what settings I use but I can pass hours of OCCT with linpack and all virtual cores enabled as well as 8 hours of custom x264 with 16 threads.
 
Quote:
Originally Posted by HeroofTime View Post

I have brand new 2133MHz Corsair RAM at home (6x4GB) that I've ordered. Should I throw these in there, run them at 1333MHz, and see what I come up with in Prime95?
Please do.

Also, if possible, run the CPU and uncore at the lowest usable multipliers (12x and 8x, I beleive) at stock voltage. Not being able to pass at any settings on a part that otherwise seems fine, will be very strong evidence for unpatched CPU errata, or a bug in Prime95, either or both of which should be brought to someone's attention.
Quote:
Originally Posted by HOODedDutchman View Post

If passed 26.6 then 28.9 is bugged.
There have been a lot of changes since 26.x and even without AVX instructions, newer versions are faster and more stressful. Passing 26 doesn't automatically mean 28.9 is bugged.
Quote:
Originally Posted by HOODedDutchman View Post

It errors in my skylake build no matter what settings
That's because Skylake itself is bugged.

http://arstechnica.com/gadgets/2016/01/intel-skylake-bug-causes-pcs-to-freeze-during-complex-workloads/

Now the question is if Westmere is bugged or if Prime95 28.9 is bugged on Westmere, or whether the new version is simply so demanding that many extant X58 setups just can't handle it.
 
Quote:
Originally Posted by Blameless View Post

Please do.

Also, if possible, run the CPU and uncore at the lowest usable multipliers (12x and 8x, I beleive) at stock voltage. Not being able to pass at any settings on a part that otherwise seems fine, will be very strong evidence for unpatched CPU errata, or a bug in Prime95, either or both of which should be brought to someone's attention.
There have been a lot of changes since 26.x and even without AVX instructions, newer versions are faster and more stressful. Passing 26 doesn't automatically mean 28.9 is bugged.
That's because Skylake itself is bugged.

http://arstechnica.com/gadgets/2016/01/intel-skylake-bug-causes-pcs-to-freeze-during-complex-workloads/

Now the question is if Westmere is bugged or if Prime95 28.9 is bugged on Westmere, or whether the new version is simply so demanding that many extant X58 setups just can't handle it.
That's what I'm saying. If he can pass minimum 4 hours 26.6 without error I would say he's stable and there's absolutely no issue worth worrying about. Or 8 hours of u want to be ocd.
 
Quote:
Originally Posted by HOODedDutchman View Post

That's what I'm saying. If he can pass minimum 4 hours 26.6 without error I would say he's stable and there's absolutely no issue worth worrying about. Or 8 hours of u want to be ocd.
That's not what I'm saying.

If there isn't a bug with Westmere and there isn't a bug with Prime95 28.9, then failing Prime95 28.9 means there is a correctable problem somewhere. To ignore correctable problems is asking for trouble if stability is an important consideration. Even if best possible stability is not the OP's goal, it's still useful to know where the source of the instability currently experienced is.

And you can most certainly pass 8 hours of P95 26.6 and still have instabilities that can crop up in real use.
 
Quote:
Originally Posted by Blameless View Post

That's not what I'm saying.

If there isn't a bug with Westmere and there isn't a bug with Prime95 28.9, then failing Prime95 28.9 means there is a correctable problem somewhere. To ignore correctable problems is asking for trouble if stability is an important consideration. Even if best possible stability is not the OP's goal, it's still useful to know where the source of the instability currently experienced is.

And you can most certainly pass 8 hours of P95 26.6 and still have instabilities that can crop up in real use.
I'm sure that's true but I've never encountered it. 4 hours of prime95 26.6 back in the day on x58, p67, z68 was rock solid everytime. Always made sure to run 2 hours of Intel burntest on maximum first tho. I've seen failures even 3 hours into prime95 that passed ibt and still needed another increment or 2 of voltage bump. I don't see the reason to worry if u pass a version of p95 that was designed to stress the platform he is on. 8 hours is overkill even on 26.6. If he's really worried about the version use 27.9. Its said to be nearly as stressful at 28.9 but less buggy. 28.9 is something to stay tf away from imo. Its done nothing but cause headaches for people on forums all over the place.
 
Discussion starter · #20 ·
I've got around to installing the 6x4GB set of RAM. They're rated for 2133MHz, 11-11-11-27, and 1.5V. They're running at 1866MHz and 11-11-11-27. I ran the desktop around for a day, and the desktop was fine with no issues. All clocks and voltages are stock.

I decided to run Prime95 with the same settings mentioned on my new RAM set. It keeps failing with no recorded results. My desktop reboots right away when this occurs, and one of the times a BSOD appeared for a split second. So, I am running MemTest86 now on the desktop. I will report back with results from MemTest86.

PS: When running Prime95, my new RAM voltage was approximately 1.84V. Why? Prime95 doesn't even have to be running actually. My RAM still runs at 1.84V. Nothing else seemed to be out of the ordinary. i7-950 is installed currently.
 
1 - 20 of 41 Posts