Overclock.net banner
21 - 40 of 50 Posts
Discussion starter · #22 ·
Been up and running for 28 hours with the same stick and still not a single error. I had it compiling under Gentoo for the first 26 hours, then an hour of memtest, and now it is running Prime95 blend.
 
Quote:


Originally Posted by jpz
View Post

Been up and running for 28 hours with the same stick and still not a single error. I had it compiling under Gentoo for the first 26 hours, then an hour of memtest, and now it is running Prime95 blend.

Keep pusing for a couple more hours, you said errors occured anywhere between 24-48 hours. After you're through 48 hours you can call that stable.
 
Discussion starter · #24 ·
Just giving a quick update.
Image


I don't intend on turning my computer off until/unless I get an error or any sort of indication. 2GB is more than plenty for me to work comfortably in Gentoo. Gaming on Vista might be unpleasant, but at least I'd be able to get some work done. Having the peace of mind that my computer is not going to crash or lockup at any moment due to a hardware error seems like such a luxury at the present time.
 
Discussion starter · #25 ·
Went out for a few hours... just got back. The fourth core of Prime95 failed with a rounding error about an hour ago. The other three cores were still running. Rebooting now to see what memtest86 has to say.

This time it took 30 hours from power-up for the first error to appear.
 
Discussion starter · #26 ·
Memtest has been running for over an hour now and no errors detected. I'll let it run through the morning and then if it still hasn't found any errors I'll probably boot back into Vista and run Prime blend for another day or so.

Really wish I was back at home so I could test my other stick in another computer at the same time... and test them both together in the other computer.
 
You're in just about the same boat as me. My solution was to do a complete rebuild. I still have the other and I can't figure out what's wrong with it. Small FTT's are fine... large ones fail in minutes. The longer my rig is running the worse it gets until it either freezes or comes up with a BSOD. Reinstalling windows did nothing. The only thing I've been doing lately is playing WoW and that crashes constantly. I have to shut down because a reboot just gives me "BIOS Checksum Error". And, just like you, it seems to fix things for a few hours and then it's ****ting all over itself again. I've tried just about everything you can think of. I blame the motherboard. It's an ASUS Striker II Formula w/ a 780i chipset.

Problem Build:

Intel Q9650
ASUS Striker II Formula
2x2GB Corsair Dominator DDR2 1066

BIGGEST WASTE OF MONEY EVAR!!! Only had it 4 months. Three of which have been nothing but headaches.

Wish I could be more help but my problems are still unsolved. Nice knowing I'm not alone though!
 
Discussion starter · #28 ·
Quote:


Originally Posted by Cobra2468
View Post

Problem Build:

Intel Q9650
ASUS Striker II Formula
2x2GB Corsair Dominator DDR2 1066

BIGGEST WASTE OF MONEY EVAR!!! Only had it 4 months. Three of which have been nothing but headaches.

Just out of curiosity, around what time did you buy your RAM, and where from? I bought my pair from Newegg around February '09 IIRC. Come to think of it, I recall hearing something about a bad batch of these kits not too long after I bought mine.
 
Discussion starter · #30 ·
That recall was for the Dominator GT (DDR3).

Last night I accidentally ended the memtest run early. I went to turn off my case lights but I must have been really tired because I flipped the switch on the back of my power supply instead of the light switch.
Image


That was about 6 hours after my previous post, which means the first stick made it through 7 hours of memtest and 37 hours straight testing with only that one Prime95 error.

I popped in the second stick before going to bed and let it run memtest overnight. It has been about 10 hours since then and memtest is now finding errors. Hopefully I've just got one bad stick of RAM. What I'll do is reboot, run memtest again to see if it finds errors instantly like it did when I had two sticks in. Then I'll power off the machine and run memtest again and see how long it takes to get more errors on the second stick. Finally I'll do one last run of memtest on my first (good?) stick to verify that it is working and that the 37 hour run was not a fluke.
 
Discussion starter · #33 ·
Quote:

Originally Posted by nolonger View Post
On the first or second stick?
Second stick. The reason I was confused by that is that I tested that stick(by itself and with the first) for much longer than half an hour without getting any errors in the past.

Under normal circumstances I'd say the second stick is faulty, but I can't be certain about that since I am still questioning the state of my motherboard and processor.

I have the first stick back in. It is just passed 6 hours of memtest86. I have never received any errors with just the first stick in except for that one error detected by Prime95 the other day, which I am going to call a fluke. My motherboard and CPU are still overclocked; it is entirely possible that my CPU makes a faulty calculation once in a blue moon and I just happened to catch it during that test.

I am going to leave the first stick running memtest until I go to sleep. If memtest still hasn't detected any errors, I'm going to boot up Gentoo and (hopefully) the WU's I downloaded a few days ago before their deadlines. In the morning I'll swap the second stick back in and see how long it takes memtest to find errors. If the first stick passes tonight's testing and the second stick fails again tomorrow, I think it will be safe to conclude that the second stick is faulty and everything else is ok.

I wonder if it is possible that a few errors generated by the second stick could corrupt something in the northbridge/memory controller over time, so that after a day or two the memory controller would be completely non-functional and could not even access the working sections of memory properly.
 
Quote:

Originally Posted by jpz View Post
Just out of curiosity, around what time did you buy your RAM, and where from? I bought my pair from Newegg around February '09 IIRC. Come to think of it, I recall hearing something about a bad batch of these kits not too long after I bought mine.
I can't remember exactly but it was close to/early summer. I had them ordered in through a local store that I've been using for years. The only thing I've yet to do is try them in another rig. Another thing is the WoW errors. Don't know how many of you play it but when it crashes it's always an error about not being able to read so and so file such as a certain texture or w/e it may be but it's always different. Rebooting solves the problem but again, I have to shut down or else it won't boot back up. Makes me think hard drive problem which leads me back to the motherboard being at fault. I can't be so unlucky as to have brand new memory and brand new hard drives fail all at once.
 
Quote:

Originally Posted by jpz View Post
Second stick. The reason I was confused by that is that I tested that stick(by itself and with the first) for much longer than half an hour without getting any errors in the past.

Under normal circumstances I'd say the second stick is faulty, but I can't be certain about that since I am still questioning the state of my motherboard and processor.

I have the first stick back in. It is just passed 6 hours of memtest86. I have never received any errors with just the first stick in except for that one error detected by Prime95 the other day, which I am going to call a fluke. My motherboard and CPU are still overclocked; it is entirely possible that my CPU makes a faulty calculation once in a blue moon and I just happened to catch it during that test.

I am going to leave the first stick running memtest until I go to sleep. If memtest still hasn't detected any errors, I'm going to boot up Gentoo and (hopefully) the WU's I downloaded a few days ago before their deadlines. In the morning I'll swap the second stick back in and see how long it takes memtest to find errors. If the first stick passes tonight's testing and the second stick fails again tomorrow, I think it will be safe to conclude that the second stick is faulty and everything else is ok.

I wonder if it is possible that a few errors generated by the second stick could corrupt something in the northbridge/memory controller over time, so that after a day or two the memory controller would be completely non-functional and could not even access the working sections of memory properly.
It would take longer to find errors with both sticks because there's double the memory to test. I'm pretty sure you can't corrupt your northbridge from faulty RAM. Seems like we found the problem!
Image
 
Discussion starter · #36 ·
After testing the second stick for 10 hours, I only had about 70 errors. With both sticks in(after the computer had been running for close to 48 hours) I had over 1,500,000 million errors within the first 30 seconds of running memtest. The failing addresses ranged from 400MB to 4200MB and both sticks were generating an equal amount of errors.

You don't think a corrupt piece of code retrieved from faulty RAM could cause the CPU to write nonsense to the northbridge, causing the northbridge to wreak havok on the entire system? What about writing nonsense to the base memory where the BIOS is stored in the RAM?
 
Quote:

Originally Posted by jpz View Post
After testing the second stick for 10 hours, I only had about 70 errors. With both sticks in(after the computer had been running for close to 48 hours) I had over 1,500,000 million errors within the first 30 seconds of running memtest. The failing addresses ranged from 400MB to 4200MB and both sticks were generating an equal amount of errors.

You don't think a corrupt piece of code retrieved from faulty RAM could cause the CPU to write nonsense to the northbridge, causing the northbridge to wreak havok on the entire system? What about writing nonsense to the base memory where the BIOS is stored in the RAM?
An explanation I see to why you would find more errors with two sticks of RAM is you had to have it running for longer to even get errors, so if you left it for 48 hours on and then running Memtest you'd have close to that many errors. Another thing is that with two sticks of RAM you stress the Northbridge more, which means it's more prone to instabilities. These would most likely never show, but due to the faulty RAM and the load on the northbridge they appeared. This could have generated errors in all of the memory sections.

I do think a memory write/read error could wreak havoc on the entire system, but not to the point to make it permanent, it'd cause a BSOD in Windows. The BIOS isn't stored in the RAM, it's stored in a BIOS chip so it's safe.
 
Discussion starter · #39 ·
Quote:

Originally Posted by nolonger View Post
I do think a memory write/read error could wreak havoc on the entire system, but not to the point to make it permanent, it'd cause a BSOD in Windows. The BIOS isn't stored in the RAM, it's stored in a BIOS chip so it's safe.
The BIOS is stored on a flash chip on the motherboard, but it gets shadow-copied to the RAM when you turn your computer on. The flash chip can be written to, but it is rare for that to happen as the result of a hardware failure since it is protected from accidental writes. Once the BIOS has been shadowed in the RAM, the flash chip is no longer necessary and you can actually remove it from the motherboard. This is what makes the old BIOS recovery trick possible. If you have two identical motherboards and the BIOS is corrupted on one of them, you can use the good board to reflash the bad chip. First you power up the good machine and prepare to flash the BIOS. Then you remove the good chip from the good board and pop in the corrupt chip while the machine is still running. You can then continue with the flash as usual, which will rewrite the BIOS to the corrupt chip leaving you with two good chips.

Anyway, my point was that powering the computer off and back on forces it to reload the BIOS. If the shadow copy in the RAM had been corrupted, the corrupt copy would have been lost when the machine was powered off, and replaced with a good copy upon the next boot. This would explain why I could get 3,000,000 memory errors per minute and then 20 seconds later(after a full power cycle) be perfectly stable and error free for a few hours.

Memtest ran for just over 12 hours on the first stick without any errors last night. It has been folding under Gentoo ever since. I'll swap the second stick in sometime today and see how long it takes for memtest to detect an error.
 
Quote:

Originally Posted by jpz View Post
The BIOS is stored on a flash chip on the motherboard, but it gets shadow-copied to the RAM when you turn your computer on. The flash chip can be written to, but it is rare for that to happen as the result of a hardware failure since it is protected from accidental writes. Once the BIOS has been shadowed in the RAM, the flash chip is no longer necessary and you can actually remove it from the motherboard. This is what makes the old BIOS recovery trick possible. If you have two identical motherboards and the BIOS is corrupted on one of them, you can use the good board to reflash the bad chip. First you power up the good machine and prepare to flash the BIOS. Then you remove the good chip from the good board and pop in the corrupt chip while the machine is still running. You can then continue with the flash as usual, which will rewrite the BIOS to the corrupt chip leaving you with two good chips.

Anyway, my point was that powering the computer off and back on forces it to reload the BIOS. If the shadow copy in the RAM had been corrupted, the corrupt copy would have been lost when the machine was powered off, and replaced with a good copy upon the next boot. This would explain why I could get 3,000,000 memory errors per minute and then 20 seconds later(after a full power cycle) be perfectly stable and error free for a few hours.

Memtest ran for just over 12 hours on the first stick without any errors last night. It has been folding under Gentoo ever since. I'll swap the second stick in sometime today and see how long it takes for memtest to detect an error.
Thanks for the lesson! That seems to explain all your problems, to be honest.
 
21 - 40 of 50 Posts