Overclock.net › Forums › Intel › Intel - General › SR-2 Random Lockup / Monitoring Tools To Use?
New Posts  All Forums:Forum Nav:

SR-2 Random Lockup / Monitoring Tools To Use?

post #1 of 3
Thread Starter 
Greetings everyone,

Been working with my Melchior system for awhile now generally without too much problem but had a question I wanted to shoot by you guys.

Basically, I've gotten the system pretty stable at 200 BLCK. CPUs are 1.325v both bootup and final, VTT is the same, IOH is 1.375 (mostly so high for audio issues under full load). Running an SR-2 with two Xeon e5620s at 3.8 Ghz. Power supply is an SR-2 unit so it's got enough power. Done RAM tests and none fails.

I can run BOINC for days and game no problem, yet it always happens when my brother plays Starcraft II and listens to Pandora via IE, the system will work fine for anywhere from 30 mins to many hours but at some point, it'll do a hard lock and force a reboot (when then allows for fine playing for that day).

What I'm wondering is while Windows 7's Admin Tools > Event Viewer is good for system events, it's not indepth enough to show exactly what is causing the lock - it shows power and shutdown was not proper, but it still doesn't show the faulting exe or driver. Is there anything in particular that can be logged or a way to monitor the system and services so I can pin point what is causing this exactly?

Temps are a little warm - hot cores are peaking at low 70s under load, cool cores are in the mid-50s but again, they can fold and crunch all day at those temps without any issue whatsoever, yet for whatever reason, my bro's combination of SCII and playing audio locks the system.

I'm guessing it's either that the northbridge is running too hot or that it's unstable so my only thought was to move an extra cooling fan over it to see if it'd help shift the heat somewhere else.

Otherwise, I was also stable at 197 BCLK and below so I might just be a tad too hot...

Any insight is appreciated. It's a specific enough problem that I think I can troubleshoot this and think it's either a driver or temperature issue but yet it's so sporadic too that I'm not sure of what to think. As a final note, I've tried a few different graphics drivers and sound drivers and they all cause the same problem eventually.

Would getting a separate PCIe soundcard help? Just using the onboard sound at the moment.

Edit - Forgot to mention that even if the temps are generally a bit on the warm side, obviously if we're just gaming, it's not the same as folding with both CPUs and GPUs cranking. Temps are barely in the 40s so it's not that I think.

Thanks.
Edited by The-Real-Link - 2/26/11 at 2:24pm
Melchior SR-2
(15 items)
 
  
CPUMotherboardGraphicsRAM
Intel i74930K EVGA X79 Dark Nvidia GTX Titan X (Pascal) 64GB GSkill Sniper 
Hard DriveOptical DriveCoolingOS
Intel 750 Series 1.2 TB PCIe x4 NVMe SSD / 960G... LG Blu-Ray Burner / 2x LG Millenniata DVD+/-RW Corsair H80 Pump + 3 YL Medium Fans Windows 10 Pro 64-bit 
MonitorKeyboardPowerCase
Dell 27" 4K P2715Q Corsair Strafe Cherry Red MX EVGA SR-2 PSU Lian-Li PC P80 
MouseMouse Pad
Corsair Steelseries Rival 100 None 
  hide details  
Reply
Melchior SR-2
(15 items)
 
  
CPUMotherboardGraphicsRAM
Intel i74930K EVGA X79 Dark Nvidia GTX Titan X (Pascal) 64GB GSkill Sniper 
Hard DriveOptical DriveCoolingOS
Intel 750 Series 1.2 TB PCIe x4 NVMe SSD / 960G... LG Blu-Ray Burner / 2x LG Millenniata DVD+/-RW Corsair H80 Pump + 3 YL Medium Fans Windows 10 Pro 64-bit 
MonitorKeyboardPowerCase
Dell 27" 4K P2715Q Corsair Strafe Cherry Red MX EVGA SR-2 PSU Lian-Li PC P80 
MouseMouse Pad
Corsair Steelseries Rival 100 None 
  hide details  
Reply
post #2 of 3
Thread Starter 
Ok did a bit of investigating again...

1. Moved extra fan over the space between CPUs to cool northbridge. Tested SCII again and reproduced the problem fast so it's not an issue there I think.

2. Heard on other forums saying that maybe the GPUs are overheating. While using a second one does increase the top one's temp by ~20C, they're still within reasonable limits. Didn't move the fan but...

3. Tried swapping Physx in Nvidia control panel from Card #2 (where the monitor is attached) to Card #1 (bottom one). So far, re-ran a good 40 minutes of replays at 8x speed without any issue. Then, turned on BOINC so the whole system would be stressed and then played another 40+ minutes of replays at a maximum load all while listening to music as I mentioned before. Not a single problem.

I wonder if it is as simple as a physx issue after all?. Oh well, seems ok for now. It'd still be nice to know of any monitoring tools I can reference though.
Melchior SR-2
(15 items)
 
  
CPUMotherboardGraphicsRAM
Intel i74930K EVGA X79 Dark Nvidia GTX Titan X (Pascal) 64GB GSkill Sniper 
Hard DriveOptical DriveCoolingOS
Intel 750 Series 1.2 TB PCIe x4 NVMe SSD / 960G... LG Blu-Ray Burner / 2x LG Millenniata DVD+/-RW Corsair H80 Pump + 3 YL Medium Fans Windows 10 Pro 64-bit 
MonitorKeyboardPowerCase
Dell 27" 4K P2715Q Corsair Strafe Cherry Red MX EVGA SR-2 PSU Lian-Li PC P80 
MouseMouse Pad
Corsair Steelseries Rival 100 None 
  hide details  
Reply
Melchior SR-2
(15 items)
 
  
CPUMotherboardGraphicsRAM
Intel i74930K EVGA X79 Dark Nvidia GTX Titan X (Pascal) 64GB GSkill Sniper 
Hard DriveOptical DriveCoolingOS
Intel 750 Series 1.2 TB PCIe x4 NVMe SSD / 960G... LG Blu-Ray Burner / 2x LG Millenniata DVD+/-RW Corsair H80 Pump + 3 YL Medium Fans Windows 10 Pro 64-bit 
MonitorKeyboardPowerCase
Dell 27" 4K P2715Q Corsair Strafe Cherry Red MX EVGA SR-2 PSU Lian-Li PC P80 
MouseMouse Pad
Corsair Steelseries Rival 100 None 
  hide details  
Reply
post #3 of 3
Thread Starter 
Hey everyone,

Dug through EVGA's threads here. Apparently to sum up, it was indeed a graphics / audio issue that a lot of people are having with the 400 (and maybe) 500 series cards.

Thread copy / dump below for anyone to reference. Hope this helps someone out!

--------------

Alrighty, while I know I threw a lot out there, I wanted to go for a couple days of stability before trying to call this a surefire "fix". I already knew I was stable at 3.7Ghz before adding my 2nd card so I figured it wasn't a voltage issue or such causing my hangs within two days max consistently.

So I found this gem of a thread here:
http://www.evga.com/forums/tm.aspx?&m=586183&mpage=1

While I don't know if it's required to follow for the 500-series cards, or even if it's just necessary when running SLI, here is what I did: (copy/pasted/edited from linked thread and cleaned it up a little).

0. Yes I'm adding a 0 . Firstly, removed Physx, 3D Surround, and so on. In my case, I only saw Physx there so I removed it from Control Panel > Programs.

1. go into device manager- go under display adapters go into your GTX 400 500 series Card and see what says for
location.. E.G. "PCI bus 18 "

2. go under system devices

3. you will see two HD audio devices.. check them both to see which one matches your GTX 500 series pci bus... I saw three HD audio devices - two for the two cards and a third for the Realtek onboard audio. Tried disabling the two which matched the PCIe cards but that did not remove them from the Sound, Video, and Game Controllers menu.

4. check under sound devices and it should no longer be there (this step might or might not require a restart)
FOR SLI SYSTEMS THERE WILL BE AN HD SOUND DEVICE FOR EACH CARD
I ALSO RECOMMEND UNINSTALLING THE SURROUND DRIVERS UNLESS OTHERWISE NEEDED AND CHANGING THE NVIDIA CONTROL PANEL TO "SINGLE DISPLAY PERFORMANCE MODE" <--Had this set for awhile already (this will help with flickering) AND SET TO MAXIMUM PERFORMANCE POWER MODE <--Had this always set before too.

5. Download Driver Sweeper and Driver Cleaner and install them. You mainly need Driver Sweeper. Also download the latest audio drivers from evga and the latest Nvidia drivers straight from Nvidia. The Nvida drivers may or may not be necessary depending on your config though.

6. boot up GO INTO PROGRAMS AND FEATURES IN CONTROL PANEL. Uninstall Nvidia Physx and Nvidia surround driver. DO NOT UNINSTALL NVIDIA DRIVERS ON NVIDIA CONTROL PANEL YET.

7. Reboot. Go into safe mode (press F8 repeatedly upon power up)

8. While in Safe Mode go into program and features now selected Nvidia Drivers then uninstall them. it will then ask you to reboot the computer DO NOT REBOOT YET - SELECT NO.

9. While still in safemode open Driver Sweeper select Nvidia display and click analyze; when it is done click on clean. Then open Driver Cleaner. click on the box that says "use multiple filters" then select Nvidia, Nvida wdm and Nvidia stereo.

10. Once you have finished with Driver Sweeper and Driver Cleaner reboot back in to regular mode. open device manager and check to see if your display adapter reads "Generic video adapter" or something along those lines and that it is a microsoft driver. then check your sound drivers to make sure that All 4 Nvidia soundrivers are uninstalled. Once you have verified that all the nvidia drivers are uninstalled REBOOT INTO SAFEMODE.
-----The major differences here for me were that as soon as I booted into Windows, it would automatically install the default graphics driver it stored despite turning off Windows Update and other settings. Ran a dxdiag to discover this was in fact, Nvidia ver 263.09 - an IMO, fairly stable driver so that was fine by me. Due to running two cards, I would have eight (8) HD Audio icons in the Sound, Video, and Game Controller tab. Even if I removed them completely via Safe Mode, they would show a laundry list of audio devices being reinstalled automatically as soon as I booted into Windows. Ignoring that for now...

11. WHILE IN SAFEMODE. Go into open programs and see if there is anything labeled Realtek audio drivers if there is uninstall that. I had the Realtek onboard sound and a Realtek Ethernet Driver so removed both of those in my case. DO NOT REBOOT YET. Open device manager and go to sound devices and select any remaining audio devices and uninstall them. then open driver sweeper and clean out the left over files. After using driver sweeper you can now Reboot into regular mode.

12. After reboot open up device manager and make sure that all the drivers you uninstalled and cleaned out were replaced by Generic Microsoft Drivers. Then install The nvidia display drivers and disable windows automatic updates. Now reboot. While I had uninstalled Windows Automatic Updates awhile ago, I chose to not install any additional Nvidia driver package after Windows had done so.

13. after you reboot go into device manager and DISABLE the 4 nvidia sound driver the REBOOT BACK INTO SAFEMODE. Once you are back in safemode open up device manager and uninstall the Nvidia sound drivers. if they won't uninstall then make sure they are disabled in safemode. This was in fact the key and might have saved me a little bit of time and hassle had I known that the sound drivers would or could not be fully uninstalled without reinstalling themselves. What I could do, was go into Safe Mode's Device Manager and simply disable all 8 of the HD audio settings under Sound, Video, and Game Controllers, and then, Disable two of the three HD Audio settings (the two that matched the GPUs of course) in System Settings. Rebooted.

14. Now install or reinstall your onboard audio drivers then reboot. I reinstalled Realtek's onboard audio Win7 package from my EVGA folder and had sound again. Rebooted and all has been fine.

===============

Basically all this seems to have been I will wager, is that there was some kind of conflict between the onboard audio and HD audio of the graphics cards. While it may be apparent with one GPU, it seemed to be exaservated with two and as I said, despite using stable OC settings, I simply could not go more than two days without a hang - nearly always while playing a game or listening to music.

While I'll still run the system normally for now (and see how long I can go for in uptime), I can verify that this seemed to be my issue by testing the following:

1. Ran Starcraft II with Pandora playing for several hours of replays and games.
2. Ran Starcraft II with 100% CPU / GPU usage via Rosetta while listening to Pandora for a couple hours and the system was happy as could be.
3. Did a dreaded test of running Rosetta / crunching entirely overnight with music playing while the system was loaded. This previously, would have lead to a hang within two hours at most due to the full load being created. It ran for the entire overnight period and then some (an uptime of around 12 hours) and when I came back, the system was still cranking on Rosetta and playing music
4. Played a good few hours of WoW with Pandora or my Winamp playlist cranking. Played without a hitch.

Thus, while we shall see how far I go beyond a couple days, this sound issue here seems to be fully resolved. Hopefully this helps someone!

---------------

Edit, after a couple reboots, the problem still remains. At least this as a variable is taken out of the possibilities though. Oh well.
Edited by The-Real-Link - 2/26/11 at 2:25pm
Melchior SR-2
(15 items)
 
  
CPUMotherboardGraphicsRAM
Intel i74930K EVGA X79 Dark Nvidia GTX Titan X (Pascal) 64GB GSkill Sniper 
Hard DriveOptical DriveCoolingOS
Intel 750 Series 1.2 TB PCIe x4 NVMe SSD / 960G... LG Blu-Ray Burner / 2x LG Millenniata DVD+/-RW Corsair H80 Pump + 3 YL Medium Fans Windows 10 Pro 64-bit 
MonitorKeyboardPowerCase
Dell 27" 4K P2715Q Corsair Strafe Cherry Red MX EVGA SR-2 PSU Lian-Li PC P80 
MouseMouse Pad
Corsair Steelseries Rival 100 None 
  hide details  
Reply
Melchior SR-2
(15 items)
 
  
CPUMotherboardGraphicsRAM
Intel i74930K EVGA X79 Dark Nvidia GTX Titan X (Pascal) 64GB GSkill Sniper 
Hard DriveOptical DriveCoolingOS
Intel 750 Series 1.2 TB PCIe x4 NVMe SSD / 960G... LG Blu-Ray Burner / 2x LG Millenniata DVD+/-RW Corsair H80 Pump + 3 YL Medium Fans Windows 10 Pro 64-bit 
MonitorKeyboardPowerCase
Dell 27" 4K P2715Q Corsair Strafe Cherry Red MX EVGA SR-2 PSU Lian-Li PC P80 
MouseMouse Pad
Corsair Steelseries Rival 100 None 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Intel - General
Overclock.net › Forums › Intel › Intel - General › SR-2 Random Lockup / Monitoring Tools To Use?