Overclock.net › Forums › Intel › Intel Motherboards › ASRock X79 Extreme 11 Owners Issues & Hopefully Fixes
New Posts  All Forums:Forum Nav:

ASRock X79 Extreme 11 Owners Issues & Hopefully Fixes - Page 89

post #881 of 1199
Quote:
Originally Posted by AdamK47 View Post

For those using SLI... Do any of you experience sudden reduction in GPU usage? I've noticed it in the past after doing heavy stress testing or running a game with vsync off for a long time. All of a sudden the GPU usage drops and I can't get it back until I reboot the PC.

Since I became aware of the PCI-E bandwith test I decided to run it after encountering the problem. This is what I saw:



It looks like one of the PLX chips is going into some sort of reduced performance mode. Any idea what's going on? This happens even when not overclocked. I can only fix it by restarting the system.

Wow, this is such a coincidence! My motherboard is not Asrock Extreme 11. It's an Asus P9X79-E WS (very similar to AE 11 with two plx chips). I just posted the same issue on geforce.com and devtalk.nvidia.com. I also submitted a bug report to NVidia . NVidia is not responding to any of them thumbsdownsmileyanim.gif

https://devtalk.nvidia.com/default/topic/650085/cuda-programming-and-performance/host-to-device-bandwidth-degradation-/

In my case only the host to device bandwidth is reduced and it requires a reboot.
The problem doesn't occur when not OC'ing or using XMP profile. I under-volt in my OC which could very well cause reduced bandwidth.

Here are my hunches as to what causes this issue:
1. Over Clocking
2. Under-volting
3. Bug in Nvidia driver not release GPU context/resrouce
4. PLX bug/defect
5. PLX temperature

I'm gonna try two things tonight:
1. OC without undervolting (bump up vcore)
2. place a Delta fan on top of the PLX chips
Edited by Ardi - 12/4/13 at 7:38am
post #882 of 1199
Quote:
Originally Posted by Ardi View Post

Wow, this is such a coincidence! My motherboard is not Asrock Extreme 11. It's an Asus P9X79-E WS (very similar to AE 11 with two plx chips). I just posted the same issue on geforce.com and devtalk.nvidia.com. I also submitted a bug report to NVidia . NVidia is not responding to any of them thumbsdownsmileyanim.gif

https://devtalk.nvidia.com/default/topic/650085/cuda-programming-and-performance/host-to-device-bandwidth-degradation-/

In my case only the host to device bandwidth is reduced and it requires a reboot.
The problem doesn't occur when not OC'ing or using XMP profile. I under-volt in my OC which could very well cause reduced bandwidth.

Here are my hunches as to what causes this issue:
1. Over Clocking
2. Under-volting
3. Bug in Nvidia driver not release GPU context/resrouce
4. PLX bug/defect
5. PLX temperature

I'm gonna try two things tonight:
1. OC without undervolting (bump up vcore)
2. place a Delta fan on top of the PLX chips

I don't think it's nVidia's fault. I have 4 cards. When the problem happens it always causes two of the cards to show low GPU usage. This PCI-E bandwidth utility points to the problem being one of the PLX chips. The bandwidth is reduced so much that the cards cannot be fed fast enough. I had the problem when running three cards. All of a sudden two of the cards would show low GPU usage and the 3rd would be 90+%. PCI-E 2.0 and PCI-E 3.0 set in the BIOS do the same thing. It only happens if I run the cards at 90%+. It could take 15 minutes or it could take 3 hours for the problem to happen.
The Computer
(19 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 8700K MSI Z370 Gaming Pro Carbon AC Two Titan Xp 16GB G.SKILL Trident Z 
Hard DriveHard DriveHard DriveOptical Drive
512GB Samsung 960 Pro Four 2TB Samsung 850 EVOs in 8TB RAID-0 10TB Seagate Enterprise 7200RPM HDD LG 6X External Slim BD-RW 
CoolingOSMonitorKeyboard
Thermaltake Water 3.0 Ultimate Windows 10 Professional x64 Samsung 70" KU6300 Corsair K70 LUX 
PowerCaseMouseMouse Pad
Corsair AX1500i Corsair Crystal 460X Logitech G900 Corsair Lapdog 
AudioAudioAudio
Denon AVR-S920W receiver Klipsch Reference 5.1 speakers Beyerdynamic DT 770 Studio headphones 
  hide details  
Reply
The Computer
(19 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 8700K MSI Z370 Gaming Pro Carbon AC Two Titan Xp 16GB G.SKILL Trident Z 
Hard DriveHard DriveHard DriveOptical Drive
512GB Samsung 960 Pro Four 2TB Samsung 850 EVOs in 8TB RAID-0 10TB Seagate Enterprise 7200RPM HDD LG 6X External Slim BD-RW 
CoolingOSMonitorKeyboard
Thermaltake Water 3.0 Ultimate Windows 10 Professional x64 Samsung 70" KU6300 Corsair K70 LUX 
PowerCaseMouseMouse Pad
Corsair AX1500i Corsair Crystal 460X Logitech G900 Corsair Lapdog 
AudioAudioAudio
Denon AVR-S920W receiver Klipsch Reference 5.1 speakers Beyerdynamic DT 770 Studio headphones 
  hide details  
Reply
post #883 of 1199
In my case it's only the host to device bandwidth that is affected on all three cards. I see in the image you posted that you run the bandwidth test program on all four cards:
concBandwidthTest 0 1 2 3

Have you tried to run the bandwidth test on each card individually?
concBandwidthTest (0 or 1 or 2 or 3)

If you have, what numbers do you see?
I think you mentioned this already, but do you still see this problem on default clocks (no OC)?
Edited by Ardi - 12/4/13 at 8:27am
post #884 of 1199
I'll have to run some stress testing for the problem to happen again. If I run everything normally with vsync on then I don't get the problem.

The problem happens even at default clock speeds.
The Computer
(19 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 8700K MSI Z370 Gaming Pro Carbon AC Two Titan Xp 16GB G.SKILL Trident Z 
Hard DriveHard DriveHard DriveOptical Drive
512GB Samsung 960 Pro Four 2TB Samsung 850 EVOs in 8TB RAID-0 10TB Seagate Enterprise 7200RPM HDD LG 6X External Slim BD-RW 
CoolingOSMonitorKeyboard
Thermaltake Water 3.0 Ultimate Windows 10 Professional x64 Samsung 70" KU6300 Corsair K70 LUX 
PowerCaseMouseMouse Pad
Corsair AX1500i Corsair Crystal 460X Logitech G900 Corsair Lapdog 
AudioAudioAudio
Denon AVR-S920W receiver Klipsch Reference 5.1 speakers Beyerdynamic DT 770 Studio headphones 
  hide details  
Reply
The Computer
(19 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 8700K MSI Z370 Gaming Pro Carbon AC Two Titan Xp 16GB G.SKILL Trident Z 
Hard DriveHard DriveHard DriveOptical Drive
512GB Samsung 960 Pro Four 2TB Samsung 850 EVOs in 8TB RAID-0 10TB Seagate Enterprise 7200RPM HDD LG 6X External Slim BD-RW 
CoolingOSMonitorKeyboard
Thermaltake Water 3.0 Ultimate Windows 10 Professional x64 Samsung 70" KU6300 Corsair K70 LUX 
PowerCaseMouseMouse Pad
Corsair AX1500i Corsair Crystal 460X Logitech G900 Corsair Lapdog 
AudioAudioAudio
Denon AVR-S920W receiver Klipsch Reference 5.1 speakers Beyerdynamic DT 770 Studio headphones 
  hide details  
Reply
post #885 of 1199
Do you have access to the CUDA SDK? There's a CUDA application called NBody that you can use to stress the cards. You can run this app in multi-GPU mode. If you don't have access to the SDK let me know and I'll see if I can make an executable tonight.

BTW, do you check your eventviewer often? Do you see any "Event ID 14" errors in there during or after stressing the cards?
post #886 of 1199
Actually, here is the link to the NBody application: http://developer.download.nvidia.com/compute/DevZone/C/Projects/x64/nbody.zip

Go Bin\win64\Release and run the program from command line:
nbody.exe -numdevices=4 (since you have 4 GPU's)


Also, one thing I forgot to add is that the concBandwidthTest program uses CUDA. CUDA and SLI don't work nicely together. CUDA programs sometimes behave strangely in the presence of SLI. In fact, to run CUDA application Nvidia suggests disabling SLI and even removing the SLI bridge. Try the stress test with and without SLI to see if the problem persists.
post #887 of 1199
There's nothing in the event viewer when this problem happens. There is no indication that there is a problem other than the low GPU usage and this PCI-E test showing horrible speeds.
The Computer
(19 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 8700K MSI Z370 Gaming Pro Carbon AC Two Titan Xp 16GB G.SKILL Trident Z 
Hard DriveHard DriveHard DriveOptical Drive
512GB Samsung 960 Pro Four 2TB Samsung 850 EVOs in 8TB RAID-0 10TB Seagate Enterprise 7200RPM HDD LG 6X External Slim BD-RW 
CoolingOSMonitorKeyboard
Thermaltake Water 3.0 Ultimate Windows 10 Professional x64 Samsung 70" KU6300 Corsair K70 LUX 
PowerCaseMouseMouse Pad
Corsair AX1500i Corsair Crystal 460X Logitech G900 Corsair Lapdog 
AudioAudioAudio
Denon AVR-S920W receiver Klipsch Reference 5.1 speakers Beyerdynamic DT 770 Studio headphones 
  hide details  
Reply
The Computer
(19 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 8700K MSI Z370 Gaming Pro Carbon AC Two Titan Xp 16GB G.SKILL Trident Z 
Hard DriveHard DriveHard DriveOptical Drive
512GB Samsung 960 Pro Four 2TB Samsung 850 EVOs in 8TB RAID-0 10TB Seagate Enterprise 7200RPM HDD LG 6X External Slim BD-RW 
CoolingOSMonitorKeyboard
Thermaltake Water 3.0 Ultimate Windows 10 Professional x64 Samsung 70" KU6300 Corsair K70 LUX 
PowerCaseMouseMouse Pad
Corsair AX1500i Corsair Crystal 460X Logitech G900 Corsair Lapdog 
AudioAudioAudio
Denon AVR-S920W receiver Klipsch Reference 5.1 speakers Beyerdynamic DT 770 Studio headphones 
  hide details  
Reply
post #888 of 1199
Could you also post the result of the bandwidth test before doing the stress test?
Edited by Ardi - 12/4/13 at 9:02am
post #889 of 1199
I disabled SLI. Didn't bother taking the bridge off. I doubt it interferes with anything leaving it on and I don't want to open the PC up to take it off for a single test.

Here's my score with SLI disabled. It's the same as with SLI enabled. Bidirectional scores are between 25K and 31K.



I'm running N-Body on all 4 GPUs. I'll let it run for about an hour. At first I used your command and noticed it was running in single precision. GPU usage was also pretty low at between 60% and 75%. I started it with the -fp64 argument after enabling double precision in the control panel and now the GPUs are running between 80% and 95% utilization. This is actually the first double precision program I've run with my Titans.

Getting around 7,000 GFLOPs in double precision mode. Single precision was 10 times higher with around 70,000 GFLOPs. Pretty nifty app.
The Computer
(19 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 8700K MSI Z370 Gaming Pro Carbon AC Two Titan Xp 16GB G.SKILL Trident Z 
Hard DriveHard DriveHard DriveOptical Drive
512GB Samsung 960 Pro Four 2TB Samsung 850 EVOs in 8TB RAID-0 10TB Seagate Enterprise 7200RPM HDD LG 6X External Slim BD-RW 
CoolingOSMonitorKeyboard
Thermaltake Water 3.0 Ultimate Windows 10 Professional x64 Samsung 70" KU6300 Corsair K70 LUX 
PowerCaseMouseMouse Pad
Corsair AX1500i Corsair Crystal 460X Logitech G900 Corsair Lapdog 
AudioAudioAudio
Denon AVR-S920W receiver Klipsch Reference 5.1 speakers Beyerdynamic DT 770 Studio headphones 
  hide details  
Reply
The Computer
(19 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 8700K MSI Z370 Gaming Pro Carbon AC Two Titan Xp 16GB G.SKILL Trident Z 
Hard DriveHard DriveHard DriveOptical Drive
512GB Samsung 960 Pro Four 2TB Samsung 850 EVOs in 8TB RAID-0 10TB Seagate Enterprise 7200RPM HDD LG 6X External Slim BD-RW 
CoolingOSMonitorKeyboard
Thermaltake Water 3.0 Ultimate Windows 10 Professional x64 Samsung 70" KU6300 Corsair K70 LUX 
PowerCaseMouseMouse Pad
Corsair AX1500i Corsair Crystal 460X Logitech G900 Corsair Lapdog 
AudioAudioAudio
Denon AVR-S920W receiver Klipsch Reference 5.1 speakers Beyerdynamic DT 770 Studio headphones 
  hide details  
Reply
post #890 of 1199
Performed the torture test for over two hours. The PC (and my computer room) got quite toasty. Stopped when the MB and SB temps leveled out. Green is motherboard and purple is south bridge. No sudden drop in GPU usage. Stayed steady the whole time. The problem seems to happen more when there are fluctuations in GPU usage. That's something Unigine Heaven does.



Click the image for original size.
The Computer
(19 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 8700K MSI Z370 Gaming Pro Carbon AC Two Titan Xp 16GB G.SKILL Trident Z 
Hard DriveHard DriveHard DriveOptical Drive
512GB Samsung 960 Pro Four 2TB Samsung 850 EVOs in 8TB RAID-0 10TB Seagate Enterprise 7200RPM HDD LG 6X External Slim BD-RW 
CoolingOSMonitorKeyboard
Thermaltake Water 3.0 Ultimate Windows 10 Professional x64 Samsung 70" KU6300 Corsair K70 LUX 
PowerCaseMouseMouse Pad
Corsair AX1500i Corsair Crystal 460X Logitech G900 Corsair Lapdog 
AudioAudioAudio
Denon AVR-S920W receiver Klipsch Reference 5.1 speakers Beyerdynamic DT 770 Studio headphones 
  hide details  
Reply
The Computer
(19 items)
 
  
CPUMotherboardGraphicsRAM
Core i7 8700K MSI Z370 Gaming Pro Carbon AC Two Titan Xp 16GB G.SKILL Trident Z 
Hard DriveHard DriveHard DriveOptical Drive
512GB Samsung 960 Pro Four 2TB Samsung 850 EVOs in 8TB RAID-0 10TB Seagate Enterprise 7200RPM HDD LG 6X External Slim BD-RW 
CoolingOSMonitorKeyboard
Thermaltake Water 3.0 Ultimate Windows 10 Professional x64 Samsung 70" KU6300 Corsair K70 LUX 
PowerCaseMouseMouse Pad
Corsair AX1500i Corsair Crystal 460X Logitech G900 Corsair Lapdog 
AudioAudioAudio
Denon AVR-S920W receiver Klipsch Reference 5.1 speakers Beyerdynamic DT 770 Studio headphones 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Intel Motherboards
Overclock.net › Forums › Intel › Intel Motherboards › ASRock X79 Extreme 11 Owners Issues & Hopefully Fixes