Overclock.net banner

the good, the bad and the unstable

339 Views 6 Replies 3 Participants Last post by  Tufelhunden
3
The good is over night I broke 100 WU and broke into the top 1K, and I have only been folding for OCN since the 2nd
and on an ATI card


The bad (and the unstable) is that my second gpu can not stay stable. I have been having an issue with it for a few days now. It really start acting up last night, when I kept getting windows reboots over and over. Now I am getting client crashes.

I woke up today and found gpu1 had been offline most of the night due to an eue:

Code:
Code:
[12:28:11] Assembly optimizations on if available.
[12:28:11] Entering M.D.
[12:28:17] Working on p4742_lam5w_300K
[12:28:18] Client config found, loading data.
[12:28:18] Starting GUI Server
[12:28:20] mdrun_gpu returned 
[12:28:20] SHAKE violations on GPU
[12:28:20] 
[12:28:20] [email protected] Core Shutdown: UNSTABLE_MACHINE
[12:28:24] CoreStatus = 7A (122)
[12:28:24] Sending work to server
[12:28:24] Project: 4742 (Run 8, Clone 381, Gen 7)
[12:28:24] - Error: Could not get length of results file work/wuresults_00.dat
[12:28:24] - Error: Could not read unit 00 file. Removing from queue.
[12:28:24] EUE limit exceeded. Pausing 24 hours.
[12:33:13] + Working...
So I clear the work queue and the work folder, restart the client, and let it run for a while and get this:

Code:
Code:
[16:39:35] Assembly optimizations on if available.
[16:39:35] Entering M.D.
[16:39:41] Working on p4742_lam5w_300K
[16:39:41] Client config found, loading data.
[16:39:41] Starting GUI Server
[16:39:43] mdrun_gpu returned 
[16:39:43] SHAKE violations on GPU
[16:39:43] 
[16:39:43] [email protected] Core Shutdown: UNSTABLE_MACHINE
[16:39:48] CoreStatus = 7A (122)
[16:39:48] Sending work to server
[16:39:48] Project: 4742 (Run 8, Clone 381, Gen 7)
[16:39:48] - Error: Could not get length of results file work/wuresults_01.dat
[16:39:48] - Error: Could not read unit 01 file. Removing from queue.
[16:39:48] - Preparing to get new work unit...
[16:39:48] + Attempting to get work packet
[16:39:48] - Connecting to assignment server
[16:39:48] - Successful: assigned to (171.64.65.103).
[16:39:48] + News From [email protected]: GPU folding beta
[16:39:48] Loaded queue successfully.
[16:39:48] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[16:39:57] + Attempting to get work packet
[16:39:57] - Connecting to assignment server
[16:39:57] - Successful: assigned to (171.64.65.103).
[16:39:57] + News From [email protected]: GPU folding beta
[16:39:57] Loaded queue successfully.
[16:39:58] + Closed connections
[16:40:03] 
[16:40:03] + Processing work unit
[16:40:03] Core required: FahCore_11.exe
[16:40:03] Core found.
[16:40:03] Working on queue slot 02 [March 13 16:40:03 UTC]
[16:40:03] + Working ...
[16:40:03] 
[16:40:03] *------------------------------*
[16:40:03] [email protected] GPU Core - Beta
[16:40:03] Version 1.22 (Mon Dec 8 12:57:56 PST 2008)
[16:40:03] 
[16:40:03] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[16:40:03] Build host: amoeba
[16:40:03] Board Type: AMD
[16:40:03] Core      : 
[16:40:03] Preparing to commence simulation
[16:40:03] - Looking at optimizations...
[16:40:03] - Created dyn
[16:40:03] - Files status OK
[16:40:03] - Expanded 88220 -> 447304 (decompressed 507.0 percent)
[16:40:03] Called DecompressByteArray: compressed_data_size=88220 data_size=447304, decompressed_data_size=447304 diff=0
[16:40:03] - Digital signature verified
[16:40:03] 
[16:40:03] Project: 4744 (Run 9, Clone 583, Gen 15)
[16:40:03] 
[16:40:03] Assembly optimizations on if available.
[16:40:03] Entering M.D.
[16:40:09] Working on p4744_lam5w_300K
[16:40:10] Client config found, loading data.
[16:40:10] Starting GUI Server
[16:40:13] mdrun_gpu returned 
[16:40:13] Nonzero force sum on GPU
[16:40:13] 
[16:40:13] [email protected] Core Shutdown: UNSTABLE_MACHINE
[16:40:15] CoreStatus = 7A (122)
[16:40:15] Sending work to server
[16:40:15] Project: 4744 (Run 9, Clone 583, Gen 15)
[16:40:15] - Error: Could not get length of results file work/wuresults_02.dat
[16:40:15] - Error: Could not read unit 02 file. Removing from queue.
[16:40:15] - Preparing to get new work unit...
[16:40:15] + Attempting to get work packet
[16:40:15] - Connecting to assignment server
[16:40:16] - Successful: assigned to (171.64.65.103).
[16:40:16] + News From [email protected]: GPU folding beta
[16:40:16] Loaded queue successfully.
[16:40:16] + Closed connections
[16:40:21] 
[16:40:21] + Processing work unit
[16:40:21] Core required: FahCore_11.exe
[16:40:21] Core found.
[16:40:21] Working on queue slot 03 [March 13 16:40:21 UTC]
[16:40:21] + Working ...
[16:40:21] 
[16:40:21] *------------------------------*
[16:40:21] [email protected] GPU Core - Beta
[16:40:21] Version 1.22 (Mon Dec 8 12:57:56 PST 2008)
[16:40:21] 
[16:40:21] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[16:40:21] Build host: amoeba
[16:40:21] Board Type: AMD
[16:40:21] Core      : 
[16:40:21] Preparing to commence simulation
[16:40:21] - Looking at optimizations...
[16:40:21] - Created dyn
[16:40:21] - Files status OK
[16:40:21] - Expanded 88220 -> 447304 (decompressed 507.0 percent)
[16:40:21] Called DecompressByteArray: compressed_data_size=88220 data_size=447304, decompressed_data_size=447304 diff=0
[16:40:21] - Digital signature verified
[16:40:21] 
[16:40:21] Project: 4744 (Run 9, Clone 583, Gen 15)
[16:40:21] 
[16:40:21] Assembly optimizations on if available.
[16:40:21] Entering M.D.
[16:40:28] Working on p4744_lam5w_300K
[16:40:28] Client config found, loading data.
[16:40:28] Starting GUI Server
[16:42:33] Completed 1%
[16:44:37] Completed 2%
[16:46:40] Completed 3%
[16:48:46] Completed 4%
[16:50:51] Completed 5%
//*-- lines skipped to save space --*\\\\
[18:26:28] Completed 51%
[18:28:29] Completed 52%
[18:30:33] Completed 53%
[B][18:30:33] mdrun_gpu returned 
[18:30:33] Nonzero force sum on GPU[/B]
[18:30:33] 
[18:30:33] [email protected] Core Shutdown: UNSTABLE_MACHINE
[18:30:37] CoreStatus = 7A (122)
[18:30:37] Sending work to server
[18:30:37] Project: 4744 (Run 9, Clone 583, Gen 15)
[18:30:37] - Error: Could not get length of results file work/wuresults_03.dat
[18:30:37] - Error: Could not read unit 03 file. Removing from queue.
[18:30:37] - Preparing to get new work unit...
[18:30:37] + Attempting to get work packet
[18:30:37] - Connecting to assignment server
[18:30:38] - Successful: assigned to (171.64.65.102).
[18:30:38] + News From [email protected]: GPU folding beta
[18:30:38] Loaded queue successfully.
[18:30:38] + Closed connections
[18:30:43] 
[18:30:43] + Processing work unit
[18:30:43] Core required: FahCore_11.exe
[18:30:43] Core found.
[18:30:43] Working on queue slot 04 [March 13 18:30:43 UTC]
[18:30:43] + Working ...
[18:30:43] 
[18:30:43] *------------------------------*
[18:30:43] [email protected] GPU Core - Beta
[18:30:43] Version 1.22 (Mon Dec 8 12:57:56 PST 2008)
[18:30:43] 
[18:30:43] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[18:30:43] Build host: amoeba
[18:30:43] Board Type: AMD
[18:30:43] Core      : 
[18:30:43] Preparing to commence simulation
[18:30:43] - Looking at optimizations...
[18:30:43] - Created dyn
[18:30:43] - Files status OK
[18:30:43] - Expanded 98506 -> 492188 (decompressed 499.6 percent)
[18:30:43] Called DecompressByteArray: compressed_data_size=98506 data_size=492188, decompressed_data_size=492188 diff=0
[18:30:43] - Digital signature verified
[18:30:43] 
[18:30:43] Project: 5732 (Run 2, Clone 43, Gen 98)
[18:30:43] 
[18:30:43] Assembly optimizations on if available.
[18:30:43] Entering M.D.
[18:30:50] Working on Protein
[18:30:50] Client config found, loading data.
[18:30:50] Starting GUI Server
[18:34:08] Completed 1%
[18:37:22] Completed 2%
At this point I am about to give up on the second gpu on this x2 card, and go buy a couple of nvidia cards for this machine and use this ati in my amd rig for just gaming.
See less See more
1 - 7 of 7 Posts
How much is it overclocked?
Its no longer overclocked as of last night when the reboots started happening, as that was the issue there, its been running at stock since about 9pm last night.
well, after 15 failed WU today, I shut everything down, and removed the 9.2 drivers, and installed the 9.1 drivers, and there shortly after gpu1 died, and then 20 or so minutes later gpu0 died, so the card is not going back.
See less See more
3
Quote:

Originally Posted by oulzac View Post
well, after 15 failed WU today, I shut everything down, and removed the 9.2 drivers, and installed the 9.1 drivers, and there shortly after gpu1 died, and then 20 or so minutes later gpu0 died, so the card is not going back.

That is so weird.
Have you stress tested the card? Just curious, really wish I knew more about the ATI cards.
See less See more
yeah I had stressed it, and I had ran many bench marking tools, and was having no problems playing games like mass effect, crysis and fallout 3 while folding for a two weeks almost. I think I just over worked the card in those two weeks. I had to install my old x1800gto2


its on its way back to newegg now, well see how the next one goes. if I have the slightest issue with the second one, it will go back and I will get a GTX 295
See less See more
3
Quote:


Originally Posted by oulzac
View Post

yeah I had stressed it, and I had ran many bench marking tools, and was having no problems playing games like mass effect, crysis and fallout 3 while folding for a two weeks almost. I think I just over worked the card in those two weeks. I had to install my old x1800gto2


its on its way back to newegg now, well see how the next one goes. if I have the slightest issue with the second one, it will go back and I will get a GTX 295

RGR that! New card woot!!
See less See more
1 - 7 of 7 Posts
This is an older thread, you may not receive a response, and could be reviving an old thread. Please consider creating a new thread.
Top