I've been experiencing a very odd problem with my SMP client. The client downloads an A3 WU, folds it, and then refuses to upload. I know that this is not an issue with my internet because I have been downloading and uploading GPU2 WUs all day and my other computer (C2D running SMP2 under Linux) can also connect to the servers. I can also access the server via FireFox.
Here's an exerpt from my log file, if you need more I can post it:
Code:
I'd obviously like to get this resolved ASAP because I currently have 2 WUs that won't upload, and their usefulness to Stanford and the bonus points keep going down
One idea I had that would work at least temporarily is to -oneunit the client, copy the work folder and queue.dat over to another computer and upload it from there....would that work?
Here's an exerpt from my log file, if you need more I can post it:
Code:
Code:
--- Opening Log file [July 6 16:31:36 UTC]
# Windows SMP Console Edition #################################################
###############################################################################
[email protected] Client Version 6.29
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: C:\\FAH\\SMP
Executable: C:\\FAH\\SMP\\[email protected]
Arguments: -smp -verbosity 9
[16:31:36] - Ask before connecting: No
[16:31:36] - User name: iFX (Team 37726)
[16:31:36] - User ID: 7728BA4739E8CD00
[16:31:36] - Machine ID: 1
[16:31:36]
[16:31:36] Loaded queue successfully.
[16:31:36]
[16:31:36] - Autosending finished units... [July 6 16:31:36 UTC]
[16:31:36] + Processing work unit
[16:31:36] Trying to send all finished work units
[16:31:36] Core required: FahCore_a3.exe
[16:31:36] Project: 6701 (Run 7, Clone 24, Gen 13)
[16:31:36] Core found.
[16:31:36] + Attempting to send results [July 6 16:31:36 UTC]
[16:31:36] - Reading file work/wuresults_00.dat from core
[16:31:36] Working on queue slot 01 [July 6 16:31:36 UTC]
[16:31:36] (Read 43593369 bytes from disk)
[16:31:36] + Working ...
[16:31:36] Connecting to http://171.64.65.56:8080/
[16:31:36] - Calling '.\\FahCore_a3.exe -dir work/ -nice 19 -suffix 01 -np 8 -checkpoint 3 -verbose -lifeline 780 -version 629'
[16:31:36]
[16:31:36] *------------------------------*
[16:31:36] [email protected] Gromacs SMP Core
[16:31:36] Version 2.22 (Mar 12, 2010)
[16:31:36]
[16:31:36] Preparing to commence simulation
[16:31:36] - Ensuring status. Please wait.
[16:31:46] - Looking at optimizations...
[16:31:46] - Working with standard loops on this execution.
[16:31:46] - Previous termination of core was improper.
[16:31:46] - Going to use standard loops.
[16:31:46] - Files status OK
[16:31:46] - Expanded 1764978 -> 2250761 (decompressed 127.5 percent)
[16:31:46] Called DecompressByteArray: compressed_data_size=1764978 data_size=2250761, decompressed_data_size=2250761 diff=0
[16:31:46] - Digital signature verified
[16:31:46]
[16:31:46] Project: 6052 (Run 0, Clone 18, Gen 64)
[16:31:46]
[16:31:46] Entering M.D.
[16:31:52] Using Gromacs checkpoints
[16:31:53] Resuming from checkpoint
[16:31:53] Verified work/wudata_01.log
[16:31:53] Verified work/wudata_01.trr
[16:31:53] Verified work/wudata_01.edr
[16:31:53] Completed 210608 out of 500000 steps (42%)
[16:35:56] Completed 215000 out of 500000 steps (43%)
[16:39:35] Completed 220000 out of 500000 steps (44%)
[16:43:06] Completed 225000 out of 500000 steps (45%)
[16:46:22] Completed 230000 out of 500000 steps (46%)
[16:50:18] Completed 235000 out of 500000 steps (47%)
[16:53:18] - Couldn't send HTTP request to server
[16:53:18] + Could not connect to Work Server (results)
[16:53:18] (171.64.65.56:8080)
[16:53:18] + Retrying using alternative port
[16:53:18] Connecting to http://171.64.65.56:80/
[16:53:21] - Couldn't send HTTP request to server
[16:53:21] + Could not connect to Work Server (results)
[16:53:21] (171.64.65.56:80)
[16:53:21] - Error: Could not transmit unit 00 (completed July 6) to work server.
[16:53:21] - 7 failed uploads of this unit.
[16:53:21] + Attempting to send results [July 6 16:53:21 UTC]
[16:53:21] - Reading file work/wuresults_00.dat from core
[16:53:21] (Read 43593369 bytes from disk)
[16:53:21] Connecting to http://171.67.108.25:8080/
[16:54:12] Completed 240000 out of 500000 steps (48%)
[16:57:45] Posted data.
[16:57:45] Initial: 0000; + Could not connect to Work Server (results)
[16:57:45] (171.67.108.25:8080)
[16:57:45] + Retrying using alternative port
[16:57:45] Connecting to http://171.67.108.25:80/
[16:58:12] Completed 245000 out of 500000 steps (49%)
[17:01:43] Completed 250000 out of 500000 steps (50%)
[17:05:22] Completed 255000 out of 500000 steps (51%)
[17:08:39] Completed 260000 out of 500000 steps (52%)
[17:12:00] Completed 265000 out of 500000 steps (53%)
[17:15:56] Completed 270000 out of 500000 steps (54%)
[17:18:19] - Couldn't send HTTP request to server
[17:18:19] + Could not connect to Work Server (results)
[17:18:19] (171.67.108.25:80)
[17:18:19] Could not transmit unit 00 to Collection server; keeping in queue.
[17:18:19] Project: 6701 (Run 7, Clone 24, Gen 13)
[17:18:19] + Attempting to send results [July 6 17:18:19 UTC]
[17:18:19] - Reading file work/wuresults_00.dat from core
[17:18:19] (Read 43593369 bytes from disk)
[17:18:19] Connecting to http://171.64.65.56:8080/
[17:20:03] Completed 275000 out of 500000 steps (55%)
[17:23:18] Completed 280000 out of 500000 steps (56%)
[17:26:59] Completed 285000 out of 500000 steps (57%)
[17:30:33] Completed 290000 out of 500000 steps (58%)
--- Opening Log file [July 6 17:37:05 UTC]
# Windows SMP Console Edition #################################################
###############################################################################
[email protected] Client Version 6.29
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: C:\\FAH\\SMP
Executable: C:\\FAH\\SMP\\[email protected]
Arguments: -smp -verbosity 9
[17:37:05] - Ask before connecting: No
[17:37:05] - User name: iFX (Team 37726)
[17:37:05] - User ID: 7728BA4739E8CD00
[17:37:05] - Machine ID: 1
[17:37:05]
[17:37:05] Loaded queue successfully.
[17:37:05]
[17:37:05] - Autosending finished units... [July 6 17:37:05 UTC]
[17:37:05] + Processing work unit
[17:37:05] Trying to send all finished work units
[17:37:05] Core required: FahCore_a3.exe
[17:37:05] Project: 6701 (Run 7, Clone 24, Gen 13)
[17:37:05] Core found.
[17:37:05] + Attempting to send results [July 6 17:37:05 UTC]
[17:37:05] - Reading file work/wuresults_00.dat from core
[17:37:05] Working on queue slot 01 [July 6 17:37:05 UTC]
[17:37:06] (Read 43593369 bytes from disk)
[17:37:06] + Working ...
[17:37:06] Connecting to http://171.64.65.56:8080/
[17:37:06] - Calling '.\\FahCore_a3.exe -dir work/ -nice 19 -suffix 01 -np 8 -checkpoint 3 -verbose -lifeline 2400 -version 629'
[17:37:06]
[17:37:06] *------------------------------*
[17:37:06] [email protected] Gromacs SMP Core
[17:37:06] Version 2.22 (Mar 12, 2010)
[17:37:06]
[17:37:06] Preparing to commence simulation
[17:37:06] - Looking at optimizations...
[17:37:06] - Files status OK
[17:37:06] - Expanded 1764978 -> 2250761 (decompressed 127.5 percent)
[17:37:06] Called DecompressByteArray: compressed_data_size=1764978 data_size=2250761, decompressed_data_size=2250761 diff=0
[17:37:06] - Digital signature verified
[17:37:06]
[17:37:06] Project: 6052 (Run 0, Clone 18, Gen 64)
[17:37:06]
[17:37:06] Assembly optimizations on if available.
[17:37:06] Entering M.D.
[17:37:13] Using Gromacs checkpoints
[17:37:13] Resuming from checkpoint
[17:37:13] Verified work/wudata_01.log
[17:37:13] Verified work/wudata_01.trr
[17:37:13] Verified work/wudata_01.edr
[17:37:14] Completed 291908 out of 500000 steps (58%)
[17:39:33] Completed 295000 out of 500000 steps (59%)
[17:42:51] Completed 300000 out of 500000 steps (60%)
[17:46:26] Completed 305000 out of 500000 steps (61%)
[17:50:12] Completed 310000 out of 500000 steps (62%)
[17:53:44] Completed 315000 out of 500000 steps (63%)
[17:57:07] Completed 320000 out of 500000 steps (64%)
[18:00:39] Completed 325000 out of 500000 steps (65%)
[18:04:07] Completed 330000 out of 500000 steps (66%)
[18:07:54] Completed 335000 out of 500000 steps (67%)
[18:11:19] Completed 340000 out of 500000 steps (68%)
[18:14:41] Completed 345000 out of 500000 steps (69%)
[18:18:04] Completed 350000 out of 500000 steps (70%)
[18:21:40] Completed 355000 out of 500000 steps (71%)
[18:25:11] Completed 360000 out of 500000 steps (72%)
[18:28:47] Completed 365000 out of 500000 steps (73%)
[18:32:20] Completed 370000 out of 500000 steps (74%)
[18:36:13] Completed 375000 out of 500000 steps (75%)
[18:40:29] Completed 380000 out of 500000 steps (76%)
[18:44:05] Completed 385000 out of 500000 steps (77%)
[18:47:51] Completed 390000 out of 500000 steps (78%)
[18:51:29] Completed 395000 out of 500000 steps (79%)
[18:55:42] Completed 400000 out of 500000 steps (80%)
[18:59:43] Completed 405000 out of 500000 steps (81%)
[19:03:27] Completed 410000 out of 500000 steps (82%)
[19:06:45] Completed 415000 out of 500000 steps (83%)
[19:10:02] Completed 420000 out of 500000 steps (84%)
[19:13:31] Completed 425000 out of 500000 steps (85%)
[19:16:55] Completed 430000 out of 500000 steps (86%)
[19:20:28] Completed 435000 out of 500000 steps (87%)
[19:23:24] - Couldn't send HTTP request to server
[19:23:24] + Could not connect to Work Server (results)
[19:23:24] (171.64.65.56:8080)
[19:23:24] + Retrying using alternative port
[19:23:24] Connecting to http://171.64.65.56:80/
[19:23:46] Completed 440000 out of 500000 steps (88%)
[19:27:29] Completed 445000 out of 500000 steps (89%)
[19:31:44] Completed 450000 out of 500000 steps (90%)
[19:35:50] Completed 455000 out of 500000 steps (91%)
[19:39:02] Completed 460000 out of 500000 steps (92%)
[19:42:15] - Couldn't send HTTP request to server
[19:42:15] + Could not connect to Work Server (results)
[19:42:15] (171.64.65.56:80)
[19:42:15] - Error: Could not transmit unit 00 (completed July 6) to work server.
[19:42:15] - 8 failed uploads of this unit.
[19:42:15] + Attempting to send results [July 6 19:42:15 UTC]
[19:42:15] - Reading file work/wuresults_00.dat from core
[19:42:15] (Read 43593369 bytes from disk)
[19:42:15] Connecting to http://171.67.108.25:8080/
[19:42:22] Completed 465000 out of 500000 steps (93%)
[19:44:16] - Couldn't send HTTP request to server
[19:44:16] + Could not connect to Work Server (results)
[19:44:16] (171.67.108.25:8080)
[19:44:16] + Retrying using alternative port
[19:44:16] Connecting to http://171.67.108.25:80/
[19:46:21] Completed 470000 out of 500000 steps (94%)
[19:47:35] Posted data.
[19:47:35] Initial: 0000; + Could not connect to Work Server (results)
[19:47:35] (171.67.108.25:80)
[19:47:35] Could not transmit unit 00 to Collection server; keeping in queue.
[19:47:35] Project: 6701 (Run 7, Clone 24, Gen 13)
[19:47:35] + Attempting to send results [July 6 19:47:35 UTC]
[19:47:35] - Reading file work/wuresults_00.dat from core
[19:47:35] (Read 43593369 bytes from disk)
[19:47:35] Connecting to http://171.64.65.56:8080/
[19:50:14] Completed 475000 out of 500000 steps (95%)
[19:54:15] Completed 480000 out of 500000 steps (96%)
[19:57:53] Completed 485000 out of 500000 steps (97%)
[20:01:22] Completed 490000 out of 500000 steps (98%)
[20:05:27] Completed 495000 out of 500000 steps (99%)
[20:08:56] Completed 500000 out of 500000 steps (100%)
[20:08:57] DynamicWrapper: Finished Work Unit: sleep=10000
[20:09:07]
[20:09:07] Finished Work Unit:
[20:09:07] - Reading up to 3698496 from "work/wudata_01.trr": Read 3698496
[20:09:07] trr file hash check passed.
[20:09:07] edr file hash check passed.
[20:09:07] logfile size: 66520
[20:09:07] Leaving Run
[20:09:09] - Writing 3800568 bytes of core data to disk...
[20:09:09] ... Done.
[20:09:09] - Shutting down core
[20:09:09]
[20:09:09] [email protected] Core Shutdown: FINISHED_UNIT
[20:09:14] CoreStatus = 64 (100)
[20:09:14] Unit 1 finished with 95 percent of time to deadline remaining.
[20:09:14] Updated performance fraction: 0.930753
[20:09:14] Sending work to server
[20:09:14] - Already sending work
[20:09:14] Trying to send all finished work units
[20:09:14] - Already sending work
[20:09:14] - Already sending work
[20:09:14] + Sent 0 of 2 completed units to the server
[20:09:14] - Preparing to get new work unit...
[20:09:14] Cleaning up work directory
[20:09:14] + Attempting to get work packet
[20:09:14] Passkey found
[20:09:14] - Will indicate memory of 2046 MB
[20:09:14] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 14, Stepping: 5
[20:09:14] - Connecting to assignment server
[20:09:14] Connecting to http://assign.stanford.edu:8080/
[20:09:16] Posted data.
[20:09:16] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[20:09:16] + News From [email protected]: Welcome to [email protected]
[20:09:16] Loaded queue successfully.
[20:09:16] Connecting to http://171.64.65.56:8080/
[20:09:21] Posted data.
[20:09:21] Initial: 0000; - Receiving payload (expected size: 763807)
[20:09:31] - Downloaded at ~74 kB/s
[20:09:31] - Averaged speed for that direction ~144 kB/s
[20:09:31] + Received work.
[20:09:31] Trying to send all finished work units
[20:09:31] - Already sending work
[20:09:31] - Already sending work
[20:09:31] + Sent 0 of 2 completed units to the server
[20:09:31] + Closed connections
[20:09:31]
[20:09:31] + Processing work unit
[20:09:31] Core required: FahCore_a3.exe
[20:09:31] Core found.
[20:09:31] Working on queue slot 02 [July 6 20:09:31 UTC]
[20:09:31] + Working ...
[20:09:31] - Calling '.\\FahCore_a3.exe -dir work/ -nice 19 -suffix 02 -np 8 -checkpoint 3 -verbose -lifeline 2400 -version 629'
[20:09:31]
[20:09:31] *------------------------------*
[20:09:31] [email protected] Gromacs SMP Core
[20:09:31] Version 2.22 (Mar 12, 2010)
[20:09:31]
[20:09:31] Preparing to commence simulation
[20:09:31] - Looking at optimizations...
[20:09:31] - Created dyn
[20:09:31] - Files status OK
[20:09:31] - Expanded 763295 -> 1404481 (decompressed 184.0 percent)
[20:09:31] Called DecompressByteArray: compressed_data_size=763295 data_size=1404481, decompressed_data_size=1404481 diff=0
[20:09:31] - Digital signature verified
[20:09:31]
[20:09:31] Project: 6701 (Run 7, Clone 24, Gen 13)
[20:09:31]
[20:09:31] Assembly optimizations on if available.
[20:09:31] Entering M.D.
[20:09:38] Completed 0 out of 2000000 steps (0%)
[20:11:15] Killing all core threads
[20:11:15] Could not get process id information. Please kill core process manually
[email protected] Client Shutdown at user request.
[20:11:15] ***** Got a SIGTERM signal (2)
[20:11:15] Killing all core threads
[20:11:15] Could not get process id information. Please kill core process manually
[email protected] Client Shutdown.
--- Opening Log file [July 6 20:19:43 UTC]
# Windows SMP Console Edition #################################################
###############################################################################
[email protected] Client Version 6.29
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: C:\\FAH\\SMP
Executable: C:\\FAH\\SMP\\[email protected]
Arguments: -smp -verbosity 9
[20:19:43] - Ask before connecting: No
[20:19:43] - User name: iFX (Team 37726)
[20:19:43] - User ID: 7728BA4739E8CD00
[20:19:43] - Machine ID: 1
[20:19:43]
[20:19:43] Loaded queue successfully.
[20:19:43]
[20:19:43] - Autosending finished units... [July 6 20:19:43 UTC]
[20:19:43] + Processing work unit
[20:19:43] Trying to send all finished work units
[20:19:43] Core required: FahCore_a3.exe
[20:19:43] Project: 6701 (Run 7, Clone 24, Gen 13)
[20:19:43] Core found.
[20:19:43] + Attempting to send results [July 6 20:19:43 UTC]
[20:19:43] - Reading file work/wuresults_00.dat from core
[20:19:43] Working on queue slot 02 [July 6 20:19:43 UTC]
[20:19:43] (Read 43593369 bytes from disk)
[20:19:43] + Working ...
[20:19:43] Connecting to http://171.64.65.56:8080/
[20:19:43] - Calling '.\\FahCore_a3.exe -dir work/ -nice 19 -suffix 02 -np 8 -checkpoint 3 -verbose -lifeline 2036 -version 629'
[20:19:43]
[20:19:43] *------------------------------*
[20:19:43] [email protected] Gromacs SMP Core
[20:19:43] Version 2.22 (Mar 12, 2010)
[20:19:43]
[20:19:43] Preparing to commence simulation
[20:19:43] - Ensuring status. Please wait.
[20:19:53] - Looking at optimizations...
[20:19:53] - Working with standard loops on this execution.
[20:19:53] - Previous termination of core was improper.
[20:19:53] - Files status OK
[20:19:53] - Expanded 763295 -> 1404481 (decompressed 184.0 percent)
[20:19:53] Called DecompressByteArray: compressed_data_size=763295 data_size=1404481, decompressed_data_size=1404481 diff=0
[20:19:53] - Digital signature verified
[20:19:53]
[20:19:53] Project: 6701 (Run 7, Clone 24, Gen 13)
[20:19:53]
[20:19:53] Entering M.D.
[20:19:59] Completed 0 out of 2000000 steps (0%)

One idea I had that would work at least temporarily is to -oneunit the client, copy the work folder and queue.dat over to another computer and upload it from there....would that work?