Overclock.net banner

Maximize your SMP folding! 2 easy steps!

1K views 38 replies 12 participants last post by  Lude 
#1 ·
Ok over the past week I have done some tinkering and found 2 easy things we can all do to maximize our SMP folding rigs.

Step #1) Get VMware and Arch. Easy to install and use.

My findings 1xWINSMP vs. 2xLinux SMP; 3.6GHz (400x9) DDR800 4-4-4-12 2T
WINSMP--->~7:45/% x1 client
LinuxSMP-->~10:45/% x2 clientsNet ppd gain ~50%

Step #2) Back off the DDR MHz and tighten your timings. Sounds crazy I know, but you can get huge gains by running your RAM 800-850MHz at tighter timings than you can get at DDR 1000+MHz.

My findings with same CPU:FSB, varied RAM settings:
DDR1066 5-5-5-18 2T--->~13:00/% x2 Linux SMP clients
DDR800 4-4-4-12 2T---->~10:45/% x2 Linux SMP clients
DDR800 3-3-3-8 1T---->~5:10/% x2 Linux SMP clients. Net ppd gain 100%

Ravin's estimated production gains from tweaks; WINSMP--->Linux & tighter timings: 280% (7:45/% x1 vs. 5:10/%x2)


And don't ask if I
, Just try it for yourself and see!

Edit: DDR800 3-3-3-8 2T---->10:10/% 1T started to get flaky on me.
 
See less See more
3
#4 ·
Quote:


Originally Posted by wire
View Post

Good post. You may also see an increase in times from 400 x 9 to 450 x 8.

Thanks. I've thought about running 450x8 instead of 400x9, but 400x9 was so easy to hit and stable, so I never tried to drop the multi and push for more FSB. I also doubt that I can run DDR-900 at CAS 3 without insane Vdimm, but again, never tried
 
#6 ·
Quote:


Originally Posted by Ravin
View Post

Edit: DDR800 3-3-3-8 2T---->10:10/% 1T started to get flaky on me.


one problem...not everyones ram will run 3-3-3-8 2T

5 mins per % x 2 is pretty extreme folding and probably double what most in the world are folding at.
 
#10 ·
Windows SMP for me is faster with higher RAM speed over tighter timings. For Linux, I haven't tried for it yet.

I am very interested in Linux folding, but I have a question.

Quote:
DDR800 3-3-3-8 1T---->~5:10/% x2 Linux SMP clients. Net ppd gain 100%
Are you running x2 SMP on a Vmware?
 
#12 ·
Quote:

Originally Posted by PhelanJKell View Post
Yeah he is running VMware 2.0 + 2x instances of ARCH' linux distro with SMP edited in by Bal3Wolf.
Thank you.

Do you know the average RAM usage? I want to switch to Linux folding for a while, but I never got to it.
 
#13 ·
Quote:


Originally Posted by cognoscenti
View Post

one problem...not everyones ram will run 3-3-3-8 2T

5 mins per % x 2 is pretty extreme folding and probably double what most in the world are folding at.

Yea, getting CAS 3 was stable was not easy. 1T(~5:00/%) command rate is stable in Memtest, but causes system hangs in XP for me. If I'm not mistaken 3-3-3-8 timings are fairly tight even at 667MHz and just above average at 533MHz.

Quote:


Originally Posted by TaiDinh
View Post

Do you know the average RAM usage? I want to switch to Linux folding for a while, but I never got to it.

Roughly 200-300MB per client for VMware/arch.....more if you use other distros.

Quote:


Originally Posted by TaiDinh
View Post

Windows SMP for me is faster with higher RAM speed over tighter timings.

Cyberdruid was running WINSMP with his 3.6GHz(400x9) quad with DDR 11xxMHz 5-5-5-18 and was getting around 13:00/%. After he adjusted to DDR 800MHz 4-4-4-12 his times went down to around 8-9 min/%.
 
#14 ·
Quote:


Originally Posted by TaiDinh
View Post

Thank you.

Do you know the average RAM usage? I want to switch to Linux folding for a while, but I never got to it.

I cant comment on VMware, but my RAM usage while folding in native Arch Linux is 550MB, and that is with a few other small things running as well. I set it to use 2gigs. IIRC VMware used a lot more ram.

Edit: Ravin beat me. Im going to see what my usage is with nothing else running.

Good post Ravin. Only thing that would be better would be native Arch Linux folding. I might try messing with my memory timings, never really have much.
 
#15 ·
Quote:


Originally Posted by Lude
View Post

Good post Ravin. Only thing that would be better would be native Arch Linux folding. I might try messing with my memory timings, never really have much.

Thanks. My next box will be dual 8-core nehalems running Solaris for dedicated folding only. ETA to build.....2 years. I need windows, and I figure this quad will still be plenty of oomph for an e-machine/office apps for many, many years to come.
 
#16 ·
Quote:


Originally Posted by Ravin
View Post

Yea, getting CAS 3 was stable was not easy. 1T(~5:00/%) command rate is stable in Memtest, but causes system hangs in XP for me. If I'm not mistaken 3-3-3-8 timings are fairly tight even at 667MHz and just above average at 533MHz.

Roughly 200-300MB per client for VMware/arch.....more if you use other distros.

Cyberdruid was running WINSMP with his 3.6GHz(400x9) quad with DDR 11xxMHz 5-5-5-18 and was getting around 13:00/%. After he adjusted to DDR 800MHz 4-4-4-12 his times went down to around 8-9 min/%.

That is interesting.

Let me go try going from 938MHz 5-5-5-15 to 736MHz with 4-4-4-12, and maybe 938MHz with 4-5-5-15/12
 
#17 ·
Quote:


Originally Posted by TaiDinh
View Post

That is interesting.

Let me go try going from 938MHz 5-5-5-15 to 736MHz with 4-4-4-12, and maybe 938MHz with 4-5-5-15/12

Someone over in the P5N-E SLI thread noted that they got better performance between DDR 800-850 speeds, and took a huge hit on latencies from 851-1150MHz. Early (revision 1 and 2) 965 express boards also experienced this phenomena....speculation is that between 850-1150MHz the MCH straps new latencies that don't show increased performance with even 200MHz increases on the FSB.

MCH strap latencies are definitely an issue with p945/p965/650i/and 680i boards. I have not seen data on the P35/x38/780i/790i MCH latencies.
 
#19 ·
Yeah you are right on it being due to the strap changes or latency changes I should say.
The P35 doesnt have necessarily strap changes at a given fsb but for a given ram divider youll get different sets fo latency's. Its called performance level I think.

On some boards this is adjustable and on others it is not. The IP35pro sets the lowest possible from what I understand and it does a pretty good job.

I have tried keeping the same speed ram and just changed the timings from 4-4-4-12 to 3-3-3-8 both at around ddr 700 and I didnt see much of if any change at all but that was just me.

The greatest increase I saw was by using the affinity changer listed in another thread. I run two instances of smp in vista and am averaging 13min/% times two so basically 13min/2% on the 2653 unit. This is with a quad at about 3.2ghz. So I am now getting roughly two of the 2653wu's done every 21.6hrs which is really good I think.
 
#20 ·
Quote:

Originally Posted by MADMAX22 View Post
Yeah you are right on it being due to the strap changes or latency changes I should say.
The P35 doesnt have necessarily strap changes at a given fsb but for a given ram divider youll get different sets fo latency's. Its called performance level I think.

On some boards this is adjustable and on others it is not. The IP35pro sets the lowest possible from what I understand and it does a pretty good job.

I have tried keeping the same speed ram and just changed the timings from 4-4-4-12 to 3-3-3-8 both at around ddr 700 and I didnt see much of if any change at all but that was just me.

The greatest increase I saw was by using the affinity changer listed in another thread. I run two instances of smp in vista and am averaging 13min/% times two so basically 13min/2% on the 2653 unit. This is with a quad at about 3.2ghz. So I am now getting roughly two of the 2653wu's done every 21.6hrs which is really good I think.
I just finised my 2 WUs and submitted, so I'm going to mess around a little more. I have some of those subtiming setting adjustments on my board.....Clock twister and Transaction booster I think..... Gonna try those out and see what happens.
 
#21 ·
Well I lost a WU messing around with the timings, so that suxs. But I didn't find any speed difference between 5-5-5-15/4-4-4-12/4-3-4-9. I'm running 4-4-4-12 right now @ 947mhz.

My DFI Blood Iron has more memory options then colors in a rainbow, but I don't know even where to start with them!
 
#22 ·
The joys of dfi, as soon as intel puts the memory controller on die Ill switch to dfi just so I can play with the timings and actually get results from them.

Unfortunatly right now though intel doesnt care much for what timings you use. These big results I think are from the different strap (latency sets) that are getting used from switching the ram dividers.

I know on teh abitIP35pro the ram dividers set the strap so the first two are 1333 strap, second two are 1066, third set is 800 and then it jumps back up to 1066 I think for the other divider which is kind of weird. When getting into these different straps you change the performance level its called which can play a major role in ram performance.

Atleast thats what Ive scene so far. Also memtest is your friend.
 
#23 ·
Quote:


Originally Posted by MADMAX22
View Post

Unfortunatly right now though intel doesnt care much for what timings you use. These big results I think are from the different strap (latency sets) that are getting used from switching the ram dividers.

I agree. The biggest difference I saw was going from the 1066 to the 800 MHz strap, and again goint from 2T to 1T. Sure, I lost about 300MB/s bandwidth according to Memtest and Sandra, but I also decreased latencies from ~80ns to ~40-50ns.
Just wish 1T was stable in Windows. Memtest86+ says it's fine, programs hang and lag in Win.

Quote:


Originally Posted by PhelanJKell
View Post

Well I lost a WU messing around with the timings, so that suxs. But I didn't find any speed difference between 5-5-5-15/4-4-4-12/4-3-4-9. I'm running 4-4-4-12 right now @ 947mhz.

My DFI Blood Iron has more memory options then colors in a rainbow, but I don't know even where to start with them!

I shoulda warned you about that. FAH likes to drop WU's when you change timings. I lost one yesterday too....Thought it was sent, rebooted and changed timings, restarted folding and lost a completed WU.....all cause that stupid "INTERRUPTED" when a WU tries to submit.

I have a lot ofsimilar memory tweaks too, Clocktwister @ strong actually decreases performance for me, and I can not even post with the TransactionBooster (performance level) enabled.

Were you already running 1:1 with your CPU/FSB/RAM? If you were I bet that's why you are not seeing much change.
 
#24 ·
I forgot to update last night.

I went from 920MHz 5-5-5-15 to 920MHz 4-4-4-12 and went from 8:49 minutes per % on WU #2653 to 8:17 minutes per %!


I need to get a new motherboard since my current DS3 craps out at 368FSB or higher.
 
#25 ·
Quote:


Originally Posted by TaiDinh
View Post

I forgot to update last night.

I went from 920MHz 5-5-5-15 to 920MHz 4-4-4-12 and went from 8:49 minutes per % on WU #2653 to 8:17 minutes per %!


I need to get a new motherboard since my current DS3 craps out at 368FSB or higher.

Great job TaiDinh!

Preliminary analysis (guestamate) of the data suggests that at the same MHz you get around 0:30/% gain by lowering CAS by one. It's the pattern I saw going from 5-5-5-15--->4-4-4-12--->3-3-3-8 at 800MHz.

is your ram/CPU/FSB 1:1?
 
#26 ·
Quote:


Originally Posted by Ravin
View Post

Great job TaiDinh!

Preliminary analysis (guestamate) of the data suggests that at the same MHz you get around 0:30/% gain by lowering CAS by one. It's the pattern I saw going from 5-5-5-15--->4-4-4-12--->3-3-3-8 at 800MHz.

is your ram/CPU/FSB 1:1?

It is not. Since I can only do 368FSB max, 2.0 divder will set my RAM at double my FSB, which would be 736MHz, but it will set it to 1:1.

I am running at 2.5 divider with the RAM at 920MHz. I'm at school, but I think I'm at 5:6

Is it worth dropping nearly 200MHz in RAM speed for 1:1 ratio?
 
This is an older thread, you may not receive a response, and could be reviving an old thread. Please consider creating a new thread.
Top