Overclock.net › Forums › Software, Programming and Coding › Operating Systems › Linux, Unix › Question: Any guide (website/book) to setting up Linux computer cluster for scientific computation?
New Posts  All Forums:Forum Nav:

Question: Any guide (website/book) to setting up Linux computer cluster for scientific computation?

post #1 of 7
Thread Starter 
I'm a graduate student in economics, and I do a lot of simulations that requires pure CPU power. Most of these simulations can be parallelized and hence can easily benefited from having more CPU cores. They sometimes can be benefited from offloading to GPU as well, but currently the way GPU handling things is still of limited use for my research so for now, I'm only focusing on CPU power.

Given how expensive the Xeon processors are, and how cheap Ivy Bridge/Sandy Bridge desktop CPUs (and how OC-able!) are, it makes more sense financially to hook up a few desktop computers together than to buy a 8 or 16 cores Xeon setup. That would probably give me more flexibility in upgrading as well. (Just found a Dell desktop with i7 3770 and 8GB ram for $585 including tax on their outlet).

I don't really have the budget to build anything right now, but I can easily found a few old laptop from my or my co-worker's closets to test thing out. This is not supposed to be my final build, but rather something that I want to test and learn how to set up such thing. (I'm getting my PhD this year, and hopefully will have some research fund to back up my computational need once I get a job at some academic department).

Is there any good webstie or book you can recommend?

I imagine this tiny cluster would just be sitting in some corner without monitor connected to it, and I'll be mostly sending jobs to it remotely from other computers. I run primary matlab and some statistical packages on it, and I know for example, matlab have their own distributed computing toolbox which I have to license and learn how to setup. But for now, let's say I just want to hook up these computers without matlab, or test them with some open-source multi-thread supported softwares on it.

Any suggestion about how I should go about learning and testing it? What distribution of LInux? What additional softwares (paid, or open-source) do I need?

Thanks a lot.
My System
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 2600k @ 4.5Ghz ASUS P8P67 Deluxe EVGA 670 GTX 4GB 4 x 8GB Corsair Vengeance 
Hard DriveOSMonitorMouse
Samsung 830 256GB SSD Windows 7 Professional 64-bit Dell 2407WFP-HC Logitech MX Revolution 
  hide details  
Reply
My System
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 2600k @ 4.5Ghz ASUS P8P67 Deluxe EVGA 670 GTX 4GB 4 x 8GB Corsair Vengeance 
Hard DriveOSMonitorMouse
Samsung 830 256GB SSD Windows 7 Professional 64-bit Dell 2407WFP-HC Logitech MX Revolution 
  hide details  
Reply
post #2 of 7
what I would do is use amd opterons. there much cheaper, and while you'll have to foot the power bill the start up cost would be alot less. do you need floating point performance or integer?

http://www.newegg.com/Product/Product.aspx?Item=N82E16819113036

can fit four of those babies on a single board. if floating point is a hugh concern, I would look to the 8-12 core phenom based opterons. 8 cores have gone on ebay for like $100. however in integer these bulldozers wreck. heres an intel to amd grid- same clock speed assumed
Code:
flouting point performance                                       integer based performance 

intel 6 core (sb-e) ~ bulldozer 16 core                       intel 6 core ~ bulldozer 12 core

intel 4 core ~ bulldozer 12 core                               intel 8 core ~ bulldozer 16 core


now since you could get 4 16 cores amds for $2000 for the procs, $500 for the mobo, you could likely build a machine on that idea for ~$3000.

For the distro, you'll want something very, very stable. CentOS/RHEL, Debian, or FreeBSD.
post #3 of 7
Well, +1 for using Opterons, they can be had fairly cheaply (e.g. lots of 4p's folding for OCN! happysmiley.gif )

But, to try to help with your actual question, you might check out HPU4Science. It seems like they did what you're trying to do, though honestly I don't understand all of it. Here's the article in Ars that discusses their software implementation. Hopefully this helps.
post #4 of 7
Thread Starter 
For my applications, floating point performance is everything. So I will probably go with Intel, or whatever is available when I am going to buy my real rig.

For now, I just want to learn and understand about building and maintaining a cluster. I won't be building my real rig until, say next summer. My plan right now is just to borrowing a few old desktops/laptops, and learn how to hook them together and build a cluster. I'm a Linux noob, so there is a lot for me to learn, but this is a year long experimental project for me to know more about how to get these things work. Hopefully this will also enable me to make more informed purchase decision next year.
My System
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 2600k @ 4.5Ghz ASUS P8P67 Deluxe EVGA 670 GTX 4GB 4 x 8GB Corsair Vengeance 
Hard DriveOSMonitorMouse
Samsung 830 256GB SSD Windows 7 Professional 64-bit Dell 2407WFP-HC Logitech MX Revolution 
  hide details  
Reply
My System
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 2600k @ 4.5Ghz ASUS P8P67 Deluxe EVGA 670 GTX 4GB 4 x 8GB Corsair Vengeance 
Hard DriveOSMonitorMouse
Samsung 830 256GB SSD Windows 7 Professional 64-bit Dell 2407WFP-HC Logitech MX Revolution 
  hide details  
Reply
post #5 of 7
Thread Starter 
And.... for now I won't have the cash to buy the license of the statistical package that I will be using.

So my goal currently is simply:
1 - grab some old computers from people i know.
2 - learn Linux basic, and how to put these computers together and form a cluster that I can send jobs remotely to run on it.
3 - install and run a single piece of simple, open-source, free, multi-threaded softwares that can run across all computers in the cluster.

That is, for No 3, I don't know what that is yet. Any suggestion? Like, using all available CPU cores on the cluster to compute Pi or something?
My System
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 2600k @ 4.5Ghz ASUS P8P67 Deluxe EVGA 670 GTX 4GB 4 x 8GB Corsair Vengeance 
Hard DriveOSMonitorMouse
Samsung 830 256GB SSD Windows 7 Professional 64-bit Dell 2407WFP-HC Logitech MX Revolution 
  hide details  
Reply
My System
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 2600k @ 4.5Ghz ASUS P8P67 Deluxe EVGA 670 GTX 4GB 4 x 8GB Corsair Vengeance 
Hard DriveOSMonitorMouse
Samsung 830 256GB SSD Windows 7 Professional 64-bit Dell 2407WFP-HC Logitech MX Revolution 
  hide details  
Reply
post #6 of 7
Thread Starter 
Just found something interesting - Raspberry Pi Supercomputer!

http://www.raspberrypi.org/archives/1973

Raspberry is the new super cheap ARM based computer that basically had extremely limited computing power, but a great (and cheap) tool to learn programming. Since this test cluster build won't be used to do any real work, and thus I don't really care about x86 compatibility. All I really want to learn from this experience is how to link a number of computer together and networking in Linux, this seems to be a good and cheap start. What's more helpful is that this guy has put together a step by step tutorial thumb.gif

http://www.southampton.ac.uk/~sjc/raspberrypi/
My System
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 2600k @ 4.5Ghz ASUS P8P67 Deluxe EVGA 670 GTX 4GB 4 x 8GB Corsair Vengeance 
Hard DriveOSMonitorMouse
Samsung 830 256GB SSD Windows 7 Professional 64-bit Dell 2407WFP-HC Logitech MX Revolution 
  hide details  
Reply
My System
(13 items)
 
  
CPUMotherboardGraphicsRAM
i7 2600k @ 4.5Ghz ASUS P8P67 Deluxe EVGA 670 GTX 4GB 4 x 8GB Corsair Vengeance 
Hard DriveOSMonitorMouse
Samsung 830 256GB SSD Windows 7 Professional 64-bit Dell 2407WFP-HC Logitech MX Revolution 
  hide details  
Reply
post #7 of 7
@OP

As always, Wikipedia is a useful first port of call. After that, have a look at Setting up a Beowulf Cluster Using Open MPI on Linux. Also, DebianBeowulf.


Older articles:

Setting Up Beowulf Cluster in FreeBSD.
Beowulf Cluster Setup.


You might also want to look at distcc. I;ve used this personally a couple years ago, and it was incredibly useful at the time - i.e. I needed the horsepower. It's really easy to set up too.
Edited by parityboy - 9/18/12 at 8:47am
Ryzen
(12 items)
 
  
CPUMotherboardGraphicsRAM
Ryzen 7 1700 Gigabyte GA-AB350M Gaming 3 Palit GT-430 Corsair Vengeance LPX CMK16GX4M2B3000C15 
Hard DriveCoolingOSMonitor
Samsung 850 EVO AMD Wraith Spire Linux Mint 18.x Dell UltraSharp U2414H 
KeyboardPowerCaseMouse
Dell SK-8185 Thermaltake ToughPower 850W Lian-Li PC-A04B Logitech Trackman Wheel 
  hide details  
Reply
Ryzen
(12 items)
 
  
CPUMotherboardGraphicsRAM
Ryzen 7 1700 Gigabyte GA-AB350M Gaming 3 Palit GT-430 Corsair Vengeance LPX CMK16GX4M2B3000C15 
Hard DriveCoolingOSMonitor
Samsung 850 EVO AMD Wraith Spire Linux Mint 18.x Dell UltraSharp U2414H 
KeyboardPowerCaseMouse
Dell SK-8185 Thermaltake ToughPower 850W Lian-Li PC-A04B Logitech Trackman Wheel 
  hide details  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Linux, Unix
Overclock.net › Forums › Software, Programming and Coding › Operating Systems › Linux, Unix › Question: Any guide (website/book) to setting up Linux computer cluster for scientific computation?