All You Ever Wanted To Know About RAID
Most of the questions asked on this forum are related to RAIDs. "Is it worth it for me to get a RAID setup?", "Which RAID mode is best?", "What stripe size should I use?"......Its time to get all those answers in one place.
Most of the questions asked on this forum are related to RAIDs. "Is it worth it for me to get a RAID setup?", "Which RAID mode is best?", "What stripe size should I use?"......Its time to get all those answers in one place.
What Is a RAID?
RAID stands for Redundant Array of Inexpensive Disks. It is a means by which your PC uses multiple disks as if they were one, either to increase performance, safeguard against disk failures, or both. There are four main factors of a RAID setup: striping, which spreads data across multiple drives, mirroring, which copies the data to more than one disk, space efficiency, which is how much of the total space is available to use, and fault tolerance, which is a measure of how well protected the RAID array is against disk failure.
What do the different RAID levels mean?
RAID 0 - Striping
RAID 0 offers the best performance per disk out of all RAID levels. Data is striped across at least 2 drives, theoretically increasing both performance and capacity linearly (each hard drive adds xMB/s to throughput and capacity). The downside to RAID 0 is that if one hard drive fails, all data is lost - but the risk only becomes significant with 4 or more drives. This is also the cheapest RAID level to implement - most onboard controllers support it, and you get to use all the hard drive space that you paid for. For that reason, it's more cost beneficial to buy two 250GB drives and put them in RAID0 than it is to buy a single 500GB drive.
Read Performance - 5/5
Write Performance - 5/5
Degraded Array Performance - N/A
Space Efficiency - 5/5
Fault Tolerance - 0/5
Price = 5/5
RAID 1 - Mirroring
RAID 1 makes two identical copies of the same data on two different disks. With a good RAID controller it also increases read performance since two sets of data can be accessed simultaneously. However, write performance suffers slightly because two drives have to be written to. Even if one disk fails, the other continues to work as if it was a single drive all along. Just remember - since two copies of the data are always kept, you only get the space of a single drive, not two. If you have a low budget and don't need that much space for your important data, this is the best choice. Otherwise look at other RAID levels.
Read Peformance - 3/5
Write Performance - 1/5
Degraded Array Performance - 5/5 (Little to no loss)
Space Efficiency - 0/5
Fault Tolerance - 3/5
RAID 0+1 - Mirror of Stripes
A combination of both RAID 0 and RAID 1. RAID 0+1 creates a second striped set that is a mirror of the first - in other words, take a RAID 0 array, and make a copy of it. In this setup one drive can fail, and even a second drive can fail if it was in the same striped set as the first. But if the second drive is in the other set, then the array goes down. Each pair of disks added supplies another mirror, providing tolerance for up to two more drives to fail.
Read Performance - 3/5
Write Performance - 2/5
Degraded Array Performance - 5/5
Space Efficiency - 1/5
Fault Tolerance - 4/5
RAID 1+0 (also called RAID 10) - Stripe of Mirrors
RAID 1+0 is the opposite of RAID 0+1 - it is like a RAID 0 array, but with a mirror of each disk in the array. Multiple drives can fail as long as they aren't part of the same mirror. The main difference between this and RAID 0+1 is that adding a pair of drives increases performance more, but only adds fault tolerance of up to 1 more drive. This is by far the most common nested RAID level available. This (and RAID 0+1) are good if you want more space and speed than RAID 1 can provide, without having to buy a dedicated controller. However, if using more than 4 drives, an onboard controller will have trouble.
Read Performance - 4/5
Write Performance - 4/5
Degraded Array performance - 4/5
Space Efficiency - 1/5
Fault Tolerance - 3/5
RAID 3 - Striping with Dedicated Parity Disk
RAID 3 is offered by very few RAID controllers these days. It calculates the parity of the stripes across all the drives, and stores that parity on a dedicated disk. Read performance is increased to the level of RAID 0 of one drive less than is installed - so if you use 3 drives in RAID 3, you get the read speed of 2 drives in RAID 0. However, write speed is bottlenecked to that of a single drive, since all writes have to be written to the same parity disk. The main benefit of this type of array (as opposed to RAID 5) is that if a drive fails, the performance penalty is significantly less.
Read Performance - 4/5
Write Performance - 1/5
Degraded Array Performance - 4/5
Space Efficiency - 4/5
Fault Tolerance - 2/5
RAID 5 - Striping with Parity
RAID 5 is similar to RAID 3 except that instead of using a dedicated parity disk, the parity data is also striped across all the disks. This eliminates the bottleneck on write performance, allowing it to scale with the number of drives installed. However, if a drive fails, the performance of the array drops significantly depending on the number of disks and the speed of the controller. Also, the write speed you get heavily depends on the controller. Onboard and software-based controllers generally have terrible write speeds because the CPU is used to calculate the parity, while dedicated controllers can provide significant increases with each disk installed. One drive can fail and be replaced with no loss, but two failed drives will destroy the data. RAID 5 is best suited for situations where fault tolerance is needed, and the capacity of a single disk is not enough.
Read Performance - 4/5
Write Performance - 1/5 with onboard, 4/5 with dedicated controller
Degraded Array Performance - 3/5
Space Efficiency - 4/5
Fault Tolerance - 2/5
RAID 6 - Striping with Dual Parity
RAID 6 is the same as RAID 5, except the equivalent of two disks are used for parity storage. This allows two drives to fail without any trouble. Read speed becomes roughly equivalent to RAID 0 minus two disks, and write speed to RAID 5 minus 1 disk. Also, degraded array performance, even on the best of controllers, is terrible. RAID 6 is best suited for arrays of 10 or more drives, any less and RAID 5 is a better choice in terms of cost and performance.
Read Performance - 3/5
Write Performance - 3/5
Degraded Array Performance - 2/5 with one failed disk, 1/5 with two failed disks
Space Efficiency - 3/5
Fault Tolerance - 5/5
How To Pick a Stripe Size
This all depends on the average amount of data you are going to be working with. The whole idea is that if a file size is smaller than the stripe size, then it won't get striped across the array, giving it the same performance as a single disk. But, at the same time, while small stripe sizes increase sequential throughput on small files, they also increase random access time. So the optimal size for you depends on your usage habits. If you work mainly with large files (like movies, renderings, or ISOs) then you should definitely go with a large stripe size, somewhere around 1MB. If you want faster application loading and boot times, look at the other end of the scale.
SSD Stripe Optimization
Traditional hard drives are divided into small 512B sectors - so all of today's common stripe sizes are actually multiples of 512B, and can be reached through the equation: 2^x * 512B (1kB stripes give x=1, 2kB stripes give x=2, 4kB stripes give x=3, and so on). And with every read or write operation, each sector can be accessed and modified individually. When you overwrite a file, it simply goes to the exact sectors that need to be overwritten and changes the data. And when you delete a file, it simply marks it as deleted in the filesystem but leaves the data still physically on the disk.
*Random Fact: The maximum stripe size your controller supports will depend on the size of the register that contains the value of 2^x. Instead of using standard binary counting, it simply adds 1s to the value. So it goes: 0001, 0011, 0111, 1111. And this represents the number of sectors in a stripe.
SSDs are organized differently. They are divided into blocks (~512kB), each of which contain multiple pages (~4kB). These sizes are only approximates, as they can vary between manufacturers. So when you look at the equation for cluster sizes, you'll see that certain values no longer line up well with the 4kB or 512kB sizes that may prove most beneficial (for example, 1kB or 2kB). So if you use a 1kB stripe size, the controller will now store 4 stripes within each page, meaning that it would still have to read 4kB per drive to get the same data. So under 4kB obviously provides no benefit.
Now, the reason that this is important is because of the way that SSDs write data. Even though SSDs can write to a single page, it can only erase entire blocks at once. When you first get the SSD this won't matter - wear leveling algorithms will make sure you keep on writing to unused block instead of having to erase old data all the time. However, this only lasts for so long. After enough data has been written the drive will think that all blocks have been used even though the file system might mark them as empty, making every write command a full erase-write cycle. So to make full use of these cycles, stripes should be 512kB or bigger.
Now, what you need to decide is how you want to optimize read and write performance. If you want read performance go with 4kB stripes, or however large your SSD's pages are. Pages will always be able to be read one by one, and random access throughput is always equal (or almost equal) to sequential access throughput, so this gives you optimal read performance. If you want write performance, you are better off going with 512kB stripes or larger. Using less than 512kB stripes may cause the drive to use two separate erase-write cycles per block instead of one (but this occurs mostly when your RAID controller's speed doesn't scale exactly linearly with your SSDs' speed). Stripes larger than that will lessen the time your drive can run in its optimal state (where you are left with free blocks that can be written without erasing first), but once it degrades to full erase-write cycles the speed you get won't be as bad. Stripes less than 512kB will provide no added benefit to write speed.
So in summary:
Read Speed Optimal: stripe size = page size, assumed here to be 4kB
Write Speed Optinal: stripe size >= block size, assumed here to be 512kB
When To Get a Dedicated Controller
Onboard RAID will be enough for most people, as it can handle a 2 or 3 drive RAID 0 array quite well (and RAID 1 doesn't tax it at all). So what do you need a dedicated controller for? Well, you should get one if:
- Performance is important to you
- You want to ensure that the array can be moved to a new motherboard
- You need more space than you can get out of 2 drives in RAID 0 or 4 drives in RAID 10
- You want to eliminate the load on your CPU
- You already have everything else for your PC
- You want to use SAS, Fibre Channel, or SSDs
How To Pick a Controller
There are two types of RAID controllers available - software based and hardware based. Software based ones are no different than your onboard controller - your CPU has to do all the work. These are a complete waste of $15. Hardware based ones have their own processor that does all the work, and does it a lot better than your general-purpose CPU can. So how do you tell the difference between the two offhand? Easy: The hardware based ones have heatsinks, the software based ones don't. Plain and simple.
Now, once you've single out the hardware controllers, look at which interface it connects over - PCI, PCIx, or PCIe (and how many lanes). You need to estimate how much bandwidth you will need for your drives (PCI will not be fast enough for 2 VelociRaptors), and make sure the card's interface is at least that fast. Keep in mind that you can install a PCIx card in a regular PCI slot and vice versa - you'll just get limited to regular PCI speeds.
And the last thing to look at is the price. Thankfully, Dell has been selling way too many servers. Too many servers means too many RAID cards, and too many RAID cards means cheap prices. The current best-bang-for-your-buck card is the Dell Perc 5/i for around $100 on ebay. It is based off of the Intel IOP333 processor, which is used by a lot of $300-$500 cards. And it even has an expandable cache and supports SAS as well! Its really a steal.
DuckieHo has written up a guide to flashing and using that card, and you can find it here.
Installing Drivers
Windows XP:
If you are doing a fresh install, then you will need to tell setup how to access your RAID array. You must put the drivers for your card onto a floppy disk. When setup starts, you should see a prompt at the bottom saying "Press F6 to install a third party driver...". So press F6. You won't get any sort of confirmation, but once it finishes loading the basic drivers it will prompt you to insert the floppy. Do so, and a list containing your controller will pop up. Select it, and you're good to go.
Windows Vista:
Most of the time with Vista you won't need to install any drivers during setup - Microsoft has included most of the common RAID drivers from Intel, AMD, and nVidia, and some other higher-end card manufacturers as well. But if setup doesn't find the hard disk, then you will need to manually install them. Thankfully with Vista, you no longer need a floppy disk like XP does. Now you can use any storage media you want, including USB drives. So to select your driver, in the 'Drive Options' portion of setup, there is a button that says 'Install Driver'. Click that, navigate to the folder with the driver and select it. And it should all work after that.
Creating Custom Driver Disks
If you've got many different types of RAID cards in your PCs at home, you may want to create a single disk with all the drivers on it this way you won't have to go digging through tons of floppies to get the right one. Unfortunately, its not as simple as copying all the files from all the disks and putting them together, because XP will only read from the root folder of the drive, and they all contain one identical file: TXTSETUP.OEM. This file tells Setup exactly which controllers are supported, and which files need to be loaded to use it. So what you'll need to do is open up these files in notepad, take the data you need off of them, and merge them all into a single one.
For example, these are the important lines from an ICH9R driver floppy:
*Note: All lines starting with a ';' are comments, and are not parsed by Windows.
Code:
[Disks]
disk1 = "Intel Matrix Storage Manager driver", iaStor.sys,
[Defaults]
scsi = iaStor_ICH8MEICH9ME
;#############################################################################
[scsi]
; iaAHCI.inf
iaAHCI_ICH9RDODH = "Intel(R) ICH9R/DO/DH SATA AHCI Controller"
; iaStor.inf
iaStor_ICH8RICH9RICH10RDO = "Intel(R) ICH8R/ICH9R/ICH10R/DO SATA RAID Controller"
;#############################################################################
; iaAHCI.inf
[Files.scsi.iaAHCI_ICH9RDODH]
driver = disk1, iaStor.sys, iaStor
inf = disk1, iaAHCI.inf
catalog = disk1, iaAHCI.cat
; iaStor.inf
[Files.scsi.iaStor_ICH8RICH9RICH10RDO]
driver = disk1, iaStor.sys, iaStor
inf = disk1, iaStor.inf
catalog = disk1, iaStor.cat
;#############################################################################
[Config.iaStor]
value = "", tag, REG_DWORD, 1b
value = "", ErrorControl, REG_DWORD, 1
value = "", Group, REG_SZ, "SCSI Miniport"
value = "", Start, REG_DWORD, 0
value = "", Type, REG_DWORD, 1
;#############################################################################
; iaAHCI.inf
[HardwareIds.scsi.iaAHCI_ICH9RDODH]
id = "PCIVEN_8086&DEV_2922&CC_0106","iaStor"
; iaStor.inf
[HardwareIds.scsi.iaStor_ICH8RICH9RICH10RDO]
id = "PCIVEN_8086&DEV_2822&CC_0104","iaStor"
Code:
[Disks]
disk1 = "Intel Matrix Storage Manager driver", iaStor.sys,
[Defaults]
scsi = iaStor_ICH8MEICH9ME
Under [Defaults], you would list one entry's label that you want to be the default selected one when the screen comes up. On the ICH9R floppy this was a different driver than the one I copied over to this guide, so to make this one the default the line should read:
scsi = iaStor_ICH8RICH9RICH10RDO
Code:
[scsi]
; iaAHCI.inf
iaAHCI_ICH9RDODH = "Intel(R) ICH9R/DO/DH SATA AHCI Controller"
; iaStor.inf
iaStor_ICH8RICH9RICH10RDO = "Intel(R) ICH8R/ICH9R/ICH10R/DO SATA RAID Controller"
The 'iaAHCI' listing is for the AHCI (not RAID) driver, which enables hot-swapping of SATA drives. Use this if you selected AHCI in BIOS and are using the onboard controller.
The 'iaSTOR' listing is for the RAID driver, which (if you are reading this) is probably the one you want to use.
Code:
; iaAHCI.inf
[Files.scsi.iaAHCI_ICH9RDODH]
driver = disk1, iaStor.sys, iaStor
inf = disk1, iaAHCI.inf
catalog = disk1, iaAHCI.cat
; iaStor.inf
[Files.scsi.iaStor_ICH8RICH9RICH10RDO]
driver = disk1, iaStor.sys, iaStor
inf = disk1, iaStor.inf
catalog = disk1, iaStor.cat
Code:
[Config.iaStor]
value = "", tag, REG_DWORD, 1b
value = "", ErrorControl, REG_DWORD, 1
value = "", Group, REG_SZ, "SCSI Miniport"
value = "", Start, REG_DWORD, 0
value = "", Type, REG_DWORD, 1
These lines are vital to the RAID card working properly, and get entered into the registry. Unless you really, really know what you are doing, don't bother with them. Just copy them as they are.
Code:
; iaAHCI.inf
[HardwareIds.scsi.iaAHCI_ICH9RDODH]
id = "PCIVEN_8086&DEV_2922&CC_0106","iaStor"
; iaStor.inf
[HardwareIds.scsi.iaStor_ICH8RICH9RICH10RDO]
id = "PCIVEN_8086&DEV_2822&CC_0104","iaStor"
So these are the parts you would have to copy out of Intel's driver disk for the ICH9R RAID driver. You could also leave out the ACHI driver listings (and delete the files) if you know you'll never use them.
If there's anything you guys feel should be added, just say so!

Added Questions:
Quote:
|
Quote:
What you do is you create the 2x320GB array from your controller's setup (while booting), then in Windows you go to Administrative Tools->Computer Management->Disk Management. There, since Windows sees the 320GB array as a single drive, you can tell it to create a mirror with the 640GB drive. |
Quote:
|
Quote:
Although, if you're using anything other than RAID 0, it is theoretically better to use different model drives. Each model of a drive has a different average failure rate, so using different models decreases the chance that they fail at the same time. Just that with RAID 0 it doesn't matter if they do or not, you lose your data anyway. |
Quote:
|
Quote:
|
Edited by Manyak - 4/13/09 at 7:51pm






