Tuning a HP Smart Array P400 with Linux - why tuning really matters... - Overclock.net - An Overclocking Community

Forum Jump: 

Tuning a HP Smart Array P400 with Linux - why tuning really matters...

 
Thread Tools
post #1 of 1 (permalink) Old 05-30-2009, 12:06 AM - Thread Starter
New to Overclock.net
 
BLinux's Avatar
 
Join Date: May 2009
Location: San Diego, CA
Posts: 704
Rep: 84 (Unique: 65)
Today I had to do some tuning on a HP Smart Array P400 controller with 8x SAS 300GB 10K RPM HDD. It was already determined that this controller was *really* *really* bad at RAID5. This was a system that needed some decent performance so it was decided to use RAID10. We set the cache ratio to 50/50, and used 256k stripes. The controller already had write cache enabled with the battery.

This system was running RedHat Enterprise Linux 5.3 with additional support for the XFS file system using the "extras" repository from CentOS. The XFS file system was setup to match with the stripe of the RAID10, and then we mounted it with the following options:

rw,noatime,nobarrier,logbufs=8,logbsize=256k

Sometimes you have all the time in the world to test out every scenario, but this wasn't one of those times and I had to turn this around fairly quickly. So, I decided just to run a select few test cases with the iozone benchmark (my preferred benchmark tool). These are the specific test cases I ran:

iozone -b results.xls -r 4m -s 8g -t 6 -i 0 -i 1 -i 2

For those not familiar with iozone, the "-i" indicates the test, with the numbers meaning:

0 : sequential write/re-write test
1 : sequential read/re-read test
2 : random read/write test

So, first order of business was to get a "baseline" run to see where we were at:

initial write: 299MB/s
re-write: 335MB/s
read: 123MB/s
re-read: 125MB/s
random read: 108MB/s
random write: 306MB/s

What immediately concerned me here are the read speeds (both sequential and random); those are pretty bad numbers for 8x HDD in RAID10! In particular, this system needed both sequential and random reads to be fast.

Whenever working with Linux on RAID controllers, the first thing I like to try out is to change the I/O scheduler to 'noop'. Basically, Linux has a modular I/O scheduler architecture and you can choose from 4 different options. The default in RHEL5 is 'cfq', which is actually pretty good in many cases. But, if you're on a RAID controller, sometimes it's better to let the hardware take care of the I/O intelligence, and that's where 'noop' comes to play. You can change the I/O scheduler via the /sys filesystem:

echo "noop" > /sys/block/cciss!c0d1/queue/scheduler

Re-running the iozone tests above with 'noop' yielded some good improvements in both sequential and random read speeds. There was also a small gain in initial write speeds, but other write tests actually lost a little speed. Though, the lost in some write tests seemed like it might be worth the trade-off considering the gains:

initial write: 338MB/s
re-write: 299MB/s
read: 201MB/s (from 123MB/s !!!)
re-read: 200MB/s (from 125MB/s !!!)
random read: 206MB/s (from 108MB/s !!!)
random write: 255MB/s

Looks like we're headed in the right direction and gaining back some read speeds, even at the slight cost of some write speed tests.

To focus on tuning for read speeds, the next thing to tune is Linux's read ahead cache. By default, this is set to 256k, but in my experience, this is really too little for every RAID controller I've worked with. Normally, you can incrementally increase this value and re-run the benchmark to see the gains (or loss). But I've been doing quite a bit of Linux tuning lately and my experience has told me that usually 8MB~32MB is where it's worth playing. Also, I wasn't given a lot of time to turn this around, so my first test was to increase the read ahead cache to 8MB:

/sbin/blockdev --setra 8192 /dev/cciss/c0d0

So, with 'noop' and 8MB of read ahead cache, here are the results:

initial write: 335MB/s
re-write: 305MB/s
read: 324MB/s (from 123MB/s !!! +263%)
re-read: 325MB/s (from 125MB/s !!! +261%)
random read: 417MB/s (from 108MB/s !!! +386%)
random write: 256MB/s

Pretty impressive gains on the read speeds. I also tried using 16MB of read ahead cache, but it didn't really yield significant gains, so I decided that 8MB was enough.

Just two simple tuning parameters that can gain quite a bit! Here's a chart of the results:

BLinux is offline  
Sponsored Links
Advertisement
 
Reply

Quick Reply
Message:
Options

Register Now

In order to be able to post messages on the Overclock.net - An Overclocking Community forums, you must first register.
Please enter your desired user name, your email address and other required details in the form below.
User Name:
If you do not want to register, fill this field only and the name will be used as user name for your post.
Password
Please enter a password for your user account. Note that passwords are case-sensitive.
Password:
Confirm Password:
Email Address
Please enter a valid email address for yourself.
Email Address:

Log-in



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Show Printable Version Show Printable Version
Email this Page Email this Page


Forum Jump: 

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off