The new kernel has a number of nifty new features and additions. One of these is the inclusion of a modern Software RAID implementation -- yay! Software RAID allows you to dramatically increase Linux disk IO performance and reliability without buying expensive hardware RAID controllers or enclosures. Because it's implemented in software, Linux RAID software is flexible, fast... and fun!
The concept behind Software RAID is simple -- it allows you to combine two or more block devices (usually disk partitions) into a single RAID device. So let's say you have three empty partitions, hda3, hdb3, and hdc3. Using Software RAID, you can combine these partitions and address them as a single RAID device, /dev/md0. md0 can then be formatted to contain a filesystem and used like any other partition. There are also a number of different ways to configure a RAID volume -- some maximize performance, others maximize availability, while others provide a mixture of both.
There are two forms of RAID: linear and RAID-0 mode. Neither one is technically a form of RAID at all, since RAID stands for "redundant array of inexpensive disks", and RAID-0 and linear mode don't provide any kind of data redundancy. However, both modes -- especially RAID-0 -- are very useful. After giving you a quick overview of these two forms of "AID", I'll step you through the process of getting Software RAID set up on your system.
Linear mode is one of the simplest methods of combining two or more block devices into a "RAID" volume -- the method of simple concatenation. If you have three partitions, hda3, hdb3, and hdc3, and each is about 2Gb, they will create a resultant linear volume of 6Gb. The first third of the linear volume will reside on hda3, the last third on hdc3, and the middle third on hdb3.
To configure a linear volume, you'll need at least two partitions that you'd like to join together. They can be different sizes, and they can even all reside on the same physical disk without negatively affecting performance.
Linear mode is the best way to combine two or more partitions on the same disk into a single volume. While doing this with any other RAID technique will result in a dramatic loss of performance, linear mode is saved from this problem because it doesn't write to its constituent partitions in parallel (as all the other RAID modes do). But for the same reason, linear mode has the liability of lacking scale in performance compared to RAID-0, RAID-4, RAID-5, and to some extent RAID-1.
In general, linear mode doesn't provide any kind of performance improvement over traditional non-RAID partitions. Actually, if you spread your linear volume over multiple disks, your volume is more likely to become unavailable due to a random hard drive failure. The probability of failure of a linear volume will be equal to the sum of the probabilities of failure of its constituent physical disks and controllers. If one physical disk dies, the linear volume is generally unrecoverable. Linear mode does not offer any additional redundancy over using a single disk.
But linear mode is a great way to avoid repartitioning a single disk. For example, say your second IDE drive has two unused partitions, hdb1 and hdb3. And say you're unable to repartition the drive due to critical data hanging out at hdb2. You can still combine hdb1 and hdb3 into a single, cohesive whole using linear mode.
Linear mode is also a good way to combine partitions of different sizes on different disks when you just need a single big partition (and don't really need to increase performance). But for any other job there are better RAID technologies you can use.
RAID-0 is another one of those "RAID" modes that doesn't have any "R" (redundancy) at all. Nevertheless, RAID-0 is immensely useful. This is primarily because it offers the highest performance potential of any form of RAID.
To set up a RAID-0 volume you'll need two or more equally (or almost equally) sized partitions. The RAID-0 code will evenly distribute writes (and thus reads) between all constituent partitions. And by parallelizing reads and writes between all constituent devices, RAID-0 has the benefit of multiplying IO performance. Ignoring the complexities of controller and bus bandwidth, you can expect a RAID-0 volume composed of two partitions on two separate identical disks to offer nearly double the performance of a traditional partition. Crank your RAID-0 volume up to three disks, and performance will nearly triple. This is why a RAID-0 array of IDE disks can outperform the fastest SCSI or FC-AL drive on the market. For truly blistering performance, you can set up a bunch of SCSI or FC-AL drives in a RAID-0 array. That's the beauty of RAID-0.
To create a RAID-0 volume, you'll need two or more equally sized partitions located on separate disks. The capacity of the volume will be equal to the combined capacity of the constituent partitions. As with linear mode, you can combine block devices from various sources (such as IDE and SCSI drives) into a single volume with no problems.
If you're creating a RAID-0 volume using IDE disks, you should try to use UltraDMA compliant disks and controllers for maximum reliability. And you should use only one drive per IDE channel to avoid sluggish performance -- a slave device, especially if it's also part of the RAID-0 array, will slow things down so much as to nearly eliminate any RAID-0 performance benefit. You may also need to add an off-board IDE controller so that you have the extra IDE channels you require.
If you're creating a RAID-0 volume out of SCSI devices, be aware that the combined throughput of all the drives can potentially exceed the maximum throughput of the SCSI (and potentially PCI) bus. In such a case, the SCSI bus will be the performance-limiting factor. If, for example, you have four drives that have a maximum throughput of 15Mb/sec set up on a 40Mb/sec 68-pin Ultra Wide bus, there will be times when the drives will saturate the bus, and performance will reach an upper maximum of close to 40Mb/sec. This may be fine for your application (after all, 40Mb/sec IO ain't bad!), but you'd probably have identical peak IO performance from a RAID-0 volume that used only three drives.
From a reliability standpoint, RAID-0 has the same characteristics as linear mode -- the more drives you add to the array, the higher the probability of volume failure. And, like linear mode, the death of a single drive will bring down the entire RAID-0 volume and make it unrecoverable. To figure out the probability of failure of your RAID-0 volume, simply add together the probabilities of failure of all constituent drives.
RAID-0 is ideal for applications for which you need maximum IO performance, since it's the highest-performing RAID mode available. But remember that RAID-0 should only be used if you can tolerate a slightly higher risk of volume failure.
If you're putting together a compute farm or web cluster, RAID-0 is an excellent way to increase disk IO performance. Since in this case you would already have an existing level of redundancy (lots of spare machines), your resources would continue to be available for the rare case that a machine with a failed hard drive needs to be brought down for a drive replacement and reload.
There are two steps involved in getting your 2.4 system ready for Software RAID. First, RAID support needs to be enabled in the kernel. This normally involves recompiling and installing a new kernel unless you're already using a 2.4 series kernel with RAID support compiled-in.
Then the raidtools package needs to be compiled and installed. The raidtools are the user-level tools that allow you to initialize, start, stop, and control your RAID volumes. Once these two steps are complete, you'll be able to create your own RAID volumes, create Filesystems on the volumes, mount them, etc.
I'm using kernel 2.4.0-test10 for this series. I recommend that you use the most recent 2.4 kernel you can track down, which should at least be kernel 2.4.0-test10 or later (but not 2.4.0-test11, which had serious filesystem corruption problems). You can find a recent kernel over at kernel.org, and a tutorial showing you how to recompile and install a new kernel from sources here at developerWorks (see the Resources section later in this article).
I recommend that you configure your kernel so that Software RAID support is compiled-in (rather than supported as modules). When you type "make menuconfig" or "make xconfig", you'll find the Software RAID settings under the "Multi-device support (RAID and LVM)" section. I also recommend that you enable everything RAID-related here, including "Boot support" and "Auto Detect support". This will allow the kernel to auto-start your RAID volume at boot-time, as well as allow you to create a root RAID filesystem if you so desire. Here's a snapshot of "make menuconfig". The last two options (LVM support) are not required, although I compiled them into the kernel anyway:
Configuring the kernel for RAID
Once the kernel's properly configured, install it and reboot. Now let's track down the latest version of raidtools.
Before we can install raidtools we need to do a bit of searching to find the latest version. You can generally find the raidtools program at kernel.org. Now track down the most recent "raidtools-0.90" archive (not raid0145!). Currently it's "raidtools-19990824-0.90.tar.gz".
If you like living on the bleeding edge (and if you're using a 2.4.0-test kernel, then you do), you may want to head over to RedHat (see Resources) and snag the latest version of raidtools you can find. Currently it's "raidtools-dangerous-0.90-20000116.tar.gz".
# cd raidtools-0.90 # ./configure # make # make install
# cat /proc/mdstat
OK, now it's time to prepare some disk partitions, of which you'll need at least two. If you're using RAID-0, make sure they're on separate disks and approximately the same size. It goes without saying that the data on these partitions will be destroyed.
One other important note -- when you create your partitions, give them the partition type "FD". This will allow the Linux kernel to recognize them as Linux RAID partitions, so they will be autodetected and started at every boot. If you don't mark your RAID partitions this way, you'll need to type "raidstart --all" after every boot before you can mount your RAID volumes. That can be annoying, so set the partition type correctly.
The raidtab syntax is fairly easy to figure out -- each block of directives begins with a "raiddev" entry specifying the RAID volume that will be created. When you installed raidtools, the Makefile created /dev/md0 through md15 for you, so they're available for use.
Next, "nr-raid-disks" should specify the number of disks in your array. Then you set the "persistent-superblock" to 1, telling the raid tools that when this volume is created, a special superblock should be written to each constituent device describing the configuration of the RAID array. The Linux kernel uses this information to auto-detect and start up RAID arrays at boot time, so you should make sure that every RAID volume you create is configured to do this.
"chunk-size" specifies the granularity of the chunks used for RAID-0 in kilobytes. In this example, our RAID-0 volume will write to its constituent partitions in 32K blocks; that is, the first 32K of the RAID volume maps to hde1, the second 32K maps to hdg1, etc. We also specify a chunk size for our /dev/md1 linear volume -- this is just a dummy entry and doesn't mean anything.
Finally, you specify the devices that make up the volume. First you specify the actual block device with a "device" line, and then you immediately follow it with a "raid-disk" entry that specifies its position in the array, starting with zero.
Once you've created your own /etc/raidtab file, you're ready to do a one-time initialization of the array.
OK. Our partitions are created, the raidtab file is in place -- now it's time to initialize our first partition by using the mkraid command:
# mkraid /dev/md0
After this command completes, /dev/md0 will be initialized and the md0 array will be started. If you type "cat /proc/mdstat", you should see something like this:
Personalities : [linear] [raid0] [raid1] [raid5] read_ahead 1024 sectors md0 : active raid0 hdg1 hde1 90069632 blocks 32k chunks unused devices: <none>
Yay! Our RAID device is up and running. All we need to do now is create a filesystem on it. To do this, use the mke2fs command or the mkreiserfs command (RAID-0 and ReiserFS is a great combination!):
# mke2fs /dev/md0
# mkreiserfs /dev/md0
Now your new filesystem can be mounted:
# mkdir /mnt/raid # mount /dev/md0 /mnt/raid
Feel free to add a /dev/md0 entry to your fstab. It goes something like this:
/dev/md0 /mnt/raid reiserfs defaults 0 0
If you set the partition type correctly to "FD", your RAID volume will be auto-started at boot time. Now all that's left to do is use and enjoy your new Software RAID volume. And (of course :) catch my second Software RAID article, in which we'll take a look at some more advanced Software RAID functionality and RAID-1.
- Read Part 2 in Daniel's series on RAID, where he explains what software RAID-1, 4, and 5 can and cannot do for you and how you should approach the implementation of these
RAID levels in a production environment
The Software-RAID HOWTO is
another excellent resource for information related to Linux Software RAID
- You may want to check out the
Boot+Root+RAID+Lilo Software RAID HOWTO if you'd like to learn how
to create a root RAID filesystem
- For updated versions of raidtools-0.90, keep an eye on people.redhat.com
- Find a recent kernel in The Linux Kernel Archives
- Take an IBM developerWorks tutorial showing you how to recompile and install a new kernel from sources
- Find the raidtools program
- Snag the latest version of raidtools
- Check out more tips on Software Raid solutions for Linux
Daniel Robbins lives in Albuquerque, New Mexico. He is the President/CEO of Gentoo Technologies, Inc., the Chief Architect of the Gentoo Project and a contributing author to several books published by MacMillan: Caldera OpenLinux Unleashed, SuSE Linux Unleashed, and Samba Unleashed. Daniel has been involved with computers in some fashion since the second grade, when he was first exposed to the Logo programming language as well as a potentially dangerous dose of Pac-Man. This probably explains why he has since served as a Lead Graphic Artist at SONY Electronic Publishing/Psygnosis. Daniel enjoys spending time with his wife Mary, and his new baby daughter Hadassah. You can contact Daniel at email@example.com.