IBM Netezza Replication Services, Version 1.6

RAID implementation considerations

Determine your needs regarding storage scalability, performance, and mean time to data loss for your implementation. Your needs drive your decision regarding storage implementation, RAID volume count and size, and RAID format. You can set up your storage in a RAID 10, RAID 6, or RAID 60 configuration.

RAID 10 mirrors and stripes data across disks; you can combine multiples of four drives as a single volume. Storing a byte of data in RAID 10 requires two bytes of disk space, so a RAID 10 volume presents half the drive space as available. The loss of a single drive does not impair a RAID 10 volume, but if two mirrored drives fail, the array permanently loses data. RAID 10 is a somewhat less flexible alternative, because it requires a multiple of four drives. For example, in a 16- or 24-drive enclosure that allocates two hot-spare drives, a RAID 10 array allows at most 12 or 20 drives.

RAID 6 stores double parity bits that are striped across a minimum of five drives. Compared to RAID 10, storing a byte with RAID 6 on a 10-drive array requires only 10 bits of space, resulting in greater capacity and higher performance. In addition, any two drives in a RAID 6 volume can fail without losing data.

You might also consider using striped RAID 6—RAID 60 (RAID 6 + striping). In this implementation, two identical volumes are software striped by using Linux Volume Manager (LVM); with parallel controllers, this can double the performance of the combined storage. There is a trade-off for space, however, because using multiple volumes increases the number of parity bits to process.

When a disk fails in a RAID volume, it needs to be rebuilt. You should allocate hot-spare drives for rebuilding lost disks immediately, because the array is more vulnerable to data loss after drives fail. In general, RAID 10 rebuilds faster then RAID 6 or RAID 60: a single drive is read and written to recover the array instead of all the drives being read to recompute the missing data using parity. In practice, storage manufacturers might provide solutions that make this less of a trade-off. To identify the specific recovery performance when a drive fails, consult your storage supplier.

Consider the following statements and interpretations:
Table 1. RAID format considerations
Statement Interpretation
Storage space is more important than either performance or mean time to data loss. RAID 6 is preferable to RAID 60 or RAID 10. Minimizing the number of RAID volumes provides more space at the cost of lower performance (where parallel controllers are practical) and lower mean time to data loss (because the volumes contain more disks that can fail).
Performance is more important than storage space or mean time to data loss. Volume striping (that is, RAID 60 using LVM) can increase throughput. Because in replication, most reads and writes are sequential (as opposed to random access), RAID 60 generally provides better performance than do RAID 6 or RAID 10. In addition, at the cost of storage space, RAID 60 generally provides longer mean time to data loss than does RAID 6.
Mean time to data loss is more important than storage space or performance. Depending on your storage supplier, RAID 60 and RAID 10 are appropriate. Both are generally preferable to RAID 6.

The solution requires that you mount the storage as a block device on the replication server hosts; you cannot mount it by using NFS or a similar non-local solution.



Feedback