Compared to conventional RAID, IBM Spectrum Scale™ RAID implements
a sophisticated data and spare space disk layout scheme that allows
for arbitrarily sized disk arrays while also reducing the overhead
to clients when recovering from disk failures. To accomplish this, IBM Spectrum
Scale RAID uniformly
spreads or declusters user data, redundancy information, and
spare space across all the disks of a declustered array. Figure 1 compares a conventional
RAID layout versus an equivalent declustered array.
Figure 1. Conventional RAID versus declustered
RAID layouts.
This figure is an example of how IBM Spectrum
Scale RAID improves client
performance during rebuild operations by using the throughput of all
disks in the declustered array. This is illustrated here by comparing
a conventional RAID of three arrays versus a declustered array, both
using seven disks. A conventional 1-fault-tolerant 1 + 1 replicated
RAID array in the lower left is shown with three arrays of two disks
each (data and replica strips) and a spare disk for rebuilding. To
decluster this array, the disks are divided into seven tracks, two
strips per array, as shown in the upper left. The strips from each
group are then combinatorially spread across all seven disk positions,
for a total of 21 virtual tracks, per the upper right. The strips
of each disk position for every track are then arbitrarily allocated
onto the disks of the declustered array of the lower right (in this
case, by vertically sliding down and compacting the strips from above).
The spare strips are uniformly inserted, one per disk.
As illustrated in Figure 2, a declustered
array can significantly shorten the time that is required to recover
from a disk failure, which lowers the rebuild overhead for client
applications. When a disk fails, erased data is rebuilt using all
the operational disks in the declustered array, the bandwidth of which
is greater than that of the fewer disks of a conventional RAID group.
Furthermore, if an additional disk fault occurs during a rebuild,
the number of impacted tracks requiring repair is markedly less than
the previous failure and less than the constant rebuild overhead of
a conventional array.
The decrease in declustered rebuild impact and client overhead
can be a factor of three to four times less than a conventional RAID.
Because IBM Spectrum
Scale stripes
client data across all the storage nodes of a cluster, file system
performance becomes less dependent upon the speed of any single rebuilding
storage array.
Figure 2. Lower rebuild overhead in declustered RAID versus conventional RAID.
When a single disk fails in the 1-fault-tolerant 1 + 1 conventional
array on the left, the redundant disk is read and copied onto the
spare disk, which requires a throughput of 7 strip I/O operations.
When a disk fails in the declustered array, all replica strips of
the six impacted tracks are read from the surviving six disks and
then written to six spare strips, for a throughput of two strip I/O
operations. The bar chart illustrates disk read and write I/O throughput
during the rebuild operations.