Pdisk-group fault tolerance: an example

Every data stripe (including user data and system configuration data) within the IBM Storage Scale RAID system is protected through a distinct form of redundancy. Each of these data stripes has a set of disks within which they constrain their strip placement. Each stripe of the data (for which there are many stripes in each whole) has individual strips that serve in the redundancy code protection of the object's data. The placement of these strips is distributed across a set of pdisks residing within a set of drawers. These drawers reside within a set of enclosures.

Figure 1 shows a sample stripe placement for a vdisk that was using a RAID redundancy code of 8+2p (that is, eight data strips and two parity strips) on five enclosure system. The pdisk-group fault-tolerant placement has chosen to place the 10 strips of the stripe across five enclosures (each having two drawers).
sample stripe placement for a vdisk using 8+2p

By segregating each individual strip across as wide a set of disk groups as possible, IBM Storage Scale RAID ensures that the loss of any set of disk groups up to fault tolerance of the RAID redundancy code is survivable. So in the give example the pdisk-group fault-tolerance is one enclosure because RAID code can survive two strips.

Figure 2 shows an example of the same configuration after the loss of a full enclosure. You can see that there are eight strips of the stripe that are still available, hence data will be fully available even after the failure of one enclosure.
configuration for a vdisk using 8+2p - after the loss of a full enclosure

After the failure, the GNR software tries to rebuild the DA subject to availability of space such that maximum fault tolerance can be achieved.

For the same 8+2p RAID code if you had four enclosure system, the pdisk-group fault-tolerance is one drawer. This RAID code under this configuration cannot survive an enclosure failure.
4 enclosure system with RAID code 8+2p

If you want enclosure failure on four enclosure system, you would need to set the RAID code to 8+3p.

Limiting factor of pdisk-group fault-tolerance

IBM Storage Scale RAID selects a minimum of five-way replication for its internal configuration data and requires some physical disks to be available for describing multiple entities within recovery group, internally called recovery group (rg) descriptor. Similarly, pdisk-group fault tolerance is also used for system vdisks like loghome, logtip, and logtip backup. The actual pdisk-group fault-tolerance is a union of distribution of actual strips of vdisk and internal recovery group descriptor across available failure domains (nodes, enclosure, drawer, pdisks). The fault tolerance of internal configuration data is the limiting factor for any system or user vdisk. Hence in some cases actual pdisk-group fault-tolerance will be lower than theoretical pdisk-group fault tolerance.
Note: Always refer to command output to find actual pdisk-group fault-tolerance as it varies with RAID code and system configuration.
In the new version, pdisk-group fault-tolerance can be seen through mmvdisk command
mmvdisk recoverygroup list --recovery-group <RgName> –all
or
mmvdisk recoverygroup list --recovery-group <RgName> -–fault-tolerance
In older version, this can be seen through the following command:
mmlsrecoverygroup <RgName> -L
The following example from six enclosure system showing system vdisk RG001LOGHOME pdisk-group fault-tolerance. Theoretically it should be three enclosures but it is limited by recovery group descriptor and hence the actual fault-tolerance is two enclosures.
configuration data  disk group fault tolerance         remarks
------------------  ---------------------------------  -------
rg descriptor       2 enclosure                        limiting fault tolerance
system index        2 enclosure                        limited by rg descriptor

vdisk               RAID code        disk group fault tolerance         remarks
------------------  ---------------  ---------------------------------  -------
RG001LOGHOME        4WayReplication  2 enclosure                        limited by rg descriptor
RG001LOGTIP         2WayReplication  1 pdisk
RG001LOGTIPBACKUP   Unreplicated     0 pdisk
RG001VS001          8+2p             1 enclosure
RG001VS002          8+2p             1 enclosure
The following example from two enclosure (without drawer) system shows system vdisk RG001LOGHOME and user vdisk RG001VS004 pdisk-group fault-tolerance that are limited by recovery group descriptor, which is lower than theoretical max.
configuration data  disk group fault tolerance         remarks
------------------  ---------------------------------  -------
rg descriptor       4 pdisk                            limiting fault tolerance
system index        4 pdisk                            limited by rg descriptor

vdisk               RAID code        disk group fault tolerance         remarks
------------------  ---------------  ---------------------------------  -------
RG001LOGHOME        4WayReplication  3 pdisk                            limited by rg descriptor
RG001LOGTIP         2WayReplication  1 pdisk                            
RG001LOGTIPBACKUP   Unreplicated     0 pdisk                            
RG001VS001          8+2p             2 pdisk                            
RG001VS004          3WayReplication  2 pdisk                            limited by rg descriptor