Data spare space and VCD spares
While operating with a failed pdisk in a declustered array, IBM Storage Scale RAID continues to serve file system I/O requests by using redundancy information on other pdisks to reconstruct data that cannot be read, and by marking data that cannot be written to the failed pdisk as stale. Meanwhile, to restore full redundancy and fault tolerance, the data on the failed pdisk is rebuilt onto data spare space, reserved unused portions of the declustered array that are declustered over all of the member pdisks. The failed disk is thereby drained of its data by copying it to the data spare space.
The amount of data spare space in a declustered array is set at creation time and can be changed later. The data spare space is expressed in whole units equivalent to the capacity of a member pdisk of the declustered array, but is spread among all of the member pdisks. There are no dedicated spare pdisks. This implies that a number of pdisks equal to the specified data spare space could fail, and the full redundancy of all of the data in the declustered array can be restored through a rebuild operation. If the user chooses to not fill the space in the declustered array with vdisks, and wants to use the unallocated space as extra data spare space, the user can increase the setting of the dataSpares parameter to the desired level of resilience against pdisk failures.
- Non-volatile RAM disks used for a log tip vdisk
- SSDs used for a log tip backup vdisk.
IBM Storage Scale RAID vdisk configuration data (VCD) is stored more redundantly than vdisk content, typically 5-way replicated. When a pdisk fails, this configuration data is rebuilt at the highest priority, onto functioning pdisks. The redundancy of configuration data always has to be maintained, and IBM Storage Scale RAID will not serve a declustered array that does not have sufficient pdisks to store all configuration data at full redundancy. The declustered array parameter vcdSpares determines how many pdisks can fail and have full VCD redundancy restored, by reserving room on each pdisk for vdisk configuration data. When using pdisk-group fault tolerance, the value of vcdSpares should be set higher than the value of the dataSpares parameter to account for the expected failure of hardware failure domains.