Pdisk states

IBM Storage Scale RAID maintains its view of a pdisk and its corresponding physical disk by using a pdisk state. The pdisk state consists of multiple keyword flags, which can be displayed by using the mmlsrecoverygroup or mmlspdisk commands. You can also use the IBM Storage Scale System 3200 GUI to display pdisk states. The state of pdisks is displayed in these views: Arrays > Physical, Monitoring > System, and Monitoring > System Details. In addition, information about pdisks with a negative state (disks that should be replaced, for example) is displayed in the Monitoring > Events view.

The pdisk state flags indicate in detail how IBM Storage Scale RAID is using or managing a disk.

In normal circumstances, the state of most of the pdisks is represented by the sole keyword ok. The ok keyword means that IBM Storage Scale RAID considers the pdisk to be healthy: the recovery group server is able to communicate with the disk, the disk is functioning normally, and the disk can be used to store data. The diagnosing flag is present in the pdisk state when the IBM Storage Scale RAID disk hospital suspects, or attempts to correct, a problem. If IBM Storage Scale RAID is unable to communicate with a disk, the pdisk state includes the missing keyword. If a missing disk becomes reconnected and functions properly, its state changes back to ok. The readonly flag means that a disk is indicating that it can no longer safely write data. A disk can also be marked by the disk hospital as failing, which might be due to an excessive number of media or checksum errors. When the disk hospital concludes that a disk is no longer operating effectively, it declares the disk dead. If the number of non-functioning (dead, missing, failing, or slow) pdisks reaches or exceeds the replacement threshold of their declustered array, the disk hospital adds the replace flag to the pdisk state, which indicates that physical disk replacement should be performed as soon as possible.

When the state of a pdisk indicates that it can no longer behave reliably, IBM Storage Scale RAID rebuilds the pdisk's data onto spare space on the other pdisks in the same declustered array. This is called draining the pdisk. Flags indicate whether a pdisk is draining or was drained. The draining flag means that IBM Storage Scale RAID will rebuild the data from the pdisk. The deleting flag means that the IBM Storage Scale RAID administrator issued the mmdelpdisk command to delete the pdisk.

To summarize, most of the pdisks are in the ok state during normal operation. The ok state indicates that the disk is reachable, functioning, not draining, and that the disk contains user data as well as IBM Storage Scale RAID recovery group and vdisk configuration information. A more complex example of a pdisk state is dead/drained for a single pdisk that has failed. This set of pdisk state flags indicates that the pdisk was declared dead by the system, was marked to be drained, and that all of its data (recovery group, vdisk configuration, and user) was successfully rebuilt onto the spare space on other pdisks.

In addition to the states discussed here, there are some transient pdisk states that have little impact on normal operations. Table 1 lists the complete set of states.

Table 1. Pdisk states
State Description
ok The disk is available.
dead The disk completely failed.
simulatedDead The disk is being treated as if it were dead for error injection (see mmchpdisk --simulate-dead).
missing The disk hospital determined that the system cannot connect to the drive.
readonly The disk has failed; it can still be read but not written.
failing The disk needs to be drained and replaced due to a SMART trip or high uncorrectable error rate.
simulatedFailing The disk is being treated as if it were failing for error injection (see mmchpdisk --simulate-failing).
slow The disk needs to be drained and replaced due to poor performance.
diagnosing The disk hospital is checking the disk after an error.
PTOW The disk is temporarily unavailable because of a pending timed-out write.
suspended The disk is temporarily offline for service (see mmchpdisk and mmchcarrier).
serviceDrain The disk is being drained of data for service (see mmchpdisk --begin-service-drain).
draining The data is being drained from the disk and moved to distributed spare space on other disks.
deleting The disk is being deleted from the system through the mmdelpdisk, mmaddpdisk/--replace, or mmchcarrier command.
drained All of the data was successfully drained from the disk and the disk is replaceable, but the replace threshold was not met.
undrainable As much of the data as possible was drained from the disk and moved to distributed spare space.
replace The disk is ready for replacement.

Related information