mmchpdisk command

Changes IBM Storage Scale RAID pdisk states. This command is to be used only in extreme situations under the guidance of IBM® service personnel.

Synopsis

mmchpdisk RecoveryGroupName --pdisk PdiskName
      { --simulate-failing | --kill | --revive | --revive-failing | 
     --revive-slow | --diagnose |--suspend | --resume | 
     --begin-service-drain | --end-service-drain } 
     [ --identify {on|off}] [ --clear-error-counters ]

Availability

Available on all IBM Storage Scale editions.

Description

The mmchpdisk command changes the states of pdisks.

Attention: This command is to be used only in extreme situations under the guidance of IBM service personnel.

Parameters

RecoveryGroupName: Specifies the recovery group that contains the pdisk for which one or more states are to be changed.
--pdisk PdiskName: Specifies the target pdisk.
--simulate-failing: Specifies that the disk is being treated as if it were failing.
Attention: This option must be used with caution; if the total number of failures in a declustered array exceeds the fault tolerance of any vdisk in that array, permanent data loss might result.
--kill: The --kill option is deprecated. If it is run, it acts like --simulate-failing.
--revive: Attempts to make a failed disk usable again by removing dead, failing, and readonly pdisk state flags. Data can become readable again if the disk was not rebuilt onto spare space; however, any data that was already reported as lost cannot be recovered.
--revive-failing: Clears the "failing" pdisk state flag and resets all of the disk hospital bit error rate history for the pdisk.
Note: It might take several months to rebuild this history. Do not use this option to clear simulatedFailing; use --revive instead.
--revive-slow: Clears the "slow" pdisk state flag and resets all of the disk hospital I/O performance history for the pdisk.
--diagnose: Runs basic tests on the pdisk. If no problems are found, the pdisk state automatically returns to ok.
--suspend: Suspends I/O to the pdisk until a subsequent resume command is given. If a pdisk remains in the suspended state for longer than a predefined timeout period, IBM Storage Scale RAID begins rebuilding the data from that pdisk into spare space.
Attention: Use this option with caution and only when performing maintenance on disks manually, bypassing the automatic system provided by mmchcarrier.
If a pdisk is removed by using this option, vdisks that store data on it are temporarily degraded, requiring data that was stored on the removed pdisk to be rebuilt from redundant data. Also, if you try to remove more pdisks than the redundancy level of the least redundant vdisk in that declustered array, data becomes inaccessible.

Therefore, when preparing to remove a pdisk, use the --begin-service-drain and --end-service-drain options instead of this option.
--resume: Cancels a previously run mmchpdisk --suspend command and resumes use of the pdisk.
Use this option only when performing maintenance on disks manually and bypassing the automatic system provided by mmchcarrier.
--begin-service-drain: Starts draining the specified pdisk so that it can be temporarily removed. After issuing the command with this option, wait until the pdisk is drained before removing it.
Note: This process requires sufficient spare space in the declustered array for the data that is to be drained. If the available spare space is insufficient, it can be increased with the mmchrecoverygroup command.
--end-service-drain: Returns drained data to a pdisk after the pdisk is brought back online.
--identify {on | off}: Turns on or off the disk identify light, if available.
--clear-error-counters: Resets the IOErrors, IOTimeouts, mediaErrors, checksumErrors, and pathErrors counters.

Exit status

0: Successful completion.
nonzero: A failure occurred.

Security

You must have root authority to run the mmchpdisk command.

The node on which the command is issued must be able to execute remote shell commands on any other node in the cluster without the use of a password and without producing any extraneous messages. For additional details, see the following IBM Storage Scale RAID: Administration topic: Requirements for administering IBM Storage Scale RAID.

Examples

The following command example shows how to instruct IBM Storage Scale to try to revive the failed pdisk c036d3 in recovery group 000DE37BOT:

mmchpdisk 000DE37BOT --pdisk c036d3 --revive

Location

/usr/lpp/mmfs/bin