If a failed node cannot be recovered, auto recovery migrates
all data from disks in this node to other disks in cluster. If the
system does not recover, delete the disks in the node and node.
- Log in to another cluster node and run mmlsdisk <fs-name>
-M command to get a list of disks attached to the failed
node. Save the disk list in the diskList file. Each line lists a disk
name.
- Run the mmdeldisk <fsName> -F <diskList> command
to delete the disks attached to the failed node.
- Run the mmdelnsd -F <diskList> command
to delete NSDs attached to the failed node. Run mmdelnode command
to remove the node, or if you are replacing the node with new hardware,
use the same name and IP address to continue.
To replace
the failed node with a new node, start the replacement mode with the
hostname and the IP address of the failed node. Install IBM
Storage Scale packages
and configure SSH authorization with other nodes in the cluster. Run
the following command to restore IBM
Storage Scale configurations
in this replacement node:
mmsdrrestore -p <cluster manager> -R <remoteFileCopyCommand> -N <replacement node>
Use
the mmlsmgr command to identify the cluster manager
node. Use the Remote file copy command that is configured for the
cluster.
- Start IBM
Storage Scale on the
replacement node by running the mmstartup -N <replacement
node> command. Confirm that IBM
Storage Scale state
is active by running the mmgetstate -N <replacement node> command.
- Prepare a stanza file to create NSDs by running the mmcrnsd command
and add these disks into file system by running the mmadddisk command.