Repair of cluster configuration information when no CCR backup information is available: mmsdrrestore command
When no CCR backup information is available, you can repair missing or corrupted CCR files with the mmsdrrestore command and the --ccr-repair parameter.
The mmsdrrestore command with the --ccr-repair parameter can repair cluster configuration information in a cluster in which no intact CCR state can be found on the quorum nodes and no CCR backup is available from which the cluster configuration information can be recovered. In this state the CCR committed directory of all of the quorum nodes has corrupted or lost files. For more information, see mmsdrrestore command.
This procedure does not guarantee to recover the most recent state of all the configuration files in the CCR. Instead, it brings the CCR back into a consistent state with the most recent available version of each configuration file.
For an example of running the mmsdrrestore command with the --ccr-repair parameter, see mmsdrrestore command.
- All the quorum nodes crash or lose power at the same time, and on each quorum node one or more files are corrupted or lost in the CCR committed directory after the quorum nodes started up.
- The local disks of all the quorum nodes have a power loss, and on each quorum node a file is corrupted or lost in the CCR committed directory after the local disks came back.
Checking for corrupted or lost files in the CCR committed directory
mmccr check -Y -e
In the following example, the next-to-last line of the
output indicates that one or more files are corrupted or lost in the CCR
committed directory of the current
node:# mmccr check -Y -e
mmccr::HEADER:version:reserved:reserved:NodeId:CheckMnemonic:ErrorCode:ErrorMsg:
ListOfFailedEntities:ListOfSucceedEntities:Severity:
mmccr::0:1:::1:CCR_CLIENT_INIT:0:::/var/mmfs/ccr,/var/mmfs/ccr/committed,/var/mmfs/ccr/ccr.nodes,
Security,/var/mmfs/ccr/ccr.disks:OK:
mmccr::0:1:::1:FC_CCR_AUTH_KEYS:0:::/var/mmfs/ssl/authorized_ccr_keys:OK:
mmccr::0:1:::1:FC_CCR_PAXOS_CACHED:0:::/var/mmfs/ccr/cached,/var/mmfs/ccr/cached/ccr.paxos:OK:
mmccr::0:1:::1:FC_CCR_PAXOS_12:0:::/var/mmfs/ccr/ccr.paxos.1,/var/mmfs/ccr/ccr.paxos.2:OK:
mmccr::0:1:::1:PC_LOCAL_SERVER:0:::c80f5m5n01.gpfs.net:OK:
mmccr::0:1:::1:PC_IP_ADDR_LOOKUP:0:::c80f5m5n01.gpfs.net,0.000:OK:
mmccr::0:1:::1:PC_QUORUM_NODES:0:::192.168.80.181,192.168.80.182:OK:
mmccr::0:1:::1:FC_COMMITTED_DIR:5:Files in committed directory missing or corrupted:1:6:WARNING:
mmccr::0:1:::1:TC_TIEBREAKER_DISKS:0:::1:OK:
In
the following example, the next-to-last line indicates that none of the files in the CCR
committed directory of the current node are corrupted or
lost:# mmccr check -Y -e
mmccr::HEADER:version:reserved:reserved:NodeId:CheckMnemonic:ErrorCode:ErrorMsg:
ListOfFailedEntities:ListOfSucceedEntities:Severity:
mmccr::0:1:::1:CCR_CLIENT_INIT:0:::/var/mmfs/ccr,/var/mmfs/ccr/committed,/var/mmfs/ccr/ccr.nodes,
Security,/var/mmfs/ccr/ccr.disks:OK:
mmccr::0:1:::1:FC_CCR_AUTH_KEYS:0:::/var/mmfs/ssl/authorized_ccr_keys:OK:
mmccr::0:1:::1:FC_CCR_PAXOS_CACHED:0:::/var/mmfs/ccr/cached,/var/mmfs/ccr/cached/ccr.paxos:OK:
mmccr::0:1:::1:FC_CCR_PAXOS_12:0:::/var/mmfs/ccr/ccr.paxos.1,/var/mmfs/ccr/ccr.paxos.2:OK:
mmccr::0:1:::1:PC_LOCAL_SERVER:0:::c80f5m5n01.gpfs.net:OK:
mmccr::0:1:::1:PC_IP_ADDR_LOOKUP:0:::c80f5m5n01.gpfs.net,0.000:OK:
mmccr::0:1:::1:PC_QUORUM_NODES:0:::192.168.80.181,192.168.80.182:OK:
mmccr::0:1:::1:FC_COMMITTED_DIR:0::0:7:OK:
mmccr::0:1:::1:TC_TIEBREAKER_DISKS:0:::1:OK: