Disaster recovery scenarios for CCR
Different procedures can be followed for recovering from a broken CCR. The recovery actions to be applied vary based on the use cases.
- Recovering from a single quorum or non-quorum node failure
- A node failure might occur when the node is completely corrupted or when the node was rebuilt
from scratch. In this scenario, just one quorum node is broken but there are still enough quorum
nodes available on which CCR is running without any issue. This case must be even applied when a
single non-quorum node must be recovered.
Command to apply to restore the configuration information: mmsdrrestore -p <GOOD_QUORUM_NODE>.
For more information, see Recovering from a single quorum or non-quorum node failure.
- Recovering from the loss of a majority of quorum nodes
- In this case, a majority of quorum nodes are broken but there is still at least one quorum node
available with an intact CCR state.
Command to apply to restore the configuration information: mmchnode --noquorum -N <LIST_OF_BROKEN_QUORUM_NODES> --force
For more information, see Recovering from the loss of a majority of quorum nodes.
- Recovering from damage or loss of the CCR on all quorum nodes
- In this case, the CCR is partially broken on all quorum nodes. This means that there are still
fragments of the CCR state available on various quorum nodes but no quorum node is available with a
complete, intact CCR state.
Command to apply to restore the configuration information: mmsdrrestore --ccr-repair
For more information, see Recovering from damage or loss of the CCR on all quorum nodes.
- Recovering from an existing CCR backup
- In this case, the CCR state on all quorum nodes is lost. That is, loss of access to
/var/mmfs or /var/mmfs/ccr on all quorum nodes. This
assumes that a valid CCR backup is still available.
Command to apply to restore the configuration information: mmsdrrestore -F <PATH_TO_CCR_BACKUP_FILE> -a
For more information, see Recovering from an existing CCR backup.
The following sections describe the different recovery cases in more detail with examples.