[Linux][V9.0.4 Oct 2017]

Replacing a failed node

If one of the nodes in your HA group fails, you can replace it.

About this task

The steps to take to replace a node depend on the scenario:
  • If you are replacing the failed node with a node with an identical configuration, you can replace the node without disrupting the HA group.
  • If the new node has a different configuration, then you must delete and then rebuild the HA group.

Procedure

  • If the replacement node is configured to look like the failed node (same hostname, same IP addresses, and so on), then complete the following steps on the new node:
    1. Create an rdqm.ini file that matches the files on the other nodes, and then run the rdqmadm -c command (see Defining the Pacemaker cluster (HA group)).
    2. Run the crtmqm -sxs qmanager command to recreate each replicated data queue manager (see Creating an HA RDQM).
  • If the replacement node has a different configuration to the failed node:
    1. Delete the replicated data queue managers from the other nodes in the HA group by using the dltmqm command (see Deleting an HA RDQM).
    2. Unconfigure the Pacemaker cluster by using the rdqmadm -u command (see Deleting the Pacemaker cluster (HA group)).
    3. Reconfigure the Pacemaker cluster, including the information for the new node, by using the rdqmadm -c command (see Defining the Pacemaker cluster (HA group)).
    4. Run the crtmqm -sxs qmanager command to recreate each replicated data queue manager (see Creating an HA RDQM).