Repairing the cluster manager domain

If a failure situation occurs with a Db2® pureScale® instance which requires the cluster manager domain to be re-created, use the db2cm command to re-create it.

Before you begin

The Db2 instance must be stopped before performing this task; all nodes in the cluster must be reachable and have root passwordless SSH (or db2locssh) configured between them.

If a quorum device is configured for the instance, repairing the cluster manager domain requires passwordless root SSH and secure copy protocol (SCP) to be enabled between all cluster nodes and between all cluster nodes and the quorum device host. Ensure that you are able to use root passwordless SSH between all hosts, including the quorum device, as well as locally on the current host. This requirement is needed to reconfigure the quorum device.

About this task

Using the db2cm command re-creates the domain with the same topology and configuration as the existing domain including the quorum device if it exists.


Restrictions

The command used in this task can only be run as the System Administrator. 

Procedure

  1. Use the DB2INSTANCE environment variable to specify the target instance.
    export DB2INSTANCE=<inst-name>
  2. Issue the db2cm command with the -repair -domain option while inside the install directory or the sqllib/bin directory.
    db2cm -repair -domain <domain-name>
    The cluster manager domain name parameter is optional. If no cluster manager domain name is specified, the current domain is repaired. If a cluster manager domain name is specified, it must match the name of the current domain. To obtain the cluster manager domain name, run the db2cm command: db2cm -list -domain. (You can also obtain the domain name with the db2greg -dump command.)

    If the cluster manager domain is in an unhealthy state, hosts are in maintenance mode, or if there are resources still online, the db2cm command may fail and indicate that the command should be re-issued with the -force option. Re-issuing the command with the -force option successfully re-creates the cluster manager domain in these cases.

    If the cluster manager domain contains a quorum device and passwordless root SSH (or db2locssh starting in Db2® 12.1.3) is not enabled between the cluster nodes and the qdevice host, the db2cm command will fail. In Db2® 12.1.2 and later, the command can be re-issued with the -force option to bypass this requirement. Re-issuing the command with the -force option will re-create the cluster manager domain without reconfiguring the quorum device.

Results

After successful re-creation of the cluster manager domain, bring the instance back online using the db2start command. If the cluster manager domain cannot be successfully re-created, contact an IBM Service Representative for more information about how to recover from this problem.

Example

A DBA with System Administrator authority needs to re-create a cluster manager domain, MYDOMAIN, in Db2 instance myinst1.
export DB2INSTANCE=myinst1
db2cm -repair -domain MYDOMAIN
As the domain is torn down and re-created, db2cm issues informational messages about the progress and the successful completion of the operation:
 Deleting the domain 'MYDOMAIN' from the cluster ...
 Deleting the domain 'MYDOMAIN' from the cluster was successful.
 Creating domain 'MYDOMAIN' in the cluster ...
 Creating domain 'MYDOMAIN' in the cluster was successful.
 The resource model for the instance 'myinst1' has been re-created.
 The cluster manager domain has been successfully repaired.