Restore from a saved Pacemaker cluster configuration

In situations where the cluster needs to be recreated, a saved Pacemaker configuration, based on the current hardware, can be restored.

Before you begin

Important: In Db2® 11.5.8 and later, Mutual Failover high availability is supported when using Pacemaker as the integrated cluster manager. In Db2 11.5.6 and later, the Pacemaker cluster manager for automated fail-over to HADR standby databases is packaged and installed with Db2. In Db2 11.5.5, Pacemaker is included and available for production environments. In Db2 11.5.4, Pacemaker is included as a technology preview only, for development, test, and proof-of-concept environments.
Note: Due to corosync requirements, you must configure passwordless root secure shell (SSH) access and enable secure copy protocol (SCP) for the primary, standby, and qdevice hosts when importing the cluster configuration file with a quorum device. Ensure that you are able to use root passwordless SSH between all hosts, as well as locally on the current host. Passwordless root SSH is only needed for initial configuration, and further configuration changes. Passwordless root SSH is not needed for continued QDevice work
Before proceeding with the configuration, ensure that the following system dependencies are in place:

About this task

You can redeploy a Pacemaker cluster configuration and resource model from a saved configuration, using the -import option.

Note: A backup configuration cannot be imported on a new set of hosts where any of the following details are different from the original cluster:
  • Host names
  • Domain name
  • Interface names
  • Instance names
  • Database names
  • Primary/Standby virtual IP addresses
  • Qdevice host
To import a configuration on a new set of hosts, follow the example in this technote.

Procedure

  1. As the root user, prepare the current cluster for the restoration:
    Warning: Any virtual IPs (VIPs) configured via the db2cm utility will be removed when running the db2cm -delete -cluster command which could result in database clients becoming disconnected temporarily.
    ./sqllib/bin/db2cm -delete -cluster
  2. As the root user, ensure that the cluster's resources and domain have been removed successfully:
    crm status
  3. As the root user, determine if each Db2 HADR database is in "PEER" state:
    ./sqllib/adm/db2pd -hadr -db <dbname> | grep HADR_STATE
    If any databases are not in PEER state, check the db2diag.log for any problems encountered by HADR.
  4. Import the previous cluster configuration:
    ./sqllib/bin/db2cm -import <path to backup file>
  5. As the root user, verify that both nodes and all resources are online:
    ./sqllib/bin/db2cm -list
    Both nodes and all resources should show as Online under Node information and Resource Information respectively. If a Qdevice was configured at the time the configuration backup was taken, it should show up under Quorum Information.

Examples

The following example shows the command syntax and output from running db2cm to prepare a cluster for restoration:
[root@jesting1]$ /home/<instance user>/sqllib/bin/db2cm -delete -cluster
Cluster deleted successfully.
The following example shows the command syntax and output from running crm to see if the cluster's resources and domain have been removed successfully:
[root@jesting1]$ crm status
ERROR: status: crm_mon (rc=102): Error: cluster is not available on this node
The following example shows the command syntax and output from running db2pd to see if each HADR database is in PEER state:
./sqllib/adm/db2pd -hadr -db HADRDB | grep HADR_STATE
                      HADR_STATE = PEER
                      HADR_STATE = PEER
The following example shows the command syntax and output from importing a saved cluster configuration:
[root@jesting1]$ /home/<instance user>/sqllib/bin/db2cm -import /tmp/backup.conf
Importing cluster configuration from /tmp/backup.conf...
Import completed successfully.
The following example shows the command syntax and output from verifying that both nodes are running:
/home/db2inst1/sqllib/bin/db2cm -list
      Cluster Status

Domain information:
Domain name               = pcmkdomain
Pacemaker version         = 2.0.2-1.db2pcmk.el8
Corosync version          = 3.0.3
Current domain leader     = jesting1
Number of nodes           = 2
Number of resources       = 6

Node information:
Name name           State
----------------    --------
jesting1            Online
inwards1            Online

Resource Information:

Resource Name             = db2_db2inst1_db2inst1_HADRDB
  Resource Type                 = HADR
    DB Name                     = HADRDB
    Managed                     = true
    HADR Primary Instance       = db2inst1
    HADR Primary Node           = jesting1
    HADR Primary State          = Online
    HADR Standby Instance       = db2inst1
    HADR Standby Node           = inwards1
    HADR Standby State          = Online

Resource Name             = db2_inwards1_db2inst1_0
  State                         = Online
  Managed                       = true
  Resource Type                 = Instance
    Node                        = inwards1
    Instance Name               = db2inst1

Resource Name             = db2_inwards1_eth1
  State                         = Online
  Managed                       = true
  Resource Type                 = Network Interface
    Node                        = inwards1
    Interface Name              = eth1

Resource Name             = db2_jesting1_db2inst1_0
  State                         = Online
  Managed                       = true
  Resource Type                 = Instance
    Node                        = jesting1
    Instance Name               = db2inst1

Resource Name             = db2_jesting1_eth1
  State                         = Online
  Managed                       = true
  Resource Type                 = Network Interface
    Node                        = jesting1
    Interface Name              = eth1

Fencing Information:
  Not Configured
Quorum Information:
  Qdevice

Qdevice information
-------------------
Model:                  Net
Node ID:                1
Configured node list:
    0   Node ID = 1
    1   Node ID = 2
Membership node list:   1, 2

Qdevice-net information
----------------------
Cluster name:           pcmkdomain
QNetd host:             frizzly1:5403
Algorithm:              LMS
Tie-breaker:            Node with lowest node ID
State:                  Connected