Creating an HADR Db2 instance on a Pacemaker-managed Linux cluster

You can create and administer a Pacemaker-managed Db2 Linux cluster that is configured for High Availability Disaster Recovery (HADR) by using the Db2® cluster manager (db2cm) utility.

Before you begin

Important: In Db2 11.5.8 and later, Mutual Failover high availability is supported when using Pacemaker as the integrated cluster manager. In Db2 11.5.6 and later, the Pacemaker cluster manager for automated fail-over to HADR standby databases is packaged and installed with Db2. In Db2 11.5.5, Pacemaker is included and available for production environments. In Db2 11.5.4, Pacemaker is included as a technology preview only, for development, test, and proof-of-concept environments.
Before proceeding with the configuration, ensure that the following system dependencies are in place:
  • The Pacemaker cluster software stack must be installed on all hosts in the cluster. For more information, refer to Installing the Pacemaker cluster software stack.
  • The Db2 instances and HADR database should be configured and online.
  • The HADR database must be configured with either SYNC or NEARSYNC as the HADR_SYNCMODE.
  • The HADR database HADR_PEER_WINDOW should be set to the recommended value of 120 seconds (60 seconds minimum).

About this task

The following placeholders are used in the command statements throughout this procedure:
  • <exportedFile> is the name of the file to which the instance owner backs up their existing SA MP cluster configuration.
  • <hostname1> and <hostname2> are the host names for the primary and standby network interfaces in the cluster.
  • <network_interface_name> is the name of the device on the cluster.
  • <database_name> is the name of the Db2 database on the HADR database resource.
  • <instance_name> is the name of the Db2 instance on the cluster

For this procedure, you must run all of the steps as the root user on one of the hosts. There is no need to run them on both hosts.

Procedure

  1. Create the Pacemaker cluster and the public network resources by running the following command.
    ./sqllib/bin/db2cm -create -cluster -domain <domain name> 
    -host <hostname1> -publicEthernet <network_interface_name>
    -host <hostname2> -publicEthernet <network_interface_name>
    Note: You need only to run this step once.
  2. Create the instance cluster domain by running the following commands:
    ./sqllib/bin/db2cm -create -instance <instance_name> –host <hostname1>
    
    ./sqllib/bin/db2cm -create -instance <instance_name> –host <hostname2>
  3. Verify the cluster by running the crm status command.
    Note:
    • The Online parameter needs to include both hosts.
    • db2_<hostname>_<network_interface_name> is the public Ethernet resource in the cluster domain.
  4. Create a new database and configure HADR on the new database. For more information on configuring HADR, refer to Initializing high availability disaster recovery (HADR). If your database already exists and HADR is configured, proceed to step 5.
  5. Create the HADR database resources.
    ./sqllib/bin/db2cm -create -db <database_name> -instance <instance_name>
  6. Optional: Create the VIP resources for the newly created database.
    ./sqllib/bin/db2cm -create -primaryVIP <IP_address> -db <database_name> –instance <instance_name>
  7. Verify the cluster again using crm status.
    Note: Ensure that the output contains the following information:
    • The Online parameter includes the IP addresses of both hosts.
    • The db2_<hostname>_<network_interface_name> public Ethernet resource exists in the cluster domain. Ensure that one resource exists on each host, and that both resources are labeled with the Started state.
    • A db2_<instance_name>_<hostname>_0 instance resource exists for every Db2 instance.
    • The database resource has Master and Slaves started on the respective host.
  8. Verify that the associated constraints have been created by running the crm config show command.
    These constraints ensure that the following operations occur:
    • The Virtual IP that is associated with the database is started on the same host as the primary database.
    • The Pacemaker cluster manager is configured to start the Db2 database after the instance is up and the public network is available on the host. The Pacemaker cluster manager then starts the VIP on the host where the primary database resides.

Examples

The following example shows the command syntax and output for creating the Pacemaker cluster and the public network resources, where ip-172-31-15-79 and ip-172-31-10-145 are the host names of the nodes in your cluster, hadom is the domain name and eth0 is the network interface (device) name of each host (see step 1):
[root@ip-172-31-15-79 Db2]# /home/db2inst1/sqllib/bin/db2cm -create -cluster -domain hadom -host ip-172-31-15-79 -publicEthernet eth0 -host ip-172-31-10-145 -publicEthernet eth0
Created db2_ip-172-31-15-79_eth0 resource.
Created db2_ip-172-31-10-145_eth0 resource.
Cluster created successfully.
The following example shows the command syntax and output for creating the instance cluster domain for the Db2 instance db2inst1 (see step 2):
[root@ip-172-31-15-79 Db2]# /home/db2inst1/sqllib/bin/db2cm -create -instance db2inst1 –host ip-172-31-15-79

INSTANCE-HOME/sqllib/bin/db2cm -create -instance db2inst1 –host ip-172-31-10-145
Sample output:
[root@ip-172-31-15-79 ~]# /home/db2inst1/sqllib/bin/db2cm -create -instance db2inst1 –host ip-172-31-15-79
Created db2_ip-172-31-15-79_db2inst1_0 resource.
Instance resource for db2inst1 on ip-172-31-15-79 created successfully.

[root@ip-172-31-15-79 ~]# /home/db2inst1/sqllib/bin/db2cm -create -instance db2inst1 –host ip-172-31-10-145
Created db2_ip-172-31-10-145_db2inst1_0 resource.
Instance resource for db2inst1 on ip-172-31-10-145 created successfully.
The following example shows sample output from running crm status to verify the cluster before creating the Db2 databases, HADR resources, and VIP resources (see step 3):
[root@ip-172-31-10-145 Db2agents]# crm status
Stack: corosync
Current DC: ip-172-31-10-145 (version 2.0.2-1.el8-744a30d655) - partition with quorum
Last updated: Tue Dec 24 21:49:57 2019
Last change: Tue Dec 24 21:39:45 2019 by root via cibadmin on ip-172-31-15-79
 
2 nodes configured
4 resources configured

Online: [ ip-172-31-10-145 ip-172-31-15-79 ]

Full list of resources:

db2_ip-172-31-15-79_eth0       (ocf::heartbeat:db2ethmon):     Started ip-172-31-15-79
db2_ip-172-31-10-145_eth0      (ocf::heartbeat:db2ethmon):     Started ip-172-31-10-145
db2_ip-172-31-15-79_db2inst1_0        (ocf::heartbeat:db2inst):       Started ip-172-31-15-79
db2_ip-172-31-10-145_db2inst1_0        (ocf::heartbeat:db2inst):       Started ip-172-31-10-145
The following example shows the command syntax for creating the HADR resources on a database named SAMPLE on the Db2 instance db2inst1 (see step 5) :
[root@ip-172-31-15-79 db2inst1]# /home/db2inst1/sqllib/bin/db2cm -create -db SAMPLE -instance db2inst1
Database resource for SAMPLE created successfully
The following example shows the command syntax for creating the VIP resource on a database named SAMPLE on the Db2 instance db2inst1 (see step 6):
INSTANCE-HOME/sqllib/bin/db2cm -create -primaryVIP <IP address> -db SAMPLE –instance db2inst1
The following example shows the output from running crm status to verify the cluster after creating the Db2 databases, HADR resources, and VIP resources (see step 7):
[root@ip-172-31-10-145 db2inst1]# crm status
Stack: corosync
Current DC: ip-172-31-10-145 (version 2.0.2-1.el8-744a30d655) - partition with quorum
Last updated: Tue Dec 24 23:09:53 2019
Last change: Tue Dec 24 23:04:19 2019 by root via cibadmin on ip-172-31-10-145

2 nodes configured
7 resources configured
    
Online: [ ip-172-31-10-145 ip-172-31-15-79 ]

Full list of resources:

db2_ip-172-31-15-79_eth0       (ocf::heartbeat:db2ethmon):     Started ip-172-31-15-79
db2_ip-172-31-10-145_eth0      (ocf::heartbeat:db2ethmon):     Started ip-172-31-10-145
db2_ip-172-31-15-79_db2inst1_0        (ocf::heartbeat:db2inst):       Started ip-172-31-15-79
db2_ip-172-31-10-145_db2inst1_0        (ocf::heartbeat:db2inst):       Started ip-172-31-10-145
Clone Set: db2_db2inst1_db2inst2_SAMPLE-clone [db2_db2inst1_db2inst2_SAMPLE] (promotable)
  Masters: [ ip-172-31-10-145 ]
   Slaves: [ ip-172-31-15-79 ]
db2_db2inst1_db2inst1_SAMPLE-primary-VIP       (ocf::heartbeat:IPaddr2):       Started ip-172-31-10-145
The following example shows the output from running crm config show to verify that the associated constraints for your Pacemaker cluster are created (see step 8):
[root@ip-172-31-10-145 ~]# crm config show
node 1: ip-172-31-15-79
node 2: ip-172-31-10-145
primitive db2_db2inst1_db2inst1_SAMPLE db2hadr \
        params instance="db2inst1,db2inst1" dbname=SAMPLE \
        op demote interval=0s timeout=120s \
        op monitor interval=20s timeout=60s \
        op monitor interval=22s role=Master timeout=60s \
        op monitor interval=24s role=Slave timeout=60s \
        op promote interval=0s timeout=120s \
        op start interval=0s timeout=120s \
        op stop interval=0s timeout=120s \
        meta resource-stickiness=5000 migration-threshold=0
primitive db2_db2inst1_db2inst1_SAMPLE-primary-VIP IPaddr2 \
        params ip=9.28.232.70 cidr_netmask=21 \
        op monitor interval=30s \
        op start interval=0s timeout=20s \
        op stop interval=0s timeout=20s \
        meta is-managed=true
primitive db2_ip-172-31-15-79_db2inst1_0 db2inst \
        params instance=db2inst1 hostname=ip-172-31-15-79 \
        op monitor timeout=120s interval=10s on-fail=restart \
        op start interval=0s timeout=120s \
        op stop interval=0s timeout=120s \
        meta migration-threshold=0 is-managed=true
primitive db2_ip-172-31-15-79_eth0 db2ethmon \
        params interface=eth0 hostname=ip-172-31-15-79 repeat_count=4 repeat_interval=4 \
        op monitor timeout=30s interval=4 \
        op start timeout=60s interval=0s \
        op stop interval=0s timeout=20s \
        meta is-managed=true
primitive db2_ip-172-31-10-145_db2inst1_0 db2inst \
        params instance=db2inst1 hostname=ip-172-31-10-145 \
        op monitor timeout=120s interval=10s on-fail=restart \
        op start interval=0s timeout=120s \
        op stop interval=0s timeout=120s \
        meta migration-threshold=0 is-managed=true
primitive db2_ip-172-31-10-145_eth0 db2ethmon \
        params interface=eth0 hostname=ip-172-31-10-145 repeat_count=4 repeat_interval=4 \
        op monitor timeout=30s interval=4 \
        op start timeout=60s interval=0s \
        op stop interval=0s timeout=20s \
        meta is-managed=true
ms db2_db2inst1_db2inst1_SAMPLE-clone db2_db2inst1_db2inst1_SAMPLE \
        meta resource-stickiness=5000 migration-threshold=0 ordered=true promotable=true is-managed=true
colocation db2_db2inst1_db2inst1_SAMPLE-primary-VIP-colocation inf: db2_db2inst1_db2inst1_SAMPLE-primary-VIP:Started db2_db2inst1_db2inst1_SAMPLE-clone:Master
location loc-rule-db2_db2inst1_db2inst1_SAMPLE-eth0-ip-172-31-15-79 db2_db2inst1_db2inst1_SAMPLE-clone \
        rule -inf: db2ethmon-eth0 eq 0
location loc-rule-db2_db2inst1_db2inst1_SAMPLE-eth0-ip-172-31-10-145 db2_db2inst1_db2inst1_SAMPLE-clone \
        rule -inf: db2ethmon-eth0 eq 0
location loc-rule-db2_db2inst1_db2inst1_SAMPLE-node-ip-172-31-15-79 db2_db2inst1_db2inst1_SAMPLE-clone \
        rule -inf: db2inst-db2inst1 eq 0
location loc-rule-db2_db2inst1_db2inst1_SAMPLE-node-ip-172-31-10-145 db2_db2inst1_db2inst1_SAMPLE-clone \
        rule -inf: db2inst-db2inst1 eq 0
order order-rule-db2_db2inst1_db2inst1_SAMPLE-then-primary-VIP Mandatory: db2_db2inst1_db2inst1_SAMPLE-clone:start db2_db2inst1_db2inst1_SAMPLE-primary-VIP:start
location prefer-db2_ip-172-31-15-79_db2inst1_0 db2_ip-172-31-15-79_db2inst1_0 100: ip-172-31-15-79
location prefer-db2_ip-172-31-15-79_eth0 db2_ip-172-31-15-79_eth0 100: ip-172-31-15-79
location prefer-db2_ip-172-31-10-145_db2inst1_0 db2_ip-172-31-10-145_db2inst1_0 100: ip-172-31-10-145
location prefer-db2_ip-172-31-10-145_eth0 db2_ip-172-31-10-145_eth0 100: ip-172-31-10-145
location prefer-db2inst1-ip-172-31-15-79-SAMPLE-primary-VIP db2_db2inst1_db2inst1_SAMPLE-primary-VIP 100: ip-172-31-15-79
location prefer-db2inst1-ip-172-31-10-145-SAMPLE-primary-VIP db2_db2inst1_db2inst1_SAMPLE-primary-VIP 100: ip-172-31-10-145
location prefer-ip-172-31-15-79-db2inst1-db2_db2inst1_db2inst1_SAMPLE-clone db2_db2inst1_db2inst1_SAMPLE-clone 100: ip-172-31-15-79
location prefer-ip-172-31-10-145-db2inst1-db2_db2inst1_db2inst1_SAMPLE-clone db2_db2inst1_db2inst1_SAMPLE-clone 100: ip-172-31-10-145
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=2.0.2-1.el8-744a30d655 \
        cluster-infrastructure=corosync \
        cluster-name=hadom \
        stonith-enabled=false \
        no-quorum-policy=ignore \
        stop-all-resources=false \
        cluster-recheck-interval=60 \
        symmetric-cluster=false \
        last-lrm-refresh=1583509412
rsc_defaults rsc-options: \
        failure-timeout=60
rsc_defaults rsc_defaults-options: \
        is-managed=false