Creating a Mutual Failover Db2 instance on a Pacemaker-managed Linux cluster

Db2 11.5.8 and later provides a Mutual Failover high availability option for Db2 when deployed on a Pacemaker-managed Linux cluster. You create the cluster using the Db2 cluster manager (db2cm) utility.

Before you begin

Important: In Db2® 11.5.8 and later, Mutual Failover high availability is supported when using Pacemaker as the integrated cluster manager. In Db2 11.5.6 and later, the Pacemaker cluster manager for automated fail-over to HADR standby databases is packaged and installed with Db2. In Db2 11.5.5, Pacemaker is included and available for production environments. In Db2 11.5.4, Pacemaker is included as a technology preview only, for development, test, and proof-of-concept environments.

About this task

Db2 Mutual Failover is a highly available deployment where two hosts have an identical Db2 instance installation and a shared mount, with only one host active at one time. With this solution, the setup, configuration, and management of the deployment is handled with the db2cm utility, and the automation of the solution is monitored and controlled by the Pacemaker cluster manager.

The following placeholders are used in the command statements throughout this procedure. These represent values that you can change to suit your organization:

<host1> and <host2> are the host names for the Mutual Failover hosts in the Db2 Linux cluster.
<host3> is the host name of the quorum device, which is used to make cluster management decisions when Pacemaker cannot.
<partition_name> is the partition on the host that holds the file system for the Db2 cluster.
<partition_number> is a unique number that identifies the database partition server in the Db2 Pacemaker cluster. For more information, see dbpartitionnum.
<network_interface_name> is the name of the device on the cluster.
<db2_fs_dir> is the mount point for the file system that is used on the cluster.
<db2inst1_dir> is the user directory created for the db2inst1 user.
<db2c_db2inst1> is the port name of your Db2 instance.
<db2_home_dir> is the directory for your Db2 instance.
<file_system> is the file system type chosen for your Db2 Linux cluster. For example, ext3 or ext4.
<database_name> is the name of the Db2 database.
<instance_name> is the name of the Db2 instance on the cluster.
<db2iadm1> is the administrators group for the Db2 instance.
<db2fadm1> is the fenced administrators group for the Db2 instance.
<db2inst1> is the name of the Db2 instance user.
<db2fenc1> is the name of the Db2 instance fenced user.

Attention: It is important that you follow the steps in this procedure correctly and verify your results. Specifically, ensure that you configure a mount mask correctly, as described in steps 1.h and 2.g, and set up SSH correctly. Failure to do so can result in data corruption.

Procedure

Configure the first host:
1. As the root user, create a directory on host1 for the file system that is to be shared by the two Db2 Mutual Failover hosts (<host1> and <host2>).
```
[root@<host1> ~]# mkdir /<db2_fs_dir> 
```
2. Format a partition on the target device with the appropriate file system for the cluster:
```
[root@<host1> ~]# mkfs.<file_system> /dev/<partition_name> 
```
  Note: For more information, see Recommended file systems.
3. Get the Universal Unique Identifier (UUID) of the mount device that you created:
```
[root@<host1> ~]# blkid
```
4. Create a new entry in the /etc/fstab file on both hosts:
```
UUID=31a26d02-391c-4229-8c70-1d4dc75c0b70 /<db2fsdir> <file_system> acl,user_xattr,noauto 0 0 
```
5. When the file system is listed in the fstab file, mount the new file system on this host, using the folder that you created in step 1.a as a mount point:
```
[root@<host1> ~]# mount /<db2_fs_dir> 
```
6. Optional: For any existing databases, add and associate database mount resources with an existing Pacemaker cluster:
```
db2cm -add -dbMount <database_name> -partition 0 -instance <instance_name>
```
  Running this command creates database mount resources of database directories if they do not already exist.
7. Verify that the file system is listed in mounts:
```
[root@<host1> ~]# mount | grep /<db2_fs_dir>
```
8. Add a mask for the mount point:
```
systemctl mask $(systemd-escape -p --suffix=mount "<db2_fs_dir>")
```
  This step prevents the file system from being mounted on multiple systems during a remote shell login to the instance user. Mounting on multiple systems can corrupt your data.
9. If they are not already configured, create administration groups, an instance user, and a fenced user on both hosts in your cluster:
```
groupadd -g 990 db2iadm1
groupadd -g 989 db2fadm1
useradd -u 1002 -g db2iadm1 -m -d /home/<db2inst1> -s /usr/bin/ksh <db2inst1>
useradd -u 1003 -g db2fadm1 -m -d /home/<db2fenc1> -s /usr/bin/ksh <db2fenc1>
```
10. Install Db2 on both hosts in your cluster:
```
./db2_install -b /opt/ibm/db2/V115M8A -p SERVER -y
```
11. Create an empty directory for the db2inst1 user and copy the existing .profile and .kshrc files over to the folder. If a different shell is used, migrate those shell initialization scripts, as needed:
```
mkdir /<db2_fs_dir>/<db2inst1_dir>
chown -R <db2inst1>.<db2iadm1> /<db2_fs_dir>
usermod -d /<db2_fs_dir>/<db2inst1_dir> <db2inst1>
cp -f /home/db2inst1/.profile /<db2_fs_dir>/<db2inst1_dir>
cp -f /home/db2inst1/.kshrc /<db2_fs_dir>/<db2inst1_dir>
```
12. Create your Db2 instance:
```
[root@<host1> <db2_home_dir>]# /opt/ibm/db2/V115M8A/instance/db2icrt -p <db2c_db2inst1> -u <db2fenc1> <db2inst1>
```
13. Configure passwordless SSH by first adding the Db2 instance user's SSH key to the authorized_keys file on host1, and then running the umount command to unmount the shared file system:
```
[root@<host1> ~]# cat /<db2_fs_dir>/<db2_home_dir>/.ssh/id_rsa.pub >> /<db2_fs_dir>/<db2_home_dir>/.ssh/authorized_keys
```
Configure the second host:
1. Unmount the shared file system from the first host:
```
[root@<host1> ~]# umount /<db2_fs_dir>
```
2. Verify that this directory is no longer mounted on this host:
```
[root@<host1> ~]# mount | grep /<db2_fs_dir>
```
3. Make a duplicate file system directory on the second host and then mount the shared file system to the directory:
```
[root@<host2> ~]# mkdir /<db2_fs_dir>
[root@haap2 ~]# mount /<db2_fs_dir>
```
4. Confirm that the users are the same on both hosts:
```
root@<host2> ~]# ls -al /db2haap
```
  Note: If the list results show that the user and group ownership of the files and directories in the shared file system are different on the second host, then the UID and GID of the instance user and db2iadm1 group are not the same on both hosts. These values must be made the same on both hosts before proceeding.
5. Mount the shared file system and add the Db2 instance user's SSH key to the authorized_keys file.
  Note: This key is identical to the one added by the first host and is required to edit one of the keys in the authorized_keys files, so that the hostnames reflect what was done during setup. For example, the second key from the first host,
```
ssh-rsa [key1]= db2inst1@<host1> 
ssh-rsa [key1]= db2inst1@<host1>
```
  needs to be changed to indicate the second host:
```
ssh-rsa [key1]= db2inst1@<host1> 
ssh-rsa [key1]= db2inst1@<host2>
```
6. Configure the home directory on the second host and then remove all files from the sqllib directory:
```
chown -R db2inst1.db2iadm1 /<db2_fs_dir>
usermod -d /<db2_fs_dir>/<db2_home_dir> <db2inst1>

rm -rf /<db2_fs_dir>/<db2_home_dir>/sqllib
```
7. Add a mask for the mount point:
```
systemctl mask $(systemd-escape -p --suffix=mount "<db2_fs_dir>") 
```
  This step prevents the file system from being mounted on multiple systems during a remote shell login to the instance user. Mounting on multiple systems can corrupt your data.
8. Create the Db2 instance on the second host. This is needed since some of the instance files are created on the local disk, such as the Db2 registry:
```
[root@<host2> ~]# /opt/ibm/db2/V115M8A/instance/db2icrt -p <db2c_db2inst1> -u <db2fenc1> <db2inst1>
```
Configure Mutual Failover with db2cm:
1. On the first host, go to /<db2_fs_dir>/<db2_home_dir>/sqllib/bin/ and locate the db2cm script.
2. Run db2cm to create your Db2 cluster:
```
root@<host1>$ ./db2cm -create -cluster -domain <domain_name> -host <host1> -publicEthernet <network_interface_name> 
-host <host2> -publicEthernet <network_interface_name>
```
  Two cluster resources are created, using the naming convention db2_<hostname>_<network_interface_name>
3. Create a partition resource on the first host:
```
root@<host1>$ ./db2cm -create -partition <partition_number> -instance <instance_name>
```
4. Create a VIP resource on the first host:
```
root@<host1>$ ./db2cm -create -primaryVIP <virtual IP address> -partition <partition_number> -instance <instance_name>
```
5. Create a Quorum Device on the first host:
```
root@<host1>$ ./db2cm -create -qdevice <host3>
```
6. Verify the configuration of your Mutual Failover, Pacemaker-managed Db2 cluster:
```
root@<host1>$./db2cm -list
```

Examples

The following example shows the command syntax and output from creating a file system for your Pacemaker-managed, Mutual Failover Linux cluster (see Step 1.b):

[root@haap1 ~]# mkfs.ext3 /dev/sda1
mke2fs 1.42.9 (28-Dec-2013)
/dev/sda1 is entire device, not just one partition!
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=1024 blocks
3932160 inodes, 15728640 blocks
786432 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
480 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424
 
Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

The following example shows the command syntax and output from retrieving a UUID from the mounted device (haap1) on partition sda1 (see Step 1.c):

[root@haap1 ~]# blkid
...
/dev/sda1: UUID="31a26d02-391c-4229-8c70-1d4dc75c0b70" BLOCK_SIZE="4096" TYPE="ext3"

The following example shows the command syntax and output from creating a new entry in the /etc/fstab file on both hosts (see Step 1.d):

/dev/sda1:UUID=31a26d02-391c-4229-8c70-1d4dc75c0b70    /db2haap    ext3    acl,user_xattr,noauto    0 0

The following example shows the command syntax and output from creating administration groups and users on both Mutual Failover hosts (see Step 1.i.):

groupadd -g 990 db2iadm1
groupadd -g 989 db2fadm1
useradd -u 1002 -g db2iadm1 -m -d /home/db2inst1 -s /usr/bin/ksh db2inst1
useradd -u 1003 -g db2fadm1 -m -d /home/db2fenc1 -s /usr/bin/ksh db2fenc1

The following example shows the command syntax and output from creating an empty directory on the first host for the Db2 instance user (db2inst1) user, and copying the existing .profile and .kshrc files over to the folder (see Step 1.k).

mkdir /db2haap/db2home
chown -R db2inst1.db2iadm1 /db2haap
usermod -d /db2haap/db2home db2inst1
cp -f /home/db2inst1/.profile /db2haap/db2home
cp -f /home/db2inst1/.kshrc /db2haap/db2home

The following example shows the command syntax and output from listing the contents of the file system directory (see Step 2.d:

[root@haap2 ~]# ls -al /db2haap
total 24
drwxr-xr-x   4 db2inst1 db2iadm1  4096 Sep 10 22:40 .
dr-xr-xr-x. 19 root     root       276 Sep 10 22:55 ..
drwxr-xr-x   6 db2inst1 db2iadm1  4096 Sep 10 22:45 db2home
drwx------   2 db2inst1 db2iadm1 16384 Sep 10 22:26 lost+found
[root@haap2 ~]# ls -al /db2haap/db2home/
total 36
drwxr-xr-x  6 db2inst1 db2iadm1 4096 Sep 10 22:45 .
drwxr-xr-x  4 db2inst1 db2iadm1 4096 Sep 10 22:40 ..
drwxr-xr-x  3 db2inst1 db2iadm1 4096 Sep 10 22:44 .config
drwxr-xr-x  3 db2inst1 db2iadm1 4096 Sep 10 22:44 .ibm
-rw-r--r--  1 root     root      172 Sep 10 22:43 .kshrc
drwxr-xr-x  3 db2inst1 db2iadm1 4096 Sep 10 22:44 .local
-rw-r--r--  1 root     root      173 Sep 10 22:44 .profile
-rw-------  1 db2inst1 db2iadm1   38 Sep 10 22:46 .sh_history
drwxrwsr-t 24 db2inst1 db2iadm1 4096 Sep 10 22:45 sqllib
[root@haap2 ~]# cat /db2haap/db2home/.ssh/id_rsa.pub >> /db2haap/db2home/.ssh/authorized_keys

The following example shows the output from running the db2cm -list command to verify the creation of a Mutual Failover cluster (see Step 3.f):

$ ./db2cm -list
      Cluster Status
 
Domain information:
Domain name               = db2ha
Pacemaker version         = 2.1.2-4.db2pcmk.el8
Corosync version          = 3.1.6
Current domain leader     = haap1
Number of nodes           = 2
Number of resources       = 5
 
Node information:
Name                State
----------------    --------
haap1              Online
haap2              Online
 
Resource Information:
 
Resource Name             = db2_haap1_eth0
  State                         = Online
  Managed                       = true
  Resource Type                 = Network Interface
    Node                        = haap1
    Interface Name              = eth0
 
Resource Name             = db2_haap2_eth0
  State                         = Online
  Managed                       = true
  Resource Type                 = Network Interface
    Node                        = haap2
    Interface Name              = eth0
 
Resource Name             = db2_db2inst1_0
  State                         = Online
  Managed                       = true
  Resource Type                 = Partition
  Instance                      = db2inst1
  Partition                     = 0
  Current Host                  = haap1
 
Resource Name             = db2_db2inst1_0-VIP
  State                         = Online
  Managed                       = true
  Resource Type                 = IP
    Node                        = haap1
    Ip Address                  = 10.11.82.38
  Location                      = haap1
 
Resource Name             = db2_db2inst1_0-mnt_db2haap
  State                         = Online
  Managed                       = true
  Resource Type                 = File System
  Device                        = "/dev/sdb"
  Mount Point                   = "/db2haap"
  File System Type              = ext3
  Current Host                  = haap1

Fencing Information:
  Configured
Quorum Information:
  Qdevice
 
Qdevice information
-------------------
Model:			Net
Node ID:		1
Configured node list:
    0	Node ID = 1
    1	Node ID = 2
Membership node list:	1, 2
 
Qdevice-net information
----------------------
Cluster name:		db2ha
QNetd host:		haap3:5403
Algorithm:		LMS
Tie-breaker:		Node with lowest node ID
State:			Connected