How To
Summary
This document provides an alternate Db2 HADR configuration with Pacemaker without using a third lightweight host as a quorum device arbitrator.
Objective
The objective of this document is to detail an alternative to the two-node HADR + quorum device best practice Pacemaker solution detailed in the IBM Documentation here:
Quorum devices support on Pacemaker - IBM Documentation
On Google Cloud, you do not necessarily need to configure a quorum device on a third host. Instead, you can configure fencing as described in this document.
The advantage of configuring a two-node HADR Pacemaker cluster with fencing is that it removes the requirement of a third host for the quorum device, thus reducing on-going cost. The disadvantage is that the longer recovery time from primary host failure due to the added time it takes to successfully fence the failed host from the cluster. Based on internal tests, it can take up to 6 times longer to recover from a primary host failure with fencing compared to using a quorum device host. To compensate for this effect, the HADR_PEER_WINDOW value of all databases must be set to at least 300 seconds.
The choice of configuration must be based on your specific business requirements by taking recovery time and cost of implementation into account.
Fencing on Google Cloud is done with the fence_gce agent.
This support is available from Db2 11.5.9.0 running on Red Hat Enterprise Linux Version 8.6 or higher or SUSE Linux Enterprise Server Release 15.4 or higher. The hardware architecture supported is the X86 architecture. The fencing agent for Google Cloud must be the one provided through the IBM Marketing Registration Services website instead of other versions available elsewhere.
Environment
Refer to the following IBM Documentation for a list of platforms supported by Pacemaker, these same restrictions apply here:
Restrictions on Pacemaker - IBM Documentation
Refer to the “Configuring a clustered environment using the Db2 cluster manager (db2cm) utility” page of the IBM Documentation to deploy the automated HADR solution as usual:
Creating an HADR Db2 instance on a Pacemaker-managed Linux cluster - IBM Documentation
A prerequisite is the Google Cloud guest environment installed on all nodes in the cluster. This guest environment is automatically deployed with each Google-provided public image and is set up automatically. If you are using a custom image, ensure that the guest environment is set up according to this Google documentation: Guest environment | Compute Engine Documentation | Google Cloud
In addition, you need to launch Google Cloud Shell and authorize the gcloud utility. Details are described here:
Launch Cloud Shell | Google Cloud
gcloud | Google Cloud CLI Documentation
To be able to set up the pacemaker cluster with fencing, it is beneficial to follow a consistent namespace for the different components. In our case, the names of the entities are derived from the cluster name and the Db2 instance in the cluster. In this document, we use placeholders for the entities that you need to replace with the entities in your environment.
The fencing agent interacts with the Google Cloud infrastructure to start, stop, or reboot the virtual machines by a Service Account. It is best practice to create a dedicated Service Account, a custom role with the minimal set of privileges and assign this role to the service account. The fencing agent uses this service account to interact with the Google Cloud backend. The fencing agent uses an access key that is stored on each virtual machine in the cluster for authentication.
If you need to derive from this setup with capabilities like rotating keys or centralized keystore, familiarize yourself with Google Cloud Identity and Access Management. You can use the following links as a starting point:
Identity and Access Management | IAM | Google Cloud
Identities for workloads | IAM Documentation | Google Cloud
Create and delete service account keys | IAM Documentation | Google Cloud
Best practices for managing service account keys | IAM Documentation | Google Cloud
Steps
Navigate to this website: https://www-01.ibm.com/marketing/iwm/platform/mrs/assets?source=mrs-db2pcmk&_ga=2.31425788.296289340.1604345966-344484498.1579133947
Download the latest version of the Google Cloud fencing agent, for example Db2_RHEL8_GCE_fence-agents-4.12.1.tar.gz from the IBM Marketing Registration Services website.
Unpack the archive by using the following commands:
gunzip Db2_RHEL8_GCE_fence-agents-4.12.1.tar.gz
tar -zxf Db2_RHEL8_GCE_fence-agents-4.12.1.tar.gz
These commands create the directory Db2_RHEL8_GCE_fence-agents-4.12.1
Install the rpm
Switch to the created directory, followed by the operating System identifier, and issue the following command:
For SLES:
zypper install --allow-unsigned-rpm *.rpm
For RHEL:
dnf install *.rpm
Note: The fencing agents must be installed on both nodes in the cluster.
2. Decide on namespaces and IP addresses
Compile a list of all host names, including virtual host names, and update your DNS servers to enable proper IP address to host-name resolution. If a DNS server doesn't exist or you can't update and create DNS entries, you need to use the local host files of the individual virtual machines. For an introduction to DNS, refer to: Internal DNS | Compute Engine Documentation | Google Cloud
If you're using host files entries, make sure that the entries are applied to all virtual machines in the environment. Also, compile a list of names for the different entities required according to the example shown in the following table.
Entity |
Name |
Google Cloud Project |
db2pcmk |
Google Cloud Region |
europe-west1 |
Db2 Instance |
db2gp1 |
Db2 database |
GP1 |
Hostname/tag of cluster node 0 |
pcmkdb01 |
Hostname/tag of cluster node 1 |
pcmkdb02 |
Pacemaker Cluster Name |
GP1cluster |
Google Cloud Service Account |
db2gp1-service-account |
Google Cloud Role Name for Fencing |
db2gp1fencer |
Google Cloud Service Account email |
db2gp1-service-account@db2pcmk2023.iam.gserviceaccount.com |
3. Create Service Account
Before setting up the fencing agent, create the required Google Backend Services. This configuration can be done by using the graphical user interface in the Google Cloud console or the Google Cloud Shell. In this document, we use the Google Cloud Shell. For a reference of Google Cloud Shell Commands, refer to: gcloud | Google Cloud CLI Documentation
gcloud iam service-accounts create db2gp1-service-account --display-name="db2gp1-service-account-for-fencing" --project="db2pcmk2023"
4. Create a Role with required permissions
In Cloud Shell, create a role and assign the required permissions. The required permissions are:
- compute.instances.get
- compute.instances.list
- compute.instances.reset
- compute.instances.start
- compute.instances.stop
- compute.zoneOperations.get
- logging.logEntries.create
- compute.zoneOperations.list
gcloud iam roles create db2gp1fencer --project=db2pcmk2023 --title=db2gp1fencer --description="Perform Pacemaker fencing actions on db2gp1" --stage=GA --permissions=compute.instances.get,compute.instances.list,compute.instances.reset,compute.instances.start,compute.instances.stop,compute.zoneOperations.get,logging.logEntries.create,compute.zoneOperations.list
5. Assign Role to the Service Account.
gcloud projects add-iam-policy-binding db2pcmk2023 --member serviceAccount:db2gp2-service-account@db2pcmk2023.iam.gserviceaccount.com --role=projects/db2pcmk2023/roles/db2gp2fencer --condition=None
Download Access Key File.
The Key file is used to authenticate the service account in the Google Cloud backend. The file must be downloaded to both nodes in the cluster and is used as an optional argument when the fencing agent in the pacemaker cluster is configured. You can use the gcloud command either in the Google Cloud Console or on one of the virtual machines in the cluster directly.
If you use the command in the Google Cloud Console, you can store the Key File locally on your workstation and upload it to the virtual machines afterward.
A more convenient way is to execute the gcloud command on one of the nodes in the cluster, save the Key File immediately on this node and copy it to the second node. The Key File can be located in the directory “fence_auth” in the home directory of the Db2 instance owner. To prepare the directory structure, perform the following command sequence on both nodes in the cluster.
If the setup in your organization requires a different setup for the key management, refer to:
Best practices for managing service account keys | IAM Documentation | Google Cloud
The following example shows how to execute the cloud command on the virtual machines itself.
To store the key file, create a new directory, for example /etc/db2pcmk_fence as user root on both nodes in the cluster.
mkdir /etc/db2pcmk_fence
chmod 700 /etc/db2pcmk_fence
cd /etc/db2pcmk_fence
In Cloud Shell. Generate and download the access key file for the service account used for the fencing agent.
Note: This command must be executed only on one of the virtual machines. In our case, we use the primary database server pcmkdb01.
The following example shows how to execute the cloud command on the virtual machines itself.
gcloud iam service-accounts keys create db2gp1.json --iam-account=db2gp1-service-account@db2pcmk2023.iam.gserviceaccount.com
Once this file is downloaded to the virtual machine, copy the file to the second node and change the permissions.
scp db2gp1.json pcmkdb02:/etc/db2pcmk_fence
ssh pcmkdb02 "chmod 700 /etc/db2pcmk_fence/db2gp1.json"
6. Prepare the Cluster for Fencing
- Create a systemd drop-in file: systemctl edit corosync.service and add the lines:
[Service]
ExecStartPre=/bin/sleep 60
- Reload the systemd daemon: systemctl daemon-reload- Check and add the wait_for_all: 1 clause to the /etc/corosync/corosync.conf file on both hosts:
Quorum {
provider: corosync_votequorum
two_node: 1
wait_for_all: 1
}
- Increase the value for token from 10.000 to 20.000 in the /etc/corosync/corosync.conf file on both hosts:
totem {
version: 2
cluster_name: GP1cluster
transport: knet
token: 20000
crypto_cipher: aes256
crypto_hash: sha256
}
- Start the Pacemaker cluster on both hosts: crm cluster start
- Enable the fencing-related properties:
crm configure property stonith-enabled=true
crm configure property no-quorum-policy=stop
crm configure property priority-fencing-delay=60
- On one of the hosts, create the fencing agent primitive as follows:
crm configure primitive fence_db2_gcp_db2gp1 stonith:fence_gce
op monitor interval=300s timeout=120s
op start interval=0 timeout=60s
params serviceaccount="/etc/db2pcmk_fence/db2gp1.json"
pcmk_host_list="pcmkdb01,pcmkdb02" zone=europe-west1-b project=db2pcmk2023
pcmk_reboot_timeout=300
pcmk_monitor_retries=4
pcmk_delay_max=30
meta is-managed=true
crm configure manage fence_db2_gcp_db2gp1
Additional Information
To do so, unmanage the resources and check the status first by using following commands:
crm resource unmanage fence_db2_gcp_db2gp1
crm resource status
crm resource status fence_db2_gcp_db2gp1
crm resource manage fence_db2_gcp_db2gp1
crm resource start fence_db2_gcp_db2gp1
crm configure property Stonith enabled=false
crm configure property no-quorum-policy=ignore
crm configure delete fence_db2_gcp_db2gp1 -force
crm resource refresh
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
14 November 2023
UID
ibm17071303