When you use the integrated High Availability (HA) feature with Pacemaker to automate
HADR, extra steps are required to update the operating system or the Db2® database system
software, upgrade the hardware, or change the database configuration parameters. Follow this
procedure to perform a rolling update in a Pacemaker automated HADR
environment.
Before you begin
Before running the rolling update procedure, ensure that the following prerequisites are met:
- You have configured HADR for your Pacemaker-managed Linux cluster, either on two hosts or a two-sites multiple standby cluster with same-site
failover automation on four hosts in two sites.
- The instances are running Db2
11.5.4 or
later.
- If you are updating to Db2
11.5.5,
you have downloaded the associated Pacemaker stack here. When updating to Db2
11.5.6
and higher, the Pacemaker stack is
installed by running the installFixPack command.
- The HADR pair are in PEER state.
Restrictions
Use this procedure to perform a rolling update on your
Db2 database system and
update the
Db2
database product software to a new fix pack level in a
Pacemaker automated HADR
environment. For example, applying a fix pack to a
Db2 database product software.
- The Db2
instances must be currently running at Db2
11.5.4 or
later.
A rolling update cannot be used to upgrade a Db2 database system from
an earlier version to a later version. For example, you cannot use this procedure to upgrade from
Db2 Version 10.5
to Db2
11.5.
To upgrade a Db2
server in an automated HADR environment, see Upgrading Db2 servers in a TSA automated HADR environment.
You cannot use this procedure to update the Db2 HADR configuration
parameters. Updates to the HADR configuration parameters must be made separately. Because HADR
requires the parameters on the primary and standby to be the same, both the primary and standby
databases might need to be deactivated and updated at the same time.
The following procedure cannot be used to convert an existing Db2 HADR system using
Tivoli
SA MP (TSA)
as a cluster manager to a newer Db2 level using Pacemaker as a cluster
manager in a single step. Instead, the existing system should be first updated to a new Db2 level while
maintaining TSA as the integrated manager. Once the update is complete, follow the steps outlined in
Replacing an existing Tivoli SA MP-managed Db2 instance with a Pacemaker-managed HADR Db2 instance to use Pacemaker as the cluster
manager.
The following procedure is only applicable when the existing Db2 HADR cluster is
deployed using the Db2-provided Pacemaker cluster
software stack. If the cluster to be updated uses Pacemaker provided by
other vendors, all cluster resources should be removed following the procedures outlined from the
Pacemaker
supplier and recreated using the Db2 provided Pacemaker cluster
software stack using the db2cm utility. See Configuring high availability with the Db2 cluster manager utility (db2cm).
Procedure
-
On each standby host, ensure that all databases have their HADR_ROLE set as
STANDBY:
db2pd -hadr -db <database-name>
- On each standby host, deactivate all databases to stop HADR while retaining
the role:
db2 deactivate db <database-name>
- On each standby host, stop all Db2 processes:
- As the root user on each standby host, stop all Pacemaker and Coroysnc
processes:
systemctl stop pacemaker
systemctl stop corosync
systemctl stop corosync-qdevice
Note: Only run the systemctl stop corosync-qdevice command if the Qdevice is
configured.
- Apply the update on each standby host.
- If not updating to a new Db2 fix pack nor updating
to a new major version of the operating system, for example, from Red Hat Enterprise Linux (RHEL) 8
to RHEL 9, you can proceed to step 9
after the change has been applied.
-
If updating to a new major version of the operating system, for example, from RHEL 8 to RHEL 9,
run the following command then proceed to step 9:
db2InstallPCMK -i
- If updating to a new Db2 fix pack, follow the
Installing offline
fix pack updates to existing Db2 database products (Linux®® and UNIX)
procedure.
Important: If updating to
Db2
11.5.6 or
later,
step 6 through
step 8 are no longer necessary as the
installFixPack command takes care of these tasks.
- If updating to Db2
11.5.5,
on each standby host, install the new Pacemaker and Corosync
packages that are provided by IBM®:
- If updating to Db2
11.5.5,
as the root user on each standby host, copy the new db2cm utility from
/<tarFilePath>/Db2/db2cm to
/home/<inst_user>/sqllib/adm:
cp /<tarFilePath>/Db2/db2cm /home/<inst_user>/sqllib/bin
chmod 755 /home/<inst_user>/sqllib/bin/db2cm
- If updating to Db2
11.5.5,
on each standby host, run the following as root to copy the resource agent scripts
(db2hadr, db2inst, db2ethmon) from
/<tarFilePath>/Db2agents into
/usr/lib/ocf/resource.d/heartbeat/:
/home/<inst_user>/sqllib/bin/db2cm -copy_resources /<tarFilePath>/Db2agents -host <host1>
/home/<inst_user>/sqllib/bin/db2cm -copy_resources /<tarFilePath>/Db2agents -host <host2>
- As the root user on each standby host, start the Pacemaker and Corosync
processes:
systemctl start pacemaker
systemctl start corosync
systemctl start corosync-qdevice
Note: Only run the systemctl start corosync-qdevice command if the Qdevice is
configured.
- As the root user on each standby host, check the configuration, either manually or by
running the crm_verify tool, if available:
crm_verify -L -V
Note: This prints any error in the configuration. If there is nothing wrong, nothing is
printed.
- On each standby host, start all Db2 processes:
- On each standby host, activate all databases:
db2 activate db <database-name>
- On the principal standby host, run a role switch for all databases:
db2 takeover hadr on db <database-name>
- If applying a new Db2 fix pack, after the
role switch, the old primary database disconnects because the new primary is running on a higher fix
pack level.
- On the old primary host, repeat step
2 to step 12 to apply the update
on this host.
Note: Exclude
step 8 since this step is
redundant if you have already done it the first time through.
Important: Step 15 to
step 19 are only necessary if updating from
Db2
11.5.5 to
Db2
11.5.6 or
later.
- Update the
migration-threshold meta
attribute for each database by deleting the existing attribute, and setting it with the new
value.
Delete the existing
attribute:
crm resource meta <database resource name> delete migration-threshold
Then set the new
attribute:
crm resource meta <database resource name> set migration-threshold 1
The following example shows the command syntax for updating the migration threshold for an
automated database named
CORAL:
crm resource meta db2_db2inst1_db2inst1_CORAL delete migration-threshold
crm resource meta db2_db2inst1_db2inst1_CORAL set migration-threshold 1
- Update the
failure-timeout attribute for each
database:
crm resource meta <database resource name> set failure-timeout 10
The following example shows the command syntax for updating the
failure-timeout
attribute for a database named
CORAL:
crm resource meta db2_db2inst1_db2inst1_CORAL set failure-timeout 10
- Ensure that the
migration-threshold and
failure-timeout attributes have been updated for each database:
crm config show <database resource-clone>
The following example shows the command syntax for viewing the updated resource configuration for
an automated database named
CORAL:
crm config show db2_db2inst1_db2inst1_CORAL-clone
ms db2_db2inst1_db2inst1_CORAL-clone db2_db2inst1_db2inst1_CORAL \
meta resource-stickiness=5000 migration-threshold=1 ordered=true promotable=true is-managed=true failure-timeout=10
- Update the cluster configuration to set
symmetric-cluster to true:
crm configure property symmetric-cluster=true
- Update the Corosync configuration to use millisecond
timestamps. This can be done while the cluster is online.
Edit the
corosync.conf file:
crm corosync edit
Update the timestamp setting under the logging directive to
hires instead of
on. The final directive should look like the following:
logging {
to_logfile: yes
logfile: /var/log/cluster/corosync.log
to_syslog: yes
timestamp: hires
function_name: on
fileline: on
}
Push the change to the remote
host:
crm corosync push <remote hostname>
Lastly, refresh Corosync so it uses the new
configuration:
crm corosync reload
- On the original primary host for each database, run a failback operation to set the HADR
roles back to their original state:
db2 takeover hadr on db <database-name>
- Verify that all the databases are in the PEER state:
db2pd -hadr -db <database-name>
Important:
Step 22 through
step 24 are only necessary if updating to a
new
Db2 fix pack
and the Qdevice is configured.
- If updating to new Db2 fix pack and the
Qdevice is configured, as root user on the Qdevice host, stop the
corosync-qnetd
process:
systemctl stop corosync-qnetd
- If updating to new Db2 fix pack and the
Qdevice is configured, as root user on the Qdevice host, update the
corosync-qnetd
package provided by IBM, depending on the version of Db2:
- For RHEL systems on Db2
11.5.5
and
older:
dnf upgrade /<tarFilePath>/RPMS/<architecture>/corosync-qnetd
- For SLES systems on Db2
11.5.5
and
older:
zypper in --allow-unsigned-rpm /<tarFilePath>/RPMS/<architecture>/corosync-qnetd
- For RHEL systems on Db2
11.5.6
and
newer:
dnf upgrade <Db2_image>/db2/<platform>/pcmk/Linux/<OS_distribution>/<architecture>/corosync-qnetd
- For SLES systems on Db2
11.5.6
and
newer:
zypper in --allow-unsigned-rpm <Db2_image>/db2/<platform>/pcmk/Linux/<OS_distribution>/<architecture>/corosync-qnetd
- If updating to new Db2 fix pack and the
Qdevice is configured, as root user on the Qdevice host, update the
qnetd
process:
systemctl start corosync-qnetd
- Confirm that the cluster is in a healthy state:
crm resource show
Note: This might take Pacemaker around a minute
to complete.
The following example shows an output from running the crm resource show
command:
db2_db2tea1_eth1 (ocf::heartbeat:db2ethmon): Started
db2_kedge1_eth1 (ocf::heartbeat:db2ethmon): Started
db2_kedge1_db2inst1_0 (ocf::heartbeat:db2inst): Started
db2_db2tea1_db2inst2_0 (ocf::heartbeat:db2inst): Started
db2_kedge1_db2inst2_0 (ocf::heartbeat:db2inst): Started
Clone Set: db2_db2inst2_db2inst2_CORAL-clone [db2_db2inst2_db2inst2_CORAL] (promotable)
Masters: [ db2tea1 ]
Slaves: [ kedge1 ]
Clone Set: db2_db2inst2_db2inst2_CORAL2-clone [db2_db2inst2_db2inst2_CORAL2] (promotable)
Masters: [ db2tea1 ]
Slaves: [ kedge1 ]
db2_db2tea1_db2inst1_0 (ocf::heartbeat:db2inst): Started
Clone Set: db2_db2inst1_db2inst1_CORAL-clone [db2_db2inst1_db2inst1_CORAL] (promotable)
Masters: [ db2tea1 ]
Slaves: [ kedge1 ]
Clone Set: db2_db2inst1_db2inst1_CORAL2-clone [db2_db2inst1_db2inst1_CORAL2] (promotable)
Masters: [ db2tea1 ]
Slaves: [ kedge1 ]
No resources should be in the unmanaged state and all resources should be
started on the expected role.