With Governor running on the designated primary and principal standby databases, HADR is
automatically restarted after a failover to another database. But in a scenario where the database
unexpectedly becomes unavailable (for example, a site outage), the old and new primaries' log
streams might diverge, resulting in HADR failing to start.
About this task
You would see this message in the db2 diaglog (located in the Db2u pod in
${DIAGPATH}/NODE000):
MESSAGE : ADM12500E The HADR standby database cannot be made consistent with
the primary database. The log stream of the standby database is
incompatible with that of the primary database. To use this database
as a standby, it must be recreated from a backup image or split
mirror of the primary database.
If this scenario occurs, you should make an online backup from the current primary database and
restore it to the standby database that is failing to start. You can do this by manually taking a
backup from the current primary and re-running the setup_config_hadr
script with
--db-role standby.
Procedure
-
Stop HADR on the database that cannot reintegrate:
oc exec -it c-db2-primary-db2u-0 -- manage_hadr -stop
-
Determine the database that is the current primary by using the manage_hadr
tool with -status option.
In the following example, db2wh-aux
is the current primary database, after a
forced takeover. Note that HADR_ROLE = PRIMARY
.
oc exec -it c-db2wh-aux-db2u-0 -- manage_hadr -status
# Output:
#######################################################################
### Db2 Warehouse high availability and ###
### disaster recovery (HADR) management ###
#######################################################################
Running HADR action -status on the database BLUDB ...
################################################################################
### The HADR status summary ###
################################################################################
Database Member 0 -- Database BLUDB -- Active -- Up 0 days 00:00:39 -- Date 2021-05-28-03.47.31.838856
####### Primary - Standby 1 ######
HADR_ROLE = PRIMARY
-
Exec into the current primary database Db2
Warehouse pod and switch to the database
instance owner:
oc exec -it c-db2wh-aux-db2u-0
su - db2inst1
-
Initiate an online backup of the database to the backup location (${BACKUPDIR}
(/mnt/backup):
db2 backup db BLUDB online to ${BACKUPDIR}
-
Copy the keystore into the backup location:
tar -cjvf ${BACKUPDIR}/keystore.tar -C ${KEYSTORELOC} .
-
Update permissions on the backup directory so the Db2
Warehouse instance owner/group has
read-write access:
sudo chmod 755 -R /mnt/backup
-
Copy the Db2
Warehouse backup file from the current primary database to the standby
database:
## Copy from current primary database to a directory on the host called /tmp/hadr
oc rsync c-db2wh-aux-db2u-0:/mnt/backup/ /tmp/hadr
-
Run the
setup_config_hadr
script again to restore the database.
- Use standby for the --db-role to ensure that the database
is reconfigured as a standby.
- If the database that is being reinitialized is the former primary database, use the designated
principal standby as primary, and use the designated primary as standby, leaving the auxiliary
standby databases as
auxiliaries:
oc exec -it c-db2wh-primary-db2u-0 -- setup_config_hadr --db-role standby --primary-name db2wh-standby --standby-name db2wh-primary --primary-port 31384 --standby-port 32457 --aux1-name db2wh-aux --aux1-port 32649 --etcd-host my-etcd-client.my-etcd --etcd-port 2379 –multicluster
- If the database that is being reinitialized is the former principal standby database, use the
same parameters as used in the original
setup:
oc exec -it c-db2wh-standby-db2u-0 -- setup_config_hadr --db-role standby --primary-name db2wh-primary --standby-name db2wh-standby --primary-port 32457 --standby-port 31384 --aux1-name db2wh-aux --aux1-port 32649 --etcd-host my-etcd-client.my-etcd --etcd-port 2379 --multicluster
-
Exec into the Db2
Warehouse pod again, and as the Db2
Warehouse instance owner,
check the HADR configuration by setting HADR_LOCAL_SVC.
db2 get db cfg for bludb | grep -i hadr_local_svc
# Output:
HADR local service name (HADR_LOCAL_SVC) = 60007|31384
-
Verify that the first port number is correct for its designated original role:
- Primary: 60006
- Standby: 60007
- Aux1: 60008
- Aux2: 60009
If incorrect, edit the HADR configuration setting HADR_LOCAL_SVC so that it uses the correct
port. Only update the first port number, and use the existing value for the second port number:
db2 “update db cfg for BLUDB using HADR_LOCAL_SVC 60006|31384”
-
Exit the pod, and start HADR on the database as a standby:
oc exec -it c-db2wh-primary-db2u-0 -- manage_hadr -start_as standby