Examples: Takeover in a multiple HADR standby setup

This set of examples of takeovers (both forced and unforced) with multiple HADR standbys is based on a three-standby setup. The purpose of these examples is to show how the automatic reconfiguration works in a takeover situation.
Note: You can execute a graceful or a forced takeover on either the principle or the auxiliary HADR standby database. As every possible combination cannot be provided in an example, some examples are highlighted in this topic.
The initial setup for each of the examples is as follows:
  • a primary database (host1)
  • a principal standby (host2)
  • two auxiliary standbys (host3 and host4)
All of the databases are called hadr_db. The primary and principal standby have their synchronization mode set to SYNC and the standbys have theirs set to SUPERASYNC.
The configuration for each database is shown in Table 1.
Table 1. Configuration values for each HADR database
Configuration parameter Host1 Host2 Host3 Host4
hadr_target_list host2:40|host3:41|host4:42 host1:10|host3:41|host4:42 host2:40|host1:10|host4:42 host2:40|host1:10|host3:41
hadr_remote_host host2 host1 host1 host1
hadr_remote_svc 40 10 10 10
hadr_remote_inst dbinst2 dbinst1 dbinst1 dbinst1
hadr_local_host host1 host2 host3 host4
hadr_local_svc 10 40 41 42
Configured hadr_syncmode
(Refers to the explicitly set synchronization mode, which is used if the database becomes a primary)
SYNC SYNC SUPERASYNC SUPERASYNC
Effective hadr_syncmode
(Refers to the synchronization mode that is used if the database is currently a standby)
n/a SYNC SUPERASYNC SUPERASYNC

A principal standby takes over gracefully (role switch)

The DBA performs a takeover on the principal standby by issuing the following command on host2:
 DB2 TAKEOVER HADR ON DB hadr_db
After the takeover is completed successfully, host2 becomes the new primary and host1, which is the first entry in the hadr_target_list of host2 (as shown in Table 1), becomes its principal standby. Their sync mode is SYNC mode because host2 is configured with an hadr_syncmode of SYNC. The auxiliary standby targets, host3 and host4, have their hadr_remote_host and hadr_remote_svc pointing at the old primary, host1, but are automatically redirected to the new primary, host2. In this redirection, host3 and host4 update (persistently) their hadr_remote_host, hadr_remote_svc, and hadr_remote_inst configuration parameters. They reconnect to host2 as auxiliary standbys, and are told by host2 to use an effective synchronization mode of SUPERASYNC (regardless of what they have locally configured for hadr_syncmode). They do not update their settings for hadr_syncmode persistently. The configuration for each database is shown in Table 2.
Table 2. Configuration values for each HADR database after a role switch. Rows 3 to 5 in columns 4 and 5 have been bolded to show that they have been auto-reconfigured
Configuration parameter Host1 Host2 Host3 Host4
hadr_target_list host2:40|host3:41|host4:42 host1:10|host3:41|host4:42 host2:40|host1:10|host4:42 host2:40|host1:10|host3:41
hadr_remote_host host2 host1 host2 host2
hadr_remote_svc 40 10 40 40
hadr_remote_inst dbinst2 dbinst1 dbinst2 dbinst2
hadr_local_host host1 host2 host3 host4
hadr_local_svc 10 40 41 42
Configured hadr_syncmode SYNC SYNC SUPERASYNC SUPERASYNC
Effective hadr_syncmode SYNC n/a SUPERASYNC SUPERASYNC
Note: A number of values are not updated for the following reasons
  • Because host2 already has its hadr_remote_host and hadr_remote_svc configuration parameters pointing at its principal standby, host1, these values are not updated on host2.
  • Because host1 already has its hadr_remote_host and hadr_remote_svc configuration parameters pointing at the new primary, these values are not updated on host1.
  • Because host1's operational synchronization mode is SYNC and host3 and host4's operational synchronization modes are SUPERASYNC, there is no change for the effective synchronization mode.

An auxiliary standby takes over by force (failover)

A widespread power outage in City A results in the primary (host1) becoming unavailable. Normally, the principal standby (host2) which is in SYNC mode would be the best candidate for taking over and becoming the new primary, but the power outage means that host2 is momentarily unavailable as well. The DBA queries the two auxiliary standbys to determine which one has the most log data:
db2pd -hadr -db hadr_db | grep STANDBY_LOG_FILE,PAGE,POS
The DBA determines that host3 is the most up to date (although it is still a little behind in log replay) and picks that host as the new primary:
 DB2 TAKEOVER HADR ON DB hadr_db BY FORCE
After the takeover is completed successfully, host3 becomes the new primary. Meanwhile, host2 becomes available again. host3 informs host2 and host4 that it is now the primary. On host3, the values for hadr_remote_host, hadr_remote_svc, and hadr_remote_inst are reconfigured to point to host2, which is the principal standby because it is the first entry in the hadr_target_list on host3. On host2, the synchronization mode is reconfigured to SUPERASYNC because that is the setting for hadr_syncmode on host3; in addition, the hadr_remote_host, hadr_remote_svc, and hadr_remote_inst are updated (persistently). host4 is automatically redirected to the new primary, host3. In this redirection, host4 updates (persistently) its hadr_remote_host, hadr_remote_svc, and hadr_remote_inst configuration parameters. There is no automatic reconfiguration on host1 until it becomes available again. The configuration for each database is shown in Table 3.
Table 3. Configuration values for each HADR database after a failover. Rows 3 to 5 in columns 3 to 5 have been bolded to show that they have been auto-reconfigured
Configuration parameter Host1
(unavailable)
Host2 Host3 Host4
hadr_target_list host2:40|host3:41|host4:42 host1:10|host3:41|host4:42 host2:40|host1:10|host4:42 host2:40|host1:10|host3:41
hadr_remote_host host2 host3 host2 host3
hadr_remote_svc 40 41 40 41
hadr_remote_inst dbinst2 dbinst3 dbinst2 dbinst3
hadr_local_host host1 host2 host3 host4
hadr_local_svc 10 40 41 42
Configured hadr_syncmode SYNC SYNC SUPERASYNC SUPERASYNC
Effective hadr_syncmode n/a SUPERASYNC n/a SUPERASYNC
After a short period of time, host1 becomes available. The DBA tries to start host1 as a standby, but because host1 has more logs than were propagated to host3, host1 is rejected as part of the initial handshake with the new primary. The DBA takes a backup of the new primary, restores it to host1, and starts HADR on that host:
DB2 BACKUP DB hadr_db

DB2 RESTORE DB hadr_db

DB2 START HADR ON DB hadr_db AS STANDBY
As is shown in Table 4, host1 is reconfigured.
Table 4. Configuration values for a reintegrated standby. Various rows in column 2 have been bolded to show that they have been auto-reconfigured
Configuration parameter Host1 Host2 Host3 Host4
hadr_target_list host2:40|host3:41|host4:42 host1:10|host3:41|host4:42 host2:40|host1:10|host4:42 host2:40|host1:10|host3:41
hadr_remote_host host3 host3 host2 host3
hadr_remote_svc 41 41 40 41
hadr_remote_inst dbinst3 dbinst3 dbinst2 dbinst3
hadr_local_host host1 host2 host3 host4
hadr_local_svc 10 40 41 42
Configured hadr_syncmode SYNC SYNC SUPERASYNC SUPERASYNC
Effective hadr_syncmode SUPERASYNC SUPERASYNC n/a SUPERASYNC

If the DBA wants to make host1 the primary again, then all that is required is a failback, which will restore the original configuration shown in Table 1.

An auxiliary standby takes over by force (failover) in a SA MP or Pacemaker environment

This example is similar to the previous one, but HADR has been deployed with IBM® Tivoli® System Automation for Multiplatforms (SA MP) to automate failover.

A power failure in City A results in the principal standby (host2) becoming unavailable. Following that, there is an outage on the primary (host1). Normally, the cluster manager would automatically fail over to the principal standby (host2), but the power outage means that one of the auxiliary standbys needs to be the takeover target. Failover cannot be automated to auxiliary standbys, so the DBA must do it manually while keeping host1 offline to eliminate the possibility that the old primary will restart if a client connects to it.

Important: Multiple standby with Pacemaker is not available in version 11.5.4. Support for multiple standbys has been added for version 11.5.5 and later.
The DBA queries the two auxiliary standbys to determine which one has the most log data:
db2pd -hadr -db hadr_db | grep 'STANDBY_LOG_FILE,PAGE,POS'
The DBA determines that host3 is the most up to date and picks that host as the new primary.
Then, the DBA issues the force takeover on host3:
 DB2 TAKEOVER HADR ON DB hadr_db BY FORCE
After the takeover is completed successfully, host3 becomes the new primary. Meanwhile, host2 becomes available again. host3 informs host2 and host4 that it is now the primary. On host3, the values for hadr_remote_host, hadr_remote_svc, and hadr_remote_inst are reconfigured to point to host2, which is the principal standby because it is the first entry in the hadr_target_list on host3. On host2, the synchronization mode is reconfigured to SUPERASYNC because that is the setting for hadr_syncmode on host3; in addition, the hadr_remote_host, hadr_remote_svc, and hadr_remote_inst are updated (persistently). host4 is automatically redirected to the new primary, host3. In this redirection, host4 updates (persistently) its hadr_remote_host, hadr_remote_svc, and hadr_remote_inst configuration parameters. There is no automatic reconfiguration on host1. The configuration for each database is shown in Table 5.
Table 5. Configuration values for each HADR database after a failover. Rows 3 to 5 in columns 3 to 5 have been bolded to show that they have been auto-reconfigured
Configuration parameter Host1
(unavailable)
Host2 Host3 Host4
hadr_target_list host2:40|host3:41|host4:42 host1:10|host3:41|host4:42 host2:40|host1:10|host4:42 host2:40|host1:10|host3:41
hadr_remote_host host2 host3 host2 host3
hadr_remote_svc 40 41 40 41
hadr_remote_inst dbinst2 dbinst3 dbinst2 dbinst3
hadr_local_host host1 host2 host3 host4
hadr_local_svc 10 40 41 42
Configured hadr_syncmode SYNC SYNC SUPERASYNC SUPERASYNC
Effective hadr_syncmode n/a SUPERASYNC n/a SUPERASYNC
Once the original primary site recovers and host1 and host2 become available, the databases will fail to start as not all of their logs were propagated to host3. They must be manually reintegrated using the following steps:
  1. Disable SA MP on host1 and host2 (for SA MP only) by running:
    db2haicu -disable
  2. Deactivate the database on host1 and host2 by issuing the following command on both hosts:
    DB2 DEACTIVATE DB hadr_db
  3. Stop HADR on host1 and host2 to disable automation between them by running:
    DB2 STOP HADR ON DB hadr_db
  4. Drop the databases on host1 and host2 by running:
    DB2 DROP DB hadr_db
  5. Backup the new primary database on host3 online by running:
    DB2 BACKUP DB hadr_db online
  6. Copy the backup image to both host1 and host2 using the scp command.
  7. Restore the hadr_db database using the backup image on host1 and host2 by running:
    DB2 RESTORE DB hadr_db
  8. Reconfigure the restored database on each host by running the following command:
    On host1:
    DB2 "UPDATE DB CFG FOR hadr_db USING 
    HADR_TARGET_LIST host2:40|host3:41|host4:42 
    HADR_REMOTE_HOST host3 
    HADR_REMOTE_SVC  30 
    HADR_LOCAL_HOST  host1 
    HADR_LOCAL_SVC   10 
    HADR_SYNCMODE    sync 
    HADR_REMOTE_INST dbinst3"
    On host2:
    DB2 "UPDATE DB CFG FOR hadr_db USING 
    HADR_TARGET_LIST host1:10|host3:41|host4:42 
    HADR_REMOTE_HOST host3 
    HADR_REMOTE_SVC  41 
    HADR_LOCAL_HOST  host2 
    HADR_LOCAL_SVC   40 
    HADR_SYNCMODE    sync 
    HADR_REMOTE_INST dbinst3"
  9. Start HADR on host1 and host2 to reintegrate them as standbys by running:
    DB2 START HADR ON DB hadr_db AS STANDBY
After running these commands successfully, host1 will become an auxiliary standby and host2 will remain as principle standby. But in this case, no automation is supported between the new primary on host3 and the principle standby on host2. To enable automation, issue the takeover command on host1 manually to make the database primary again by running:
DB2 TAKEOVER HADR ON DB hadr_db