DB2 Version 10.1 for Linux, UNIX, and Windows

Examples: Takeover in HADR multiple standby mode

This set of examples of takeovers (both forced and unforced) in HADR multiple standby mode is based on a three-standby setup. The purpose of these examples is to show how the multiple standby automatic reconfiguration works in a takeover situation.

The initial setup for each of the examples is as follows:

a primary database (host1)
a principal standby (host2)
two auxiliary standbys (host3 and host4)

All of the databases are called hadr_db. The primary and principal standby have their synchronization mode set to SYNC and the standbys have theirs set to SUPERASYNC.

The configuration for each database is shown in Table 1.

Table 1. Configuration values for each HADR database
Configuration parameter	Host1	Host2	Host3	Host4
hadr_target_list	host2:40\|host3:41\|host4:42	host1:10\|host3:41\|host4:42	host2:40\|host1:10\|host4:42	host2:40\|host1:10\|host3:41
hadr_remote_host	host2	host1	host1	host1
hadr_remote_svc	40	10	10	10
hadr_remote_inst	dbinst2	dbinst1	dbinst1	dbinst1
hadr_local_host	host1	host2	host3	host4
hadr_local_svc	10	40	41	42
Configured hadr_syncmode (Refers to the explicitly set synchronization mode, which is used if the database becomes a primary)	SYNC	SYNC	SUPERASYNC	SUPERASYNC
Effective hadr_syncmode (Refers to the synchronization mode that is used if the database is currently a standby)	n/a	SYNC	SUPERASYNC	SUPERASYNC

A principal standby takes over gracefully (role switch)

The DBA performs a takeover on the principal standby by issuing the following command on host2:

 DB2 TAKEOVER HADR ON DB hadr_db

After the takeover is completed successfully, host2 becomes the new primary and host1, which is the first entry in the hadr_target_list of host2 (as shown in Table 1), becomes its principal standby. Their sync mode is SYNC mode because host2 is configured with an hadr_syncmode of SYNC. The auxiliary standby targets, host3 and host4, have their hadr_remote_host and hadr_remote_svc pointing at the old primary, host1, but are automatically redirected to the new primary, host2. In this redirection, host3 and host4 update (persistently) their hadr_remote_host, hadr_remote_svc, and hadr_remote_inst configuration parameters. They reconnect to host2 as auxiliary standbys, and are told by host2 to use an effective synchronization mode of SUPERASYNC (regardless of what they have locally configured for hadr_syncmode). They do not update their settings for hadr_syncmode persistently. The configuration for each database is shown inTable 2.

Table 2. Configuration values for each HADR database after a role switch. Rows 3 to 5 in columns 4 and 5 have been bolded to show that they have been auto-reconfigured
Configuration parameter	Host1	Host2	Host3	Host4
hadr_target_list	host2:40\|host3:41\|host4:42	host1:10\|host3:41\|host4:42	host2:40\|host1:10\|host4:42	host2:40\|host1:10\|host3:41
hadr_remote_host	host2	host1	host2	host2
hadr_remote_svc	40	10	40	40
hadr_remote_inst	dbinst2	dbinst1	dbinst2	dbinst2
hadr_local_host	host1	host2	host3	host4
hadr_local_svc	10	40	41	42
Configured hadr_syncmode	SYNC	SYNC	SUPERASYNC	SUPERASYNC
Effective hadr_syncmode	SYNC	n/a	SUPERASYNC	SUPERASYNC

Note: A number of values are not updated for the following reasons

Because host2 already has its hadr_remote_host and hadr_remote_svc configuration parameters pointing at its principal standby, host1, these values are not updated on host2.
Because host1 already has its hadr_remote_host and hadr_remote_svc configuration parameters pointing at the new primary, these values are not updated on host1.
Because host1's operational synchronization mode is SYNC and host3 and host4's operational synchronization modes are SUPERASYNC, there is no change for the effective synchronization mode.

An auxiliary standby takes over by force (failover)

A widespread power outage in City A results in the primary (host1) becoming unavailable. Normally, the principal standby (host2) which is in SYNC mode would be the best candidate for taking over and becoming the new primary, but the power outage means that host2 is momentarily unavailable as well. The DBA queries the two auxiliary standbys to determine which one has the most log data:

db2pd -hadr -db hadr_db | grep 'PRIMARY_LOG_FILE,PAGE,POS|STANDBY_LOG_FILE,PAGE,POS'

The DBA determines that host3 is the most up to date (although it is still a little behind in log replay) and picks that host as the new primary:

 DB2 TAKEOVER HADR ON DB hadr_db BY FORCE

After the takeover is completed successfully, host3 becomes the new primary. Meanwhile, host2 becomes available again. host3 informs host2 and host4 that it is now the primary. On host3, the values for hadr_remote_host, hadr_remote_svc, and hadr_remote_inst are reconfigured to point to host2, which is the principal standby because it is the first entry in the hadr_target_list on host3. On host2, the synchronization mode is reconfigured to SUPERASYNC because that is the setting for hadr_syncmode on host3; in addition, the hadr_remote_host, hadr_remote_svc, and hadr_remote_inst are updated (persistently). host4 is automatically redirected to the new primary, host3. In this redirection, host4 updates (persistently) its hadr_remote_host, hadr_remote_svc, and hadr_remote_inst configuration parameters. There is no automatic reconfiguration on host1 until it becomes available again. The configuration for each database is shown inTable 3.

Table 3. Configuration values for each HADR database after a failover. Rows 3 to 5 in columns 3 to 5 have been bolded to show that they have been auto-reconfigured
Configuration parameter	Host1 (unavailable)	Host2	Host3	Host4
hadr_target_list	host2:40\|host3:41\|host4:42	host1:10\|host3:41\|host4:42	host2:40\|host1:10\|host4:42	host2:40\|host1:10\|host3:41
hadr_remote_host	host2	host3	host2	host3
hadr_remote_svc	40	41	40	41
hadr_remote_inst	dbinst2	dbinst3	dbinst2	dbinst3
hadr_local_host	host1	host2	host3	host4
hadr_local_svc	10	40	41	42
Configured hadr_syncmode	SYNC	SYNC	SUPERASYNC	SUPERASYNC
Effective hadr_syncmode	n/a	SUPERASYNC	n/a	SUPERASYNC

After a short period of time, host1 becomes available. The DBA tries to start host1 as a standby, but because host1 has more logs than were propagated to host3, host1 is rejected as part of the initial handshake with the new primary. The DBA takes a backup of the new primary, restores it to host1, and starts HADR on that host:

DB2 BACKUP DB hadr_db

DB2 RESTORE DB hadr_db

DB2 START HADR ON DB hadr_db AS STANDBY

As is shown inTable 4, host1 is reconfigured.

Table 4. Configuration values for a reintegrated standby. Various rows in column 2 have been bolded to show that they have been auto-reconfigured
Configuration parameter	Host1	Host2	Host3	Host4
hadr_target_list	host2:40\|host3:41\|host4:42	host1:10\|host3:41\|host4:42	host2:40\|host1:10\|host4:42	host2:40\|host1:10\|host3:41
hadr_remote_host	host3	host3	host2	host3
hadr_remote_svc	41	41	40	41
hadr_remote_inst	dbinst3	dbinst3	dbinst2	dbinst3
hadr_local_host	host1	host2	host3	host4
hadr_local_svc	10	40	41	42
Configured hadr_syncmode	SYNC	SYNC	SUPERASYNC	SUPERASYNC
Effective hadr_syncmode	SUPERASYNC	SUPERASYNC	n/a	SUPERASYNC

If the DBA wants to make host1 the primary again, then all that is required is a failback, which will restore the original configuration shown in Table 1.

An auxiliary standby takes over by force (failover) in a SA MP environment

This example is similar to the previous one, but HADR has been deployed with IBM® Tivoli® System Automation for Multiplatforms (SA MP) to automate failover.

A power failure in City A results in the principal standby (host2) becoming unavailable. Following that, there is an outage on the primary (host1). Normally, SA MP, the cluster manager, would automatically fail over to the principal standby (host2), but the power outage means that one of the auxiliary standbys needs to be the takeover target. Failover cannot be automated to auxiliary standbys, so the DBA must do it manually. However, before doing this, the DBA needs to ensure that TSA is disabled so that if host1 or host2 become available, there is no possibility for a split brain situation, in which more than one database is operating independently as a primary. To do this, the DBA issues the following command on host1 and host2 (whenever they become available):

db2haicu -disable

In addition, the DBA needs to keep host1 offline to eliminate the possibility that the old primary will restart if a client connects to it.

The DBA queries the two auxiliary standbys to determine which one has the most log data:

db2pd -hadr -db hadr_db | grep 'STANDBY_LOG_FILE,PAGE,POS'

The DBA determines that host3 is the most up to date and picks that host as the new primary.

Then, the DBA issues the force takeover on host3:

 DB2 TAKEOVER HADR ON DB hadr_db BY FORCE

After the takeover is completed successfully, host3 becomes the new primary. Meanwhile, host2 becomes available again. host3 informs host2 and host4 that it is now the primary. On host3, the values for hadr_remote_host, hadr_remote_svc, and hadr_remote_inst are reconfigured to point to host2, which is the principal standby because it is the first entry in the hadr_target_list on host3. On host2, the synchronization mode is reconfigured to SUPERASYNC because that is the setting for hadr_syncmode on host3; in addition, the hadr_remote_host, hadr_remote_svc, and hadr_remote_inst are updated (persistently). host4 is automatically redirected to the new primary, host3. In this redirection, host4 updates (persistently) its hadr_remote_host, hadr_remote_svc, and hadr_remote_inst configuration parameters. There is no automatic reconfiguration on host1. The configuration for each database is shown inTable 5.

Table 5. Configuration values for each HADR database after a failover. Rows 3 to 5 in columns 3 to 5 have been bolded to show that they have been auto-reconfigured
Configuration parameter	Host1 (unavailable)	Host2	Host3	Host4
hadr_target_list	host2:40\|host3:41\|host4:42	host1:10\|host3:41\|host4:42	host2:40\|host1:10\|host4:42	host2:40\|host1:10\|host3:41
hadr_remote_host	host2	host3	host2	host3
hadr_remote_svc	40	41	40	41
hadr_remote_inst	dbinst2	dbinst3	dbinst2	dbinst3
hadr_local_host	host1	host2	host3	host4
hadr_local_svc	10	40	41	42
Configured hadr_syncmode	SYNC	SYNC	SUPERASYNC	SUPERASYNC
Effective hadr_syncmode	n/a	SUPERASYNC	n/a	SUPERASYNC