The initial setup for each of the examples is as follows:
- a primary database (host1)
- a principal standby (host2)
- two auxiliary standbys (host3 and host4)
All of the databases are called hadr_db. The primary and principal
standby have their synchronization mode set to SYNC and the standbys
have theirs set to SUPERASYNC.
A principal standby takes over gracefully
(role switch)
The DBA performs a takeover on the principal
standby by issuing the following command on host2:
DB2 TAKEOVER HADR ON DB hadr_db
After
the takeover is completed successfully, host2 becomes the new primary
and host1, which is the first entry in the
hadr_target_list of
host2 (as shown in
Table 1),
becomes its principal standby. Their sync mode is SYNC mode because
host2 is configured with an
hadr_syncmode of
SYNC. The auxiliary standby targets, host3 and host4, have their
hadr_remote_host and
hadr_remote_svc pointing
at the old primary, host1, but are automatically redirected to the
new primary, host2. In this redirection, host3 and host4 update (persistently)
their
hadr_remote_host,
hadr_remote_svc,
and
hadr_remote_inst configuration parameters.
They reconnect to host2 as auxiliary standbys, and are told by host2
to use an effective synchronization mode of SUPERASYNC (regardless
of what they have locally configured for
hadr_syncmode).
They do not update their settings for
hadr_syncmode persistently.
The configuration for each database is shown in
Table 2.
Table 2. Configuration values for each HADR
database after a role switch. Rows 3 to 5 in columns 4
and 5 have been bolded to show that they have been auto-reconfiguredConfiguration parameter |
Host1 |
Host2 |
Host3 |
Host4 |
hadr_target_list |
host2:40|host3:41|host4:42 |
host1:10|host3:41|host4:42 |
host2:40|host1:10|host4:42 |
host2:40|host1:10|host3:41 |
hadr_remote_host |
host2 |
host1 |
host2 |
host2 |
hadr_remote_svc |
40 |
10 |
40 |
40 |
hadr_remote_inst |
dbinst2 |
dbinst1 |
dbinst2 |
dbinst2 |
hadr_local_host |
host1 |
host2 |
host3 |
host4 |
hadr_local_svc |
10 |
40 |
41 |
42 |
Configured hadr_syncmode |
SYNC |
SYNC |
SUPERASYNC |
SUPERASYNC |
Effective hadr_syncmode |
SYNC |
n/a |
SUPERASYNC |
SUPERASYNC |
Note: A number of values are not updated for the following
reasons
- Because host2 already has its hadr_remote_host and hadr_remote_svc configuration
parameters pointing at its principal standby, host1, these values
are not updated on host2.
- Because host1 already has its hadr_remote_host and hadr_remote_svc configuration
parameters pointing at the new primary, these values are not updated
on host1.
- Because host1's operational synchronization mode is SYNC and host3
and host4's operational synchronization modes are SUPERASYNC, there
is no change for the effective synchronization mode.
An auxiliary standby takes over by force
(failover)
A widespread power outage in City A results in
the primary (host1) becoming unavailable. Normally, the principal
standby (host2) which is in SYNC mode would be the best candidate
for taking over and becoming the new primary, but the power outage
means that host2 is momentarily unavailable as well. The DBA queries
the two auxiliary standbys to determine which one has the most log
data:
db2pd -hadr -db hadr_db | grep 'PRIMARY_LOG_FILE,PAGE,POS|STANDBY_LOG_FILE,PAGE,POS'
The
DBA determines that host3 is the most up to date (although it is still
a little behind in log replay) and picks that host as the new primary:
DB2 TAKEOVER HADR ON DB hadr_db BY FORCE
After
the takeover is completed successfully, host3 becomes the new primary.
Meanwhile, host2 becomes available again. host3 informs host2 and
host4 that it is now the primary. On host3, the values for
hadr_remote_host,
hadr_remote_svc,
and
hadr_remote_inst are reconfigured to point
to host2, which is the principal standby because it is the first entry
in the
hadr_target_list on host3. On host2, the
synchronization mode is reconfigured to SUPERASYNC because that is
the setting for
hadr_syncmode on host3; in addition,
the
hadr_remote_host,
hadr_remote_svc,
and
hadr_remote_inst are updated (persistently).
host4 is automatically redirected to the new primary, host3. In this
redirection, host4 updates (persistently) its
hadr_remote_host,
hadr_remote_svc,
and
hadr_remote_inst configuration parameters.
There is no automatic reconfiguration on host1 until it becomes available
again. The configuration for each database is shown in
Table 3.
Table 3. Configuration values for each HADR
database after a failover. Rows 3 to 5 in columns 3 to
5 have been bolded to show that they have been auto-reconfiguredConfiguration parameter |
Host1 (unavailable) |
Host2 |
Host3 |
Host4 |
hadr_target_list |
host2:40|host3:41|host4:42 |
host1:10|host3:41|host4:42 |
host2:40|host1:10|host4:42 |
host2:40|host1:10|host3:41 |
hadr_remote_host |
host2 |
host3 |
host2 |
host3 |
hadr_remote_svc |
40 |
41 |
40 |
41 |
hadr_remote_inst |
dbinst2 |
dbinst3 |
dbinst2 |
dbinst3 |
hadr_local_host |
host1 |
host2 |
host3 |
host4 |
hadr_local_svc |
10 |
40 |
41 |
42 |
Configured hadr_syncmode |
SYNC |
SYNC |
SUPERASYNC |
SUPERASYNC |
Effective hadr_syncmode |
n/a |
SUPERASYNC |
n/a |
SUPERASYNC |
After a short period of time, host1 becomes available.
The DBA tries to start host1 as a standby, but because host1 has more
logs than were propagated to host3, host1 is rejected as part of the
initial handshake with the new primary. The DBA takes a backup of
the new primary, restores it to host1, and starts HADR on that host:
DB2 BACKUP DB hadr_db
DB2 RESTORE DB hadr_db
DB2 START HADR ON DB hadr_db AS STANDBY
As is shown in
Table 4, host1 is reconfigured.
Table 4. Configuration values for a reintegrated standby. Various rows in column 2 have been bolded to show that they
have been auto-reconfiguredConfiguration parameter |
Host1 |
Host2 |
Host3 |
Host4 |
hadr_target_list |
host2:40|host3:41|host4:42 |
host1:10|host3:41|host4:42 |
host2:40|host1:10|host4:42 |
host2:40|host1:10|host3:41 |
hadr_remote_host |
host3 |
host3 |
host2 |
host3 |
hadr_remote_svc |
41 |
41 |
40 |
41 |
hadr_remote_inst |
dbinst3 |
dbinst3 |
dbinst2 |
dbinst3 |
hadr_local_host |
host1 |
host2 |
host3 |
host4 |
hadr_local_svc |
10 |
40 |
41 |
42 |
Configured hadr_syncmode |
SYNC |
SYNC |
SUPERASYNC |
SUPERASYNC |
Effective hadr_syncmode |
SUPERASYNC |
SUPERASYNC |
n/a |
SUPERASYNC |
If the DBA wants to make host1 the primary again, then
all that is required is a failback, which will restore the original
configuration shown in Table 1.
An auxiliary standby takes over by
force (failover) in a SA MP environment
This
example is similar to the previous one, but HADR has been deployed
with IBM® Tivoli® System Automation for Multiplatforms (SA MP)
to automate failover.
A power failure in City A results in
the principal standby (host2) becoming unavailable. Following that,
there is an outage on the primary (host1). Normally,
SA MP,
the cluster manager, would automatically fail over to the principal
standby (host2), but the power outage means that one of the auxiliary
standbys needs to be the takeover target. Failover cannot be automated
to auxiliary standbys, so the DBA must do it manually. However, before
doing this, the DBA needs to ensure that TSA is disabled so that if
host1 or host2 become available, there is no possibility for a
split
brain situation, in which more than one database is operating
independently as a primary. To do this, the DBA issues the following
command on host1 and host2 (whenever they become available):
db2haicu -disable
In
addition, the DBA needs to keep host1 offline to eliminate the possibility
that the old primary will restart if a client connects to it.
The
DBA queries the two auxiliary standbys to determine which one has
the most log data:
db2pd -hadr -db hadr_db | grep 'STANDBY_LOG_FILE,PAGE,POS'
The
DBA determines that host3 is the most up to date and picks that host
as the new primary.
Then,
the DBA issues the force takeover on host3:
DB2 TAKEOVER HADR ON DB hadr_db BY FORCE
After
the takeover is completed successfully, host3 becomes the new primary.
Meanwhile, host2 becomes available again. host3 informs host2 and
host4 that it is now the primary. On host3, the values for
hadr_remote_host,
hadr_remote_svc,
and
hadr_remote_inst are reconfigured to point
to host2, which is the principal standby because it is the first entry
in the
hadr_target_list on host3. On host2, the
synchronization mode is reconfigured to SUPERASYNC because that is
the setting for
hadr_syncmode on host3; in addition,
the
hadr_remote_host,
hadr_remote_svc,
and
hadr_remote_inst are updated (persistently).
host4 is automatically redirected to the new primary, host3. In this
redirection, host4 updates (persistently) its
hadr_remote_host,
hadr_remote_svc,
and
hadr_remote_inst configuration parameters.
There is no automatic reconfiguration on host1. The configuration
for each database is shown in
Table 5.
Table 5. Configuration values for each HADR
database after a failover. Rows 3 to 5 in columns 3 to
5 have been bolded to show that they have been auto-reconfiguredConfiguration parameter |
Host1 (unavailable) |
Host2 |
Host3 |
Host4 |
hadr_target_list |
host2:40|host3:41|host4:42 |
host1:10|host3:41|host4:42 |
host2:40|host1:10|host4:42 |
host2:40|host1:10|host3:41 |
hadr_remote_host |
host2 |
host3 |
host2 |
host3 |
hadr_remote_svc |
40 |
41 |
40 |
41 |
hadr_remote_inst |
dbinst2 |
dbinst3 |
dbinst2 |
dbinst3 |
hadr_local_host |
host1 |
host2 |
host3 |
host4 |
hadr_local_svc |
10 |
40 |
41 |
42 |
Configured hadr_syncmode |
SYNC |
SYNC |
SUPERASYNC |
SUPERASYNC |
Effective hadr_syncmode |
n/a |
SUPERASYNC |
n/a |
SUPERASYNC |