HADR takeover operations with multiple standbys
When an HADR standby database takes over as the primary database in a multiple standby setup, there are a number of important differences from when there is a single standby.
With HADR, there are two types of takeover: role switch and failover. Role switch, sometimes called graceful takeover or non-forced takeover, can be performed only when the primary is available and it switches the role of primary and standby. Failover, or forced takeover, can be performed when the primary is not available. It is commonly used in primary failure cases to make the standby the new primary. The old primary remains in the primary role in a forced takeover, but the standby sends it a message to disable it. Both types of takeover are supported with multiple standby databases, and any of the standby databases can take over as the primary. A crucial thing to remember, though, is that if a standby is not included in the new primary's target list, it is considered to be orphaned and cannot connect to the new primary.
- On the new primary: They refer to the principal standby (the first database listed in the new primary's target list).
- On the standbys: They refer to the new primary. When an
old primary is reintegrated to become standby, the START
HADR AS STANDBY command first converts
it to a standby. Thus it can also be automatically redirected to the
new primary if it is listed in the target list of the new primary. Note: Orphaned standbys are not automatically updated in this way. If you want them to join as standbys, you need to ensure they are in the new primary's target list and that they include the new primary in their target lists.
Role switch
Just as in single standby mode, role switch in multiple standby mode guarantees no data is lost between the old primary and new primary. Other standbys configured in the new primary's hadr_target_list configuration parameter are automatically redirected to the new primary and continue receiving logs.
Failover
Just as with one standby, if a failover results in any data loss with multiple standbys (meaning that the new primary does not have all of the data of the old primary), the old and new primary's log streams diverge and the old primary has to be reinitialized. For the other standbys, if a standby received logs from the old primary beyond the diverge point, it has to be reinitialized. Otherwise, it can connect to the new primary and continue log shipping and replay. As a result, it is very important that you check the log positions of all of the standbys and choose the standby with the most data as the failover target. You can query this information using the db2pd command or the MON_GET_HADR table function.