Choice of SUPERASYNC mode for disaster recovery using DB2 HADR
Introduction and background
HADR is a DB2 feature that provides high availability and disaster recovery via data replication. When enabled, database logs from the Primary database are shipped to the standby database in real time. The standby database continuously replays the received logs to stay in sync with the Primary database.
Starting with DB2 V9.5 Fix Pack 8 and DB2 V9.7 Fix Pack 5, SUPERASYNC can be specified as hadr_syncmode, where Primary can never be blocked in any case.
This article explains the purpose of SUPERASYNC mode, how you can set up HADR in this mode, as well as the different transition states of standby in this mode. It includes use cases that suit implementation of SUPERASYNC mode including the pros and cons of using it.
Purpose of SUPERASYNC mode
You may face an issue where the Primary is blocked due to the slowness in the replay of logs at the standby side due to network hiccups or lack of resources on the standby. The SUPERASYNC mode has been introduced to prevent any back pressure (slowing/blocking the transactions) on Primary caused by network hiccups or slow performing standby.
How HADR works in SUPERASYNC mode
In SUPERASYNC mode, HADR EDU does the log shipping in the background, and does not interfere with the code path of the transaction, which means log shipping is out of the loop for committing transactions. Hence, it will not block the Primary from running the transactions.
The HADR pair will never enter Peer state or Disconnected Peer state. The HADR state will move from local catch up to remote catch up, and then stays in remote catch up. HADR will always ship logs from Primary on disk or archived logs. It will not enter peer state where logs are shipped from the Primary database log buffer, and the Primary database log writer can be slowed down.
This gives the best performance when compared to the other sync modes as shown in Figure 1.
Figure 1. Working of HADR in SUPERASYNC
- Primary writes the log records to the log files of the primary database.
- It then commits the transaction without waiting for log replication to the standby.
Setting up HADR pair in SUPERASYNC mode
To set up the HADR pair in SUPERASYNC mode, update the HADR_SYNCMODE db cfg parameter with SUPERASYNC as shown in Listings 1 through 3.
Listing 1. Update HADR_SYNCMODE
$ db2 update db cfg for hadrdb using HADR_SYNCMODE SUPERASYNC DB20000I The UPDATE DATABASE CONFIGURATION command completed successfully. SQL1363W One or more of the parameters submitted for immediate modification were not changed dynamically. For these configuration parameters, the database must be shutdown and reactivated before the configuration parameter changes become effective.
Listing 2. Deactivate database
$ db2 deactivate db hadrdb DB20000I The DEACTIVATE DATABASE command completed successfully.
Listing 3. Activate database
$ db2 activate db hadrdb DB20000I The ACTIVATE DATABASE command completed successfully.
The state of a Primary or standby can be monitored using the MON_GET_HADR (this may not be available in V9.7) table function or the db2pd command with -hadr option.
$db2pd –db dbname -hadr
HADR state transition in SUPERASYNC mode
As shown in Figure 2, when a standby database is started, it enters Local catchup state and reads the log files that are available in the local log path. Once it reads the local log files, standby will enter into Remote catchup pending state and wait for the Primary connection. Once the Primary database is connected to a standby database, they will stay in Remote catchup state and will never get into Peer state to avoid back pressure on Primary.
Figure 2. States of the standby database in SUPERASYNC mode
When Primary and standby are connected in SUPERASYNC mode, the state of the standby will be RemoteCatchup, as shown in Figure 3.
Figure 3. Standby state as Remote catchup
When the standby becomes unavailable, Primary state will be Disconnected as shown in Figure 4.
Figure 4. Primary state as disconnected when standby is unavailable
And when the standby loses connection to Primary, the standby state will be RemoteCatchupPending as shown in Figure 5.
Figure 5. Standby state as RemoteCatchupPending
DB2 disaster recovery scenarios where SUPERASYNC is configured
The following section describes the use case scenarios where SUPERASYNC can be configured as hadr_syncmode, and how it helps in disaster recovery with better Primary performance.
DB2 disaster recovery using HADR and HA using cluster service
The following scenario is applicable for DB2 V9.7
Let's say Primary is set up in Location 1 for high availability between two machines (M1 and M2) using TSA or HACMP cluster service. The standby is set up in Location 2 for disaster recovery using HADR replication process in SUPERASYNC mode as shown in Figure 6.
Figure 6. DR using SUPERASYNC
User applications connect and execute transactions on the primary server, such as M1. The logs will be shipped to the standby server from M1. Since standby is set up in SUPERASYNC mode there will not be any back pressure on Primary due to its distance from Primary (high network latency) or network hiccups. Hence, Primary's performance will be good.
In case M1 on the Primary side (at Location 1) goes down, the other node which is set up for high availability such as Machine M2 in Location 1 is brought up automatically by the cluster service.
After this HA failover, log shipping will be performed from M2 to standby via HADR replication process. If Location 1 goes down (both M1 and M2 are down), standby can be brought up as Primary. Through this kind of setup, you can achieve best performance with high availability of database as well as recovery of database in case of disaster.
Multiple standby for DR using HADR in SUPERASYNC
The following scenario is applicable for DB2 V10.1
In HADR with multiple standby you can have up to three standby databases, which is a new feature in DB2 V10.1. One of these databases can be designated as Principal standby (which supports all the HADR sync modes), and others as auxiliary standby databases (which supports only SUPERASYNC mode). Principal standby can be deployed in the same location as Primary. Auxiliary standbys can be deployed in a distant location which can provide protection from the disaster at Primary and principle locations.
The following are the two possible scenarios to set up multiple standby for disaster recovery using HADR in SUPERASYNC mode.
- Scenario 1: DB2 high availability and disaster recovery using HADR
- Scenario 2: DB2 high availability and multiple DR using HADR
Scenario 1: DB2 high availability and disaster recovery using HADR
In this scenario, high availability has been set up between Primary and Principal standby at Location 1 using TSA cluster service. Auxiliary standby is set up for DR at Location 2, as shown in Figure 7.
Figure 7. HA and DR using multiple standby
User applications connect and execute transactions on the Primary server. Transaction logs will be shipped from the Primary to the Principal and auxiliary standby servers. Since auxiliary standby is set up in SUPERASYNC, there will not be any back pressure on Primary due to its distance from Primary (high network latency) or network hiccups.
If there is an outage on the Primary, Principle standby will be brought up as Primary automatically using the cluster service (TSA), and now the new Primary ships the logs to the standby at Location 2.
If any disaster occurs at Location 1 (both Primary and Principal standby are down), then the standby at Location 2 can be brought up as Primary. Since you used the SUPERASYNC mode at the disaster recovery site, you can achieve the best performance at Primary by avoiding back pressure due to distant location or network delay.
Scenario 2: DB2 high availability and multiple DR using HADR
In this scenario, high availability has been set up between Primary and Principal standby at Location 1 using TSA cluster service. For disaster recovery, Auxiliary standby1 is set up at Location 2, and Auxiliary standby2 is set up at Location 3, as shown in Figure 8.
Figure 8. HA and multiple DR using multiple standby
User applications connect and execute transactions on the Primary server. Transaction logs are shipped from the Primary to the Principal standby and also to both auxiliary standby servers. Since auxiliary standbys are set up in SUPERASYNC, these will not affect the activity on Primary due to the distance or the network delay.
If there is an outage on the Primary, Principle standby will be brought up as Primary automatically using the cluster service (TSA), and now the logs will be shipped from the new Primary to other standbys.
If any disaster occurs at Location 1, one of the auxiliary standbys will be brought up as Primary, and applications will be connecting to this new Primary and logs will be sent from new Primary to the remaining standby. Since you used the SUPERASYNC mode at disaster recovery sites, you can achieve the best performance at Primary by avoiding back pressure due to distant location or network delay.
In this article, you have learned the benefits and drawbacks of using SUPERASYNC mode, which are as follows.
- The transaction response time in this mode is shorter than all other synchronization modes. But it has the highest probability of transaction losses if any failure occurs on the Primary system. This mode is useful when you do not want your transactions on Primary to be blocked or experience prolonged response times due to network-related problems.
- Transaction commits on Primary database is not affected by the HADR network or the standby server. This might cause the continuous increase in the log gap between the Primary database and standby database. A large log gap would result in a long graceful takeover time. All data in the log gap will be lost if any disaster occurs on the Primary system. Hence it is important to monitor the log gap using the hadr_log_gap monitor element or db2pd –hadr command. If you observe that the log gap is not acceptable, investigate the network performance or the relative speed of the standby database and take corrective actions to control the log gap.
- Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.