High availability disaster recovery (HADR) in Db2 pureScale environments

Db2 high availability disaster recovery (HADR) is supported in a Db2 pureScale environment that provides excellent continuous availability. With HADR, you also have DR (disaster recovery) protection. With a second copy of data on the standby site, you are protected from total failure at the primary site.

Configuring and managing HADR in a Db2 pureScale environment is very similar to configuring and managing HADR in other environments. You create a standby database by restoring using a backup image or split mirror from the primary database, set various HADR configuration parameters, and start HADR on the standby and then the primary. The standby can quickly take over as the primary in the event of a role switch. All the administration commands are the same as what you are used to with HADR in other environments, but you can use only the db2pd command and the MON_GET_HADR table function to monitor HADR. Other monitor interfaces such as snapshot do not report HADR information in Db2 pureScale environments.

There are, however, some important differences for HADR in Db2 pureScale environments. An HADR pair is made up of a primary cluster and a standby cluster. Each cluster is made up of multiple members and at least one cluster caching facility; the member topology must be the same in the two clusters. The member from which you issue the START HADR command, on both the primary and the standby, is designated as the preferred replay member. When the database operates as a standby, only one member (the replay member) is activated. The database selects the preferred replay member as the replay member if the Db2 instance is online on the member, otherwise, another member is selected. That replay member replays all of the logs, and the other members are not activated. An HADR TCP connection is established between each member on the primary and the current replay member on the standby. Each member on the primary ships its logs to the standby replay member through the TCP connection. The HADR standby merges and replays the log streams. If the standby cannot connect to a particular member, A, on the primary (because of network problems or because the member is inactive) another member, B, on the primary that can connect to the standby sends the logs for member A to the standby. This process is known as assisted remote catchup.