Db2 database clustering

IBM® offers several high availability Db2® configurations for the metadata repository database.

IBM InfoSphere® Information Server supports these configurations:

Db2 clustering by using high availability clustering software
Db2 with high availability disaster recovery (HADR)

In either configuration, you can use Db2 automatic client reroute to enable IBM InfoSphere Information Server processes to reconnect to a standby node when a failover occurs.

Database clustering

To provide a high availability Db2 configuration, you can create a Db2 cluster across computers. In this configuration, the metadata repository database is shared between nodes in the cluster. If a failover occurs, another node in the cluster provides Db2 functionality. To provide high availability, set up your cluster in an active-passive configuration with a single active Db2 instance on one computer and one or more passive instances on the other computers. If the Db2 instance encounters a problem or fails, a passive instance can take over.

To manage this configuration, you can choose between several high availability clustering management software products. This software maintains a "heartbeat" signal between the nodes in the cluster. If the heartbeat fails on the active node, the software initiates failover to another node.

With this configuration, the Db2 failover is automatic, but might take several minutes as the new instance acquires resources, repeats certain transactions, and undoes other transactions. To minimize interruption and manual intervention, configure Db2 automatic client reroute. This function causes other components in the IBM InfoSphere Information Server instance, such as IBM WebSphere® Application Server, to automatically reconnect to the new Db2 instance.

This configuration does not provide redundancy for the database itself. Instead, it provides high availability for database client processes and smooths the reconnection to the new node. To provide redundancy for the database itself, implement high availability disaster recovery (HADR).

The following diagram shows a topology that includes a clustered metadata repository tier. In this case, HA cluster management software on the metadata repository computers monitors Db2 processes specific to Db2. The software also monitors the health of the computer hardware and the network.

This diagram shows a topology that includes a metadata repository tier where Db2 clustering is implemented across two computers. In the figure, dotted lines show communication between computers. At the top of the diagram, there are three client workstation computers labeled "client tier". HTTP clients within the workstations communicate with the load balancer. The load balancer communicates with two web servers. A backup load balancer communicates with the load balancer. Below these components is a line that indicates the firewall. The web servers each communicate through the firewall with two application server nodes. These nodes are grouped in a cluster. EJB clients within the client workstations communicate with the application server nodes. A separate Deployment Manager computer also communicates with the application server nodes. The nodes plus the Deployment Manager computer make up the services tier. In the lower left corner there are two computers that share a storage area network (SAN). These components make up the engine tier. One of the computers is labeled "active" and the other is labeled "passive". Each computer is running HA cluster management software. There is a heartbeat between the two computers. Each node in the services tier communicates with the "active" computer. In the lower right corner there are two computers labeled "metadata repository computer". One is labeled "active" and the other is labeled "standby". They share a Db2 database. There is a heartbeat between the two computers. Each node in the services tier communicates with the "active" computer. — Figure 1. Topology with a clustered metadata repository tier

High availability disaster recovery (HADR)

To provide high availability at the Db2 database level, you can create a Db2 high availability disaster recovery (HADR) configuration. In this configuration, a complete, separate copy of the database is maintained on a standby node at a local or remote location. The primary Db2 database processes transactions. It uses internal HADR processes to replicate transaction logs to the standby node, where the logs are stored. A process at the standby node then replays the logged transactions directly to the standby database. The two copies are maintained in synchronization or near-synchronization with each other.

Db2 processes normally access only the primary copy of the database. However, if the primary copy fails, an administrator triggers the standby copy to take over the transactional workload. You can set up automatic client reroute to make the failover behavior nearly transparent. If the primary copy becomes unavailable, automatic client reroute tries the connection again to the failed primary copy. If the reconnection fails, automatic client reroute determines whether the standby copy is available. If the server is available, automatic client reroute reroutes the application server connection to the standby copy. Pending transactions are rolled back and then reissued on the standby copy. Automatic client reroute can also be configured to reconnect to an alternate standby copy if the first standby copy is unavailable. Failover is quick (normally 10 seconds to 15 seconds if the servers are on the same LAN segment).

If the unavailable database becomes available again, it is automatically reintegrated as the new standby database and is resynchronized.

The following diagram shows an IBM InfoSphere Information Server implementation in which the metadata repository tier is set up in an HADR configuration.