Overview

Overview of High-Availability Clustering

High-availability (HA) clustering is a solution that uses clustering software and special purpose hardware to minimize system downtime. HA clusters are groups of computing resources that are implemented to provide high availability of software and hardware computing services. HA clusters operate by having redundant groups of resources (such as CPU, disk storage, network connections, and software applications) that provide service when the primary system resources fail. webMethods Broker can run in an HA cluster environment, under Windows or UNIX.

In a clustered environment, groups of resources (such as CPU, disk storage, network connections, and software applications) are connected to shared storage hardware, and controlling cluster software. When the primary system fails, the cluster software switches control to a secondary group of resources.

Without an HA cluster, a failed resource will remain unavailable until that resource is brought back online. Based on the dependencies among resources, a failed resource can make the entire computing environment unusable. HA clustering remedies this problem by detecting hardware or software failures and immediately starting (or failing over to) the redundant resources on another node without requiring administrative intervention. As part of this failover process, clustering software will start the resources on the redundant node in a predefined order (or resource dependency) to ensure that the entire node will come up properly.

Virtual IP Addresses

Edit online

For client applications to access services in an HA cluster in a transparent way, a virtual IP address must be supplied to the client applications. This virtual IP address is usually referred to as the "logical host." This logical host identity is a network address (or host name) and is not tied to a single cluster node.

When failover happens, the cluster control software will resolve the virtual IP address to the physical IP address of the active node in the cluster. (A virtual IP address is like any other IP address except it does not have a specific host or node to resolve to. It resolves at run time to a node wherever the IP is physically bound and reachable on the network.) The client application should not be affected in any way other than experiencing a brief outage of the services.

Types of Cluster Configuration

Edit online

There are three basic HA cluster configurations:

Active/Passive. A two-node configuration with one node being active and one node in a standby state at any given time. This is the most common configuration, and the one that is recommended for use with Broker applications.
Active/Active. A two-node configuration with both nodes being active at the same time. Each node typically runs different sets (or instances) of services. When one node fails, the services on the failed node will failover to the active node.
N-to-1. A multi-node configuration with one dedicated spare node. After the original failed node is recovered, services are restored from the spare node to the original node and the spare node returns to its standby state.

In addition to the above basic configurations, there are advanced HA configurations that are beyond the scope of this manual. For more information, see your cluster software product documentation for your cluster software.

Running Applications in a High-Availability Cluster

Edit online

Most applications can run in an HA cluster environment provided that they have:

Defined start, stop, and monitor procedures
The ability to store the application's state information and data on a shared disk
The ability to survive a crash and restart themselves in a known state
The ability to meet license requirements and host name dependencies

The webMethods Broker application fully meets all of these requirements.

The defined start, stop, and monitor procedures are usually provided as Unix shell scripts that must be incorporated into the cluster control software's infrastructure. Some custom coding will be required to enable the cluster control software to invoke these scripts to control the application.

The cluster control software will determine the health of the resources by periodically probing them using monitor scripts. When the cluster control software determines one of the resources in the cluster has failed, it will shut down the remaining active resources on that cluster node and then start the resources on the spare node.

webMethods Broker provides three Korn shell scripts plus one configuration file for starting, stopping, and monitoring a Broker Server. The scripts and the configuration file are available in the webMethods Broker_directory /scripts/generic_cluster directory.

Running the webMethods Broker Application in a High-Availability Cluster

Edit online

webMethods Broker runs as a service in a cluster. Within a cluster, there can be only a single instance of a Broker Server running at any given time. The spare Broker Server is stopped.

When a client makes a request to a Broker, the Broker handles the request much the same as in an unclustered environment. Although, in a clustered environment the Broker writes the client information to a shared disk instead of a private data store. The diagram below illustrates the flow of documents through a typical clustered environment.

If a Broker fails, subsequent requests for the session are redirected to a spare Broker in the cluster that is currently active and running, as shown in the diagram below.

Setting Up webMethods Broker to Run in a High-Availability Cluster Environment

Edit online

There are three task categories when setting up webMethods Broker to run in an HA cluster environment:

System and network administration tasks performed by the user's system administrators.
Cluster hardware and software (for example, Veritas, HP ServiceGuard, IBM HACMP) installation performed by cluster installation consultants.
webMethods Broker and HA script installation performed by a webMethods administrator.

The table below summarizes the steps for configuring webMethods Broker to run in an HA cluster environment. The table columns, User's SysAd, Cluster Vendor and webMethods Administrator, indicate that the responsible party for the tasks are the user's system administrator, cluster vendor's installation consultant and the webMethods administrator, respectively.

Important: You must perform steps 7, 8, and 9 for each node.

Step No.	Task	Comments	User's SysAd	Cluster Vendor	webMethods Administrator
1	Review this book	Read this book to gain a better understanding of the installation and configuration process.			X
2	Install HA cluster environment			X
3	Configure HA cluster environment including the shared disk storage			X
4	Administer the HA cluster environment so it is ready for software installation		X
5	Configure the external network connection to the HA cluster and create the virtual host (virtual IP address) for the HA cluster		X
6	Test the basic HA installation to ensure it functions properly		X	X
7a	Install and configure webMethods Broker on the cluster node	See Install and Configure the webMethods Broker Software for instructions.		X	X
7b	Test that webMethods Broker runs on the cluster node	See Verify That webMethods Broker Is Running for instructions.			X
8	Update the webMethods Broker Monitor configuration file	See Configure webMethods Broker Monitor for instructions.			X
9a	Configure the webMethods Broker HA scripts.	See Configure the High-Availability Cluster Scripts for instructions.			X
9b	Update the HA script configuration file with appropriate system parameters from the HA cluster.	See Edit the wment-defs.sh File for instructions.			X
9c	Test the HA scripts to ensure they function properly.	See Verify the webMethods Scripts Work Correctly for instructions.			X
10	Incorporate each node's HA scripts into the cluster control software.			X	X
11	Test the entire HA installation with the webMethods Broker application running to ensure it functions (fails over) properly.	You can verify this two ways: A manual failover using the cluster's vendor-specific commands An automatic failover by provoking/simulating a webMethods Broker crash	X	X	X