High availability topologies

IBM® provides different high availability solutions for each IBM InfoSphere® Information Server tier.

Increasing availability (implementing a high availability solution) refers to maximizing the percentage of time that the system services are operational. To increase availability, you implement topologies and technologies that introduce redundancy. The aim is to reduce or eliminate the number of single points of failure. Single points of failure are elements whose failure causes critical aspects of the system to stop operating.

IBM provides different high availability solutions for each InfoSphere Information Server tier. With each solution, you can design many different highly available configurations, from relatively simple setups to complex installations.

The following table lists high availability solutions for each tier:

Table 1. Tiers and high availability solutions
Tier	Solutions
Engine tier	Active-passive topology managed by high availability cluster-management software, such as IBM Tivoli® System Automation for Multiplatforms
Services tier	Either of the following solutions: Active-passive topology managed by high availability cluster-management software such as Tivoli System Automation for Multiplatforms Active-active topology with IBM WebSphere® Application Server clustering
Metadata repository tier	Any of the following solutions: Active-passive topology managed by high availability cluster-management software such as Tivoli System Automation for Multiplatforms IBM Db2® clustering Db2 high-availability data recovery (HADR) Oracle Real Application Clusters (RAC)

In general, the higher the level of overall availability that you want to achieve, the more complex the system that you must design and maintain. High availability systems typically require more hardware. For these reasons, give careful consideration to the level of availability that you require within each software tier in the system. You might want a different level of availability within your development system than you have within your production system.

To install a highly available topology, you must have a solid understanding of network technologies such as protocols, tiers, and devices. Successfully deploying highly available topologies, especially ones that include clustering, is a technically complex process that requires a high degree of technical expertise.

The server and software topology that you choose is only part of a high availability solution. You must also introduce redundancy at a number of different levels to reduce or eliminate single points of failure. When determining which topology and implementation to choose, consider the following design factors.

Level of high availability

When considering your topology, evaluate the level of high availability that you require:

Consider the amount of automation that you need. Must the system take care of failover and recovery automatically, or is a system administrator available to intervene?
Consider how your needs might differ depending upon how the system is used, such as whether it is a development system, testing system, or production system. For example, how important is it that the development system or testing system is highly available?
Consider the level of high availability that you require for different software tiers. The tiers are used differently depending upon whether the system is used primarily for development or in production. The tiers are also used differently depending upon the product modules that you have installed.
Important: For products that have components installed on multiple tiers, a failure on one tier typically renders the entire system nonoperational until the problem is fixed or a failover occurs. Likewise, this would not apply to client-only products, unless the client computer is affected.

Performance and throughput requirements

The amount of scalability that each high availability solution offers differs from solution to solution. When you are choosing a topology and architecture, consider how the system might need to scale to support greater performance and throughput requirements in the future. Other considerations include the configuration of your network, your electrical infrastructure, and your backup, restore, and disaster recovery contingencies. For examples of how the solutions differ, see the scenario comparisons.

See Capacity planning to analyze your performance, throughput, and storage needs.

Security requirements

Different topologies lend themselves to different security possibilities. Consider how you want to implement firewalls and other security precautions between the different tiers, between the computers in each tier, and between the system and external data sources and targets. These considerations are important for the services tier and the engine tier that must have fast communications paths with external systems.

For more information about security, see the IBM WebSphere Application Server documentation:

Complexity factors

High availability and scalability add layers of complexity to the installation. Eliminating single points of failure requires implementation of complex redundant hardware and software components.

When you design your system, consider the amount of complexity that your IT staff must support. Is there sufficient in-house expertise to support the system that you design? If your high availability features fail, the amount of time you need to bring the system online again might negate any uptime gains that your features provide.

Begin with a relatively simple system that is well within your ability to support. As your IT group becomes familiar with support of the system, implement the high availability features on a staggered schedule until your entire system is in place. This approach also gives the group experience with changing the system and provides them with knowledge for future system scaling.

Maintainability factors

Consider the maintenance costs of the system that you design. Determine what aspects of maintenance to automate, and which aspects to leave in the hands of support personnel.

High availability features can simplify system maintenance. For example, in a clustered system, you can take servers offline for certain updates without rendering the entire system nonoperational.

In an organization where specific departments are responsible for different systems, consider isolating the tiers on separate computers so each department can "own" a tier. For example, in an organization where a group of database administrators is responsible for corporate databases, consider isolating the metadata repository tier on computers that are within their control.

Cost

Implementing high availability typically adds to the initial cost of the system. Extra hardware, software, training, and other costs make the initial outlay larger. However, the productivity gains that the highly available system provides might make up for these costs.