Managing Single Points of Failure in a Clustered Installation

Failover and clustering are not the same thing. In general, Sterling B2B Integrator should be scalable, reliable and have minimal downtime. Clustering is one way to meet this objective.

To achieve high availability in a system, you must:
  1. Consider potential failure points.
  2. Establish a level of risk or possible rate of occurrence and recovery for each failure. Most components cannot be unavailable for very long.
  3. Weigh this level of risk with the cost to eliminate the potential failure. This cost may be a redundant backup action to reduce the time to recover the loss. For example, if information stored in a database can be recovered by restoring from a backup eight hours old and then re-running any updates, this may be an acceptable risk compared to doing backups every two to four hours, or to the cost of having a mirrored image of the database.
A system without a single point of failure must consider all levels of the environment. If any level is unprotected, then the whole system is not protected. Consider all of the following levels:
  1. System hardware
  2. Database server
  3. External applications/systems that Sterling B2B Integrator is integrating
  4. Sterling B2B Integrator
  5. Application server
  6. Web server

To protect each level, that level would have to be replicated, and clustering affects the handling of these replications. This is often used not only to back up the primary component, but also to offload processing to the backup component while the primary component is still running.

While you must consider all of these issues, the IBM® services staff can provide expertise with configuring and setting up Sterling B2B Integrator to avoid single points of failure for the application server and Sterling B2B Integrator, with some general guidance on avoiding single points of failure in other areas. Even with this help, however, you must still provide expertise in single points of failure implementation for external applications, the database server, and system hardware.