High availability in Cloud Pak for Business Automation

Learn about the high availability options in IBM Cloud Pak® for Business Automation.

High availability topology

The following image shows a high availability topology for Cloud Pak for Business Automation.

ICP4BA high availability topology

To configure your environment for high availability:

  1. Configure your pods across different worker nodes.
  2. Set up storage that includes high availability features, for example, Red Hat OpenShift Data Foundation (previously Red Hat OpenShift Container Storage).
  3. Deploy your control planes and worker nodes across multiple availability zones.
  4. Distribute your virtual machine or hardware across multiple availability zones or locations. For example, you can host your virtual machine across a private and public cloud.

The following diagram shows what your deployment might look like when you distribute your resources across multiple availability zones. The example deployment includes automatic recovery, no data loss, and data mirroring. Zones 1-3 in the following diagram are also known as availability zones or fault domains. Each zone has a control plane and worker nodes in Openshift Container Platform, and a storage instance in Openshift Container Storage.

ICP4BA example high availability topology

The load balancer in the diagram uses the following synchronous architecture to minimize local data center failures:

  • The disaster recovery sites and availability zones can be connected by a metropolitan area network (MAN) or a campus area network (CAN) network.
  • The availability zones are mapped to a fault domain.
  • An odd number of availability zones or fault domains are needed for the cluster quorum.
  • Network latency between zones usually does not exceed 5 milliseconds.
  • Red Hat OpenShift Container Platform ensures that:
    • Pods and nodes are scheduled across zones during deployment.
    • Data that is stored in each Red Hat OpenShift Data Foundation instance is consistent across each zone.
    • Applications are automatically recovered without disruption across zones.

High availability of EDB Postgres

High availability of EDB Postgres on Red Hat OpenShift involves setting up a replica cluster. A replica cluster is a set of Postgres instances on replica nodes that you manage with Kubernetes StatefulSets. For more information, see Replica clusters.

When a replica cluster is configured and the primary node fails, one of the replica nodes is promoted to a new primary node. The cluster continues to operate without interruption. The EDB Postgres replication documentation covers the following topics:

  • Configuring streaming replication between a primary node and replica nodes for a Postgres cluster.
  • Creating and configuring a Postgres replica.
  • Setting up monitoring and alerting for Postgres replication.
  • Testing failover and promoting a replica to a primary node.

High availability in Cloud Pak for Business Automation components

Depending on the components that you choose and your architecture, built-in or configurable high availability features are available.

PostgreSQL database

The PostgreSQL database clusters have multiple replicas. For more information about the cluster architecture, see Architecture. One database pod is designated as the primary database service, while the other database pods are in a standby status and sync with the data in the primary database server. In the situation where the primary database is unavailable, Kubernetes automatically moves the service to another instance of the database cluster, which becomes the new primary database. Production deployments use 2 replicas by default, where 1 is the primary and the other is the backup. Starter deployments have 1 replica only and do not support high availability.

Diagram showing the high availability architecture for a PostgreSQL database
Business Teams Service
Business Teams Service is part of Cloud Pak foundational services and provides team management. For Business Teams Service high availability, you have the following:
  • Business Teams Service uses Cloud Native PostgreSQL, which has built-in functions for high availability and disaster recovery.
  • Business Teams Service supports scaling pre-defined deployment profiles and sizes. For more information, see Post-installation.
You can back up and recover Business Teams Service data by using storage providers such as Amazon Simple Storage Service (Amazon S3), Google Cloud, and Microsoft Azure. For more information, see Backing up and restoring.
Business Automation Insights

High availability of Business Automation Insights is covered in the following topics:

Differences between high availability in Cloud Pak for Business Automation and on-premises environments

The following tables show the differences between high availability in Cloud Pak for Business Automation and on-premises environments for IBM FileNet® Content Manager and IBM Business Automation Workflow.

Table 1. Differences between high availability in IBM FileNet Content Manager in Cloud Pak for Business Automation and in an on-premises environment
Feature For an on-premises WebSphere® Application Server cluster For an IBM Cloud Pak for Business Automation environment
Multiple IBM FileNet Content Manager server instances The server instances are based on the traditional WebSphere Application Server Network Deployment cluster. If the WebSphere Application Server Java virtual machine (JVM) crashes, it is restarted automatically. The node agent manages the crash. The server instances are based on the Liberty server. Crashed pods are restarted automatically and managed by the Kubernetes Kubelet.
HTTP load balancer The load balancer is provided by the IBM HTTP Server or another load balancer. The load balancer is provided by the router in Red Hat OpenShift or Kubernetes.
Session affinity Session affinity or session persistence is provided by the IBM HTTP Server WebSphere Application Server plug-in or provided through the load balancer. Session affinity is provided by Red Hat OpenShift and Kubernetes.
Workload manager Workload Manager provides high availability for Enterprise JavaBeans protocol routing to cluster members. Not available in this environment. Enterprise JavaBeans protocol is not used.
Logging Log files are stored on the file system. Pod and log files can be forwarded to a logging service such as Elasticsearch, Logstash, and Kibana (ELK) or Elasticsearch, Fluentd, and Kibana (EFK).
Health Check The node agent and IBM HTTP Server WebSphere Application Server plug-in monitor the health status of the application server instance. Red Hat OpenShift or the Kubernetes kubelet process maintains the configured state. The liveness and readiness probes monitor the health status of the application inside the container.
Scalability Scale the existing cluster resources by using a dynamic cluster. Scale the existing cluster resources by using the custom resource.
Table 2. Differences between high availability in IBM Business Automation Workflow in Cloud Pak for Business Automation and in an on-premises environment
Feature For an on-premises WebSphere Application Server cluster For an IBM Cloud Pak for Business Automation environment
Multiple Business Automation Workflow server instances The server instances are based on the traditional Business Automation Workflow cluster. If the server crashes, it is restarted automatically. The node agent manages the crash. The server instances are based on the Liberty server. Crashed pods are restarted automatically and managed by Red Hat OpenShift or Kubernetes.
Load balancer The load balancer is provided by the IBM HTTP Server or another load balancer. The load balancer is provided by the router in Red Hat OpenShift or Kubernetes.
Session affinity Session affinity is provided by the IBM HTTP Server plug-in. Session affinity is provided by Red Hat OpenShift or Kubernetes.
Transaction service The transaction log is stored in the database. The high availability manager in the WebSphere Application Server cluster provides peer transaction recovery. The transaction log is stored in the database. The Liberty transaction service provides peer recovery.
Messaging service The message engine provides the messaging service. The Java Message Service pod provides the messaging capacity.
Logging service The log files are stored on the file system. The log files are stored on the persistent volume claim (PVC) or persistent volume (PV) provided by Red Hat OpenShift or Kubernetes.
Health check The node agent maintains the health check. Red Hat OpenShift or the Kubernetes kubelet process maintains the configured state. The liveness and readiness probes monitor the health status of the application inside the container.
Scalability Complex steps to add nodes to the cluster or extend the deployment environment. Fewer steps to scale up and down.