Architecture Overview

Learn more about the High Availability (HA) controller + n compute topology and architecture.

The HA controller +n compute topology uses the Pacemaker cluster stack to provide cluster resource management. As a result, the topology achieves high availability of the IBM® Cloud Manager with OpenStack controller services. The Pacemaker cluster stack is designed to recover from single points of failures, such as service, resource, and node failures. However, cascading failures or multiple points of failure (for example, losing most of the HA controller nodes) can require manual intervention to recover the cloud environment. Pacemaker interacts with the controller services by using resource agents (RA) provided by IBM Cloud Manager with OpenStack.

Many of the IBM Cloud Manager with OpenStack controller services have an active/active configuration (that is, cloned resource in Pacemaker terminology). Having an active/active configuration means that the service runs on every HA controller node. However, some controller services have an active/passive configuration (that is, primitive resource in Pacemaker terminology). Such services run on only one HA controller node at a time. If the active node fails, Pacemaker fails over the service to another available HA controller node.

The HA controller +n compute topology uses DB2® for the database controller service. This service has an active/standby configuration (that is, master resource in Pacemaker terminology). An active/standby configuration means that the service runs on each of the HA controller nodes (up to at most four nodes due to the DB2 HADR design). However, only one of the nodes is the master and handles the database requests from the IBM Cloud Manager with OpenStack services. This configuration is used with the built-in DB2 HADR support to achieve high availability for the database controller service.

RabbitMQ is the messaging controller service that is used by the HA controller +n compute topology. This service has an active/active configuration and uses the built-in clustering support that is provided by RabbitMQ to achieve high availability for the messaging controller service.

The HA controller +n compute topology requires a virtual IP address (VIP) for high availability and load balancing of the IBM Cloud Manager with OpenStack controller services. The VIP is managed by Pacemaker and only one HA controller node is assigned the VIP at a time. The node assigned the VIP also runs HAProxy and the IBM DB2 HADR master. HAProxy provides load balancing across the HA controller nodes for the IBM Cloud Manager with OpenStack API and UI controller services. In addition, HAProxy assists with high availability by preventing requests from being sent to services or nodes that are not available. DB2 is accessed through the VIP and not through HAProxy. RabbitMQ is not accessed through the VIP or HAProxy. Instead, clients are configured to user the list of RabbitMQ cluster nodes. Fail over between these nodes is automatically handled by the clients. Users and applications access the cloud environment by using the VIP. The VIP along with HAProxy route requests to one of the available HA controller nodes.

Optionally, the HA controller +n compute topology can define a virtual pubic IP address (VPIP) for high availability and load balancing of the IBM Cloud Manager with OpenStack controller public services. The VPIP is managed by Pacemaker and only one HA controller node is assigned the VPIP at a time. The node assigned the VPIP also runs HAProxy and the IBM DB2 HADR master. HAProxy provides load balancing across the HA controller nodes for the IBM Cloud Manager with OpenStack API and UI controller public services. In addition, HAProxy assists with high availability by preventing requests from being sent to services or nodes that are not available.

For more information on Pacemaker, DB2, RabbitMQ, or HAProxy, see the following links: