Learn about the solutions that are available to prevent planned and unplanned outages.
Comparison of solutions
|Full cloud stack
|High Availability Disaster Recovery (HADR)
Setting up high availability disaster recovery in a hybrid deployment
|Yes (as a technology preview in a full cloud deployment)
Installing a geo-redundant deployment on Red Hat OpenShift Container Platform
In a geo-redundant deployment, the primary deployment is located on one Red Hat® OpenShift Container Platform cluster and the secondary deployment is located on a different cluster. The individual Cassandra data centers in each deployment are replicated to synchronize the event and topology data across the clusters. Geo-redundancy is available only as a technology preview in a full cloud deployment. For more information, see Installing a geo-redundant deployment on Red Hat OpenShift Container Platform.
High availability disaster recovery
A high availability disaster recovery (HADR) hybrid deployment is composed of cloud native components on Red Hat OpenShift Container Platform along with an on-premises installation that has multiple IBM Netcool/OMNIbus WebGUI instances.
The on-premises WebGUI or DASH servers can be set up with load balancing by using an HTTP Server that balances the on-premises UI load. If the primary WebGUI fails, then the user is routed to the backup WebGUI seamlessly.
For disaster recovery, automatic and manual failover and failback between Netcool Operations Insight deployments is supported. If the primary ObjectServer fails, then the secondary ObjectServer takes over. In a HADR hybrid deployment, only cloud native analytics policies get pushed to the backup cluster through the backup or restore pods. No event and topology data is synchronized across the Cassandra instances, as the Cassandra instances do not communicate with each other.
- Supporting continuous grouping of events between two hybrid deployments.
- Allowing more than one WebGUI instance to connect to the same hybrid deployment.
- Supporting automatic and manual failover and failback between deployments.
- Backup and restore of cloud native analytics policies.
On-premises WebGUI access is through the HTTP load balancer. The HTTP load balancer enables high availability by distributing the workload among the WebGUI instances.
DASH is set up to use single sign-on (SSO) with the ObjectServer as the repository to store the OAuth tokens. Public-private key pairs on each DASH instance confirm the validity of the LTPA tokens.
ObjectServer traffic flows between the on-premises aggregation ObjectServer and the WebGUI instances. The traffic includes UI configuration metadata, authentication, and event data.
The console integration with the on-premises HTTP load balancer is updated by the active deployment. At any one time, either the primary or the backup cloud deployment updates the console integration.
Certificate authority (CA) signed certificates allow communication between the WebGUI instances. These
CA signed certificates are loaded in to the HTTP load balancer. CA signed certificates are also
added to the
user-certificates configmap. The common UI services load the CA signed
certificates from the configmap for the cluster connection to the HTTP load balancer.
The HAproxy directs users to the currently active deployment. The cloud NOI UI components query the HAproxy to determine the OAuth token for the associated Web GUI instance.
The coordinator service in the backup deployment tries to connect to the coordinator service in the primary deployment through the HAproxy, to determine the state of the primary deployment. If the primary coordinator service is not reachable, the backup coordinator service does the failover.