Setting up a multicluster environment

The IBM® Sterling Intelligent Promising multicluster setup enables data replication across multiple data centers to ensure increased availability and consistency.

Remember: It is not mandatory to implement a multicluster setup. However, a multicluster setup is recommenced for production environments.
Context

The Sterling Intelligent Promising primarily relies on two databases: Cassandra and Elasticsearch. Depending on the type of data, it might be stored in one or both databases.

  • Inventory data (supply, demand, reservation): Initially stored in Cassandra, then synchronized with Elasticsearch.
  • Inventory audit data: Exclusively stored in Elasticsearch.
  • Rules configuration data: Exclusively stored in Elasticsearch.
  • Catalog data: Exclusively stored in Elasticsearch.
Need for data replication in a multicluster setup
In a multicluster setup, two data centers are typically available. If an issue arises with one data center, the API traffic can be diverted to the other data center to ensure business continuity and maintain availability of services. To achieve this, it is crucial to keep the databases synchronized between both data centers.
Cassandra data replication
Cassandra natively supports the setup of a multidata center (DC) cluster. Ensure that cross-data center (DC) data replication for Cassandra is enabled. It is essential to ensure data consistency and maintaining increased availability across multiple data centers.
Elasticsearch data replication
You can deploy Elasticsearch to natively replicate data across different Elasticsearch clusters. However, if you set up two independent Elasticsearch clusters within each Kubernetes cluster, they can operate independently. The Sterling Intelligent Promising provides a mechanism for near real-time data replication between these two clusters.

Data ingestion in Elasticsearch is driven through Kafka topics. The Sterling Intelligent Promising supports replicating Kafka data and hence data to Elasticsearch via Kafka MirrorMaker. Setting up mirrored topics ensures real-time replication of Elasticsearch data across multiple data centers. Thus, maintaining consistent topic data across clusters. This ensures that even if one data center goes down, the capabilities dependent on Elasticsearch remain operational through the replicated data in another data center, which maintains business continuity and provides a seamless user experience.

About this task

To implement multicluster setup, configure the mirrored Kafka topics and update required custom resources definitions to replicate Elasticsearch data.

Procedure

  • Mirror the Kafka topics for each service as demonstrated in the following sections.
  • Configure the custom resources definitions as shown in the following steps.
    1. To enable Elasticsearch data replication, configure multiDCEnabled and replicationEnabled to true.
      apiVersion: apps.sip.ibm.com/v1beta1
      kind: SIPEnvironment
      metadata:
        name: sip
        namespace: test
      spec:
        multiDCEnabled: true
        externalServices:
          elasticSearch:
            replicationEnabled: true
      Remember: If the cloud provider you chose for deployment offers Elasticsearch data replication and you choose to use it, you can set replicationEnabled in elasticSearch to false. Also, you do not need to create mirrored topics for Elasticsearch data replication. However, the application will not replicate the Elasticsearch data under these conditions.
      apiVersion: apps.sip.ibm.com/v1beta1
      kind: SIPEnvironment
      metadata:
        name: sip
        namespace: test
      spec:
        multiDCEnabled: true
        externalServices:
          elasticSearch:
            replicationEnabled: false
    2. To mirror Kafka topics, define the following Kafka prefixes for the topics that are used across different data centers.
      1. topicPrefix: A prefix used for Kafka topics in the current data center. It is recommended to use the same topicPrefix for all the clusters.
      2. mirrorTopicPrefix: A prefix used in Kafka MirrorMaker to label replicated topics in destination clusters. With this attribute you can identify mirrored topics by appending a specified prefix to their names during replication.
        Example: If topicPrefix is sip and mirrorTopicPrefix is dc1, a topic that is named sip-catalog-update-attribute in the source cluster is replicated as dc1.sip-catalog-update-attribute in the destination cluster. This clear labeling helps in identifying the topics that are mirrored and from which source data centers they originate.
        Note: Set up Kafka MirrorMaker on your own to handle the actual replication process. The Sterling Intelligent Promising reads data from these mirrored topics by using the specified naming convention.
      3. crossDCTopicPrefix: An optional prefix used specifically for Kafka topics in other data centers. If not provided, topicPrefix is used for mirrored topics.
    3. Configure the optional environment parameter to specify the environment context for topic names. Ensure that if you specify it, it must be the same for all the clusters.
      The following example shows Kafka prefixes and environment parameter that are configured in SIPEnvironment.
      apiVersion: apps.sip.ibm.com/v1beta1
      kind: SIPEnvironment
      metadata:
        name: sip
        namespace: test
      spec:
        environment: ""
        externalServices:
          kafka:
            topicPrefix: ""
            mirrorTopicPrefix: ""
            crossDCTopicPrefix: ""