Installing an active-active configuration

Install on three or more clusters in an active-active configuration.

This topic describes how to install IBM® Cloud Pak for Network Automation on three clusters in an active-active configuration, although you can install on more than three clusters.

The term cluster 1 here refers to the first cluster that you install. Cluster 1 becomes the SymmetricDS registration node. Cluster 2 and cluster 3 and any other clusters that you install after cluster 1 must register with the registration node.

You must complete the following sets of steps to install in an active-active configuration:

  1. Install the orchestration manager operator and update the custom resource on cluster 1
  2. Add cluster details to the custom resource for cluster 1
  3. Create an instance on cluster 1
  4. Gather data for the installation of clusters 2 and 3
  5. Install the orchestration manager operator on clusters 2 and 3
  6. Add details of cluster 1 to the custom resource of clusters 2 and 3
  7. Update ZooKeeper details, apply changes, and verify on all clusters
  8. Configure Kafka MirrorMaker on all clusters

Install the orchestration manager operator and update the custom resource on cluster 1

For information about how to install the IBM Cloud Pak for Network Automation Orchestration Manager operator on the first cluster, see Installing IBM Cloud Pak for Network Automation Orchestration Manager.

If you want to customize your installation, do it after you install the operator. You must use a custom resource (CR) to specify how the orchestration manager operator is installed and configured in your environment. For information about how to customize your installation by modifying the CR settings, see Install the Cloud Pak with the Red Hat OpenShift CLI. For reference information about CR settings, see Custom resources.

Add cluster details to the custom resource for cluster 1

To enable multicluster services, you must complete the spec.advanced.clusters section of the CR. Add details of cluster 1 to the spec.advanced.clusters.local attribute.

Add the following details of the other clusters to the spec.advanced.clusters.remote attribute:

Cluster Attribute
Hostname and port of the Kafka bootstrap server kafka.boostrapserver
Name of the Kafka secret that you created for the cluster kafka.authentication.passwordSecret.secretName
Attribute in the secret that contains the bootstrap server password kafka.authentication.passwordSecret.password
Name of the Kafka secret that you created for the cluster kafka.tls.trustedCertificates.secretName
Attribute in the secret that contains the SSL certificate kafka.tls.trustedCertificates.certificate

The following CR is an example CR for cluster 1, where we include details of clusters 2 and 3 as much as we can at this point in the installation. The values for the advanced.clusters.remote.kafka.bootstrapServers attributes are derived from the namespaces and cluster names of the remote clusters. These values must match the actual Kafka bootstrap server values that you configure in Configure Kafka MirrorMaker on all clusters.

apiVersion: tnc.ibm.com/v1beta1
kind: Orchestration
metadata:
  name: cluster-1
spec:
  license:
    accept: true
  version: 2.7.6
  featureconfig:
    siteplanner: true
    logging: false
  advanced:
    synchronizeVaultChanges: false
    multitenant: false
    podSettings:
      zookeeperlocks:
        replicas: 3
    clusters:
      local:  # Replication config settings for local cluster
        alias: cluster-1
        kafka:
          mm2:
            replicas: 1
      remote:  # Replication config settings for remote clusters
        - alias: cluster-2                
          kafka:
            bootstrapServers: cp4na-o-events-kafka-bootstrap-lifecycle-manager.apps.cluster-2.cp.fyre.ibm.com:443
            authentication:
              passwordSecret:
                secretName: cp4na-o-kafka-cluster-all
                password: password-2
            tls:
              trustedCertificates:
                - secretName: cp4na-o-kafka-cluster-all
                  certificate: ca-cluster-2.crt
          zookeeper:
        - alias: cluster-3
          kafka:
            bootstrapServers: cp4na-o-events-kafka-bootstrap-lifecycle-manager.apps.cluster-3.cp.fyre.ibm.com:443
            authentication:
              passwordSecret:
                secretName: cp4na-o-kafka-cluster-all
                password: password-3
            tls:
              trustedCertificates:
                - secretName: cp4na-o-kafka-cluster-all
                  certificate: ca-cluster-3.crt
          zookeeper: 

Create an instance on cluster 1

Set the namespace for cluster 1 to the namespace where you installed the orchestration manager operator:

oc project <cp4na_namespace>

Create an instance of IBM Cloud Pak for Network Automation on cluster 1 by running the following command:

oc create -f cluster-1.yaml

cluster-1.yaml is the name of the CR file.

To verify that the instance is successfully created, run the following command:

oc get orchestration

The instance might take some time to create. When the Status value is Ready, your instance is created.

Gather data for the installation of clusters 2 and 3

Complete the following steps to gather data from cluster 1 to use during the installation of clusters 2 and 3:

  1. On cluster 1, export the secret cp4na-o-vault-keys to a file by running the following command:
    oc get secret cp4na-o-vault-keys -o yaml > cp4na-o-vault-keys-cluster-1.yaml

    The secret must be in a format similar to the following example:

    apiVersion: v1
    data:
      alm_token: NjA1MjVhODEtY2QyYy0xMWVjLTk0OTctMGE1ODBhZmUxMDE3
      key1: b96e8804bf5f4642f8c351429f2eb9959fe9c514d544c2c39aab4d0e101eafc0
      root_token: cy45NXJQS0xOeWZKaHlzOTFqZDVURnM1a0Q=
    metadata:
      name: cp4na-o-vault-keys
      namespace: <cp4na_namespace>
    kind: Secret
    
  2. Copy the exported secret file from cluster 1 to clusters 2 and 3.
  3. On clusters 2 and 3, delete the following attributes from the exported secret file:
    • All the attributes in the metadata section, except for the name and namespace attributes.
    • The type: Opaque attribute-value pair.
  4. On clusters 2 and 3, in the exported secret file, enter the namespace where you installed the orchestration manager operator in the namespace attribute.
  5. On clusters 2 and 3, create the secret cp4na-o-vault-keys by running the following command on each cluster:
    oc create -f cp4na-o-vault-keys-cluster-1.yaml
  6. Cluster 1 is the SymmetricDS registration node, which is the node that subsequent clusters register with when you install them. On cluster 1, get the SymmetricDS registration URL from the cluster 1 orchestration status by running the following command:
    oc get orchestration cluster-1 -o jsonpath='{.status.endpoints.symmetricds.registrationURL}'
  7. On cluster 1, get the IP address and port details for the ZooKeeper load-balancing servers by running the following command:
    oc get svc | grep cp4na-o-zookeeper-locking-ext

    Save the details. You use them later to create the ZooKeeper section in the CRs for clusters 2 and 3.

  8. On cluster 1, get the external bootstrap service address for Kafka by running the following command:
    oc get routes cp4na-o-events-kafka-bootstrap -o jsonpath='{.spec.host}' -n <cp4na_namespace>

    Save the details. You use them later to create the Kafka section in the CRs for clusters 2 and 3.

Install the orchestration manager operator on clusters 2 and 3

Complete the following steps to install the orchestration manager operator on clusters 2 and 3:

  1. Install the IBM Cloud Pak for Network Automation Orchestration Manager operator on cluster 2. For more information about how to install the orchestration manager operator, see Installing IBM Cloud Pak for Network Automation Orchestration Manager.
  2. Specify how the orchestration manager operator is installed and configured in a CR. For information about how to customize your installation by modifying the CR settings, see Install the Cloud Pak with the Red Hat OpenShift CLI. For reference information about CR settings, see Custom resources.
Tip: After you install the orchestration manager operator on clusters 2 and 3, the pods for the Daytona intent engine enter a CrashLoopBackOff state. You can continue the procedure to install an active-active configuration when the pods reach this state. The CrashLoopBackOff state resolves later in this procedure, after you update the CR for cluster 1 with the ZooKeeper details for clusters 2 and 3, then apply those changes.

Add details of cluster 1 to the custom resource of clusters 2 and 3

To enable multicluster services on the other clusters, complete the following steps. In these steps, you use the information that you gathered in the Gather data for the installation of clusters 2 and 3 procedure.

  1. Enter the SymmetricDS registration URL in the spec.advanced.clusters.local.db.symmetricds.registrationUrl attribute. The first cluster that you installed, cluster 1, is the SymmetricDS registration node. Any other clusters that you install must register with the registration node.
    Tip: You got the SymmetricDS registration URL from cluster 1 in the Gather data for the installation of clusters 2 and 3 steps.
  2. Assign a unique SymmetricDS sequence number for the cluster 2 in the spec.advanced.clusters.local.db.sequence.rangeMultiplier attribute. You don't need to set a sequence number for cluster 1, but all other clusters require one. Cluster 2 must have the sequence number 1, cluster 3 must have the sequence number 2, and so on. For example, for a three-cluster deployment, the sequence numbers are as follows:
    • Cluster 1 - No sequence number
    • Cluster 2 - 1
    • Cluster 3 - 2
  3. Add the IP addresses and port details for the ZooKeeper load-balancing services for cluster 1 to the spec.advanced.clusters.remote.zookeeper section of cluster 2.

    Each cluster has 3 ZooKeeper servers. The following example shows the spec.advanced.clusters.remote.zookeeper section for cluster 2.

          remote:  # Replication configuration settings for remote clusters.
              zookeeper:
                server.1: 192.0.2.0:31079:32458
                server.2: 192.0.2.0:32544:32432
                server.3: 192.0.2.0:32579:32658

    The IP address of cluster 1 is 192.0.2.0. The servers on cluster 1 have server IDs 1, 2, and 3. For more information about the numbering pattern for ZooKeeper services, see ZooKeeper server ID numbering.

    The following example from a CR shows the configuration for cluster 2. Cluster 1 is listed as a remote cluster. The values in the CR include the SymmetricDS, ZooKeeper, and Kafka values that you gathered in the Gather data for the installation of clusters 2 and 3 procedure. This example shows how the CR might appear before you add the details for cluster 3. That is why information about cluster 3 does not appear in the example.

    apiVersion: tnc.ibm.com/v1beta1
    kind: Orchestration
    metadata:
      name: cluster-2
    spec:
      license:
        accept: true
      version: 2.7.6
      featureconfig:
        siteplanner: true
        logging: false
      advanced:
        podSettings:
          zookeeperlocks:
            replicas: 3
        clusters:
          local:  # Replication configuration settings for the local cluster.
            alias: cluster-2
            kafka:
              mm2:
                replicas: 1
            db:
              symmetricds:
                # Registration URL is the URL to contact for registration with SymmetricDS.  
                # Registration URL is empty on the cluster that acts as the SymmetricDS registration server.  
                registrationUrl: https://cp4na-o-symmetricds.cluster-1.cp.fyre.ibm.com/sync/cluster-1
              sequence:
                rangeMultiplier: 1
          remote:  # Replication configuration settings for remote clusters.
            - alias: cluster-1 
              kafka: # bootstrapServer obtained on remote server post install steps, secret cp4na-o-kafka-cluster-all is described here
                bootstrapServers: cp4na-o-events-kafka-bootstrap-cluster-1.ibm.com:443
                authentication:
                  passwordSecret:
                    secretName: cp4na-o-kafka-cluster-all
                    password: password-1
                tls:
                  trustedCertificates:
                    - secretName: cp4na-o-kafka-cluster-all
                      certificate: ca-cluster-1.crt
              zookeeper:
                server.1: 192.0.2.0:31079:32458
                server.2: 192.0.2.0:32544:32432
                server.3: 192.0.2.0:32579:32658
  4. Repeat steps 1 to 3 for cluster 3.

To view sample custom resources for cluster 1 and cluster 3, see Sample custom resources for active-active configuration.

ZooKeeper server ID numbering

IBM Cloud Pak for Network Automation uses ZooKeeper to coordinate services. Typically, each cluster that you install has three ZooKeeper servers. IBM Cloud Pak for Network Automation assigns sequential server ID numbers to the ZooKeeper servers.

For example, if your active-active configuration has three clusters, you might have a total of nine ZooKeeper servers. In that scenario, the servers are assigned sequential IDs as shown in the following table:

Cluster Assigned server IDs
Cluster 1, the SymmetricDS registration node server.1, server.2, server.3
Cluster 2 server.4, server.5, server.6
Cluster 3 server.7, server.8, server.9

Follow this server number pattern when you update the spec.advanced.clusters.remote.zookeeper section of the CR for clusters that you install.

For cluster 2, in the CR, the spec.advanced.clusters.remote.zookeeper section must include details of cluster 1 similar to the following example:

          zookeeper:
            server.1: 192.0.2.0:31079:32458
            server.2: 192.0.2.0:32544:32432
            server.3: 192.0.2.0:32579:32658

After you install cluster 2, you must update the spec.advanced.clusters.remote.zookeeper section of the CR for cluster 1 to include details of the second cluster similar to the following example:

          zookeeper:
            server.4: 192.0.2.1:31079:32458
            server.5: 192.0.2.1:32544:32432
            server.6: 192.0.2.1:32579:32658

The IP address of cluster 1 is 192.0.2.0; the IP address of cluster 2 is 192.0.2.1.

Update ZooKeeper details, apply changes, and verify on all clusters

Follow these steps to update the CR for cluster 1 with the ZooKeeper details for clusters 2 and 3. The steps also describe how to apply the configuration changes, and verify the ZooKeeper details on all clusters.

  1. On cluster 1, update the spec.advanced.clusters.remote.zookeeper sections of the CR with the ZooKeeper details for clusters 2 and 3. That is, add the IP addresses and port details of the remote ZooKeeper servers for clusters 2 and 3 to the CR of cluster 1.

    The CR doesn't need to include details of the local ZooKeeper servers.

  2. On all clusters, apply the recent configuration changes that you made in the CRs to the deployment. For example, to apply the changes to cluster 1, run the following command:
    oc apply -f cluster-1.yaml

    cluster-1.yaml is the name of the CR file of cluster 1.

    You must also apply the configuration changes that you made in the CRs to clusters 2 and 3.

    Troubleshooting: If the Vault pods fail to install, you might need to run the following commands to apply permissions for the postgres pod:
    oc exec -it cp4na-o-postgresql-1 -- psql -U postgres -d app
    GRANT postgres TO app;
  3. (Optional) If you are using large amounts of data, the replication of data during the initial loading between clusters might take a long time. To avoid long load times, disable all foreign key checks during the initial loading. To disable foreign key checks, set the replication role to replica during initial loading as described in the following procedure:
    1. After PostgreSQL is fully installed and running, but before the IBM Cloud Pak for Network Automation microservices are running, access the cp4na-o-postgresql pod:
      oc exec -it $(oc get cluster cp4na-o-postgresql -o jsonpath --template '{.status.targetPrimary}') -- bash
    2. If the postgres user is set up as a superuser, log in to the pod as the postgres user by running the following command:
      psql --user postgres
    3. Set the replication role to replica by running the following command:
      alter role app SET session_replication_role = 'replica';
    4. Monitor the SymmetricDS pod logs to identify when the initial loading of data is complete. During loading, log messages like the following message appear:
      2022-06-28 15:49:05,623 INFO [cluster-d] [DataLoaderService] [qtp1710814638-15] 148430 data and 15 batches loaded 
      during push request from cluster-a:cluster-a:cluster-a
      When the loading is complete, these log messages stop.
    5. You must now set the replication role back to origin. Repeat step 3.a and step 3.b to log in to the cp4na-o-postgresql pod.
    6. Set the replication role to origin by running the following command:
      alter role app SET session_replication_role = 'origin';
  4. On all clusters, verify the server IDs for the ZooKeeper pods. To display the server IDs, run the following commands:
    for pod in 0 1 2
    do
    oc exec -it cp4na-o-zookeeper-locking-$pod -- cat /var/lib/zookeeper/data/myid
    done
    Confirm that the server IDs are the same as the ones that are described in ZooKeeper server ID numbering.

Configure Kafka MirrorMaker on all clusters

To configure Kafka MirrorMaker (MM2), you must add Kafka bootstrap server and authentication details to the CR for each cluster. In these steps, you use the information that you gathered in the Gather data for the installation of clusters 2 and 3 procedure.

To configure MM2, follow these steps:

  1. On cluster 1, create an environment variable to store the Kafka password in. For example, create an environment variable called PWCLUSTER1.
  2. On cluster 1, get the Kafka password for Kafka MirrorMaker (MM2) and store it in the environment variable. The following example commands store the password in PWCLUSTER1:
    NAMESPACE=cp4na
    PWCLUSTER1=$(oc get secret cp4na-o-kafka-user -o jsonpath='{.data.password}' -n $NAMESPACE | base64 -d)
    echo $PWCLUSTER1
  3. Repeat steps 1-2 for cluster 2 and cluster 3 to store the Kafka password in environment variables.
    Tip: Use variable names that identify the cluster. For example, create PWCLUSTER2 and PWCLUSTER3.
  4. Copy the password environment variables for each cluster to the other clusters. That is, copy PWCLUSTER1 from cluster 1 to clusters 2 and 3, copy PWCLUSTER2 from cluster 2 to clusters 1 and 3, and so on.
  5. On cluster 1, get the SSL certificate for the Kafka bootstrap address and save the certificate to a file.
    1. Inspect the Kafka instance by running the following command:
      oc get kafka cp4na-o-events -o yaml
    2. Save the certificate for cp4na-o-events-kafka-bootstrap to a file by running the following command:
      oc get kafka -o jsonpath='{.items[0].status.listeners[2].certificates[0]}' > ca-cluster-1.crt
  6. Repeat step 5 for cluster 2 and cluster 3. Use certificate file names that identify the cluster. For example, create ca-cluster-2.crt and ca-cluster-3.crt.
  7. Copy the certificates for each cluster to the other clusters. That is, copy the certificates for clusters 2 and 3 to cluster 1, copy the certificates for clusters 1 and 3 to cluster 2, and so on.
  8. On cluster 1, use the SSL certificate and the password environment variables to create a secret. For example, the following command creates a secret that is called cp4na-o-kafka-cluster-all from the certificates and the password environment variables:
    oc create secret generic cp4na-o-kafka-cluster-all --from-file ./ca-cluster-1.crt \
    --from-file ./ca-cluster-2.crt --from-file ./ca-cluster-3.crt \
    --from-literal=password-1=$PWCLUSTER1 --from-literal=password-2=$PWCLUSTER2 \
    --from-literal=password-3=$PWCLUSTER3
  9. Repeat step 8 for cluster 2 and cluster 3.
  10. On cluster 1, update the spec.advanced.clusters.remote.kafka sections of the CR to include the Kafka bootstrap server and authentication details for clusters 2 and 3.
  11. On cluster 2, update the spec.advanced.clusters.remote.kafka sections of the CR to include the Kafka bootstrap server and authentication details for clusters 1 and 3.
  12. On cluster 3, update the spec.advanced.clusters.remote.kafka sections of the CR to include the Kafka bootstrap server and authentication details for clusters 1 and 2.
  13. On all clusters, apply the changes that you made to the CRs for the other clusters by running the following command:
    oc apply -f cluster-n-cr.yaml

    cluster-n-cr.yaml is the name of the CR file of the cluster.

What to do next

Complete the following task after you install IBM Cloud Pak for Network Automation in an active-active configuration:
Configure your users' access control permissions
You must have administrator permissions to configure users. To configure your users and their access permissions, complete one of the following items:
  • If you don't want to use object-based access control (OBAC), configure an LDAP connection for your Red Hat OpenShift cluster. Then map users and user groups from your LDAP directory into the cluster and set the access permissions of users and user groups. For more information, see Configuring an LDAP connection, Mapping users to LDAP roles and groups, and IBM Cloud Pak Managing users.
  • If you don't want to use LDAP, SAML and OIDC are available with IBM Cloud Pak foundational services. For more information, see IBM Cloud Pak foundational services Authentication types.
  • If you want to use OBAC, spend some time planning the structure of your object groups and user groups.

    In IBM Cloud Pak for Network Automation, you can use object groups, user groups, and users to specify different access control settings for your assembly instances, deployment locations, infrastructure keys, network packages, and secret groups. You can also set the permissions that apply to the user groups.

    You can assign several user groups to each object group. You can also assign users to multiple user groups. When you set permissions for the user groups, you must consider how you want to structure access to objects.

    When you implement your object groups and user groups, you can follow these steps:

    1. Decide which types of user your IBM Cloud Pak for Network Automation deployment needs to support. For example, you might need users who can only read all objects, users who can also create and update some types of object, users who can do most administrative tasks, and users who can do all administrative tasks, including creating object groups. You can then define the permissions for each type of user.
    2. Decide how you want to place the objects in object groups and what type of access to give the different types of user to the object groups. These decisions help to determine which user groups you need to create.
    3. Create the user groups that allow the user types the access they require. Two levels of permission apply:
      • Role-based access control (RBAC) permissions, which apply to user groups and roles. Users are assigned user groups and roles. To set RBAC permissions, click Administration > Access control from the navigation menu Navigation menu icon in the IBM Cloud Pak® console.
      • Object-based access control (OBAC) permissions, which apply to object groups.
      To use an OBAC permission, a user must have the corresponding RBAC permission. You can give an OBAC permission to a user, but if they don't have the corresponding RBAC permission, the OBAC permission doesn't work.

      For more information about how to create user groups, see Mapping users to LDAP roles and groups.

    4. Create the object groups and assign the user groups that you created to each object group. Before you assign a user group to an object group, the permissions must be already set in the user group. For more information, see Managing object groups.

      Before you create your object groups, consider creating an extra administrative role that can't create object groups. You might want to restrict the permission to create object groups to administrators with full permissions.

      When you plan your object groups and user groups, consider associating each user group with a set of permissions for only one object group. For example, you might create the following groups:
      • An object group that is called North.
      • A user group that is called North_FullAdmin and has full read, write, update, and delete permissions to only the North object group.
      • A second user group that is called North_ReadOnly and has read-only permissions to only the North object group.

      By using this convention, you can easily identify what permissions a user has. To identify the permissions, view the user groups in a user's details.

    5. Add users to the user groups. You can complete this step in an LDAP directory, then configure an LDAP connection to your Red Hat OpenShift cluster. For more information, see Configuring an LDAP connection.
Configure multitenancy
You can enable multitenancy and configure tenant administrators and users after you install IBM Cloud Pak for Network Automation. For more information, see Configuring multitenancy.
Log in to IBM Automation
Log in to the IBM Automation UI to access IBM Cloud Pak for Network Automation features, such as the orchestration and Site Planner components.
See Logging in to the IBM Cloud Pak console.
Deploy resource drivers
Before you can use the orchestration component to automate your lifecycle processes, you must deploy the resource drivers. Resource drivers run lifecycle and operation requests from the orchestration component. See Resource drivers.