Installing an active-active configuration on AWS

Install on three or more self-managed clusters in an active-active configuration on Amazon Web Services (AWS) for high availability (HA).

About this task

Important:

To install an active-active configuration on Amazon Web Services (AWS), you install the Cloud Pak on three or more self-managed clusters. The first cluster where you install becomes the primary cluster, and the other clusters where you install become secondary clusters.

In this task, the term cluster 1 refers to the primary cluster, and the term cluster <n> refers to a secondary cluster. After you complete the installation steps that refer to cluster 1 on the primary cluster, make sure that you complete the steps that refer to cluster <n> on each secondary cluster.

For example, after you install on cluster 1, you might want to install on two secondary clusters that are called cluster 2 and cluster 3. In this case, complete the installation steps that refer to cluster <n> on cluster 2, then complete them again on cluster 3.

When you configure an active-active configuration of the Cloud Pak, you must ensure that the clusters can communicate with each other and synchronize their data. To enable this communication, you update the custom resource (CR) on each cluster with information about the synchronization services that run on the other clusters.

The following services are used to synchronize data between clusters:

Kafka: Data is synchronized by using Kafka MirrorMaker (MM2) on a Red Hat® OpenShift® Container Platform route.
PostgreSQL: Data is synchronized by using SymmetricDS on a Red Hat OpenShift Container Platform route. After you install on cluster 1, that cluster becomes the SymmetricDS registration node. Any other clusters where you install must register with the registration node.
ZooKeeper: Kubernetes load-balancing services are used to synchronize data and balance workloads across all clusters. These services enable ZooKeeper to work as a single ensemble across all clusters.
Vault: Vault data is stored in PostgreSQL, but the decryption keys must be common across all clusters to enable reading of the Vault data.

Before you begin

Complete the steps in Preparing to install an active-active configuration on AWS.

Installing the Cloud Pak in an active-active configuration on AWS

Complete the following sets of steps to install IBM Cloud Pak for Network Automation in an active-active configuration on AWS:

On the primary cluster, that is, cluster 1, complete these sets of steps:
On each secondary cluster, that is, each instance of cluster <n>, complete these sets of steps:
Reset the replication role setting for the app database role on all clusters.

Install the orchestration manager operator on cluster 1

For information about how to install the IBM Cloud Pak for Network Automation Orchestration Manager operator, see Installing IBM Cloud Pak for Network Automation Orchestration Manager.

Create the CR for cluster 1

On cluster 1, create a CR to specify how the orchestration manager operator is installed and configured in your environment. When you create the CR, complete these steps:

In the metadata section, set the value of the namespace key to the namespace where you want to install the Cloud Pak.
In the metadata.annotations section, include the following annotation to ensure that the ZooKeeper service uses a Network Load Balancer (NLB), rather than the default AWS Classic Load Balancer:
```
service.beta.kubernetes.io/aws-load-balancer-type: nlb
```
In the spec.licence section, set the value of the accept setting to true to accept the license.
In the spec.advanced.podsettings section, set the number of replicas to 3 for the zookeeperlocks and kafka services.
To enable multicluster services, complete the spec.advanced.clusters section.
- Add an alias that identifies cluster 1 in the spec.advanced.clusters.local.alias attribute.
- Later in the installation process, you add details of other clusters to the spec.advanced.clusters.remote attribute.
In the spec.storage section, for the synchronization services, specify the gp3 and efs-sc storage classes that you configured in Preparing to install an active-active configuration on AWS.

You can use the following example to create a CR for cluster 1:

apiVersion: tnc.ibm.com/v1beta1
kind: Orchestration
metadata:
  name: cluster-1
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: nlb  
spec:
  license:
    accept: true
  version: 2.7.7
  featureconfig:
    siteplanner: true
  advanced:
    imagePullPolicy: Always
    podSettings:
      zookeeperlocks:
        replicas: 3
      kafka:
        replicas: 3
    clusters:
      local:
        alias: cluster-1
  storage:
    kafka:
      storageClassName: gp3
    zookeeper:
      storageClassName: gp3
    zookeeperlocks:
      storageClassName: gp3
    zenFile:
      storageClassName: efs-sc
    zenBlock:
      storageClassName: gp3
    postgres:
      storageClassName: gp3

Optional: Customize the CR for cluster 1

You must also customize the CR in other ways. For example, increase the settings for CPU, memory, and replicas for the microservices. For more information, see Custom resources. You can use the Sample custom resource for a production environment as a reference when you are installing in an active-active configuration on AWS.

Create an instance on cluster 1

On cluster 1, complete these steps:

Create an instance of IBM Cloud Pak for Network Automation by running this command:
```
oc create -f cluster-cr.yaml
```
cluster-cr.yaml is the name of the CR file.
After PostgreSQL is fully installed and running, but before the Cloud Pak microservices are running, disable all foreign key checks in the database by completing these steps:
1. Access the cp4na-o-postgresql pod by running this command:
```
oc exec -it $(oc get cluster cp4na-o-postgresql -o jsonpath --template '{.status.targetPrimary}') -- bash
```
2. If the postgres user is set up as a superuser, log in to the pod as the postgres user by running this command:
```
psql --user postgres
```
3. Update the replication role setting for the app role to replica by running this command:
```
alter role app SET session_replication_role = 'replica';
```
Verify that the instance is successfully created by running this command:
```
oc get orchestration <cp4na_namespace>
```
<cp4na_namespace> is the namespace where you installed the orchestration manager operator.

The instance might take some time to create. When the Status value is Ready, your instance is created.

Gather data for the installation on the secondary clusters

Complete the following steps to gather data from cluster 1 to use during the installation of the Cloud Pak on the secondary clusters:

Export the secret cp4na-o-vault-keys to a file.
1. Run this command to export the keys to a YAML file:
```
oc get secret cp4na-o-vault-keys -o yaml > cp4na-o-vault-keys-cluster-1.yaml
```
2. Delete the following attributes from the metadata section of the YAML file:
  - labels
  - ownerReferences
  - uid
  - creationTimestamp
  - resourceVersion
Cluster 1 is the SymmetricDS registration node, which is the node that other clusters register with when you install them. Get the SymmetricDS registration URL from the cluster 1 orchestration status by running this command:
```
oc get orchestration cluster-1 -o jsonpath='{.status.endpoints.symmetricds.registrationURL}'
```
Get the domain name and port details for the ZooKeeper load-balancing servers by running this command:
```
oc get svc | grep cp4na-o-zookeeper-locking-ext
```
Save the details. You use them later to create the ZooKeeper section in the CRs for the secondary clusters.
Get the Kafka password for Kafka MirrorMaker (MM2) by running this command:
```
oc get secret cp4na-o-kafka-user -o jsonpath='{.data.password}' -n <cp4na_namespace> | base64 -d
```
Save the details. You use them later to create the Kafka section in the CRs for the secondary clusters.
Get the external bootstrap service address for Kafka by running this command:
```
oc get routes cp4na-o-events-kafka-bootstrap -o jsonpath='{.spec.host}' -n <cp4na_namespace> 
```
Save the details. You use them later to create the Kafka section in the CRs for the secondary clusters.
Get the SSL certificate for the Kafka bootstrap address and save the certificate to a file.
1. Inspect the Kafka instance by running this command:
```
oc get kafka cp4na-o-events -o yaml
```
2. Save the certificate for cp4na-o-events-kafka-bootstrap to a file that is called ca.crt by running this command:
```
oc get kafka -o jsonpath='{.items[0].status.listeners[2].certificates[0]}' > ca.crt
```

Install the orchestration manager operator on cluster `<n>`

On cluster <n>, complete these steps:

Create the secret cp4na-o-vault-keys by running this command:
```
oc create -f cp4na-o-vault-keys-cluster-1.yaml
```
cp4na-o-vault-keys-cluster-1.yaml is the file that you created on cluster 1 that contains the cp4na-o-vault-keys secret.
Install the IBM Cloud Pak for Network Automation Orchestration Manager operator on cluster <n>. For more information, see Installing IBM Cloud Pak for Network Automation Orchestration Manager.

Create Kafka secrets for remote clusters on cluster `<n>`

On cluster <n>, use the Kafka information that you gathered about remote clusters, such as cluster 1, to create secrets. For each remote cluster, create a secret that contains that cluster's Kafka SSL certificate and bootstrap server password by running this command:

oc create secret generic <remote-cluster-kafka-secret-name> --from-file <remote-kafka-cert_file> --from-literal=password=<remote-kafka-passsword>

<remote-cluster-kafka-secret-name> is a name that you choose to identify this secret.

<remote-kafka-cert_file> is the remote cluster's Kafka SSL certificate file that you created, that is, ca.crt.

<remote-kafka-passsword> is the password for the remote cluster's Kafka bootstrap server.

This example shows the type of information that is contained in the secret:

apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: cp4na-o-kafka-cluster-1
  namespace: lifecycle-manager
data:
  password: Q05RM01BRGMyNDZ4VmhuUEh0TkltS0N2U1dxWXFXZHM=
  ca.crt: |-
    LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZMVENDQXhXZ0F3SUJBZ0lVZkVCaCtCaTVK
    bHFacTN5b2Yrak1STVh2aGRjd0RRWUpLb1pJaHZjTkFRRU4KQlFBd0xURVRNQkVHQTFVRUNnd0th

Later, when you create the CR for cluster <n>, you include the following items in the remote cluster settings for Kafka:

The name of the secret.
The name of the attribute in the secret that contains the password, that is, password.
The name of the attribute that contains the SSL certificate content, that is, ca.crt.

Create the CR for cluster `<n>`

On cluster <n>, create a CR to specify how the orchestration manager operator is installed and configured in your environment.

When you create the CR, complete the following steps:

In the metadata section, set the value of the namespace key to the namespace where you want to install the Cloud Pak.
In the metadata.annotations section, include the following annotation to ensure that the ZooKeeper service uses a Network Load Balancer (NLB), rather than the default AWS Classic Load Balancer:
```
service.beta.kubernetes.io/aws-load-balancer-type: nlb
```
In the spec.licence section, set the value of the accept setting to true to accept the license.
In the spec.advanced.podsettings section, set the number of replicas to 3 for the zookeeperlocks and kafka services.
Update the spec.advanced.clusters section by completing these steps:
1. Add an alias that identifies cluster <n> in the spec.advanced.clusters.local.alias attribute.
2. Enter the SymmetricDS registration URL, which you gathered from cluster 1, in the spec.advanced.clusters.local.db.symmetricds.registrationUrl attribute.
3. Assign a unique SymmetricDS sequence number for cluster <n> in the spec.advanced.clusters.local.db.sequence.rangeMultiplier attribute. The first cluster that you install after cluster 1 must have the sequence number 1, the next cluster must have number 2, and so on. For example, if your deployment has three clusters, the sequence numbers are as follows:
  - Cluster 1 - No sequence number
  - Cluster 2 - 1
  - Cluster 3 - 2
4. Add information in the spec.advanced.clusters.remote section about each remote cluster where you already installed the Cloud Pak.
  - Add an alias that identifies the remote cluster in the .remote.alias attribute.
  - Add the Kafka bootstrap server address for the remote cluster in the .remote.kafka.bootstrapServers attribute.
  - In the .remote.kafka.authentication.passwordSecret section, add the name of the Kafka secret that you created for the remote cluster and the attribute in the secret that contains the bootstrap server password.
  - In the .remote.kafka.tls.trustedCertificates section, add the name of the Kafka secret and the attribute that contains the SSL certificate.
  - Add the ZooKeeper service details for the remote cluster to the spec.advanced.clusters.remote.zookeeper section.
    Note:
    On each cluster, the ZooKeeper service consists of three servers. The Cloud Pak assigns sequential IDs to these servers. On cluster 1, the assigned IDs are 1, 2, and 3. On the next cluster where you install, the assigned IDs are 4, 5, and 6, and so on.
    
    This example shows the spec.advanced.clusters.remote.zookeeper section for a CR on a secondary cluster. Because cluster 1 is remote to this secondary cluster, the section must contain the ZooKeeper service details for cluster 1. For each ZooKeeper server on cluster 1, the server ID, domain name, and port details are included.
```
      remote:  
          zookeeper:
            server.1: a8605356a4b0d4223bb1030c04b357cd-a0dc7511d3ba8d26.elb.eu-west-1.amazonaws.com:2888:3888
            server.2: a0884ba91b86945a9899e02bd13dd041-9e1adf15da6a3904.elb.eu-west-1.amazonaws.com:2888:3888
            server.3: a90c7fe26e12e48b4a1dc1463d4e0661-385f08d4d4cbdf40.elb.eu-west-1.amazonaws.com:2888:3888
```
In the spec.storage section, for the synchronization services, such as Kafka, specify the gp3 and efs-sc storage classes that you configured in Preparing to install an active-active configuration on AWS.

This example shows the CR for a secondary cluster that is called cluster-2 after the SymmetricDS, Kafka, and ZooKeeper details for a primary cluster that is called cluster-1 are added:

apiVersion: tnc.ibm.com/v1beta1
kind: Orchestration
metadata:
  name: cluster-2
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: nlb  
spec:
  license:
    accept: true
  version: 2.7.7
  featureconfig:
    siteplanner: true
  advanced:
    imagePullPolicy: Always
    podSettings:
      zookeeperlocks:
        replicas: 3
      kafka:
        replicas: 3
    clusters:
      local:
        alias: cluster-2
        kafka:
          mm2:
            replicas: 1
        db:
          symmetricds:
            registrationUrl: https://cp4na-o-symmetricds-lifecycle-manager.apps.cluster-1.26ah.p1.openshiftapps.com/sync/cluster-1
          sequence:
            rangeMultiplier: 1
      remote:
        - alias: cluster-1
          kafka:
            bootstrapServers: cp4na-o-events-kafka-bootstrap-lifecycle-manager.apps.cluster-1.26ah.p1.openshiftapps.com:443
            authentication:
              passwordSecret:
                secretName: cp4na-o-kafka-cluster-1
                password: password
            tls:
              trustedCertificates:
                - secretName: cp4na-o-kafka-cluster-1
                  certificate: ca.crt
          zookeeper:
            server.1: a8605356a4b0d4223bb1030c04b357cd-a0dc7511d3ba8d26.elb.eu-west-1.amazonaws.com:2888:3888
            server.2: a0884ba91b86945a9899e02bd13dd041-9e1adf15da6a3904.elb.eu-west-1.amazonaws.com:2888:3888
            server.3: a90c7fe26e12e48b4a1dc1463d4e0661-385f08d4d4cbdf40.elb.eu-west-1.amazonaws.com:2888:3888
  storage:
    kafka:
      storageClassName: gp3
    zookeeper:
      storageClassName: gp3
    zookeeperlocks:
      storageClassName: gp3
    zenFile:
      storageClassName: efs-sc
    zenBlock:
      storageClassName: gp3
    postgres:
      storageClassName: gp3

Optional: Customize the CR for cluster `<n>`

On cluster <n>, you must customize the CR in other ways. For example, increase the settings for CPU, memory, and replicas for the microservices. For more information, see Custom resources. You can use the Sample custom resource for a production environment as a reference when you are installing in an active-active configuration on AWS.

Create an instance on cluster `<n>`

On cluster <n>, complete these steps:

Create an instance of IBM Cloud Pak for Network Automation by running this command:
```
oc create -f cluster-cr.yaml
```
cluster-cr.yaml is the name of the CR file.
After PostgreSQL is fully installed and running, but before the Cloud Pak microservices are running, disable all foreign key checks in the database by completing these steps:
1. Access the cp4na-o-postgresql pod by running this command:
```
oc exec -it $(oc get cluster cp4na-o-postgresql -o jsonpath --template '{.status.targetPrimary}') -- bash
```
2. If the postgres user is set up as a superuser, log in to the pod as the postgres user by running this command:
```
psql --user postgres
```
3. Update the replication role setting for the app role to replica by running this command:
```
alter role app SET session_replication_role = 'replica';
```
Verify that the instance is successfully created by running this command:
```
oc get orchestration <cp4na_namespace>
```
<cp4na_namespace> is the namespace where you installed the orchestration manager operator.

The instance might take some time to create. When the Status value is Ready, your instance is created.

Gather the Kafka and ZooKeeper details for cluster `<n>`

On cluster <n>, complete these steps:

Get the Kafka password for Kafka MirrorMaker (MM2) by running this command:
```
oc get secret cp4na-o-kafka-user -o jsonpath='{.data.password}' -n <cp4na_namespace> | base64 -d
```
Save the details. You add them later to the Kafka section in the CRs for all other clusters.
Get the external bootstrap service address for Kafka by running this command:
```
oc get routes cp4na-o-events-kafka-bootstrap -o jsonpath='{.spec.host}' -n <cp4na_namespace> 
```
Save the details. You add them later to the Kafka section in the CRs for all other clusters.
Get the SSL certificate for the Kafka bootstrap address and save the certificate to a file.
1. Inspect the Kafka instance by running the following command:
```
oc get kafka cp4na-o-events -o yaml
```
2. Save the certificate for cp4na-o-events-kafka-bootstrap to a file that is called ca.crt by running the following command:
```
oc get kafka -o jsonpath='{.items[0].status.listeners[2].certificates[0]}' > ca.crt
```
Get the domain name and port details for the ZooKeeper load-balancing servers by running this command:
```
oc get svc | grep cp4na-o-zookeeper-locking-ext
```
Save the details. You add them later to the ZooKeeper section in the CRs for all other clusters.

Add the Kafka and ZooKeeper details for cluster `<n>` to all other clusters

On all clusters where you installed the Cloud Pak except cluster <n>, complete these steps:

Create a secret that contains the Kafka details that you gathered for cluster <n> by running this command:
```
oc create secret generic <cluster-n-kafka-secret-name> --from-file <cluster-n-kafka-cert_file> --from-literal=password=<cluster-n-kafka-passsword>
```
<cluster-n-kafka-secret-name> is a name that you choose to identify this secret.

<cluster-n-kafka-cert_file> is the Kafka SSL certificate file that you created on cluster <n>.

<cluster-n-kafka-passsword> is the password for the Kafka bootstrap server on cluster <n>.
Update the CR with the Kafka and ZooKeeper details for cluster <n>.
- Add the Kafka bootstrap server, authentication, and certificate details for cluster <n> to the spec.advanced.clusters.remote.kafka section.
- Add the ZooKeeper service details for cluster <n> to the spec.advanced.clusters.remote.zookeeper section.
Apply the CR updates by running this command:
```
oc apply -f cluster-cr.yaml
```
cluster-cr.yaml is the name of the CR file.

This example shows the CR for a primary cluster that is called cluster-1 after the Kafka and ZooKeeper details for two secondary clusters are added:

apiVersion: tnc.ibm.com/v1beta1
kind: Orchestration
metadata:
  name: cluster-1
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: nlb    
spec:
  license:
    accept: true
  version: 2.7.7
  featureconfig:
    siteplanner: true
  advanced:
    imagePullPolicy: Always
    podSettings:
      zookeeperlocks:
        replicas: 3
      kafka:
        replicas: 3
    clusters:
      local:
        alias: cluster-1
        kafka:
          mm2:
            replicas: 1
      remote:
        - alias: cluster-2
          kafka:
            bootstrapServers: cp4na-o-events-kafka-bootstrap-lifecycle-manager.apps.cluster-2.zp0z.p1.openshiftapps.com:443
            authentication:
              passwordSecret:
                secretName: cp4na-o-kafka-cluster-2
                password: password
            tls:
              trustedCertificates:
                - secretName: cp4na-o-kafka-cluster-2
                  certificate: ca.crt
          zookeeper:
            server.4: a6691ffe18c5b4ce7874872d6d777705-e01b7845d7af5a36.elb.eu-west-1.amazonaws.com:2888:3888
            server.5: ab540e51d6e384891884d8bbd8e32be1-b8d20b05c9212790.elb.eu-west-1.amazonaws.com:2888:3888
            server.6: a5eeea8330f894848b660f0742e001e8-ec12170ceb11a96d.elb.eu-west-1.amazonaws.com:2888:3888
        - alias: cluster-3
          kafka:
            bootstrapServers: cp4na-o-events-kafka-bootstrap-lifecycle-manager.apps.cluster-3.zp0z.p1.openshiftapps.com:443
            authentication:
              passwordSecret:
                secretName: cp4na-o-kafka-cluster-3
                password: password
            tls:
              trustedCertificates:
                - secretName: cp4na-o-kafka-cluster-3
                  certificate: ca.crt
          zookeeper:
            server.7: a6691ffe18c5b4ce7874872d6d777705-b8d20b05c9212790.elb.eu-west-1.amazonaws.com:2999:3999
            server.8: ab540e51d6e384891884d8bbd8e32be1-ec12170ceb11a96d.elb.eu-west-1.amazonaws.com:2999:3999
            server.9: a5eeea8330f894848b660f0742e001e8-e01b7845d7af5a36.elb.eu-west-1.amazonaws.com:2999:3999
  storage:
    kafka:
      storageClassName: gp3
    zookeeper:
      storageClassName: gp3
    zookeeperlocks:
      storageClassName: gp3
    zenFile:
      storageClassName: efs-sc
    zenBlock:
      storageClassName: gp3
    postgres:
      storageClassName: gp3

Reset the replication role setting for the `app` database role on all clusters

On all clusters, complete these steps:

Access the cp4na-o-postgresql pod by running this command:

oc exec -it $(oc get cluster cp4na-o-postgresql -o jsonpath --template '{.status.targetPrimary}') -- bash

If the postgres user is set up as a superuser, log in to the pod as the postgres user by running this command:
```
psql --user postgres
```
Reset the replication role setting for the app role to origin by running this command:
```
alter role app SET session_replication_role = 'origin';
```
Verify that the replication role setting is reset correctly by running this command:
```
select useconfig from pg_user where usename='app';
```

What to do next

Complete the following tasks after you install IBM Cloud Pak for Network Automation in an active-active configuration:

Configure multitenancy: You can enable multitenancy and configure tenant administrators and users after you install IBM Cloud Pak for Network Automation. For more information, see Configuring multitenancy.

Installing an active-active configuration on AWS

About this task

Before you begin

Installing the Cloud Pak in an active-active configuration on AWS

Install the orchestration manager operator on cluster 1

Create the CR for cluster 1

Optional: Customize the CR for cluster 1

Create an instance on cluster 1

Gather data for the installation on the secondary clusters

Install the orchestration manager operator on cluster <n>

Create Kafka secrets for remote clusters on cluster <n>

Create the CR for cluster <n>

Optional: Customize the CR for cluster <n>

Create an instance on cluster <n>

Gather the Kafka and ZooKeeper details for cluster <n>

Add the Kafka and ZooKeeper details for cluster <n> to all other clusters

Reset the replication role setting for the app database role on all clusters

What to do next

Install the orchestration manager operator on cluster `<n>`

Create Kafka secrets for remote clusters on cluster `<n>`

Create the CR for cluster `<n>`

Optional: Customize the CR for cluster `<n>`

Create an instance on cluster `<n>`

Gather the Kafka and ZooKeeper details for cluster `<n>`

Add the Kafka and ZooKeeper details for cluster `<n>` to all other clusters

Reset the replication role setting for the `app` database role on all clusters