Installing Netcool Operations Insight

Follow these instructions to install a geo-redundant deployment of IBM® Netcool® Operations Insight® on OpenShift®.

Before you begin

Ensure that you complete the prerequisites listed in Configuring prerequisites.

About this task

Complete the following steps to install a geo-redundant deployment of IBM Netcool Operations Insight on OpenShift.

Procedure

Create a Netcool Operations Insight instance for your geo-redundant deployment.

  1. Ensure that you are logged in to the new namespace on the primary cluster.
  2. Start the installation on the primary cluster.
    oc apply -f <primary_yaml_file_name>.yaml
    Where <primary_yaml_file_name> is the name of your primary custom resource (CR) YAML file.
    The operator pauses after the statefulset applications are deployed. This pause provides time to validate the Cassandra service on the primary cluster.
  3. Confirm that three nodes at site one are connected and the auth_schema has a replication of three dc-1: 3.
  4. Check that the Cassandra nodes are up and working. Check that the nodetool status displays one data center with the three nodes that you want. All nodes must have a state of UP and Normal, as in the following example:
    for node in 0 1 2; do oc exec -ti ${PRIMARY_NAME}-cassandra-$node-0 -- nodetool status; done
    Where PRIMARY_NAME is the release name of the primary cluster.
    Datacenter: dc-1
    ================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address        Load       Tokens       Owns (effective)  Host ID                               Rack
    UN  10.40.198.253  76.44 KiB  16           67.8%             3f6430a6-ae07-4dcb-916d-29b363fee962  rack-1
    UN  10.40.194.121  76.58 KiB  16           63.4%             1840d007-d550-41c6-857a-57ae408396d6  rack-1
    UN  10.40.197.182  76.57 KiB  16           68.7%             fd5d4965-c6f2-44f6-916c-35895bad523b  rack-1
    
    Datacenter: dc-1
    ================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
    UN  10.40.198.253  76.44 KiB  16           67.8%             3f6430a6-ae07-4dcb-916d-29b363fee962  rack-1
    UN  10.40.194.121  76.58 KiB  16           63.4%             1840d007-d550-41c6-857a-57ae408396d6  rack-1
    UN  10.40.197.182  76.57 KiB  16           68.7%             fd5d4965-c6f2-44f6-916c-35895bad523b  rack-1
    
    Datacenter: dc-1
    ================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
    UN  10.40.198.253  76.44 KiB  16           67.8%             3f6430a6-ae07-4dcb-916d-29b363fee962  rack-1
    UN  10.40.194.121  76.58 KiB  16           63.4%             1840d007-d550-41c6-857a-57ae408396d6  rack-1
    UN  10.40.197.182  76.57 KiB  16           68.7%             fd5d4965-c6f2-44f6-916c-35895bad523b  rack-1
  5. Check the auth_schema replication. Run the following command from the Cassandra pod:
    oc exec -ti ${PRIMARY_NAME}-cassandra-0-0 -n ${PRIMARY_NAMESPACE} -- bash
    Where:
    • PRIMARY_NAME is the release name of the primary cluster.
    • PRIMARY_NAMESPACE is the namespace of the primary cluster.
  6. Verify that the auth_schema has three replicas in the data center, and that the replication is updated to all three nodes in dc-1, as in the following example:
    $ cqlsh -u $CASSANDRA_USER -p $CASSANDRA_PASS -e "SELECT * FROM system_schema.keyspaces ;"
    
    keyspace_name      | durable_writes | replication
    --------------------+----------------+---------------------------------------------------------------------------------------------
            system_auth |           True | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc-1': '3'}
    ...
    (7 rows)
  7. Wait for the Cassandra, Kafka, and Web GUI pods to be running on the primary cluster, as in the following example:
    oc get pods -n ${PRIMARY_NAMESPACE}
    Where PRIMARY_NAMESPACE is the namespace of the primary cluster.
    NAME                                  READY   STATUS                       RESTARTS   AGE
    noi-operator-6cdd9d867c-fgmcd         1/1     Running                      0          47m
    primary-verifysecrets-gwg64           0/1     Completed                    0          46m
    primary-zookeeper-0                   1/1     Running                      0          45m
    primary-nciserver-0                   1/2     CreateContainerConfigError   0          44m
    primary-couchdb-0                     1/1     Running                      0          44m
    primary-openldap-0                    1/1     Running                      0          44m
    primary-ibm-ea-dr-coordinator-ser-0   1/1     Running                      0          44m
    primary-ibm-redis-server-0            2/2     Running                      0          44m
    primary-cassandra-0-0                 1/1     Running                      0          44m
    primary-ibm-redis-server-1            2/2     Running                      0          43m
    primary-ibm-redis-server-2            2/2     Running                      0          43m
    primary-cassandra-1-0                 1/1     Running                      0          41m
    primary-cassandra-2-0                 1/1     Running                      0          38m
    primary-kafka-0                       2/2     Running                      0          34m
    primary-kafka-2                       2/2     Running                      0          34m
    primary-kafka-1                       2/2     Running                      0          34m
    primary-elasticsearch-0               1/1     Running                      0          34m
    primary-impactgui-0                   2/2     Running                      0          34m
    primary-ncoprimary-0                  1/1     Running                      0          32m
    primary-webgui-primary-0              2/2     Running                      0          6m52s
  8. Ensure that you are logged in to the new namespace on the backup cluster.
  9. Start the installation on the backup cluster.
    oc apply -f <backup_yaml_file_name>.yaml
    Where <backup_yaml_file_name> is the name of your backup CR YAML file.
    The operator pauses after the statefulset applications are deployed. This pause provides time to validate the Cassandra service on the backup cluster.
  10. Confirm that three nodes from the primary cluster and three nodes from the backup cluster are connected and the auth_schema has a replication of three dc-1: 3, dc-2: 3.
  11. Check that the Cassandra nodes are up and working. Check that the nodetool status displays two data centers with the three nodes that you want. All nodes must have a state of UP and Normal, as in the following example:
    for node in 0 1 2; do oc exec -ti ${BACKUP_NAME}-cassandra-$node-0 -- nodetool status; done
    Where BACKUP_NAME is the release name of the backup cluster.
    Datacenter: dc-1
    ================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address        Load        Tokens       Owns (effective)  Host ID                               Rack
    UN  10.40.198.253  145.37 KiB  16           41.6%             55eb42c5-fdd6-4b59-a68b-3004492dd19a  rack-1
    UN  10.40.194.121  147.3 KiB   16           26.4%             658cfeae-3779-4cce-90ef-901b20357730  rack-1
    UN  10.40.197.182  145.23 KiB  16           34.0%             5c545374-cfdb-4261-8591-3aef19e59d02  rack-1
    
    Datacenter: dc-2
    ================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address        Load        Tokens       Owns (effective)  Host ID                               Rack
    UN  10.30.233.165  168.91 KiB  16           35.6%             fce1576c-4865-4ed4-8add-45a4b6792a0f  rack-1
    UN  10.30.235.34   315.72 KiB  16           29.6%             b53be031-8bea-4924-abb3-06c30aee1c91  rack-1
    UN  10.30.233.134  163.28 KiB  16           32.8%             8990b837-041e-4d74-8af5-5e7727f87564  rack-1
    
    
    Datacenter: dc-1
    ================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address        Load        Tokens       Owns (effective)  Host ID                               Rack
    UN  10.40.198.253  145.37 KiB  16           41.6%             55eb42c5-fdd6-4b59-a68b-3004492dd19a  rack-1
    UN  10.40.194.121  147.3 KiB   16           26.4%             658cfeae-3779-4cce-90ef-901b20357730  rack-1
    UN  10.40.197.182  145.23 KiB  16           34.0%             5c545374-cfdb-4261-8591-3aef19e59d02  rack-1
    
    Datacenter: dc-2
    ================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address        Load        Tokens       Owns (effective)  Host ID                               Rack
    UN  10.30.233.165  168.91 KiB  16           35.6%             fce1576c-4865-4ed4-8add-45a4b6792a0f  rack-1
    UN  10.30.235.34   315.72 KiB  16           29.6%             b53be031-8bea-4924-abb3-06c30aee1c91  rack-1
    UN  10.30.233.134  163.28 KiB  16           32.8%             8990b837-041e-4d74-8af5-5e7727f87564  rack-1
    
    
    Datacenter: dc-1
    ================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address        Load        Tokens       Owns (effective)  Host ID                               Rack
    UN  10.40.198.253  145.37 KiB  16           41.6%             55eb42c5-fdd6-4b59-a68b-3004492dd19a  rack-1
    UN  10.40.194.121  147.3 KiB   16           26.4%             658cfeae-3779-4cce-90ef-901b20357730  rack-1
    UN  10.40.197.182  145.23 KiB  16           34.0%             5c545374-cfdb-4261-8591-3aef19e59d02  rack-1
    
    Datacenter: dc-2
    ================
    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address        Load        Tokens       Owns (effective)  Host ID                               Rack
    UN  10.30.233.165  168.91 KiB  16           35.6%             fce1576c-4865-4ed4-8add-45a4b6792a0f  rack-1
    UN  10.30.235.34   315.72 KiB  16           29.6%             b53be031-8bea-4924-abb3-06c30aee1c91  rack-1
    UN  10.30.233.134  163.28 KiB  16           32.8%             8990b837-041e-4d74-8af5-5e7727f87564  rack-1
  12. Check the auth_schema replication. Run the following command from the Cassandra pod.
    oc exec -ti ${BACKUP_NAME}-cassandra-0-0 -n ${BACKUP_NAMESPACE} -- bash
    Where:
    • BACKUP_NAME is the release name of the backup cluster.
    • BACKUP_NAMESPACE is the namespace of the backup cluster.
  13. Verify that the auth_schema has three replicas in each data center, and that the replication is updated to all three nodes in dc-1 and dc-2, as in the following example:
    $ cqlsh -u $CASSANDRA_USER -p $CASSANDRA_PASS -e "SELECT * FROM system_schema.keyspaces ;"
    keyspace_name      | durable_writes | replication
    --------------------+----------------+---------------------------------------------------------------------------------------------
            system_auth |           True | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc-1': '3', 'dc-2': '3'}
    ...
    (7 rows)
  14. Wait for the Cassandra and Kafka pods to be running on the backup cluster, as in the following example:
    oc get pods -n ${BACKUP_NAMESPACE}
    Where BACKUP_NAMESPACE is the namespace of the backup cluster.
    
    NAME                                  READY   STATUS                       RESTARTS   AGE
    noi-operator-6cdd9d867c-shs2d         1/1     Running                      0          30m
    backup-verifysecrets-bnqbg            0/1     Completed                    0          29m
    backup-zookeeper-0                    1/1     Running                      0          29m
    backup-cassandra-0-0                  1/1     Running                      0          28m
    backup-cassandra-1-0                  1/1     Running                      0          25m
    backup-cassandra-2-0                  1/1     Running                      0          22m
    backup-kafka-1                        2/2     Running                      0          18m
    backup-ibm-ea-dr-coordinator-serv-0   1/1     Running                      0          18m
    backup-ibm-redis-server-0             2/2     Running                      0          18m
    backup-kafka-2                        2/2     Running                      0          18m
    backup-couchdb-0                      1/1     Running                      0          18m
    backup-elasticsearch-0                1/1     Running                      0          18m
    backup-kafka-0                        2/2     Running                      0          18m
    backup-impactgui-0                    2/2     Running                      0          18m
    backup-ncobackup-0                    2/2     Running                      0          18m
    backup-openldap-0                     1/1     Running                      0          18m
    backup-nciserver-0                    1/2     CreateContainerConfigError   0          18m
    backup-ibm-redis-server-1             2/2     Running                      0          18m
    backup-ibm-redis-server-2             2/2     Running                      0          17m
    backup-webgui-backup-0                0/2     Init:1/2                     0          8s
  15. Create a configmap on the primary cluster to allow the operator to continue the installation. The installation pauses, waiting for a configmap. To proceed with the installation, create an empty configmap on the primary cluster.
    oc create configmap ${PRIMARY_NAME}-cassandraready -n ${PRIMARY_NAMESPACE}
    Where:
    • PRIMARY_NAME is the release name of the primary cluster.
    • PRIMARY_NAMESPACE is the namespace of the primary cluster.
  16. Run the following command from the Cassandra pod:
    oc exec -ti ${PRIMARY_NAME}-cassandra-0-0 -n ${PRIMARY_NAMESPACE} -- bash 
    Where:
    • PRIMARY_NAME is the release name of the primary cluster.
    • PRIMARY_NAMESPACE is the namespace of the primary cluster.
  17. Run the following command to ensure that the configured credentials are successful, and that the replication is updated to all three nodes in dc-1, as in the following example:
    $ cqlsh -u $CASSANDRA_USER -p $CASSANDRA_PASS -e "SELECT * FROM system_schema.keyspaces ;"
    keyspace_name      | durable_writes | replication
    --------------------+----------------+---------------------------------------------------------------------------------------------
            ea_policies |           True | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc-1': '3', 'dc-2': '3'}
             janusgraph |           True | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc-1': '3', 'dc-2': '3'}
            system_auth |           True | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc-1': '3', 'dc-2': '3'}
          system_schema |           True |                                     {'class': 'org.apache.cassandra.locator.LocalStrategy'}
     system_distributed |           True |         {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '3'}
                 system |           True |                                     {'class': 'org.apache.cassandra.locator.LocalStrategy'}
             mime_config |           True | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc-1': '3', 'dc-2': '3'} 
              ea_events |           True | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc-1': '3', 'dc-2': '3'}
          system_traces |           True |         {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '2'}
       noi_alertdetails |           True | {'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy', 'dc-1': '3', 'dc-2': '3'}
    
    (10 rows)
    
  18. Check that all the pods on the primary cluster are running.
  19. After all the pods on the primary cluster are running, create the configmap on the backup cluster. The installation pauses, waiting for a configmap. To proceed with the installation, create an empty configmap on the backup cluster.
    oc create configmap ${BACKUP_NAME}-cassandraready -n ${BACKUP_NAMESPACE}
    Where:
    • BACKUP_NAME is the release name of the backup cluster.
    • BACKUP_NAMESPACE is the namespace of the backup cluster.
  20. Check that all the pods on the backup cluster are running.
    Note: If you scale down both the primary and the backup pods to zero, then you must scale the primary pod back up first. The ObjectServer bidirectional gateway must connect successfully to both the primary and backup ObjectServers. The gateway is colocated with the backup ObjectServer. If the gateway fails to connect to the primary ObjectServer, the gateway terminates and the backup ObjectServer remains in a crashloopbackoff state.
  21. Set up the cloud native analytics triggers on the ObjectServer.
    Important: To set up the cloud native analytics triggers, complete the following steps on the backup cluster only.
    1. Get the pod name.
      oc get po -o name | grep eanoiactionservice
      Output (pod name) similar to the following is displayed:
      pod/backup-ea-noi-layer-eanoiactionservice-6b6fc7b46f-6qphh
      In this example, backup is the Netcool Operations Insight backup release name.
    2. Log in to the eanoiactionservice pod.
      oc rsh <pod name>
      Where <pod name> is output from the oc get po -o name | grep eanoiactionservice command.
    3. Set up the cloud native analytics triggers on the ObjectServer.
      npm run setup:all
  22. Set up the datalayer triggers to support incidents.
    Important: To set up the datalayer triggers, complete the following steps on the backup cluster only.
    1. Log in to one of the datalayer pods.
      oc rsh $(oc get pods|grep ncodatalayer-agg-std|head -1|awk '{print $1}')
    2. Run the following command from the datalayer pod to set up the datalayer triggers.
      npm run setupdb