Rolling back IBM Netcool Operations Insight on Red Hat OpenShift

Learn how to roll back from version 1.6.12 to version 1.6.11 or version 1.6.10. Roll back with the Red Hat® OpenShift® Container Platform Operator Lifecycle Manager (OLM) user interface (UI), or the command line.

Rolling back to version 1.6.11

Roll back to version 1.6.11 from the command line or from the OLM UI.
  1. Scale down the inventory service.
    1. Find the inventory pod and note its replication level.
      oc get pods --field-selector=status.phase=Running --no-headers=true --output=custom-columns=CNAME:.metadata.ownerReferences[0].name | grep ${release}-topology-inventory | uniq --count
    2. Scale down the inventory pod.
      oc scale deployment --replicas=0 ${release}-topology-inventory
  2. Clear the inventory database.
    1. Find and store the inventory database administrator username.
      oc get secret $release-topology-postgres-admin -o jsonpath='{.data.username}' | base64 -d
    2. Find and store the inventory database user password.
      oc get secret $release-topology-postgres-admin -o jsonpath='{.data.password}' | base64 -d
    3. Find and store the inventory database name.
      oc get asmformation $release-topology -o jsonpath='{.spec.helmValues.global.postgres.dbname}'
    4. Find and store the inventory schema name.
      oc get asmformation $release-topology -o jsonpath='{.spec.helmValues.global.postgres.schema}'
    5. Run the following command to determine which is the primary pod in the database cluster.
      oc get cluster -o custom-columns=:.status.currentPrimary
    6. Clear the inventory database data. Replace the variables in the following example with the values that you gathered in the previous steps.
      oc exec -it <database_cluster_primary_pod> -- /bin/bash -c 'psql --host localhost --username <database_user> --dbname <database_name> --command "DROP SCHEMA IF EXISTS <database_schema> CASCADE;"'
      You are prompted to enter the inventory database username and password.
  3. To avoid issues with CouchDB or Redis pods after roll back, complete the following steps.

    If your deployment has more than one CouchDB replica, for example a production size deployment, scale the CouchDB statefulset to zero.

    oc scale sts <release-name>-couchdb --replicas=0
    Scale the Redis statefulset to zero.
    oc scale sts <release-name>-ibm-redis-server --replicas=0
  4. Roll back to version 1.6.11.
    • Command line: To roll back from the command line, use the oc edit noi command and change the version back to version 1.6.11. Update the following parameters.
      spec:
        postgresql:
          parameters:
            archiveTimeout: 5min
            checkpointCompletionTarget: "0.9"
            defaultStatisticsTarget: "100"
            dynamicSharedMemoryType: posix
            effectiveCacheSize: 3GB
            effectiveIoConcurrency: "200"
            hugePages: try
            maintenanceWorkMem: 512MB
            maxConnections: "100"
            maxParallelMaintenanceWorkers: "2"
            maxParallelWorkers: "4"
            maxParallelWorkersPerGather: "2"
            maxReplicationSlots: "32"
            maxWalSize: 16GB
            maxWorkerProcesses: "32"
            minWalSize: 4GB
            randomPageCost: "1.1"
            sharedBuffers: 1GB
            sharedMemoryType: mmap
            sharedPreloadLibraries: auto_explain
            walBuffers: 16MB
            walKeepSize: 512MB
            walReceiverTimeout: 5s
            walSenderTimeout: 5s
            workMem: 6MB
      Save changes to the noi instance. The pods restart.
    • OLM UI: To roll back from the OLM UI, go to Operators > Installed Operators > IBM® Netcool® for AIOps Event Manager. Then, select the NOI tab. Select your deployment and then select the YAML tab. Change the version back to version 1.6.11. Update the YAML as described in the command-line step (ncobackup.image, ncoprimary.image, and Postgres parameters) and save the changes.
    The inventory service scales up automatically.
  5. Remove the ibm-hdm-analytics-dev-aidl-ca confimap.
    oc delete configmap ibm-hdm-analytics-dev-aidl-ca
  6. Rebroadcast data to the inventory service. If data in the inventory is out of sync with data in the Cassandra database, resynchronize it by calling the rebroadcast API of the topology service. This triggers the rebroadcast of all known resources on Kafka, and the inventory service then indexes those resources in PostgreSQL. Call the rebroadcast crawler by calling the rebroadcast API of the Topology service, specifying a tenantID:
    https://master_fqdn/1.0/topology/swagger#!/Crawlers/rebroadcastTopology
    The rebroadcast crawler can also be run from the Data Administration Routines UI. Log in to the Administration > Topology Configuration UI and select the Data administration routines tile. For more information about running routines, see Running data administration routines external icon in the IBM Agile Service Manager documentation.

Rolling back to version 1.6.10

  1. Scale down the inventory service.
    1. Find the inventory pod and note its replication level.
      oc get pods --field-selector=status.phase=Running --no-headers=true --output=custom-columns=CNAME:.metadata.ownerReferences[0].name | grep ${release}-topology-inventory | uniq --count
    2. Scale down the inventory pod.
      oc scale deployment --replicas=0 ${release}-topology-inventory
  2. Clear the inventory database.
    1. Find and store the inventory database administrator username.
      oc get secret $release-topology-postgres-admin -o jsonpath='{.data.username}' | base64 -d
    2. Find and store the inventory database user password.
      oc get secret $release-topology-postgres-admin -o jsonpath='{.data.password}' | base64 -d
    3. Find and store the inventory database name.
      oc get asmformation $release-topology -o jsonpath='{.spec.helmValues.global.postgres.dbname}'
    4. Find and store the inventory schema name.
      oc get asmformation $release-topology -o jsonpath='{.spec.helmValues.global.postgres.schema}'
    5. Run the following command to determine which is the primary pod in the database cluster.
      oc get cluster -o custom-columns=:.status.currentPrimary
    6. Clear the inventory database data. Replace the variables in the following example with the values that you gathered in the previous steps.
      oc exec -it <database_cluster_primary_pod> -- /bin/bash -c 'psql --host localhost --username <database_user> --dbname <database_name> --command "DROP SCHEMA IF EXISTS <database_schema> CASCADE;"'
      You are prompted to enter the inventory database username and password.
  3. To avoid issues with CouchDB or Redis pods after roll back, complete the following steps.

    If your deployment has more than one CouchDB replica, for example a production size deployment, scale the CouchDB statefulset to zero.

    oc scale sts <release-name>-couchdb --replicas=0
    Scale the Redis statefulset to zero.
    oc scale sts <release-name>-ibm-redis-server --replicas=0
  4. Roll back to version 1.6.10.
    • Command line: To roll back from the command line, use the oc edit noi command and change the version back to version 1.6.10. Update the following parameters.
      spec:
        helmValuesNOI:
          ncobackup.image.objserv.digest: sha256:83a6e9855e5c8767e4a1e27df7cf3c461e6a3bcb5faeede3f89832651671fbef
          ncobackup.image.objserv.name: nco-objserv
          ncobackup.image.objserv.tag: rel20-rollback-5
          ncoprimary.image.digest: sha256:83a6e9855e5c8767e4a1e27df7cf3c461e6a3bcb5faeede3f89832651671fbef
          ncoprimary.image.objserv.name: nco-objserv
          ncoprimary.image.tag: rel20-rollback-5
      Also, update the following Postgres parameters.
      spec:
        postgresql:
          parameters:
            archiveTimeout: 5min
            checkpointCompletionTarget: "0.9"
            defaultStatisticsTarget: "100"
            dynamicSharedMemoryType: posix
            effectiveCacheSize: 3GB
            effectiveIoConcurrency: "200"
            hugePages: try
            maintenanceWorkMem: 512MB
            maxConnections: "100"
            maxParallelMaintenanceWorkers: "2"
            maxParallelWorkers: "4"
            maxParallelWorkersPerGather: "2"
            maxReplicationSlots: "32"
            maxWalSize: 16GB
            maxWorkerProcesses: "32"
            minWalSize: 4GB
            randomPageCost: "1.1"
            sharedBuffers: 1GB
            sharedMemoryType: mmap
            sharedPreloadLibraries: auto_explain
            walBuffers: 16MB
            walKeepSize: 512MB
            walReceiverTimeout: 5s
            walSenderTimeout: 5s
            workMem: 6MB
      Save changes to the noi instance. The pods restart.
    • OLM UI: To roll back from the OLM UI, go to Operators > Installed Operators > IBM Cloud Pak for AIOps Event Manager. Then, select the NOI tab. Select your deployment and then select the YAML tab. Change the version back to version 1.6.10 and save the changes.
    • (Optional) If you rolled back to version 1.6.10, restart the cem-operator pod to ensure that it runs the correct image digests for version 1.6.10.
      oc delete <release_name>-cem-operator
    • Remove the ibm-hdm-analytics-dev-aidl-ca confimap by running the following command.
      oc delete configmap ibm-hdm-analytics-dev-aidl-ca
    • Rebroadcast data to the inventory service. If data in the inventory is out of sync with data in the Cassandra database, resynchronize it by calling the rebroadcast API of the topology service. This triggers the rebroadcast of all known resources on Kafka, and the inventory service then indexes those resources in PostgreSQL. Call the rebroadcast crawler by calling the rebroadcast API of the Topology service, specifying a tenantId:
      https://master_fqdn/1.0/topology/swagger#!/Crawlers/rebroadcastTopology
      The rebroadcast crawler can also be run from the Data Administration Routines UI. Log in to the Administration > Topology Configuration UI and select the Data administration routines tile. For more information about running routines, see Running data administration routines external icon in the IBM Agile Service Manager documentation.