Re-install with appDisco observer fails

Re-installing Netcool® Operations Insight® on OpenShift® with the topology analytics application discovery observer enabled causes the installation to fail.

Problem

Netcool Operations Insight on OpenShift has previously been installed, and uninstalled. When attempting to reinstall Netcool Operations Insight on OpenShift with the topology analytics application discovery service enabled, the installation fails.

Cause

The appDisco observer is not uninstalled properly, and residual processes from the original Netcool Operations Insight on OpenShift deployment continue to run and cause the new installation to fail.

Resolution

  1. Remove the residual elements:
    oc get configmap,secret,route -o name | grep app-disco | xargs oc delete
  2. Delete the topology-secret-manager job:
    kubectl delete job <release-name>-topology-secret-manager
  3. Check that a new topology-secret-manager job is created and completes successfully. This should take no longer than ten minutes. Use the oc get jobs command to confirm that the topology-secret-manager job has run:
    oc get jobs <release-name>-topology-secret-manager
    Example system output (where <release-name> is noi):
    NAME                                           COMPLETIONS      DURATION         AGE
    ...
    noi-topology-secret-manager                           1/1           14s        4m
    ...
  4. If the topology services are still not running, then the topology analytics operator may not be aware that the noi-topology-secret-manager job has run, and that it needs to be triggered manually to restart the topology services, and you will need to perform the following additional steps.
  5. Confirm that the topology services are not running:
    oc get pods |  grep topology
    System output should indicate that no services are running.
  6. Check the topology analytics operator logs. The following errors may be erroneously recorded:
    • Secret-manager job not finished
    • Cassandra secret generator job not finished
    • App Disco init job not finished
    • Statefulsets not ready
  7. Use the oc edit NOI command, then save the configuration file. This triggers the topology analytics operator to restart the topology analytics services, which you can confirm using:
    oc get pods |  grep topology
    All topology analytics services should be running after a maximum of ten minutes.
Draft comment: DEIRDRELAWTON
July 2020 #5199