Recovering from a failed scale-up in a deployment with Manta Data Lineage (IBM Knowledge Catalog)

After installing or upgrading to version 5.1, scaling up IBM Knowledge Catalog, IBM Knowledge Catalog Premium, or IBM Knowledge Catalog Standard from small_mincpu or small to medium or large can fail in a deployment with IBM Manta Data Lineage.

Symptoms

If you experience issues when connecting to the Neo4j database, check the general CR status for IBM Knowledge Catalog, IBM Knowledge Catalog Premium, IBM Knowledge Catalog Standard, and Manta Data Lineage. In addition, check the status of the Neo4j CR with the following command:

oc get neo4jcluster -n ${PROJECT_CPD_INST_OPERANDS}

If the status of the Neo4j CR is failed, proceed to the Resolving the problem section.

Resolving the problem

To fix the issue, complete these steps:

  1. Set the operator for your IBM Knowledge Catalog edition in maintenance mode. Replace ikc_service and ikc_cr in the following command with the appropriate values for your deployment:

    oc patch <ikc_service> <ikc_cr> --patch '{"spec": {"ignoreForMaintenance": true}}' --type='merge'
    
  2. Log in to the Neo4j operator and delete any additional servers.

    1. Log in to the operator:

      oc rsh $(oc get po -l control-plane=ibm-cpd-neo4j-operator -o name)
      
    2. Delete additional servers.

      Important: Do not delete the data-lineage-neo4j-server1 server.
      helm list -n ${PROJECT_CPD_INST_OPERANDS}
      helm delete -n ${PROJECT_CPD_INST_OPERANDS} data-lineage-neo4j-server2
      helm delete -n ${PROJECT_CPD_INST_OPERANDS} data-lineage-neo4j-server3
      
    3. Exit the operator.

  3. Delete the load balancer service:

    oc delete svc data-lineage-neo4j-lb-neo4j -n ${PROJECT_CPD_INST_OPERANDS}
    
  4. Delete the PVCs for the servers that you just removed:

    oc delete pvc data-data-lineage-neo4j-server2-0 data-data-lineage-neo4j-server3-0 transactions-data-lineage-neo4j-server2-0 transactions-data-lineage-neo4j-server3-0
    
  5. Edit the Neo4j CR file and reset the value for clusterPrimaries.

    1. Edit the CR file:

      oc edit neo4j data-lineage-neo4j-cr -n ${PROJECT_CPD_INST_OPERANDS}
      
    2. Search for the clusterPrimaries entries and set its value to 1:

      clusterPrimaries: 1
      
  6. Force a restart of the Neo4j operator:

    oc delete po -l control-plane=ibm-cpd-neo4j-operator -n ${PROJECT_CPD_INST_OPERATORS}
    

Parent topic: Troubleshooting IBM Knowledge Catalog