Failed to install CSV for kubernetes-nmstate-operator

While installing IBM Fusion HCI System, the installation fails with the error message Failed to install CSV for kubernetes-nmstate-operator. Check log file /home/kni/logs/installoperator_playbook.log for more details.

Diagnosis

Run all commands as a kni user from the provisioner node.

  1. See log file /home/kni/logs/installoperator_playbook.log and search for the following string:
    Wait for CSV kubernetes-nmstate-operator to get installed on second attempt
  2. If the logs do not give enough information, then continue further diagnostics.
  3. As a kni user, run oc get csv -n openshift-nmstate on the provisioned (also known as RU7 or compute-1-ru7 node). Check for `Failed' status.
    NAME                                              DISPLAY                       VERSION               REPLACES   PHASE
    kubernetes-nmstate-operator.4.14.0-202311021650   Kubernetes NMState Operator   4.14.0-202311021650              Failed
    
  4. Check which pods are not Running or in Completed state:
    oc get pods -n openshift-nmstate
    See the sample output for pods that are not in 'Running' STATUS:
    NAME                                      READY   STATUS    RESTARTS       AGE
    nmstate-cert-manager-64474fd576-v75sq     1/1     Running   0              2d2h
    nmstate-console-plugin-5bb9bbf97c-ssjf2   1/1     Running   0              2d2h
    nmstate-handler-2xl6x                     1/1     Running   4 (4d1h ago)   4d1h
    nmstate-handler-8xktl                     1/1     Running   0              4d2h
    nmstate-handler-97pzc                     1/1     Running   3 (4d1h ago)   4d1h
    nmstate-handler-ggbpt                     1/1     Running   0              4d2h
    nmstate-handler-p9wsn                     1/1     Running   0              4d2h
    nmstate-handler-pjjdr                     1/1     Running   3 (4d1h ago)   4d1h
    nmstate-operator-c89955d9-snmh4           1/1     Running   0              2d2h
    nmstate-webhook-69477dbc4f-6v7l9          1/1     Running   0              2d2h
    nmstate-webhook-69477dbc4f-mqwmg          1/1     Running   0              2d2h
    
  5. See the details of pod which are not in 'Running' STATUS:
    oc descibe pod <POD NAME> -n openshift-nmstate
  6. Scroll and examine the Events section of the output:
    Events:
      Type     Reason          Age   From               Message
      ----     ------          ----  ----               -------
      Normal   Scheduled       98m   default-scheduler  Error: ImagePullBackOff
  7. If error from the describe command indicates authorisation or authentication error, then check whether the credentials you provided during the stage2 installation are correct.
    • For online installation, check pull-secret that you provided.
    • For installation from enterprise registry, check the provided credentials for your enterprise registry (for both multiple and single repositories).
    • Try to pull the image manually on one OpenShift® Container Platform node by connecting to the node from provisioner by using oc debug node/<NODE-NAM>.
  8. If the error from the previous describe command indicates manifest unknown error, then do the following checks:
    • For installation from enterprise registry, check whether your enterprise registry is configured properly and reachable from provisooner (RU7).
    • For installation from enterprise registry, make sure that the redhat kubernetes-nmstate-operator images are mirrored correctly.
    • Make sure that the user mirror images with right digest to right path exist in the enterprise registry.

Next actions

Take corrective actions and rerun installation.