Failed to deploy nmstate-operator

Failed to deploy nmstate-operator. Check the log file /home/kni/logs/installoperator_playbook.log for more details.

Diagnosis

Run all commands as a kni user from the provisioner node (also known as RU7 or compute-1-ru7).

  1. Run the following command to check whether the catalogsource exists or not:
    oc get catsrc -n openshift-marketplace
  2. Verify whether the catalog source with name redhat-operators exists.
    NAME                          DISPLAY                   TYPE   PUBLISHER   AGE
    certified-operators           Certified Operators       grpc   Red Hat     3d9h
    community-operators           Community Operators       grpc   Red Hat     3d9h
    ibm-operator-catalog          IBM Operator Catalog      grpc   IBM         3d2h
    isf-data-foundation-catalog   Data Foundation Catalog   grpc   IBM         3d5h
    isf-catalog.                                            grpc               8h
    redhat-marketplace            Red Hat Marketplace       grpc   Red Hat     3d9h
    redhat-operators              Red Hat Operators         grpc   Red Hat     3d9h
    
  3. If catalog source exists, then run the following command to check for catalogsource pod:
    c get pods -n openshift-marketplace
  4. Verify a pod starting with redhat-operators- is present and is in Running state with 1/1 containers in READY field. For example:
    NAME                                  READY   STATUS             RESTARTS   AGE
    isf-catalog-wh8zn                     1/1     Running            0          8h
    marketplace-operator-dd4dff8f-jxwm4   1/1     Running            0          32h
    redhat-marketplace-d9dl8              1/1     Running            0          32h
    redhat-operators-gzk4q                1/1     Running            0          32h
    
  5. If the pod is not running, then run the following command to see the details of pod:
    oc descibe pod <POD NAME> -n openshift-marketplace
  6. Scroll and examine the Events section of the output.
    Events:
      Type     Reason          Age   From               Message
      ----     ------          ----  ----               -------
      Normal   Scheduled       98m   default-scheduler  Error: ImagePullBackOff 
    
  7. If error from the describe command in authorisation or authentication error, then check the credentials you provided in stage2 installation GUI page:
    • For online installation, check your pull-secret that you provided.
    • For installation from enterprise registry, check the provided credentials for your enterprise registry (for both multiple and single repositories).
    • Try to pull the image manually on one OpenShift® Container Platform node by connecting to the node from provisioner by using oc debug node/<NODE-NAM>.
  8. If error from above describe command (step#3) indicated manifest unknown error then make sure below checks:
    • For installation from enterprise registry, check whether your enterprise registry is configured properly and reachable from provisooner (RU7).
    • For installation from enterprise registry, make sure that the kubernetes-nmstate-operator images are mirrored correctly.
    • Make sure that the user mirror images with right digest to right path exist in the enterprise registry.
  9. If the redhat-operators pod is running successfully, then run the following command to check the namespace openshift-nmstate:
    oc get namespace | grep openshift-nmstate
    
    Sample output:
    openshift-nmstate                                  Active   4d1h
    
    If the previous oc get og -n openshift-nmstate command gives a blank output, then it means that the operator group is not created. Check the log file /home/kni/logs/installoperator_playbook.log for the following message:
    message":"etcdserver: request timed out","code":500}\n'", "reason": "Internal
          Server Error", "status": 500}
    The failure indicates that this API issue is caused by the network bandwidth or any other latency. If the installation retries fail, contact IBM Support. Even if you do not find any API related issue in your logs, then contact IBM Support.
  10. If namespace is created, then check for the operator group with name openshift-nmstate:
    oc get og -n openshift-nmstate
    
    Sample output:
    NAME                AGE
    openshift-nmstate   4d2h
    
    If the previous oc get og -n openshift-nmstate command gives a blank output, then it means that the operator group is not created. Check the log file /home/kni/logs/installoperator_playbook.log for the following message:
    message":"etcdserver: request timed out","code":500}\n'", "reason": "Internal
          Server Error", "status": 500}
    The failure indicates that this API issue is caused by the network bandwidth or any other latency. If the installation retries fail, contact IBM Support. Even if you do not find any API related issue in your logs, then contact IBM Support.
  11. If the operator group is created, then run the following command to check for kubernetes-nmstate-operator subscription:
    oc get sub -n openshift-nmstate 
    
    Sample output:
    NAME                          PACKAGE                       SOURCE             CHANNEL
    kubernetes-nmstate-operator   kubernetes-nmstate-operator   redhat-operators   stable
    
    If the previous oc get og -n openshift-nmstate command gives a blank output, then it means that the operator group is not created. Check the log file /home/kni/logs/installoperator_playbook.log for the following message:
    message":"etcdserver: request timed out","code":500}\n'", "reason": "Internal
          Server Error", "status": 500}
    The failure indicates that this API issue is caused by the network bandwidth or any other latency. If the installation retries fail, contact IBM Support. Even if you do not find any API related issue in your logs, then contact IBM Support.
  12. If the subscription is created, then check for kubernetes-nmstate-operator installplan:
    oc get ip  -n openshift-nmstate
    
    Sample output:
    NAME            CSV                                               APPROVAL    APPROVED
    install-kz5sb   kubernetes-nmstate-operator.4.14.0-202311021650   Automatic   true
    
  13. If the oc get ip -n openshift-nmstate command gives a blank output, then it means that the install plan is not created.If install plan is not found then please check jobs by execute command oc get job -n openshift-marketplace.
    NAME                                                              COMPLETIONS   DURATION   AGE
    0308d219fbaf8eec9372a4314df5af975f06f2ff3cd55c8fc33489e88deb346   1/1           9s         3d2h
    08dd142dfe6be404a597f0c82c8bad5787997272aa8867842e1018e06b3c32f   1/1           8s         3d4h
    0f59dc23d14241f8977ca45073457ddfaca24a68c7d0a6c8e14edf4d9bca337   1/1           7s         3d6h
    19a4f4d1b807398820a95a3b292d4f595c442ae7861c49d5b27d09624c0e9b1   1/1           10s        3d
    235cc047aad86ecb3c6b1d2d45ebf4bcbd54d89c352e386745edaa2b5e89b33   1/1           9s         3d2h
    31c5fff269b3664c17b94430b42e652ab175ec2ebee85e6c7df01a7da6ff600   1/1           7s         3d3h
    324ae09e681c9662339ae2dd9a38b2ad70c2430272c3229da527fd0f2b77fb8   1/1           12s        3d9h
    56d513408bd5a601c1701a5ca18fa71ff4e8dc049c5f205ecfcde1897ee03bd   1/1           7s         3d6h
    63d794fe8f947104f804d2db67793404dd4c95756e405bf30f5bd4ac59409b1   1/1           9s         3d
    69cd2f81af46d28954bf54d01ba81582a3ee4059da62aba89fc4762f7c1d199   1/1           10s        3d6h
    6f568dbc4626d4adcf92316d6d9e0a7250ae118b5cad7d97cac9525bf810ee3   1/1           9s         3d3h
    7a16b39766293d24b1ea674bf2221c218603f148aa5b01def75c3c792dd6ee5   1/1           27s        3d6h
    82067d1666f17dd7acd36ec050bfa065850240fcaaef5aba3ad11a91d6aa63f   1/1           9s         3d
    963e64e4cb5615267ab3821d47a8acb6f04586013df3472949178b49f049c64   1/1           8s         3d6h
    96eb43df54bdc1c822600b707e1078090ed2cfdd3587360a7545525097f52a6   1/1           9s         3d6h
    c7906c42e8405161bde11a3e044dbc23c87bf31e17393e061c917b6074ff28a   1/1           12s        3d2h
    c7f12c9dc97b5a771181f4fb58ecf7cd812f8d8a5756f810d40e5295f345f73   1/1           38s        3d
    d0d80a9e2553e077d37d4a407fe3a58b4b44596df02efad7d5d962e0505680b   1/1           22s        3d
    d68f3512c112308d77831d6c2765ee4c8e8d1960661a33dad59f0419e9fd732   1/1           9s         3d3h
    f31ed4a99e495429a1d47697faef595eb7a0f59b40d43617f8f3e2018bd1842   1/1           7s         3d6h
    
  14. If any job does not have COMPLETION as '1/1', then run the oc get jobname -n openshift-marketplace comment and check for errors.

Next actions

Take corrective actions and rerun installation.