Pods displaying ImagePullBackOff status

Troubleshoot the issue that pods display ImagePullBackOff status.

Problem

Pods display ImagePullBackOff for their status. They never get to a running status.

For example:

kubectl get pods -n cp4mcm-cloud-native-monitoring

NAME                                                         READY   STATUS              RESTARTS   AGE
ibm-monitoring-dataprovider-mgmt-operator-6c58fd8fc4-2gxq7   0/1     ImagePullBackOff    0          3m55s

Cause

The following reasons might cause the ImagePullBackOff status.

Offline installation:

  1. You are using a self-signed certificate for your docker registry. The docker daemon that is used by Kubernetes on the managed cluster does not trust the self-signed certificate. As a result, a x509: certificate signed by unknown authority error is returned when you when you describe the pod with ImagePullBackOff status.

  2. It might be caused by an incorrect local docker registry login credentials.

    a. For remote deployment, an incorrect docker login credential for local docker registry exists in the pull-secret in openshift-config namespace on the Hub cluster, which results in an authentication required error.

    b. For local deployment through OperatorHub, an incorrect docker login credential for local docker registry exists in the pull-secret in openshift-config namespace on the managed cluster, which results in an authentication required error.

    c. For local deployment by script, an incorrect docker login credential for local docker registry exists in the ibm-management-pull-secret in cp4mcm-cloud-native-monitoring namespace on the managed cluster, which results in an authentication required error.

Online installation:

It might be caused by an incorrect local docker registry login credentials.

  1. For remote deployment, an incorrect or expired docker login credential for entitled registry exists in the secret that you create in 3. Create the entitled registry secret when you install IBM Cloud Pak for Multicloud Management on hub cluster, which results in an authentication required error.
  2. For local deployment through OperatorHub or by script, an incorrect docker login credential for entitled registry exists in the ibm-management-pull-secret in cp4mcm-cloud-native-monitoring namespace on the managed cluster, which results in an authentication required error.

Resolving the problem

  1. Get a detailed description of the pod with the ImagePullBackOff status. For example:

    kubectl describe pod ibm-monitoring-dataprovider-mgmt-operator-6c58fd8fc4-2gxq7 -n cp4mcm-cloud-native-monitoring
    
  2. If the error is the x509 certification error, for example:

    ...
    Warning  Failed     <invalid>                      kubelet, worker1.stubbly.cp.fyre.ibm.com  Failed to pull image "bordure-inf.fyre.ibm.com:5555/cp/cp4mcm/ibm-monitoring-dataprovider-mgmt-operator@sha256:38c703640ead573c0fc8c766a2fa48ed3a2de90f30a6c2ade9cbbd8ec747ff1e": rpc error: code = Unknown desc = error pinging docker registry bordure-inf.fyre.ibm.com:5555: Get https://bordure-inf.fyre.ibm.com:5555/v2/: x509: certificate signed by unknown authority
    

    It indicates that you install Monitoring DataProvider Management in offline mode, and you must instruct docker to trust the self-signed certificate that is used by your docker registry. To do this action, you need to perform step 1 and step 2 in Local deployment from OperatorHub.

  3. It might be caused by an invalid or expired docker-registry secret.

    Offline installation:
    For remote deployment, it might be caused by expired LOCAL_DOCKER_PASSWORD that you set when you offline install IBM Cloud Pak for Multicloud Management on the Hub cluster. For local deployment, it might be caused by an invalid or expired docker-registry secret that you create in step 4 of Local deployment from OperatorHub or step 3 of Local deployment by script.

    To resolve this error, do the following steps:

    a. For remote deployment, verify the global image pull secret on Hub cluster. Make sure that it contains the correct login credentials for your local docker registry. If credentials are incorrect, update the credentials and wait for change roll out to all cluster nodes.

    b. For local deployment through OperatorHub, verify the global image pull secret on the managed cluster. Make sure that it contains the correct login credentials for your local docker registry. If credentials are incorrect, update the credentials and wait for change roll out to all cluster nodes.

    c. For local deployment by script, delete and recreate the secret ibm-management-pull-secret that you create in cp4mcm-cloud-native-monitoring namespace on the managed cluster.

    d. Uninstall Monitoring DataProvider Management, reinstall it, and check that the pod is created and it is in a running status.

    Online installation:

    For remote deployment, it might be caused by expired docker login credentials for entitled registry that you create in 3. Create the entitled registry secret when you install IBM Cloud Pak for Multicloud Management on hub cluster. For local deployment, it might be caused by an invalid or expired docker-registry secret that you create in step 7 of Local deployment from OperatorHub or step 3 of Local deployment by script.

    To resolve this error, do the following steps:

    a. For remote deployment, verify the secret that you create in 3. Create the entitled registry secret when you install IBM Cloud Pak for Multicloud Management on hub cluster. Make sure that it contains the correct login credentials for entitled registry. If credentials are incorrect, update the credentials.

    b. For local deployment through OperatorHub and local deployment by script , delete and recreate the secret ibm-management-pull-secret that you create in cp4mcm-cloud-native-monitoring namespace on the managed cluster.

    c. Uninstall Monitoring DataProvider Management, reinstall it, and check that the pod is created and it is in a running status.