Pod Issues

A GUI pod is stuck in 3/4 running containers with multiple restarts

During an upgrade or a rolling pod update, the Liberty container in a GUI pod has multiple restarts but does not recover.

kubectl get pod -n ibm-spectrum-scale | grep gui

NAME                               READY   STATUS    RESTARTS          AGE
ibm-spectrum-scale-gui-0           3/4     Running   705 (4m18s ago)   3d
ibm-spectrum-scale-gui-1           4/4     Running   1 (17h ago)       2d16h

Manually delete the pod to fix the issue:

  1. Identify the problematic GUI pod:

     kubectl get pod -n ibm-spectrum-scale | grep gui
    
     ibm-spectrum-scale-gui-0           3/4     Running   705 (4m18s ago)   3d
    
  2. Delete the GUI pod:

     kubectl delete pod -n ibm-spectrum-scale <gui_pod_name>
    

    Example output:

     kubectl delete pod -n ibm-spectrum-scale ibm-spectrum-scale-gui-0
     pod "ibm-spectrum-scale-gui-0" deleted
    
  1. Verify that the GUI pod recovers with 4/4 containers in READY.

     kubectl delete pod -n ibm-spectrum-scale <gui_pod_name>
    
     ibm-spectrum-scale-gui-0           4/4     Running   0          4m
     ibm-spectrum-scale-gui-1           4/4     Running   0          16m
    
  1. Verify that the GUI pods are at the correct level.

     for pod in `kubectl get pods -lapp.kubernetes.io/name=gui -n ibm-spectrum-scale -ojson | jq -r .items[].metadata.name`; do
         echo -e "\n==== $pod ===="
         kubectl logs $pod -n ibm-spectrum-scale -cliberty | grep 'GPFS GUI'
     done
    
     ==== ibm-spectrum-scale-gui-0 ====
     GPFS GUI Version:5.2.3-0
     GPFS GUI Build Date: 20241119-1908
    
     ==== ibm-spectrum-scale-gui-1 ====
     GPFS GUI Version:5.2.3-0
     GPFS GUI Build Date: 20241119-1908
    

Error: daemon and kernel extension do not match

This error occurs when an unintentional upgrade of IBM Storage Scale container native happens and the issue presents itself as the GPFS state is down. The following error can be found in /var/adm/ras/mmfs.log.latest on the core pods.

Error: daemon and kernel extension do not match

To prevent this issue, follow proper upgrade procedures as the kernel module cannot be unloaded when a file system is in use.

To resolve this problem, restart the node or follow procedures to remove application workloads and use the following command to unload the kernel module. For more information, see Removing applications.

rmmod tracedev mmfs26 mmfslinux

Error: could not insert module, required key not available

This error occurs when secure boot is enabled on Red Hat OpenShift nodes and the issue presents itself as the GPFS state is down by entering the following command:

kubectl exec -n ibm-spectrum-scale \
$(kubectl get pods -lapp.kubernetes.io/name=core -n ibm-spectrum-scale -ojsonpath="{.items[0].metadata.name}") \
-- mmgetstate -a

The output appears as shown:

 Node number  Node name        GPFS state
-------------------------------------------
       1      worker0          active
       2      worker1          down
       3      worker2          active

The following error can be found in /var/adm/ras/mmfs.log.latest on the core pods that have GPFS state down.

ERROR: could not insert module /lib/modules/4.18.0-372.53.1.el8_6.x86_64/extra/tracedev.ko: Required key not available

To verify the secure boot state and resolve the problem, see Validate secure boot

MountVolume.SetUp failed for volume ssh-keys

Warning FailedMount 83m (x5 over 83m) kubelet, worker-0.example.ibm.com  MountVolume.SetUp failed for volume "ssh-keys" : secret "ibm-spectrum-scale-ssh-key-secret" not found

Check the pod creation and secret creation times. It is common that the ssh-key-secret is created after the deployment of the core pods and the pods cannot find the secret as it does not exist yet.

The message can be misleading as it takes time for the operator to create all the resources. This error is transient and will resolve itself after the secret is present.

If the core pods are not in Running state and the secret is created and present for some time, deleting the core pods resolves the issue. This action causes the pods to be re-created and mounts the secret successfully.

Unable to retrieve some image pull secrets

Starting with Red Hat OpenShift 4.15 (kubernetes 1.28), warning messages appear when secrets are referenced in service account but are not created in the namespace. Configuring ICR entitlement using the Red Hat OpenShift global pull secret result in these messages appearing in the events.

"Unable to retrieve some image pull secrets (...); attempting to pull the image may not succeed."

After namespace secrets are created, the pods need to restart for the warning messages to quiesce. For core pods only, use the following annotation to have the operator orchestrate the pod restarts, preserving quorum: kubectl annotate pod -lapp.kubernetes.io/name=core scale.spectrum.ibm.com/pending=delete. For other pods, you must delete them manually.

GUI or Grafana bridge pods fails to start, no data returned from pmcollector to front end applications

There exists an issue where no data is returned to front end applications that are actively consuming performance metrics from IBM Storage Scale pmcollector. This also has a signature of Grafana Bridge pod failing to start. If this is experienced, apply the following workaround.

  1. Check NodeNetworkConfigurationPolicy's to determine which network interfaces are configured for a node network.

    • List the NodeNetworkConfigurationPolicies

      kubectl get nnce
      

      Example:

      # kubectl get nnce
      NAME                                                         STATUS
      compute-0.mycluster.example.com.bond1-ru5-policy   SuccessfullyConfigured
      compute-1.mycluster.example.com.bond1-ru6-policy   SuccessfullyConfigured
      compute-2.mycluster.example.com.bond1-ru7-policy   SuccessfullyConfigured
      control-0.mycluster.example.com.bond1-ru2-policy   SuccessfullyConfigured
      control-1.mycluster.example.com.bond1-ru3-policy   SuccessfullyConfigured
      control-2.mycluster.example.com.bond1-ru4-policy   SuccessfullyConfigured
      
    • Describe the NodeNetworkConfigurationPolicy to identify the network interface being used.

      Example:

      # kubectl describe nnce compute-0.mycluster.example.com.bond1-ru5-policy | grep Name
      Name:         compute-0.mycluster.example.com.bond1-ru5-policy
      Namespace:
      Name:           bond1-ru5-policy
        Name:         bond1
        Name:         bond1.3201
      

      In this particular example, the bond interfaces are configured for the node network traffic.

  2. Change the Performance Data Collection rules to limit the discovery of the Network adapters to only to the configured interfaces.

    • Stop the sensors activities on all Core nodes

      kubectl get pods -lapp.kubernetes.io/name=core -n ibm-spectrum-scale \
      -ojsonpath="{range .items[*]}{.metadata.name}{'\n'}" | \
      xargs -I{} kubectl exec {} -n ibm-spectrum-scale -c gpfs -- \
      kill $(pgrep -fx '/opt/IBM/zimon/sbin/pmsensors -E /opt/IBM/zimon -C /etc/scale-pmsensors-configuration/ZIMonSensors.cfg -R /var/run/perfmon')
      
    • Review the current filter settings for the Network sensor in the Performance Data Collection rules. These are stored in the ibm-spectrum-scale-pmsensors-config configmap.

      kubectl describe cm ibm-spectrum-scale-pmsensors-config -n ibm-spectrum-scale | grep filter | grep netdev
      

      Example output:

      # kubectl describe cm ibm-spectrum-scale-pmsensors-config -n ibm-spectrum-scale | grep filter | grep netdev
      filter = "netdev_name=veth.*|docker.*|flannel.*|cali.*|cbr.*"
      

      The filter = output is used for exclusion logic.

    • Edit the ibm-spectrum-scale-pmsensors-config configmap with the following command:

      kubectl edit ibm-spectrum-scale-pmsensors-config -n ibm-spectrum-scale
      

      Replace the substring netdev_name=veth.*|docker.*|flannel.*|cali.*|cbr.* with netdev_name=^((?!bond).)*

      Bond interface is being used in this example. Replace the bond with the respective adapter name that is used by the customer's network interface.

    • Verify that the ibm-spectrum-scale-pmsensors-config configmap now reflects the wanted adapter.

      kubectl describe cm ibm-spectrum-scale-pmsensors-config -n ibm-spectrum-scale|grep filter | grep netdev
      
  3. Clean up the metadata keys in the pmcollector database not related to the configured node network interfaces. Remote the shell into each pmcollector pod and issue the following commands.

     kubectl -n ibm-spectrum-scale exec -c pmcollector -it \
     $(kubectl get pods -lapp.kubernetes.io/name=pmcollector -o jsonpath='{.items[0].metadata.name}') -- sh
    
     echo "delete key .*|Network|[a-f0-9]{15}|.*" | /opt/IBM/zimon/zc 0
    
     echo "topo -c -d 6" | /opt/IBM/zimon/zc 0| grep Network | cut -d'|' -f2-3 | sort | uniq -c | sort -n | tail -50
    

    Then, exit the container.

    Example:

     # kubectl -n ibm-spectrum-scale exec -c pmcollector -it \
     $(kubectl get pods -lapp.kubernetes.io/name=pmcollector -o jsonpath='{.items[0].metadata.name}') -- sh
     sh-4.4$ echo "delete key .*|Network|[a-f0-9]{15}|.*" | /opt/IBM/zimon/zc 0
     sh-4.4$ echo "topo -c -d 6" | /opt/IBM/zimon/zc 0| grep Network | cut -d'|' -f2-3 | sort | uniq -c | sort -n | tail -50
         96 Network|bond0
         96 Network|bond1
         96 Network|bond1.3201
         96 Network|lo
     sh-4.4$ exit
    
     # kubectl -n ibm-spectrum-scale exec -c pmcollector -it \
     $(kubectl get pods -lapp.kubernetes.io/name=pmcollector -o jsonpath='{.items[0].metadata.name}') -- sh
     sh-4.4$ echo "delete key .*|Network|[a-f0-9]{15}|.*" | /opt/IBM/zimon/zc 0
     sh-4.4$ echo "topo -c -d 6" | /opt/IBM/zimon/zc 0| grep Network | cut -d'|' -f2-3 | sort | uniq -c | sort -n | tail -50
         96 Network|bond0
         96 Network|bond1
         96 Network|bond1.3201
         96 Network|lo
     sh-4.4$ exit
    
  4. Start the sensors jobs on all Core nodes

     kubectl get pods -lapp.kubernetes.io/name=core -n ibm-spectrum-scale \
     -ojsonpath="{range .items[*]}{.metadata.name}{'\n'}" | \
     xargs -I{} kubectl exec {} -n ibm-spectrum-scale -c gpfs -- \
     /opt/IBM/zimon/sbin/pmsensors -E /opt/IBM/zimon -C /etc/scale-pmsensors-configuration/ZIMonSensors.cfg -R /var/run/perfmon
    
  5. Delete the pmcollector and Grafana bridge pods to update the configuration changes.

     kubectl delete pod -lapp.kubernetes.io/instance=ibm-spectrum-scale,app.kubernetes.io/name=pmcollector
     kubectl delete pod -lapp.kubernetes.io/instance=ibm-spectrum-scale,app.kubernetes.io/name=grafanabridge
    

    After some time, the pmcollector and Grafana bridge pods are redeployed by the ibm-spectrum-scale-operator.

pmcollector pod is in pending state during OpenShift Container Platform upgrade or reboot

Events:
  Type     Reason            Age                    From               Message
  ----     ------            ----                   ----               -------
  Warning  FailedScheduling  65s (x202 over 4h43m)  default-scheduler  0/6 nodes are available: 1 node(s) were unschedulable, 2 node(s) had volume node affinity conflict, 3 node(s) had taint {node-role.kubernetes.io/master:}, that the pod didn't tolerate.

This issue is caused by a problem during OpenShift Container Platform Upgrade or when a worker node has not been reset to schedulable after reboot. The pmcollector remains in a Pending state until the pod itself and its respective Persistent Volume can be bound to a worker node.

# kubectl get nodes
NAME                  STATUS                     ROLES    AGE     VERSION
master0.example.com   Ready                      master   5d18h   v1.18.3+2fbd7c7
master1.example.com   Ready                      master   5d18h   v1.18.3+2fbd7c7
master2.example.com   Ready                      master   5d18h   v1.18.3+2fbd7c7
worker0.example.com   Ready                      worker   5d18h   v1.17.1+45f8ddb
worker1.example.com   Ready,SchedulingDisabled   worker   5d18h   v1.17.1+45f8ddb
worker2.example.com   Ready                      worker   5d18h   v1.17.1+45f8ddb

If the Persistent Volume has Node Affinity to the host that has SchedulingDisabled, the pmcollector pod remains in Pending state until the node associated with the PV becomes schedulable.

# kubectl describe pv worker1.example.com-pv
Name:              worker1.example.com-pv
Labels:            app=scale-pmcollector
Annotations:       pv.kubernetes.io/bound-by-controller: yes
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      ibm-spectrum-scale-internal
Status:            Bound
Claim:             example/datadir-ibm-spectrum-scale-pmcollector-1
Reclaim Policy:    Delete
Access Modes:      RWO
VolumeMode:        Filesystem
Capacity:          25Gi
Node Affinity:
  Required Terms:
    Term 0:        kubernetes.io/hostname in [worker1.example.com]
Message:
Source:
    Type:  LocalVolume (a persistent volume backed by local storage on a node)
    Path:  /var/mmfs/pmcollector

If the issue was with OpenShift Container Platform upgrade, fixing the upgrade issue should resolve the pending pod.

If the issue is due to worker node in SchedulingDisabled state and not due to a failed OpenShift Container Platform upgrade, re-enable scheduling for the worker with the oc adm uncordon command.

pmsensors that shows null after failure of pmcollector node

If a node that is running the pmcollector pod is drained, when the node is uncordoned, the pmcollector pods get new IPs assigned. This leads to the pmsensors process issue. It displays the following message:

Connection to scale-pmcollector-0.scale-pmcollector successfully established.

But an error is reported:

Error on socket to scale-pmcollector-0.scale-pmcollector: No route to host (113)

See /var/log/zimon/ZIMonSensors.log. This issue can also be seen on the pmcollector pod:

# echo "get metrics cpu_user bucket_size 5 last 10" | /opt/IBM/zimon/zc 0
1:      worker1
2:      worker2
Row Timestamp               cpu_user
1   2020-11-16 05:27:25     null
2   2020-11-16 05:27:30     null
3   2020-11-16 05:27:35     null
4   2020-11-16 05:27:40     null
5   2020-11-16 05:27:45     null
6   2020-11-16 05:27:50     null
7   2020-11-16 05:27:55     null
8   2020-11-16 05:28:00     null
9   2020-11-16 05:28:05     null
10  2020-11-16 05:28:10     null

If the scale-pmcollector pods get their IP addresses changed, the pmsensors process needs to be stopped and restarted manually on all scale-core pods, to get the performance metrics collection resumed.

To stop the pmsensor process, run these commands on all the ibm-spectrum-scale-core pods. The PMSENSORPID variable holds the results of the kubectl exec command. If this variable is empty, then no process is running, and you do not need to enter the following command to stop the process.

PMSENSORPID=`kubectl exec <ibm-spectrum-scale-core>  -n ibm-spectrum-scale -- pgrep -fx '/opt/IBM/zimon/sbin/pmsensors -C /etc/scale-pmsensors-configuration/ZIMonSensors.cfg -R /var/run/perfmon'`
echo $PMSENSORPID
kubectl exec <scale-pod> -n ibm-spectrum-scale -- kill $PMSENSORPID

To start the service again, enter this command on all the scale pods.

kubectl exec <scale-pod> -n ibm-spectrum-scale -- /opt/IBM/zimon/sbin/pmsensors -C /etc/scale-pmsensors-configuration/ZIMonSensors.cfg -R /var/run/perfmon

pmcollector pods are not in Running state

Reasons to fix the pmcollector pods:

pid_limits set higher than podPidLimits, but not being honored

With Red Hat OpenShift Container Platform 4.11, certain CRI-O fields introduced before the support was in kubelet have been deprecated. One of those deprecated fields is pids_limit, that was configured in the ContainerRuntimeConfig CR. For more information, see CRI-O should deprecate log size max and pids limit options.

If you had applied a custom MCO configuration with a pids_limit value higher than 4096, the container limits are restricted by the default podPidsLimit value in kubelet.conf. This default is set to 4096 on OpenShift Container Platform 4.11, and later. In order to increase this value, do the following:

It is highly recommended that you are at IBM Storage Scale container native v5.1.5 or higher before making changes to MachineConfig as the IBM Storage Scale container native operator will orchestrate the updates to MachineConfig as an attempt to keep the IBM Storage Scale cluster operational.

  1. Define the podPidsLimit in the KubeletConfig custom resource.

     apiVersion: machineconfiguration.openshift.io/v1
     kind: KubeletConfig
     metadata:
       name: 01-worker-ibm-spectrum-scale-increase-pid-limit
     spec:
       machineConfigPoolSelector:
         matchLabels:
           pools.operator.machineconfiguration.openshift.io/worker: ''
       kubeletConfig:
         podPidsLimit: 8192
    
  2. Delete the IBM Storage Scale container native ContainerRuntimeConfig resource in order to set the default value for ContainerRuntime to 0, effective unlimited:

     kubectl delete ContainerRuntimeConfig 01-worker-ibm-spectrum-scale-increase-pid-limit