Troubleshooting operator
- operator pod is not successfully deployed
- operator pod shows container restarts
- operator pod is terminated with OOMKilled
operator pod is not successfully deployed
No operator pod exists in the ibm-spectrum-scale-operator
namespace.
-
Verify that all worker nodes in the Red Hat OpenShift Container Platform cluster are in a
Ready
state. Nodes that areNotReady
might be preventing the operator pod from being scheduled.kubectl get nodes
# kubectl get nodes NAME STATUS ROLES AGE VERSION master0.example.com Ready master 65d v1.18.3+6c42de8 master1.example.com Ready master 65d v1.18.3+6c42de8 master2.example.com Ready master 65d v1.18.3+6c42de8 worker0.example.com NotReady worker 65d v1.18.3+6c42de8 worker1.example.com NotReady worker 65d v1.18.3+6c42de8 worker2.example.com NotReady worker 65d v1.18.3+6c42de8
-
Inspect the operator namespace and look for details that might point to problems.
-
Check the operator deployment:
kubectl get deployment -n ibm-spectrum-scale-operator
Describe the deployment for more details:
kubectl describe deployment ibm-spectrum-scale-controller-manager -n ibm-spectrum-scale-operator
-
Check the operator replicaset:
kubectl get replicasets -n ibm-spectrum-scale-operator
Describe the replicaset for more details:
kubectl describe replicaset $(kubectl get replicasets -n ibm-spectrum-scale-operator \ -ojson | jq -r .items[0].metadata.name) -n ibm-spectrum-scale-operator
-
operator pod shows container restarts
Kubernetes keeps the logs of the current container and the previous container.
Look at the previous container logs and look for any issues that might be causing the container to restart.
kubectl logs -p $(kubectl get pods -n ibm-spectrum-scale-operator -ojson | \
jq -r .items[0].metadata.name) -n ibm-spectrum-scale-operator
operator pod is terminated with OOMKilled
The operator default memory limit is set too low for the cluster configuration and needs to be adjusted to a higher value.
-
Edit the operator deployment:
oc edit deploy -n ibm-spectrum-scale-operator ibm-spectrum-scale-controller-manager
-
Increase the operator deployment memory limit. This example has 1Gi but can vary depending on the environment.
spec: containers: - args: resources: limits: cpu: 1500m memory: 1Gi
-
Save and quit (:wq)
-
Verify the deployment restarts the operator pod.
oc get pod -n ibm-spectrum-scale-operator
NAME READY STATUS RESTARTS AGE ibm-spectrum-scale-controller-manager-dfbb9c87d-xt9qd 1/1 Running 0 23s