Pods fail with CrashLoopBackOff
Pods remain in CrashLoopBackOff
status and do not recover.
Symptoms
Pod fails with message similar to the following:
unexpected error watching template /etc/nginx/template/nginx.tmpl: no space left on device
Causes
No space left on device.
Resolving the problem
- Determine whether the problem is a file system issue.
- Run the following command to determine if disk space usage is full:
df -h
- Run the following command to determine if
inode
space usage is full:df -i
- Run the following command to see if there is an unreleased fd that is marked as deleted:
lsof
- Run the following command to determine if disk space usage is full:
- Use command,
lsof | grep inotify | wc -l
to checkinotify
usage. Use commandsysctl fs.inotify.max_user_watches
to check the current values. You may have reached the limit on the total number ofinotify
watches. You can increase the limit infs.inotify.max_user_watches
and restart the pods.# sysctl fs.inotify.max_user_watches=524288 fs.inotify.max_user_watches = 524288 # kubectl delete pod nginx-ingress-lb-amd64-6j9zm -n kube-system pod "nginx-ingress-lb-amd64-6j9zm" deleted