The vmware-infra-* pods in CrashLoopBackOff error states
You can encounter an error where several pods are crashing, and restarting constantly. These pods can appear to be failing with an authentication error.
When this error occurs, your output when checking the pod status can resemble the following example:
NAMESPACE NAME READY STATUS RESTARTS AGE
management-infrastructure-management 1-vmware-infra-event-catcher-30-65d7bfd967-tcsrg 0/1 CrashLoopBackOff 399 44h
management-infrastructure-management 1-vmware-infra-operations-30-6dc9f87b75-mqfpz 0/1 CrashLoopBackOff 402 44h
management-infrastructure-management 1-vmware-infra-refresh-30-84fcd6cbd4-fc68w 0/1 Error 412 44h
When you check the logs, these pods appear to be failing with an authentication error:
{"@timestamp":"2020-08-02T21:13:00.050303 ","hostname":"1-vmware-infra-refresh-30-84fcd6cbd4-fc68w","pid":7,"tid":"2ae9ee5f997c","level":"err","message":"MIQ(ManageIQ::Providers::Vmware::InfraManager::RefreshWorker::Runner) ID [2241] PID [7] GUID [403658a2-96a2-4823-b996-94cd6472dffd] EMS id [30] failed authentication check. Worker exiting."}
This error can happen when VMWare is unable to be accessed, which can be due to an authentication issue, network issue, or some other reason. When this connection issue occurs, the pods enter this state.
Solution: To resolve this error, complete the following steps:
- Fix the credentials or network issue, allowing the pods to recover.
- Delete the provider to stop the CrashLoopBackOff.