Monitoring

Prometheus pod crashes - OOMKilled

Prometheus pod crashes - OOMKilled

Symptoms

The Prometheus pod remains in crashing status. When you issue the following command to get pod details, the Prometheus container terminates with reason, OOMKilled.

kubectl describe po prometheus-monitoring-prometheus-0 -n kube-system

Causes

The problem is that Prometheus is experiencing a high workload. High volumes of metrics data require more memory than what is currently available on the Prometheus container.

Resolving the problem

The following options can resolve this problem.

You can reduce the existing Prometheus workload. Reduce the scrape frequency by decreasing the value of the scrape_Interval
You can increase the memory limits on the Prometheus container.
Your Prometheus container continues to crash and no error messages appear in the logs. This situation might indicate that too much data remains in the /prometheus/prometheus-db/data/wal path inside your Prometheus container. To work around this problem, delete the /wal folder from the volume.

Note: Deleting the /wal folder can lead to data loss.