IBM Support

kubectl top pods and docker stats show different memory statistics

Troubleshooting


Problem

The output from kubectl top pod <POD> and  docker stats <ContainerID> returns unmatching memory statitics.
For example:
[root@icp1 ~]# kubectl top pod icp-mongodb-2 -n kube-system
NAME CPU(cores) MEMORY(bytes)
icp-mongodb-2 28m 1510Mi ################################################################### [root@icp1 ~]# docker stats --no-stream 15d29f7aa89c
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 15d29f7aa89c k8s_icp-mongodb_icp-mongodb-2_kube-system_29db5101-0c29-11ea-9808-000c2943687d_0 1.94% 1.214GiB / 23.39GiB 5.19% 0B / 0B 68.4MB / 64.9GB 398 ################################################################### The memory usage from "kubectl top" is about 1.5GB while from the output from docker stats it is 1.2GB.
The difference is even more evident for others containers.
It is not clear if one of the two tools is wrong or if they are collecting different type of data.

Resolving The Problem

The two tools collect data from different sources and they are also referring to different metrics.

kubectl top pod uses memory working set: you can compare the output of the kubectl top with the value of the metric "container_memory_working_set_bytes" in Prometheus.

If you run this query in Prometheus:

container_memory_working_set_bytes{pod_name=~"<pod-name>", container_name=~"<container-name>", container_name!="POD"}

you will get value in bytes that almost matches the output of kubectl top pods.

This value is collected by cAdvisor.

Docker stats instead collects metric directly from operating system and specifically from the /sys/fs/cgroup/memory special files.

Docker stats shows as memory usage the result of  usage_in_bytes - cache.

This may still not match perfectly the value showed by docker stats, because the docker cli also subtracts shared memory from the value before it is displayed, but this is how it works.

Let's make an example by looking at the memory consumption of logging-elk-data-0 pod:

 [root@icpmstr1 ~]kubectl top pods logging-elk-data-0

logging-elk-data-0                                           211m        2389Mi
 

kubectl top pods shows 2389Mi

If I run the following query on Prometheus:

container_memory_working_set_bytes{pod_name=~"logging-elk-data-0", container_name=~"es-data", container_name!="POD"}

I will get the same value showed by the kubectl top pods output.

Now let's look at docker stats instead; after having identified the container ID, we can run the command as below:

 [root@icpmgmt1 ~]# docker stats 5e1ddf0694f2 --no-stream

CONTAINER ID       NAME                                                                               CPU %              MEM USAGE / LIMIT    MEM %              NET I/O            BLOCK I/O          PIDS

5e1ddf0694f2       k8s_es-data_logging-elk-data-0_kube-system_2aee173d-20ae-11ea-a976-000c29d92eeb_0  44.53%             1.859GiB / 2.861GiB  64.98%             0B / 0B            64.8GB / 1.73TB    56
Docker stats shows instead 1859Gib.
It differs a lot if compared to the kubectl top pods output.
Let's have a look at the memory metrics it is supposed to use to calculate the memory usage:

curl --unix-socket /var/run/docker.sock "http:/v1.24/containers/5e1ddf0694f2/stats"

 "usage":3019055104,

"active_anon":1357660160,

"active_file":529457152,

"cache":1022935040

"inactive_anon":638459904,

"inactive_file":493477888,

"mapped_file":  13770752,

"rss":   1996120064,

"rss_huge": 8388608
We know that "memory used" is calculated by doing:
"usage" - "cache"
So in this case it is:

3019055104- 1022935040 = 1996120064 ==> 1,903 Gib

As anticipated above, the calculated value may still not match perfectly the value showed by docker stats, because the docker cli also subtracts shared memory before it is displayed, so we can expect some differences here.

I could have taken the above metrics also from this file

cat /sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod2aee173d_20ae_11ea_a976_000c29d92eeb.slice/docker-5e1ddf0694f27aed958c0ed917e364a2da13470db6c10c9a37df90b8d715fb3a.scope/memory.stat

It does not include "usage" metric, that is exposed in a separate file (memory.usage_in_bytes) within the same folder.

So as you can see we have found where the two tools take their metrics.

They are actually showing different values but it occurs because they are referring to different metrics collected in a different way.

Now, what is the best one to consider for your monitoring purposes?

I would suggest using the one from kubectl top, because it is the one showed in Prometheus charts and also because working_set_bytes is what OOMKiller is watching for to decide if a container must be killed.

Document Location

Worldwide

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSBS6K","label":"IBM Cloud Private"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Product Synonym

ICP; IBM Cloud Private

Document Information

Modified date:
05 March 2020

UID

ibm13373017