IBM Support

Enable Cgroup to get the real job swap usage.

Question & Answer


Question

If you run a job which uses shared memory on a host with Linux kernel below 2.6.24, the job swap value may be quite large (unreasonable).

Cause

For LSF 9.1 and higher, the job processes can be tracked through Linux Cgroup which are supported on x86_64 and PowerPC LINUX with kernel version 2.6.24 or later. Here are the ways LSF calculates the job swap.

1. If the LSF_LINUX_CGROUP_ACCT=n parameter is set in the lsf.conf file, LSF uses PIM to collect the memory and swap usage of all processes in a job.

1) If the EGO_PIM_SWAP_REPORT=n parameter is set in the lsf.conf file (by default), swap usage is the total virtual memory (VSZ) of all job processes.

2) If the EGO_PIM_SWAP_REPORT=y parameter is set in the lsf.conf file, the resident set size (RSS) is subtracted from the virtual memory usage. RSS is the portion of memory occupied by a process that is held in main memory. Swap usage is collected as the total (VSZ-RSS) of all job processes.

2. If the LSF_LINUX_CGROUP_ACCT=y parameter is set in the lsf.conf file, LSF uses the cgroup memory subsystem to collect the memory and swap usage of all processes in a job. The job swap is the total swap usage of all processes in a job.



When a job runs on a host with Linux kernel below 2.6.24 (Linux Cgroup is not supported), the job swap is calculated as the total virtual memory (VSZ) of all job processes by default. For jobs use shared memory, LSF can’t get the shared VSZ value for each process of a job so that LSF will calculate the shared VSZ multiple times, that is why the swap value for the job may be quite large (unreasonable).

In short, the reason of the quite large job swap is the Linux Cgroup not being enabled on the host where the job runs.

Answer

When a job which uses shared memory runs on a host with Linux kernel below 2.6.24 (Linux Cgroup is not supported), the job swap value is not the real swap usage of the job and it may be quite large (unreasonable). If you want to get the real swap usage of the job, please upgrade the Linux kernel to 2.6.24 or higher and enable Linux Cgroup.

[{"Product":{"code":"SSETD4","label":"Platform LSF"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"--","Platform":[{"code":"PF016","label":"Linux"}],"Version":"9.1.0;9.1.1;9.1.2;9.1.3","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
17 June 2018

UID

isg3T1024098