]>

Enforcing Platform LSF job memory and swap with Linux cgroups

Linux control groups (cgroups) limit, account and isolate resource usage (CPU, memory, swap space, disk I/O, etc.) of process groups by aggregating and partitioning sets of tasks and all their future children, into hierarchical groups with specialized behavior. Cgroup support is a feature of the Linux kernel since version 2.6.24.

All LSF job processes are controlled by the Linux cgroup system. If job processes on a host use more memory than the defined limit, the job will be immediately killed by the Linux cgroup memory sub-system.

Since Linux kernel 2.6.34, an eventfd-based generic API notification about changing status of a cgroup was introduced. With eventfd, LSF is able to be notified when job processes used more memory than the limit, then LSF will kill all processes of the job and provide specific termination reason that will be written into the LSF job accounting file and be displayed through bjobs –l.

The following steps enable cgroup memory and swap enforcement in LSF.

Step 1: Mount the cgroup system

  1. Check whether the cgroup is supported by your Linux system.

The Linux kernel supports cgroups since v2.6.24, and supports eventfd-based notification of changing cgroup status since v2.6.34.

  1. Check whether the cgroup system is already mounted:

$ grep cgroup /proc/mounts
cgroup /cgroup/cpuset cgroup rw,relatime,cpuset 0 0
cgroup /cgroup/memory cgroup rw,relatime,memory 0 0
cgroup /cgroup/freezer cgroup rw,relatime,freezer 0 0
cgroup /cgroup/cpuacct cgroup rw,relatime,cpuacct 0 0

Four cgroup subsystems are mounted: cpuset, memory, freezer and cpuacct.

  1. Mount cgroup sub systems.

Edit the following lines in /etc/fstab:

cgroup /cgroup/freezer cgroup freezer 0 0
cgroup /cgroup/cpuacct cgroup cpuacct 0 0
cgroup /cgroup/memory cgroup memory 0 0
cgroup /cgroup/cpuset cgroup cpuset 0 0
  1. Run the $ bsub -M 100  -v 50 my_program command as root.
# mount –a –t cgroup

Step 2: Enable LSF memory and swap enforcement

  1. Add the following parameter to $LSF_ENVDIR/lsf.conf:

LSB_RESOURCE_ENFORCE=”memory”

  1. Run the lsfrestart command as LSF admininistrator to restart LSF on all hosts.

Step 3. Verify LSF memory and swap enforcement

Submit a job with memory or swap enforcement requirements.

Use bsub M to specify memory limits for the job. Use bsub v to specify swap limits for the job. You can also configure memory and swap limits in a queue (lsb.queues) and in an application profile (lsb.applications) with the MEMLIMIT and SWAPLIMIT parameters. Memory and swap limits are enforced per slot/task.

Examples

In the following example, my_program asks for 2 slots and 100 MB memory limit per slot. Since the job runs on a single host, LSF sets up a cgroup memory sub system with a 200 MB limit. When the job uses more than 200 MB, the job is terminated.

$ bsub -n 2 -M 100 –R “span[hosts=1] ” my_program

In the following example, my_program asks for 100 MB memory and 50 MB swap. After my_program uses more than 100 MB of memory, the cgroup will start to use swap for the job process. The job will not be killed until the application reaches 150 MB memory usage (100 MB memory + 50 MB swap).

$ bsub -M 100  -v 50 my_program

Step 4. Check job status

If the cgroup system supports eventfd notification, bjobs –l can show the job termination reason if the job is killed because is exceeds a memory limit. For example:

$ bjobs -l 950
Job <950>, User <user1>, Project <default>, Status <EXIT>, Queue <normal>, Comm
                     and <./eatmem 200>
Thu Oct 17 02:25:49: Submitted from host <hostA>, CWD </home/user1>;
 MEMLIMIT
    100 M
Thu Oct 17 02:25:49: Started on <hostA>, Execution Home </home/user1>,
                     Execution CWD </home/user1>;
Thu Oct 17 02:26:01: Exited with exit code 137. The CPU time used is 0.7 second
                     s.
Thu Oct 17 02:26:01: Completed <exit>; TERM_MEMLIMIT: job killed after reaching
                      LSF memory usage limit.
 MEMORY USAGE:
 MAX MEM: 100 Mbytes
 SCHEDULING PARAMETERS:
           r15s   r1m  r15m   ut      pg    io   ls    it    tmp    swp    mem
 loadSched   -     -     -     -       -     -    -     -     -      -      - 
 loadStop    -     -     -     -       -     -    -     -     -      -      - 
 RESOURCE REQUIREMENT DETAILS:
 Combined: select[type == any] order[r15s:pg]
 Effective: select[type == any] order[r15s:pg]