Resource management

The following new features affect resource management and allocation.

Exclude swap threshold when enforcing job memory limits

When specifying the behavior of enforcing a job memory limit with the LSB_MEMLIMIT_ENF_CONTROL parameter in the lsf.conf file, you can now exclude the swap threshold and specify only the memory threshold.

To exclude the swap threshold, specify a value of 0 for the swap threshold in the LSB_MEMLIMIT_ENF_CONTROL parameter in the lsf.conf file:

LSB_MEMLIMIT_ENF_CONTROL=<Memory Threshold>:0:<Check Interval>:[all]

Set hard memory limits instead of per-process (soft) memory limits

When setting a memory limit for all the processes that belong to a job with the bsub -M command option, LSF sets a per-process soft memory limit by default. This means that when a job exceeds the memory limit, LSF passes the memory limit to the operating system. UNIX operating systems that support RUSAGE_RSS for the setrlimit() function can apply the memory limit to each process.

You can now disable this feature and force LSF to kill a job as soon as it exceeds the memory limit by adding an exclamation point (!) to the end of the memory limit that you specify with the bsub -M command option:

bsub -Mmemlimit[!]

If you specify the exclamation point, the memory limit is a hard limit, and LSF kills the job as soon as it exceeds this hard memory limit and does not wait for the host memory and swap threshold to be reached.

Specify resource reservation method in the rusage string

You can now specify the resource reservation method at the job, application, or queue level by specifying the method in the resource usage (rusage) string of the resource requirements in the bsub -R option, or in the RES_REQ parameter in the lsb.applications or lsb.queues file. You can now also specify the GPU resource reservation method by specifying the method in the bsub -gpu option, in the GPU_REQ parameter in the lsb.applications or lsb.queues file, or in the LSB_GPU_REQ parameter in the lsf.conf file. Previously, you could only specify the resource reservation method at the global level by specifying the METHOD parameter in the ReservationUsage section of the lsb.resources file. The resource reservation value and method at the job level overrides the value and method at the application level, which overrides the value and method at the queue level, which overrides the value and method at the cluster level.

Specify the resource reservation method by using the /task, /job, or /host keyword after the numeric value in the rusage string or by using the /task or /host keyword in the GPU requirements string. You can only specify resource reservation methods for consumable resources. Specify the resource reservation methods as follows:

  • value/task

    Specifies per-task reservation of the specified resource. This is the equivalent of specifying PER_TASK for the METHOD parameter in the ReservationUsage section of the lsb.resources file.

  • value/job

    Specifies per-job reservation of the specified resource. This is the equivalent of specifying PER_JOB for the METHOD parameter in the ReservationUsage section of the lsb.resources file. You cannot specify per-job reservation in the GPU requirements string.

  • value/host

    Specifies per-host reservation of the specified resource. This is the equivalent of specifying PER_HOST for the METHOD parameter in the ReservationUsage section of the lsb.resources file.

For example,
  • RES_REQ="rusage[mem=10/job:duration=10:decay=0]"
  • RES_REQ="rusage[mem=(50 20)/task:duration=(10 5):decay=0]"
  • GPU_REQ="num=2/task:mode=shared:j_exclusive=yes"

Stripe tasks of a parallel job across free resources on candidate hosts

You can now enable LSF to stripe tasks of a parallel job across the free resources of the candidate hosts.

Enable job striping by using the stripe keyword in the span string of the resource requirements in the bsub -R option, or in the RES_REQ parameter in the lsb.applications or lsb.queues file. The span string at the job level overrides the span string at the application level, which overrides the span string at the queue level. You can limit the maximum number of tasks that are allocated to a candidate host by specifying a number for the stripe keyword.

For example,
  • span[stripe]
  • span[stripe=2]