LSB_JOB_MEMLIMIT

Determines whether the memory limit is a per-process limit enforced by the OS or whether it is a per-job limit enforced by LSF.

Syntax

LSB_JOB_MEMLIMIT=y | n

Description

  • The per-process limit is enforced by the OS when the memory allocated to one process of the job exceeds the memory limit.
  • The per-job limit is enforced by LSF when the sum of the memory allocated to all processes of the job exceeds the memory limit.

This parameter applies to memory limits set when a job is submitted with bsub -M mem_limit, and to memory limits set for queues with the MEMLIMIT parameter in the lsb.queues file.

The setting of the LSB_JOB_MEMLIMIT parameter has the following effect on how the limit is enforced:


LSB_JOB_MEMLIMIT LSF-enforced per-job limit OS-enforced per-process limit
y Enabled Disabled
n or not defined Disabled Enabled

When the LSB_JOB_MEMLIMIT parameter is Y, the LSF-enforced per-job limit is enabled, and the OS-enforced per-process limit is disabled.

When LSB_JOB_MEMLIMIT is N or not defined, the LSF-enforced per-job limit is disabled, and the OS-enforced per-process limit is enabled.

LSF-enforced per-job limit
When the total memory allocated to all processes in the job exceeds the memory limit, LSF sends the following signals to kill the job: SIGINT, SIGTERM, then SIGKILL. The interval between signals is 10 seconds by default.

On UNIX, the time interval between SIGINT, SIGKILL, SIGTERM can be configured with the parameter JOB_TERMINATE_INTERVAL in lsb.params.

OS-enforced per process limit
When the memory allocated to one process of the job exceeds the memory limit, the operating system enforces the limit. LSF passes the memory limit to the operating system. Some operating systems apply the memory limit to each process, and some do not enforce the memory limit at all.

OS memory limit enforcement is only available on systems that support RLIMIT_RSS for setrlimit().

The following operating systems do not support the memory limit at the OS level and the job is allowed to run without a memory limit:
  • Windows
  • Sun Solaris 2.x

Default

Not defined. Per-process memory limit enforced by the OS; per-job memory limit enforced by LSF disabled

Notes

To make changes to the LSB_JOB_MEMLIMIT parameter take effect, use the command bctrld restart sbd all to restart all sbatchds in the cluster.

If LSB_JOB_MEMLIMIT is set, it overrides the setting of the parameter LSB_MEMLIMIT_ENFORCE. The parameter LSB_MEMLIMIT_ENFORCE is ignored.

The difference between LSB_JOB_MEMLIMIT set to y and LSB_MEMLIMIT_ENFORCE set to y is that with LSB_JOB_MEMLIMIT, only the per-job memory limit enforced by LSF is enabled. The per-process memory limit enforced by the OS is disabled. With LSB_MEMLIMIT_ENFORCE set to y, both the per-job memory limit enforced by LSF and the per-process memory limit enforced by the OS are enabled.

Changing the default Terminate job control action

You can define a different Terminate action in the lsb.queues file with the parameter JOB_CONTROLS if you do not want the job to be killed. For more details on job controls, see Administering IBM® Spectrum LSF.

Limitations

If a job is running and the parameter is changed, LSF is not able to reset the type of limit enforcement for running jobs.
  • If the parameter is changed from per-process limit enforced by the OS to per-job limit enforced by LSF (LSB_JOB_MEMLIMIT=n or not defined changed to LSB_JOB_MEMLIMIT=y), both per-process limit and per-job limit affect the running job. This means that signals may be sent to the job either when the memory allocated to an individual process exceeds the memory limit or the sum of memory allocated to all processes of the job exceed the limit. A job that is running may be killed by LSF.
  • If the parameter is changed from per-job limit enforced by LSF to per-process limit enforced by the OS (LSB_JOB_MEMLIMIT=y changed to LSB_JOB_MEMLIMIT=n or not defined), the job is allowed to run without limits because the per-process limit was previously disabled.

See also

LSB_MEMLIMIT_ENFORCE, LSB_MOD_ALL_JOBS, lsb.queues, bsub, JOB_TERMINATE_INTERVAL in lsb.params