How resource allocation limits work
By default, resource consumers like users, hosts, queues, or projects are not limited in the resources available to them for running jobs.
Resource allocation limits configured in lsb.resources specify the following restrictions:
- The maximum amount of a resource requested by a job that can be allocated during job scheduling for different classes of jobs to start
- Which resource consumers the limits apply to
If all of the resource has been consumed, no more jobs can be started until some of the resource is released.
For example, by limiting maximum amount of memory for each of your hosts, you can make sure that your system operates at optimal performance. By defining a memory limit for some users submitting jobs to a particular queue and a specified set of hosts, you can prevent these users from using up all the memory in the system at one time.
Jobs must specify resource requirements
For limits to apply, the job must specify resource requirements (bsub -R rusage string or the RES_REQ parameter in the lsb.queues file). For example, the a memory allocation limit of 4 MB is configured in lsb.resources:
Begin Limit
NAME = mem_limit1
MEM = 4
End Limit
A is job submitted with an rusage resource requirement that exceeds this limit:
bsub -R "rusage[mem=5]" uname
and remains pending:
bjobs -p 600
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
600 user1 PEND normal suplin02 uname Aug 12 14:05
Resource (mem) limit defined cluster-wide has been reached;
A job is submitted with a resource requirement within the configured limit:
bsub -R"rusage[mem=3]" sleep 100
is allowed to run:
bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
600 user1 PEND normal hostA uname Aug 12 14:05
604 user1 RUN normal hostA sleep 100 Aug 12 14:09
Resource usage limits and resource allocation limits
Resource allocation limits are not the same as resource usage limits, which are enforced during job run time. For example, you set CPU limits, memory limits, and other limits that take effect after a job starts running.
Resource reservation limits and resource allocation limits
Resource allocation limits are not the same as queue-based resource reservation limits, which are enforced during job submission. The parameter RESRSV_LIMIT in the lsb.queues file specifies allowed ranges of resource values, and jobs submitted with resource requests outside of this range are rejected.
How LSF enforces limits
Resource allocation limits are enforced so that they apply to all jobs in the cluster according to the kind of resources, resource consumers, and combinations of consumers.
- All jobs in the cluster
- Several kinds of resources:
- Job slots by host
- Job slots per processor
- Running and suspended jobs
- Memory (MB or percentage)
- Swap space (MB or percentage)
- Tmp space (MB or percentage)
- Other shared resources
- Several kinds of resource consumers:
- Users and user groups (all users or per-user)
- Hosts and host groups (all hosts or per-host)
- Queues (all queues or per-queue)
- Projects (all projects or per-project)
- Combinations of consumers:
- For jobs running on different hosts in the same queue
- For jobs running from different queues on the same host
How LSF counts resources
Resources on a host are not available if they are taken by jobs that have been started, but have not yet finished. This means running and suspended jobs count against the limits for queues, users, hosts, projects, and processors that they are associated with.
Job slot limits
Job slot limits can correspond to the maximum number of jobs that can run at any point in time. For example, a queue cannot start jobs if it has no job slots available, and jobs cannot run on hosts that have no available job slots.
Limits such as QJOB_LIMIT (lsb.queues), HJOB_LIMIT (lsb.queues), UJOB_LIMIT (lsb.queues), MXJ (lsb.hosts), JL/U (lsb.hosts), MAX_JOBS (lsb.users), and MAX_PEND_SLOTS (lsb.users and lsb.params ) limit the number of job slots. When the workload is sequential, job slots are usually equivalent to jobs. For parallel or distributed applications, these are true job slot limits and not job limits.
Job limits
Job limits, specified by JOBS in a Limit section in lsb.resources, correspond to the maximum number of running and suspended jobs that can run at any point in time. MAX_PEND_JOBS (lsb.users and lsb.params) limit the number of pending jobs. If both job limits and job slot limits are configured, the most restrictive limit is applied.
Resource reservation and backfill
When processor or memory reservation occurs, the reserved resources count against the limits for users, queues, hosts, projects, and processors. When backfilling of parallel jobs occurs, the backfill jobs do not count against any limits.
IBM® Spectrum LSF multicluster capability
Limits apply only to the cluster where the lsb.resources file is configured. If the cluster leases hosts from another cluster, limits are enforced on those hosts as if they were local hosts.
Switched jobs can exceed resource allocation limits
If a switched job (the bswitch command) has not been dispatched, then the job behaves as if it were submitted to the new queue in the first place, and the JOBS limit is enforced in the target queue.
Begin Limit
USERS QUEUES SLOTS TMP JOBS
- normal - 20 2
- short - 20 2
End Limit
bsub -q normal -R"rusage[tmp=20]" sleep 1000
bsub -q short -R"rusage[tmp=20]" sleep 1000
bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
16 user1 RUN normal hosta hosta sleep 1000 Aug 30 16:26
17 user1 PEND normal hosta sleep 1000 Aug 30 16:26
18 user1 PEND normal hosta sleep 1000 Aug 30 16:26
19 user1 RUN short hosta hosta sleep 1000 Aug 30 16:26
20 user1 PEND short hosta sleep 1000 Aug 30 16:26
21 user1 PEND short hosta sleep 1000 Aug 30 16:26
blimits
INTERNAL RESOURCE LIMITS:
NAME USERS QUEUES SLOTS TMP JOBS
NONAME000 - normal - 20/20 1/2
NONAME001 - short - 20/20 1/2
bswitch short 16
bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
17 user1 RUN normal hosta hosta sleep 1000 Aug 30 16:26
18 user1 PEND normal hosta sleep 1000 Aug 30 16:26
19 user1 RUN short hosta hosta sleep 1000 Aug 30 16:26
16 user1 RUN short hosta hosta sleep 1000 Aug 30 16:26
20 user1 PEND short hosta sleep 1000 Aug 30 16:26
21 user1 PEND short hosta sleep 1000 Aug 30 16:26
blimits
INTERNAL RESOURCE LIMITS:
NAME USERS QUEUES SLOTS TMP JOBS
NONAME000 - normal - 20/20 1/2
NONAME001 - short - 40/20 2/2
bswitch short 17
bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
18 user1 RUN normal hosta hosta sleep 1000 Aug 30 16:26
19 user1 RUN short hosta hosta sleep 1000 Aug 30 16:26
16 user1 RUN short hosta hosta sleep 1000 Aug 30 16:26
17 user1 RUN short hosta hosta sleep 1000 Aug 30 16:26
20 user1 PEND short hosta sleep 1000 Aug 30 16:26
21 user1 PEND short hosta sleep 1000 Aug 30 16:26
blimits
INTERNAL RESOURCE LIMITS:
NAME USERS QUEUES SLOTS TMP JOBS
NONAME000 - normal - 20/20 1/2
NONAME001 - short - 60/20 3/2
Limits for resource consumers
Resource allocaton limits are applied according to the kind of resource consumer (host groups, compute units, users, user groups
Host groups and compute units
If a limit is specified for a host group or compute unit, the total amount of a resource used by all hosts in that group or unit is counted. If a host is a member of more than one group, each job running on that host is counted against the limit for all groups to which the host belongs. Per-user limits are enforced on each user or individually to each user in the user group listed. If a user group contains a subgroup, the limit also applies to each member in the subgroup recursively.
Limits for users and user groups
Jobs are normally queued on a first-come, first-served (FCFS) basis. It is possible for some users to abuse the system by submitting a large number of jobs; jobs from other users must wait until these jobs complete. Limiting resources by user prevents users from monopolizing all the resources.
Users can submit an unlimited number of jobs, but if they have reached their limit for any resource, the rest of their jobs stay pending, until some of their running jobs finish or resources become available.
If a limit is specified for a user group, the total amount of a resource used by all users in that group is counted. If a user is a member of more than one group, each of that user’s jobs is counted against the limit for all groups to which that user belongs.
Use the keyword all to configure limits that apply to each user or user group in a cluster. This is useful if you have a large cluster but only want to exclude a few users from the limit definition.
You can use ENFORCE_ONE_UG_LIMITS=Y combined with bsub -G to have better control over limits when user groups have overlapping members. When set to Y, only the specified user group’s limits (or those of any parent user group) are enforced. If set to N, the most restrictive job limits of any overlapping user/user group are enforced.
Per-user limits on users and groups
Per-user limits that use the keywords all apply to each user in a cluster. If user groups are configured, the limit applies to each member of the user group, not the group as a whole.
Resizable jobs
When a resize allocation request is scheduled for a resizable job, all resource allocation limits (job and slot) are enforced.
Once the new allocation is satisfied, it consumes limits such as SLOTS, MEM, SWAP and TMP for queues, users, projects, hosts, or cluster-wide. However, the new allocation will not consume job limits such as job group limits, job array limits, and non-host level JOBS limit.
Releasing part of an allocation from a resizable job frees general limits that belong to the allocation, but not the actual job limits.