Affinity string
An affinity resource requirement string specifies CPU and memory binding requirements for the tasks of jobs. An affinity[] resource requirement section controls CPU and memory resource allocations and specifies the distribution of processor units within a host according to the hardware topology information that LSF collects.
affinity sections are accepted by bsub -R, and by bmod -R for non-running jobs, and can be specified in the RES_REQ parameter in lsb.applications and lsb.queues.
Syntax
The affinity string supports the following syntax:
affinity[pu_type[*count] | [pu_type(pu_num[,pu_options])[*count]] [:cpubind=numa | socket | core | thread] [:membind=localonly | localprefer] [:distribute=task_distribution]]
- pu_type[*count] | [pu_type(pu_num[,pu_options])[*count]]
Requested processor unit for the job tasks are specified by pu_type, which indicates the type and number of processor units the tasks can run on. Processor unit type can be one of numa, socket, core, or thread. pu_num specifies the number of processor units for each task.
For compatibility with IBM LoadLeveller, options mcm and cpu are also supported. mcm is an alias for the numa processor unit type, and cpu is an alias for the thread processor unit type.
For example, the following affinity requirement requests 5 cores per task:
affinity[core(5)]
Further processor unit specification is provided by pu_options, which have the following syntax:
same=level[,exclusive=(level[,scope])]
where:- same=level
Controls where processor units are allocated from. Processor unit level can be one of numa, socket, core, or thread. The level for same must be higher than the specified processor unit type.
For example, the following requests 2 threads from the same core: affinity[thread(2,same=core)]
- "exclusive=(level[,scope [| scope]])"
- Constrains what level processor units can be allocated exclusively to a job or task. The level for exclusive can be one of numa, socket, or core. The scope for exclusive can be one of the following, or a combination separated by a logical OR (|):
intask means that the allocated processor unit cannot be shared by different allocations in the same task.
injob means that the allocated processor unit cannot be shared by different tasks in the same job.
alljobs means that the allocated processor unit cannot be shared by different jobs. alljobs scope can only be used if EXCLUSIVE=Yis configured in the queue.
For example, the following requests 2 threads for each task from the same core, exclusively to the socket. No other tasks in the same job can run on the allocated socket (other jobs or tasks from other jobs can run on that socket): affinity[thread(2,same=core,exclusive=(socket,injob))]
Note: EXCLUSIVE=Y or EXCLUSIVE=CU[cu_type] must be configured in the queue to enable affinity jobs to use CPUs exclusively, when the alljobs scope is specified in the exclusive option.
- *count
Specifies a multiple of processor unit requests. This is convenient for requesting the same processor unit allocation for a number of tasks.
For example, the following affinity request allocates 4 threads per task from 2 cores, 2 threads in each core. The cores must come from different sockets:
affinity[thread(2,same=core,exclusive=(socket,intask))*2]
- cpubind=numa | socket | core | thread
Specifies the CPU binding policy for tasks. If the level of cpubind is the same as or lower than the specified processor unit type (pu_type), the lowest processor unit is used. If the level of cpubind is higher than the requested processor type, the entire processor unit containing the allocation is used for CPU binding.
For example:affinity[core(2):cpubind=thread]
If the allocated cores are /0/0/0 and /0/0/1, the CPU binding list will contain all threads under /0/0/0 and /0/0/1.
affinity[core(2):cpubind=socket]
If the allocated cores are /0/0/0 and /0/0/1, the CPU binding list will contain all threads under the socket /0/0.
- membind=localonly | localprefer
- Specifies the physical NUMA memory binding policy for tasks.
localonly limits the processes within the policy to allocate memory only from the local NUMA node. Memory is allocated if the available memory is greater than or equal to the memory requested by the task.
localprefer specifies that LSF should try to allocate physical memory from the local NUMA node first. If this is not possible, LSF allocates memory from a remote NUMA node. Memory is allocated if the available memory is greater than zero.
- distribute=task_distribution
- Specifies how LSF distributes tasks of a submitted job on a host. Specify task_distribution according to the following syntax:
- pack | pack(type=1)
LSF attempts to pack tasks in the same job on as few processor units as possible, in order to make processor units available for later jobs with the same binding requirements.
pack(type=1) forces LSF to pack all tasks for the job into the processor unit specified by type, where type is one of numa, socket, core, or thread. The difference between pack and pack(type=1) is that LSF will pend the job if pack(type=1) cannot be satisfied.
Use pack to allow your application to use memory locality.
For example, a job has the following affinity requirements:bsub -n 6 –R "span[hosts=1] affinity[core(1):distribute=pack]"
The job asks for 6 slots, running on an single host. Each slot maps to 1 core, and LSF tries to pack all 6 cores as close as possible on a single NUMA or socket.
The following example packs all job tasks on a single NUMA node:
In this allocation, each task needs 1 core and no other tasks from the same job can allocate CPUs from the same socket. All tasks are packed in the same job on one NUMA node.affinity[core(1,exclusive=(socket,injob)):distribute=pack(numa=1)]
- balance
LSF attempts to distribute job tasks equally across all processor units. Use balance to make as many processor units available to your job as possible.
- any
LSF attempts no job task placement optimization. LSF chooses the first available processor units for task placement.
Examples
affinity[core(5,same=numa):cpubind=numa:membind=localonly]
Each task requests 5 cores in the same NUMA node and binds the tasks on the NUMA node with memory mandatory binding.
The following binds a multithread job on a single NUMA node:
affinity[core(3,same=numa):cpubind=numa:membind=localprefer]
The following distributes tasks across sockets:
affinity[core(2,same=socket,exclusive=(socket,injob|alljobs)): cpubind=socket]
Each task needs 2 cores from the same socket and binds each task at the socket level. The allocated socket is exclusive - no other tasks can use it.
Affinity string in application profiles and queues
A job-level affinity string section overwrites an application-level section, which overwrites a queue-level section (if a given level is present).
See Resource requirements for information about how resource requirements in application profiles are resolved with queue-level and job-level resource requirements.