Placing jobs based on available job slots of hosts

LSF has built-in host-based resource slots to support the placement of jobs based on available job slots. Learn how to configure and use the slots resource with existing LSF resource requirements according to a packing or spreading policy.

About this task

  • Packing always places jobs on the hosts with the least available slots first. Packing jobs can make room for bigger parallel jobs.
  • Spreading tries to spread jobs out and places jobs on the hosts with the most available slots first. Spreading jobs maximizes the performance of individual jobs.

The slots keyword represents available slots on each host and it is a built-in numeric decreasing resource. When a job occupies some of the job slots on the host, the slots resource value is decreased accordingly. For example, if the MXJ of an LSF host is defined as 8, the slots value will be 8 when the host is empty. When 6 LSF job slots have been occupied, slots becomes 2. The slots resource can only be used in select[] and order[] sections of a job resource requirement string. To apply a job packing or spreading policy, you can use the order[] section in the job resource requirement. For example, -R “order[-slots]” will order candidate hosts based on the least available slots, while –R “order[slots]” will order candidate hosts based on the hosts with the most available slots.

As part of resource requirement, the order[] section can be used in following contexts:

  • In a queue RES_REQ in the lsb.queues file
  • In an application profile level RES_REQ in the lsb.applications file
  • In a job-level resource requirement: bsub –R or bmod –R.

A job-level order[] clause overrides an application-level section, which overrides a queue-level section.

During scheduling, by default, candidate hosts for jobs with the common resource requirement are selected and sorted only once in each scheduling cycle. This is designed to speed up scheduling performance in a high throughput computing environment where large number of pending jobs are in the system. However, when many jobs are dispatched in a single scheduling cycle, ordering candidate hosts once per cycle may not get accurate scheduling results. This is because host order does not change any more even though slots value changes due to new jobs being dispatched. Use an exclamation mark (!) in the order[] section to sort candidate hosts per job, so that changes in the slots value within a single scheduling cycle can be recognized.

To use ! in an order[] clause, you must set SCHED_PER_JOB_SORT=Y in the lsb.params. To make the parameter take effect, run badmin mbdrestart or badmin reconfig on the management host to reconfigure mbatchd.

The following is an example of using the slots resource.

Procedure

  1. Configure RES_REQ in a Queue section in the lsb.queues file.
    Begin Queue
    QUEUE_NAME = myqueue
    …
    RES_REQ    = order[-slots]
    …
    
    End Queue
  2. Configure RES_REQ in an application profile in the lsb.applications file.
    Begin Application
    NAME       = myapp
    …
    RES_REQ    = order[slots]
    …
    End Application
  3. Run the badmin reconfig command to make the configurations take effect for the queues and applications. The bqueues and bapp commands show the results.
    # bqueues -l myqueue
    QUEUE: myqueue
    …
    RES_REQ:  order[-slots]
    …
    
    # bapp -l myapp
    APPLICATION NAME: myapp
    …
    RES_REQ:  order[slots]
    …

Results

You can now use the new queue and application profile for your job submissions. For example:

bsub –q myqueue myjob

The job is restricted by the queue to use the host with the least available slots first.

bsub –app myappp myjob

The job is restricted by the application profile to use the host with the most available slots first.

bsub –R “order[!slots]” –J “array[1-3]” myjob

A job array with 3 elements asks for the host with the most available slots, ordered by job.