Job scheduling and execution

The following new features affect LSF job scheduling and execution.

Plan-based scheduling and reservations

When enabled, LSF's plan-based scheduling makes allocation plans for jobs based on anticipated future cluster states. LSF reserves resources as needed in order to carry out its plan. This helps to avoid starvation of jobs with special resource requirements.

Plan-based scheduling and reservations addresses a number of issues with the older reservation features in LSF. For example:
  • It ensures that reserved resources can really be used by the reserving jobs
  • It has better job start-time prediction for reserving jobs, and thus better backfill decisions

Plan-based scheduling aims to replace legacy LSF reservation policies. When ALLOCATION_PLANNER is enabled in the lsb.params configuration file, then parameters related to the old reservation features (that is SLOT_RESERVE and RESOURCE_RESERVE in lsb.queues), are ignored with a warning.

Automatically extend job run limits

You can now configure the LSF allocation planner to extend the run limit for jobs when the resources that are occupied by the job are not needed by other jobs in queues with the same or higher priority. The allocation planner looks at job plans to determine if there are any other jobs that require the current job's resources.

Enable extendable run limits for jobs submitted to a queue by specifying the EXTENDABLE_RUNLIMIT parameter in the lsb.queues file. Since the allocation planner decides whether the extend the run limit of jobs, you must also enable plan-based scheduling by enabling the ALLOCATION_PLANNER parameter in the lsb.params file.

Default epsub executable files

Similar to esub programs, LSF now allows you to define a default epsub program that runs even if you do not define mandatory epsub programs with the LSB_ESUB_METHOD parameter in the lsf.conf file. To define a default epsub program, create an executable file named epsub (with no application name in the file name) in the LSF_SERVERDIR directory.

After the job is submitted, LSF runs the default epsub executable file if it exists in the LSF_SERVERDIR directory, followed by any mandatory epsub executable files that are defined by LSB_ESUB_METHOD, followed by the epsub executable files that are specified by the -a option.

Restrict users and user groups from forwarding jobs to remote clusters

You can now specify a list of users or user groups that can forward jobs to remote clusters when using the LSF multicluster capability. This allows you to prevent jobs from certain users or user groups from being forwarded to an execution cluster, and to set limits on the submission cluster.

These limits are defined at the queue level in LSF. For jobs that are intended to be forwarded to a remote cluster, users must submit these jobs to queues that have the SNDJOBS_TO parameter configured in the lsb.queues file. To restrict these queues to specific users or user groups, define the FWD_USERS parameter in the lsb.queues file for these queues.

Advance reservations now support the "same" section in resource requirement strings

When using the brsvadd -R and brsvmod -R options to specify resource requirements for advance reservations, the same string now takes effect, in addition to the select string. Previous versions of LSF only allowed the select string to take effect.

This addition allows you to select hosts with the same resources for your advance reservation.

Priority factors for absolute priority scheduling

You can now set additional priority factors for LSF to calculate the job priority for absolute priority scheduling (APS). These additional priority factors allow you to modify the priority for the application profile, submission user, or user group, which are all used as factors in the APS calculation. You can also view the APS and fair share user priority values for pending jobs.

To set the priority factor for an application profile, define the PRIORITY parameter in the lsb.applications file. To set the priority factor for a user or user group, define the PRIORITY parameter in the User or UserGroup section of the lsb.users file.

The new bjobs -prio option displays the APS and fair share user priority values for all pending jobs. In addition, the busers and bugroup commands display the APS priority factor for the specified users or user groups.

Job dispatch limits for users, user groups, and queues

You can now set limits on the maximum number of jobs that are dispatched in a scheduling cycle for users, user groups, and queues. This allows you to control the number of jobs, by user, user group, or queue, that are dispatched for execution. If the number of dispatched jobs reaches this limit, other pending jobs that belong to that user, user group, or queue that might have dispatched will remain pending for this scheduling cycle.

To set or update the job dispatch limit, run the bconf command on the limit object (that is, run bconf action_type limit=limit_name) to define the JOBS_PER_SCHED_CYCLE parameter for the specific limit. You can only set job dispatch limits if the limit consumer types are USERS, PER_USER, QUEUES, or PER_QUEUE.

For example, bconf update limit=L1 "JOBS_PER_SCHED_CYCLE=10"

You can also define the job dispatch limit by defining the JOBS_PER_SCHED_CYCLE parameter in the Limit section of the lsb.resources file.