Understand fair share scheduling

By default, LSF considers jobs for dispatch in the same order as they appear in the queue (which is not necessarily the order in which they are submitted to the queue). This is called first-come, first-served (FCFS) scheduling.

Fair share scheduling divides the processing power of the LSF cluster among users and queues to provide fair access to resources, so that no user or queue can monopolize the resources of the cluster and no queue will be starved.

If your cluster has many users competing for limited resources, the FCFS policy might not be enough. For example, one user could submit many long jobs at once and monopolize the cluster’s resources for a long time, while other users submit urgent jobs that must wait in queues until all the first user’s jobs are all done. To prevent this, use fair share scheduling to control how resources should be shared by competing users.

Fair sharing is not necessarily equal sharing: you can assign a higher priority to the most important users. If there are two users competing for resources, you can:
  • Give all the resources to the most important user
  • Share the resources so the most important user gets the most resources
  • Share the resources so that all users have equal importance

Queue-level vs. host partition fair share

You can configure fair share at either the queue level or the host level. However, these types of fair share scheduling are mutually exclusive. You cannot configure queue-level fair share and host partition fair share in the same cluster.

If you want a user’s priority in one queue to depend on their activity in another queue, you must use cross-queue fair share or host-level fair share.

Fair share policies

A fair share policy defines the order in which LSF attempts to place jobs that are in a queue or a host partition. You can have multiple fair share policies in a cluster, one for every different queue or host partition. You can also configure some queues or host partitions with fair share scheduling, and leave the rest using FCFS scheduling.

How fair share scheduling works

Each fair share policy assigns a fixed number of shares to each user or group. These shares represent a fraction of the resources that are available in the cluster. The most important users or groups are the ones with the most shares. Users who have no shares cannot run jobs in the queue or host partition.

A user’s dynamic priority depends on their share assignment, the dynamic priority formula, and the resources their jobs have already consumed.

The order of jobs in the queue is secondary. The most important thing is the dynamic priority of the user who submitted the job. When fair share scheduling is used, LSF tries to place the first job in the queue that belongs to the user with the highest dynamic priority.