Dynamic user priority
LSF calculates a dynamic user priority for individual users or for a group, depending on how the shares are assigned. The priority is dynamic because it changes as soon as any variable in formula changes. By default, a user’s dynamic priority gradually decreases after a job starts, and the dynamic priority immediately increases when the job finishes.
How LSF calculates dynamic priority
- The number of shares assigned to the user
- The resources used by jobs belonging to the user:
- Number of job slots reserved and in use
- Run time of running jobs
- Cumulative actual CPU time (not normalized), adjusted so that recently used CPU time is weighted more heavily than CPU time used in the distant past
- Decayed run time of running jobs
- Historical run time of finished jobs
- Committed run time, specified at job submission with the -W option of bsub, or in the queue with the RUNLIMIT parameter in lsb.queues
- Memory usage adjustment made by the fair share plug-in (libfairshareadjust.*).
How LSF measures fair share resource usage
- For user-based fair share:
- For queue-level fair share, LSF measures the resource consumption of all the user’s jobs in the queue. This means a user’s dynamic priority can be different in every queue.
- For host partition fair share, LSF measures resource consumption for all the user’s jobs that run on hosts in the host partition. This means a user’s dynamic priority is the same in every queue that uses hosts in the same partition.
- For queue-based fair share, LSF measures the resource consumption of all jobs in each queue.
Default dynamic priority formula
By default, LSF calculates dynamic priority according to the following formula:
dynamic priority = number_shares / ((cpu_time * CPU_TIME_FACTOR + run_time * RUN_TIME_FACTOR + (1 + job_slots) * RUN_JOB_FACTOR + (1 + fwd_job_slots) * FWD_JOB_FACTOR + fairshare_adjustment*FAIRSHARE_ADJUSTMENT_FACTOR) + ((historical_gpu_run_time + gpu_run_time) * ngpus_physical) * GPU_RUN_TIME_FACTOR)
For cpu_time, run_time, and job_slots, LSF uses the total resource consumption of all the jobs in the queue or host partition that belong to the user or group.
number_shares
The number of shares assigned to the user.
cpu_time
The cumulative CPU time used by the user (measured in hours). LSF calculates the cumulative CPU time using the actual (not normalized) CPU time and a decay factor such that 1 hour of recently-used CPU time decays to 0.1 hours after an interval of time specified by HIST_HOURS in lsb.params (5 hours by default).
run_time
The total run time of running jobs (measured in hours).
job_slots
The number of job slots reserved and in use.
fairshare_adjustment
The adjustment calculated by the fair share adjustment plug-in (libfairshareadjust.*).
Configure the default dynamic priority
You can give additional weight to the various factors in the priority calculation by setting the following parameters for the queue in lsb.queues or for the cluster in lsb.params. When the queue value is not defined, the cluster-wide value from lsb.params is used.
- CPU_TIME_FACTOR
- RUN_TIME_FACTOR
- RUN_JOB_FACTOR
- FWD_JOB_FACTOR
- FAIRSHARE_ADJUSTMENT_FACTOR
- HIST_HOURS
- GPU_RUN_TIME_FACTOR
If you modify the parameters used in the dynamic priority formula, it affects every fair share policy in the cluster:
- CPU_TIME_FACTOR
- The CPU time weighting factor.
Default: 0.7
- FWD_JOB_FACTOR
- The forwarded job slots weighting factor when using the LSF multicluster
capability.
Default: Not defined
- RUN_TIME_FACTOR
- The run time weighting factor.
Default: 0.7
- RUN_JOB_FACTOR
- The job slots weighting factor.
Default: 3
- FAIRSHARE_ADJUSTMENT_FACTOR
- The fairs hare plug-in (libfairshareadjust.*) weighting factor.
Default: 0
- HIST_HOURS
- Interval for collecting resource consumption history.
Default: 5
- GPU_RUN_TIME_FACTOR
- GPU run time weighting factor.
Default: 0
Customize the dynamic priority
In some cases the dynamic priority equation may require adjustments beyond the run time, CPU time, and job slot dependencies provided by default. The fair share adjustment plug-in is open source and can be customized once you identify specific requirements for dynamic priority.
All information used by the default priority equation (except the user shares) is passed to the fair share plug-in. In addition, the fair share plug-in is provided with current memory use over the entire cluster and the average memory that is allocated to a slot in the cluster.
Example
Jobs assigned to a single slot on a host can consume host memory to the point that other slots on the hosts are left unusable. The default dynamic priority calculation considers job slots used, but doesn’t account for unused job slots effectively blocked by another job.
fair share adjustment= (1+slots)*((used_memory /used_slots)/(slot_memory*THRESHOLD))
used_slots
-
The number of job slots in use by started jobs.
used_memory
-
The total memory in use by started jobs.
slot_memory
-
The average amount of memory that exists per slot in the cluster.
THRESHOLD
-
The memory threshold set in the fair share adjustment plug-in.