Package guarantees
A package comprises some number of slots and some amount of memory all on a single host.
Administrators can configure an service class of a number of packages for jobs of a particular class. A package has all the slot and memory resources for a single job of that class to run. Each job running in a guarantee pool must occupy the whole multiple of packages.
Configuring guarantee package policies
Guarantee policies (pools) are configured in lsb.resources. For package guarantees, these policies specify:
- A set (pool) of hosts
- The resources in a package
- How many packages to reserve for each set of service classes
- Policies for loaning out reserved resources that are not immediately needed
Configuration is done the same as for a slot or host guarantee policy, with a
GuaranteedResourcePoolsection in lsb.resources. The main
difference being that the TYPE parameter is used to express the package
resources. The following example is a guarantee package pool defined in
lsb.resources:
Begin GuaranteedResourcePool
NAME = example_pool
TYPE = package[slots=1:mem=1000]
HOSTS = hgroup1
RES_SELECT = mem > 16000
DISTRIBUTION = ([sc1, 25%] [sc2, 25%] [sc3, 30%])
End GuaranteedResourcePool
A package does not necessarily require both slots and memory. Setting TYPE=package[slots=1] gives essentially the same result as a slot pool. It may be useful to have only slots in a package (and not mem) in order to provide guarantees for parallel jobs that require multiple CPUs on a single host, where memory is not an important resource. It is likely not useful to configure guarantees of only memory without slots, although the feature supports this.
Each host can belong to at most one slot/host/package guarantee pool. At mbatchd startup time, it will go through hosts one by one. For each host, mbatchd will go through the list of guarantee pools in configuration order, and assign the host to the first pool for which the job meets the RES_SELECT and HOSTS criteria.
Total packages of a pool
The total packages of a pool is intended to represent the number of packages that can be supplied by the pool if there are no jobs running in the pool. This total is used for:
- Display purposes – bresources displays the total for each pool, as well as showing the pool status as overcommitted when the number guaranteed in the pool exceeds the total.
- Determining the actual number of packages to reserve when guarantees are given as percentages instead of absolute numbers.
LSF calculates the total packages of a pool by summing over all hosts in the pool, the total package each host. Hosts that are currently unavailable are not considered to be part of a pool. On each host in a pool, the total contributed by the host is the number of packages that fit into the MXJ and total memory of the host. For the purposes of computing the total packages of the host, mbschd estimates the total memory for LSF jobs as the minimum of:
- The total slots of the host (MXJ), and
- The maximum memory of the host; that is, maxmem as reported by the lshosts command.
The total packages on a host is the number of packages that can fit into the total slots and maxmem of the host. This way, the memory occupied by processes on the host that do not belong to LSF jobs does not count toward the total packages for the host. Even if you kill all the memory occupied by jobs on the host, LSF jobs might not use memory all the way to maxmem.
Memory on a host can be used by processes outside of LSF jobs. Even when no jobs are running on a host, the number of free packages on the host is less than the total packages of the host. The free packages are computed from the available slots and available memory.
Currently available packages in a pool
So that LSF knows how many packages to reserve during scheduling, LSF must track the number of available packages in each package pool. The number of packages available on a host in the pool is equal to the number of packages that fit into the free resources on the host. The available packages of a pool is simply this amount summed over all hosts in the pool.
For example, suppose there are 5 slots and 5 GB of memory free on the host. Each package contains 2 slots and 2 GB of memory. Therefore, 2 packages are currently available on the host.
Hosts in other states are temporarily excluded from the pool, and any SLA jobs running on hosts in other states are not counted towards the guarantee.