lsb.hosts

The lsb.hosts file contains host-related configuration information for the server hosts in the cluster. It is also used to define host groups, host partitions, and compute units.

This file is optional. All sections are optional.

By default, this file is installed in LSB_CONFDIR/cluster_name/configdir.

Changing lsb.hosts configuration

After making any changes to lsb.hosts, run badmin reconfig to reconfigure mbatchd.

#INCLUDE

Syntax

#INCLUDE "path-to-file"

Description

Inserts a configuration setting from another file to the current location. Use this directive to dedicate control of a portion of the configuration to other users or user groups by providing write access for the included file to specific users or user groups, and to ensure consistency of configuration file settings in different clusters (if you are using the LSF multicluster capability).

For more information, see Shared configuration file content.

#INCLUDE can be inserted anywhere in the local configuration file.

Default

Not defined.

Host section

Description

Optional. Defines the hosts, host types, and host models used as server hosts, and contains per-host configuration information. If this section is not configured, LSF uses all hosts in the cluster (the hosts listed in lsf.cluster.cluster_name) as server hosts.

Each host, host model or host type can be configured to do the following:
  • Limit the maximum number of jobs run in total
  • Limit the maximum number of jobs run by each user
  • Run jobs only under specific load conditions
  • Run jobs only under specific time windows

The entries in a line for a host override the entries in a line for its model or type.

When you modify the cluster by adding or removing hosts, no changes are made to lsb.hosts. This does not affect the default configuration, but if hosts, host models, or host types are specified in this file, you should check this file whenever you make changes to the cluster and update it manually if necessary.

Host section structure

The first line consists of keywords identifying the load indices that you wish to configure on a per-host basis. The keyword HOST_NAME must be used; the others are optional. Load indices not listed on the keyword line do not affect scheduling decisions.

Each subsequent line describes the configuration information for one host, host model or host type. Each line must contain one entry for each keyword. Use empty parentheses ( ) or a dash (-) to specify the default value for an entry.

HOST_NAME

Required. Specify the name, model, or type of a host, or the keyword default.

Pattern definition

You can use string literals and special characters when defining host names. Each entry cannot contain any spaces, as the list itself is space delimited.

You can use the following special characters to specify hosts:
  • Use square brackets with a hyphen ([integer1-integer2]) or a colon ([integer1:integer2]) to define a range of non-negative integers at the end of a host name. The first integer must be less than the second integer.
  • Use square brackets with commas ([integer1, integer2 ...]) to define individual non-negative integers anywhere in the host name.
  • Use square brackets with commas and hyphens or colons (for example, [integer1-integer2, integer3, integer4:integer5, integer6:integer7]) to define different ranges of non-negative integers anywhere in the host name.
  • Use multiple sets of square brackets (with the supported special characters) to define multiple sets of non-negative integers anywhere in the host name. For example, hostA[1,3]B[1-3] includes hostA1B1, hostA1B2, hostA1B3, hostA3B1, hostA3B2, and hostA3B3.

host name

The name of a host defined in lsf.cluster.cluster_name.

host model

A host model defined in lsf.shared.

host type

A host type defined in lsf.shared.

default

The reserved host name default indicates all hosts in the cluster not otherwise referenced in the section (by name or by listing its model or type).

CHKPNT

Description

If C, checkpoint copy is enabled. With checkpoint copy, all opened files are automatically copied to the checkpoint directory by the operating system when a process is check pointed.

Example

HOST_NAME  CHKPNT hostA         C

Compatibility

Checkpoint copy is only supported on Cray systems.

Default

No checkpoint copy

DISPATCH_WINDOW

Description

The time windows in which jobs from this host, host model, or host type are dispatched. Once dispatched, jobs are no longer affected by the dispatch window.

Default

Not defined (always open)

EXIT_RATE

Description

Specifies a threshold for exited jobs. Specify a number of jobs. If the number of jobs that exit over a period of time specified by JOB_EXIT_RATE_DURATION in lsb.params (five minutes by default) exceeds the number of jobs you specify as the threshold in this parameter, LSF invokes LSF_SERVERDIR/eadmin to trigger a host exception.

EXIT_RATE for a specific host overrides a default GLOBAL_EXIT_RATE specified in lsb.params.

Example

The following Host section defines a job exit rate of 20 jobs for all hosts, and an exit rate of 10 jobs on hostA.

Begin Host 
HOST_NAME    MXJ      EXIT_RATE  # Keywords 
Default      !        20 
hostA        !        10 
End Host

Default

Not defined

JL/U

Description

Per-user job slot limit for the host. Maximum number of job slots that each user can use on this host.

Example

HOST_NAME  JL/U
hostA         2

Default

Unlimited

MIG

Syntax

MIG=minutes

Description

Enables automatic job migration and specifies the migration threshold for checkpoint-able or re-runnable jobs, in minutes.

LSF automatically migrates jobs that have been in the SSUSP state for more than the specified number of minutes. Specify a value of 0 to migrate jobs immediately upon suspension. The migration threshold applies to all jobs running on the host.

Job-level command line migration threshold overrides threshold configuration in application profile and queue. Application profile configuration overrides queue level configuration. When a host migration threshold is specified, and is lower than the value for the job, the queue, or the application, the host value is used.

Does not affect multicluster jobs that are forwarded to a remote cluster.

Default

Not defined. LSF does not migrate checkpoint-able or re-runnable jobs automatically.

MXJ

Description

The number of job slots on the host.

With multicluster resource leasing model, this is the number of job slots on the host that are available to the local cluster.

Use ! to make the number of job slots equal to the number of CPUs on a host.

For the reserved host name default, ! makes the number of job slots equal to the number of CPUs on all hosts in the cluster not otherwise referenced in the section.

By default, the number of running and suspended jobs on a host cannot exceed the number of job slots. If preemptive scheduling is used, the suspended jobs are not counted as using a job slot.

On multiprocessor hosts, to fully use the CPU resource, make the number of job slots equal to or greater than the number of processors.

Default

Unlimited

load_index

Syntax

load_index loadSched[/loadStop]

Specify io, it, ls, mem, pg, r15s, r1m, r15m, swp, tmp, ut, or a non-shared (host based) dynamic custom external load index as a column. Specify multiple columns to configure thresholds for multiple load indices.

Description

Scheduling and suspending thresholds for dynamic load indices supported by LIM, including external load indices.

Each load index column must contain either the default entry or two numbers separated by a slash (/), with no white space. The first number is the scheduling threshold for the load index; the second number is the suspending threshold.

Queue-level scheduling and suspending thresholds are defined in lsb.queues. If both files specify thresholds for an index, those that apply are the most restrictive ones.

Example

HOST_NAME    mem     swp
hostA        100/10  200/30
This example translates into a loadSched condition of
mem>=100 && swp>=200 
and a loadStop condition of
mem < 10 || swp < 30

Default

Not defined

AFFINITY

Syntax

AFFINITY=Y | y | N | n | cpu_list

Description

Specifies whether the host can be used to run affinity jobs, and if so which CPUs are eligible to do so. The syntax accepts Y, N, a list of CPUs, or a CPU range.

Examples

The following configuration enables affinity scheduling and tells LSF to use all CPUs on hostA for affinity jobs:
HOST_NAME MXJ r1m AFFINITY
hostA      !  ()   (Y)
The following configuration specifies a CPU list for affinity scheduling:
HOST_NAME MXJ r1m  AFFINITY
hostA      !  ()   (CPU_LIST="1,3,5,7-10")

This configuration enables affinity scheduling on hostA and tells LSF to just use CPUs 1,3,5, and CPUs 7-10 to run affinity jobs.

The following configuration disables affinity scheduling:
HOST_NAME MXJ r1m AFFINITY
hostA      !  ()   (N)

Default

Not defined. Affinity scheduling is not enabled.

Example of a Host section

Begin Host 
HOST_NAME   MXJ   JL/U r1m         pg       DISPATCH_WINDOW 
hostA        1      -   0.6/1.6   10/20  (5:19:00-1:8:30 20:00-8:30)
Linux       1      -   0.5/2.5 -             23:00-8:00 
default      2      1   0.6/1.6   20/40            ()
End Host

Linux is a host type defined in lsf.shared. This example Host section configures one host and one host type explicitly and configures default values for all other load-sharing hosts.

HostA runs one batch job at a time. A job will only be started on hostA if the r1m index is below 0.6 and the pg index is below ten; the running job is stopped if the r1m index goes above 1.6 or the pg index goes above 20. HostA only accepts batch jobs from 19:00 on Friday evening until 8:30 Monday morning and overnight from 20:00 to 8:30 on all other days.

For hosts of type Linux, the pg index does not have host-specific thresholds and such hosts are only available overnight from 23:00 to 8:00.

The entry with host name default applies to each of the other hosts in the cluster. Each host can run up to two jobs at the same time, with at most one job from each user. These hosts are available to run jobs at all times. Jobs may be started if the r1m index is below 0.6 and the pg index is below 20.

HostGroup section

Description

Optional. Defines host groups.

The name of the host group can then be used in other host group, host partition, and queue definitions, as well as on the command line. Specifying the name of a host group has exactly the same effect as listing the names of all the hosts in the group.

Structure

Host groups are specified in the same format as user groups in lsb.users.

The first line consists of two mandatory keywords, GROUP_NAME and GROUP_MEMBER, as well as optional keywords, CONDENSE and GROUP_ADMIN. Subsequent lines name a group and list its membership.

The sum of all host groups, compute groups, and host partitions cannot be more than 1024.

GROUP_NAME

Description

An alphanumeric string representing the name of the host group.

You cannot use the reserved name all, and group names must not conflict with host names.

CONDENSE

Description

Optional. Defines condensed host groups.

Condensed host groups are displayed in a condensed output format for the bhosts and bjobs commands.

If you configure a host to belong to more than one condensed host group, bjobs can display any of the host groups as execution host name.

Valid values

Y or N.

Default

N (the specified host group is not condensed)

GROUP_MEMBER

Description

A space-delimited list of hostnames or previously defined host group names, enclosed in one pair of parentheses.

You cannot use more than one pair of parentheses to define the list.

The names of hosts and host groups can appear on multiple lines because hosts can belong to multiple groups. The reserved name all specifies all hosts in the cluster.

You can use string literals and special characters when defining host group members. Each entry cannot contain any spaces, as the list itself is space delimited.

When a leased-in host joins the cluster, the host name is in the form of host@cluster. For these hosts, only the host part of the host name is subject to pattern definitions.

Valid values

You can use the following special characters to specify host group members:
  • Use an exclamation mark (!) to indicate an externally-defined host group, which an egroup executable retrieves.
  • Use a tilde (~) to exclude specified hosts or host groups from the list.
  • Use an asterisk (*) as a wildcard character to represent any number of characters.
  • Use square brackets with a hyphen ([integer1-integer2]) or a colon ([integer1:integer2]) to define a range of non-negative integers anywhere in the host name. The first integer must be less than the second integer.
  • Use square brackets with commas ([integer1, integer2 ...]) to define individual non-negative integers anywhere in the host name.
  • Use square brackets with commas and hyphens or colons (for example, [integer1-integer2, integer3, integer4:integer5, integer6:integer7]) to define different ranges of non-negative integers anywhere in the host name.
  • Use multiple sets of square brackets (with the supported special characters) to define multiple sets of non-negative integers anywhere in the host name. For example, hostA[1,3]B[1-3] includes hostA1B1, hostA1B2, hostA1B3, hostA3B1, hostA3B2, and hostA3B3.

Restrictions

You cannot define subgroups that contain wildcard and special characters.

Honoring the preferred host for host group members

To specify preferred host and host groups (that is, to indicate your preference for dispatching a job to a certain host or host group), use a plus sign (+) and a positive number, after the names of hosts or host groups that you would prefer to use. A higher number indicates a higher preference (for example, (hostA groupB+2) indicates that groupB is the most preferred and hostA is the least preferred. If a preference is not given, it is assumed to be zero. If there are multiple candidates, LSF schedules the jobs to hosts in order of preference; hosts with the same level of preference are ordered by load.

A host group can include another host group (for example, groupA is a subgroup of groupB). LSF does not support multi-level host group preferences (for example, since groupA is inside groupB, then any preference set for groupA is ignored, as it is a subgroup nested inside of groupB.

You cannot set the preference level, if a host group member:
  • Is part of an externally-defined host group (egroup), indicated by an exclamation mark (!).
  • Is part of an excluded host or host group, indicated by a tilde (~).
  • Uses patterns:
    • Indicated by an asterisk (*) as a wildcard character to represent any number of characters.
    • Indicated by square brackets with a hyphen ([integer1-integer2]), with a colon ([integer1:integer2]), or with a comma ([integer1, integer2 ...]).
  • Is part of a lease-in host, such as ALLREMOTE.

GROUP_ADMIN

Description

Host group administrators have the ability to open or close the member hosts for the group they are administering.

the GROUP_ADMIN field is a space-delimited list of user names or previously defined user group names, enclosed in one pair of parentheses.

You cannot use more than one pair of parentheses to define the list.

The names of users and user groups can appear on multiple lines because users can belong to and administer multiple groups.

Host group administrator rights are inherited. For example, if the user admin2 is an administrator for host group groupA and host group groupB is a member of groupA, admin2 is also an administrator for host group groupB.

When host group administrators (who are not also cluster administrators) open or close a host, they must specify a comment with the -C option.

Valid values

Any existing user or user group can be specified. A user group that specifies an external list is also allowed; however, in this location, you use the user group name that has been defined with (!) rather than (!) itself.

Restrictions

  • You cannot specify any wildcard or special characters (for example: *, !, $, #, &, ~).
  • You cannot specify an external group (egroup).
  • You cannot use the keyword ALL and you cannot administer any group that has ALL as its members.
  • User names and user group names cannot have spaces.

Example HostGroup sections

Example 1

Begin HostGroup 
GROUP_NAME  GROUP_MEMBER GROUP_ADMIN
groupA      (hostA+2 hostD+1) (user1 user10)
groupB      (hostF groupA hostK) ()
groupC      (!) ()
End HostGroup
This example defines three host groups:
  • groupA includes hostA and hostD and can be administered by user1 and user10. Additionally, hostA is the most preferred host group and hostD is the least preferred.
  • groupB includes hostF and hostK, along with all hosts in groupA. It has no administrators (only the cluster administrator can control the member hosts).
  • The group membership of groupC is defined externally and retrieved by the egroup executable.

Example 2

Begin HostGroup 
GROUP_NAME   GROUP_MEMBER GROUP_ADMIN
groupA       (all) ()
groupB       (groupA ~hostA ~hostB) (user11 user14)
groupC       (hostX hostY hostZ) ()
groupD       (groupC ~hostX) usergroupB
groupE       (all ~groupC ~hostB) ()
groupF       (hostF groupC hostK) ()
End HostGroup
This example defines the following host groups:
  • groupA contains all hosts in the cluster and is administered by the cluster administrator.
  • groupB contains all the hosts in the cluster except for hostA and hostB and is administered by user11 and user14.
  • groupC contains only hostX, hostY, and hostZ and is administered by the cluster administrator.
  • groupD contains the hosts in groupC except for hostX. Note that hostX must be a member of host group groupC to be excluded from groupD. usergroupB is the administrator for groupD.
  • groupE contains all hosts in the cluster excluding the hosts in groupC and hostB and is administered by the cluster administrator.
  • groupF contains hostF, hostK, and the 3 hosts in groupC and is administered by the cluster administrator.

Example 3

Begin HostGroup 
GROUP_NAME   CONDENSE   GROUP_MEMBER GROUP_ADMIN
groupA          N       (all) ()
groupB          N       (hostA, hostB) (usergroupC user1)
groupC          Y       (all)()
End HostGroup
This example defines the following host groups:
  • groupA shows noncondensed output and contains all hosts in the cluster and is administered by the cluster administrator.
  • groupB shows noncondensed output, and contains hostA and hostB. It is administered by all members of usergroupC and user1.
  • groupC shows condensed output and contains all hosts in the cluster and is administered by the cluster administrator.

Example 4

Begin HostGroup 
GROUP_NAME CONDENSE GROUP_MEMBER GROUP_ADMIN
groupA          Y (host*) (user7)
groupB          N (*A) ()
groupC          N (hostB* ~hostB[1-50]) ()
groupD          Y (hostC[1:50] hostC[101:150]) (usergroupJ)
groupE          N (hostC[51-100] hostC[151-200]) ()
groupF          Y (hostD[1,3] hostD[5-10]) ()
groupG          N (hostD[11-50] ~hostD[15,20,25] hostD2) ()
groupH          Y (hostX[1:10]Y[1:10]) ()
End HostGroup
This example defines the following host groups:
  • groupA shows condensed output, and contains all hosts starting with the string host. It is administered by user7.
  • groupB shows noncondensed output, and contains all hosts ending with the string A, such as hostA and is administered by the cluster administrator.
  • groupC shows noncondensed output, and contains all hosts starting with the string hostB except for the hosts from hostB1 to hostB50 and is administered by the cluster administrator.
  • groupD shows condensed output, and contains all hosts from hostC1 to hostC50 and all hosts from hostC101 to hostC150 and is administered by the members of usergroupJ.
  • groupE shows noncondensed output, and contains all hosts from hostC51 to hostC100 and all hosts from hostC151 to hostC200 and is administered by the cluster administrator.
  • groupF shows condensed output, and contains hostD1, hostD3, and all hosts from hostD5 to hostD10 and is administered by the cluster administrator.
  • groupG shows noncondensed output, and contains all hosts from hostD11 to hostD50 except for hostD15, hostD20, and hostD25. groupG also includes hostD2. It is administered by the cluster administrator.

HostPartition section

Description

Optional. Used with host partition user-based fair share scheduling. Defines a host partition, which defines a user-based fair share policy at the host level.

Configure multiple sections to define multiple partitions.

The members of a host partition form a host group with the same name as the host partition.

Restriction: You cannot use host partitions and host preference simultaneously.

Limitations on queue configuration

  • If you configure a host partition, you cannot configure fair share at the queue level.
  • If a queue uses a host that belongs to a host partition, it should not use any hosts that don’t belong to that partition. All the hosts in the queue should belong to the same partition. Otherwise, you might notice unpredictable scheduling behavior:
    • Jobs in the queue sometimes may be dispatched to the host partition even though hosts not belonging to any host partition have a lighter load.
    • If some hosts belong to one host partition and some hosts belong to another, only the priorities of one host partition are used when dispatching a parallel job to hosts from more than one host partition.

Shared resources and host partitions

  • If a resource is shared among hosts included in host partitions and hosts that are not included in any host partition, jobs in queues that use the host partitions will always get the shared resource first, regardless of queue priority.
  • If a resource is shared among host partitions, jobs in queues that use the host partitions listed first in the HostPartition section of lsb.hosts will always have priority to get the shared resource first. To allocate shared resources among host partitions, LSF considers host partitions in the order they are listed in lsb.hosts.

Structure

Each host partition always consists of 3 lines, defining the name of the partition, the hosts included in the partition, and the user share assignments.

HPART_NAME

Syntax

HPART_NAME=partition_name

Description

Specifies the name of the partition. The name must be 59 characters or less.

HOSTS

Syntax

HOSTS=[[~]host_name | [~]host_group | all]...

Description

Specifies the hosts in the partition, in a space-separated list.

A host cannot belong to multiple partitions.

A host group cannot be empty.

Hosts that are not included in any host partition are controlled by the FCFS scheduling policy instead of the fair share scheduling policy.

Optionally, use the reserved host name all to configure a single partition that applies to all hosts in a cluster.

Optionally, use the not operator (~) to exclude hosts or host groups from the list of hosts in the host partition.

Examples

HOSTS=all ~hostK ~hostM

The partition includes all the hosts in the cluster, except for hostK and hostM.
HOSTS=groupA ~hostL

The partition includes all the hosts in host group groupA except for hostL.

USER_SHARES

Syntax

USER_SHARES=[user, number_shares]...

Description

Specifies user share assignments
  • Specify at least one user share assignment.
  • Enclose each user share assignment in square brackets, as shown.
  • Separate a list of multiple share assignments with a space between the square brackets.
  • user: Specify users who are also configured to use the host partition. You can assign the shares:
    • To a single user (specify user_name). To specify a Windows user account, include the domain name in uppercase letters (DOMAIN_NAME\user_name).
    • To users in a group, individually (specify group_name@) or collectively (specify group_name). To specify a Windows user group, include the domain name in uppercase letters (DOMAIN_NAME\group_name).
    • To users not included in any other share assignment, individually (specify the keyword default) or collectively (specify the keyword others).

By default, when resources are assigned collectively to a group, the group members compete for the resources according to FCFS scheduling. You can use hierarchical fair share to further divide the shares among the group members.

When resources are assigned to members of a group individually, the share assignment is recursive. Members of the group and of all subgroups always compete for the resources according to FCFS scheduling, regardless of hierarchical fair share policies.
  • number_shares
    • Specify a positive integer representing the number of shares of the cluster resources assigned to the user.
    • The number of shares assigned to each user is only meaningful when you compare it to the shares assigned to other users or to the total number of shares. The total number of shares is just the sum of all the shares assigned in each share assignment.

Example of a HostPartition section

Begin HostPartition
HPART_NAME = Partition1 HOSTS = hostA hostB USER_SHARES = 
[groupA@, 3] [groupB, 7] [default, 1] 
End HostPartition

ComputeUnit section

Description

Optional. Defines compute units.

Once defined, the compute unit can be used in other compute unit and queue definitions, as well as in the command line. Specifying the name of a compute unit has the same effect as listing the names of all the hosts in the compute unit.

Compute units are similar to host groups, with the added feature of granularity allowing the construction of structures that mimic the network architecture. Job scheduling using compute unit resource requirements effectively spreads jobs over the cluster based on the configured compute units.

To enforce consistency, compute unit configuration has the following requirements:

  • Hosts and host groups appear in the finest granularity compute unit type, and nowhere else.
  • Hosts appear in only one compute unit of the finest granularity.
  • All compute units of the same type have the same type of compute units (or hosts) as members.

Structure

Compute units are specified in the same format as host groups in lsb.hosts.

The first line consists of three mandatory keywords, NAME, MEMBER, and TYPE, as well as optional keywords CONDENSE and ADMIN. Subsequent lines name a compute unit and list its membership.

The sum of all host groups, compute groups, and host partitions cannot be more than 1024.

NAME

Description

An alphanumeric string representing the name of the compute unit.

You cannot use the reserved names all, allremote, others, and default. Compute unit names must not conflict with host names, host partitions, or host group names.

CONDENSE

Description

Optional. Defines condensed compute units.

Condensed compute units are displayed in a condensed output format for the bhosts and bjobs commands. The condensed compute unit format includes the slot usage for each compute unit.

Valid values

Y or N.

Default

N (the specified host group is not condensed)

MEMBER

Description

A space-delimited list of host names or previously defined compute unit names, enclosed in one pair of parentheses.

You cannot use more than one pair of parentheses to define the list.

The names of hosts and host groups can appear only once, and only in a compute unit type of the finest granularity.

An exclamation mark (!) indicates an externally-defined host group, which the egroup executable retrieves.

Pattern definition

You can use string literals and special characters when defining compute unit members. Each entry cannot contain any spaces, as the list itself is space delimited.

You can use the following special characters to specify host and host group compute unit members:
  • Use a tilde (~) to exclude specified hosts or host groups from the list.
  • Use an asterisk (*) as a wildcard character to represent any number of characters.
  • Use square brackets with a hyphen ([integer1-integer2]) or a colon ([integer1:integer2]) to define a range of non-negative integers anywhere in the host name. The first integer must be less than the second integer.
  • Use square brackets with commas ([integer1, integer2...]) to define individual non-negative integers anywhere in the host name.
  • Use square brackets with commas and hyphens or colons (for example, [integer1-integer2, integer3, integer4:integer5, integer6:integer7]) to define different ranges of non-negative integers anywhere in the host name.
  • Use multiple sets of square brackets (with the supported special characters) to define multiple sets of non-negative integers anywhere in the host name. For example, hostA[1,3]B[1-3] includes hostA1B1, hostA1B2, hostA1B3, hostA3B1, hostA3B2, and hostA3B3.

Restrictions

  • Compute unit names cannot be used in compute units of the finest granularity.
  • You cannot include host or host group names except in compute units of the finest granularity.
  • You must not skip levels of granularity. For example:

    If lsb.params contains COMPUTE_UNIT_TYPES=enclosure rack cabinet then a compute unit of type cabinet can contain compute units of type rack, but not of type enclosure.

  • The keywords all, allremote, all@cluster, other and default cannot be used when defining compute units.

TYPE

Description

The type of the compute unit, as defined in the COMPUTE_UNIT_TYPES parameter of lsb.params.

ADMIN

Description

Compute unit administrators have the ability to open or close the member hosts for the compute unit they are administering.

the ADMIN field is a space-delimited list of user names or previously defined user group names, enclosed in one pair of parentheses.

You cannot use more than one pair of parentheses to define the list.

The names of users and user groups can appear on multiple lines because users can belong to and administer multiple compute units.

Compute unit administrator rights are inherited. For example, if the user admin2 is an administrator for compute unit cu1 and compute unit cu2 is a member of cu1, admin2 is also an administrator for compute unit cu2.

When compute unit administrators (who are not also cluster administrators) open or close a host, they must specify a comment with the -C option.

Valid values

Any existing user or user group can be specified. A user group that specifies an external list is also allowed; however, in this location, you use the user group name that has been defined with (!) rather than (!) itself.

Restrictions

  • You cannot specify any wildcard or special characters (for example: *, !, $, #, &, ~).
  • You cannot specify an external group (egroup).
  • You cannot use the keyword ALL and you cannot administer any group that has ALL as its members.
  • User names and user group names cannot have spaces.

Example ComputeUnit sections

Example 1

(For the lsb.params entry COMPUTE_UNIT_TYPES=enclosure rack cabinet)
Begin ComputeUnit 
NAME   MEMBER        TYPE
encl1  (host1 host2) enclosure
encl2  (host3 host4) enclosure
encl3  (host5 host6) enclosure
encl4  (host7 host8) enclosure
rack1  (encl1 encl2) rack
rack2  (encl3 encl4) rack
cbnt1  (rack1 rack2) cabinet
End ComputeUnit
This example defines seven compute units:
  • encl1, encl2, encl3 and encl4 are the finest granularity, and each contain two hosts.
  • rack1 is of coarser granularity and contains two levels. At the enclosure level rack1 contains encl1 and encl2. At the lowest level rack1 contains host1, host2, host3, and host4.
  • rack2 has the same structure as rack1, and contains encl3 and encl4.
  • cbnt1 contains two racks (rack1 and rack2), four enclosures (encl1, encl2, encl3, and encl4) and all eight hosts. Compute unit cbnt1 is the coarsest granularity in this example.

Example 2

(For the lsb.params entry COMPUTE_UNIT_TYPES=enclosure rack cabinet)
Begin ComputeUnit 
NAME  CONDENSE MEMBER                   TYPE      ADMIN
encl1 Y        (hg123 ~hostA ~hostB)    enclosure (user11 user14)
encl2 Y        (hg456)                  enclosure ()
encl3 N        (hostA hostB)            enclosure usergroupB
encl4 N        (hgroupX ~hostB)         enclosure ()
encl5 Y        (hostC* ~hostC[101-150]) enclosure usergroupJ
encl6 N        (hostC[101-150])         enclosure ()
rack1 Y        (encl1 encl2 encl3)      rack      ()
rack2 N        (encl4 encl5)            rack      usergroupJ
rack3 N        (encl6)                  rack      ()
cbnt1 Y        (rack1 rack2)            cabinet   ()
cbnt2 N        (rack3)                  cabinet   user14
End ComputeUnit
This example defines 11 compute units:
  • All six enclosures (finest granularity) contain only hosts and host groups. All three racks contain only enclosures. Both cabinets (coarsest granularity) contain only racks.
  • encl1 contains all the hosts in host group hg123 except for hostA and hostB and is administered by user11 and user14. Note that hostA and hostB must be members of host group hg123 to be excluded from encl1. encl1 shows condensed output.
  • encl2 contains host group hg456 and is administered by the cluster administrator. encl2 shows condensed output.
  • encl3 contains hostA and hostB. usergroupB is the administrator for encl3. encl3 shows noncondensed output.
  • encl4 contains host group hgroupX except for hostB. Since each host can appear in only one enclosure and hostB is already in encl3, it cannot be in encl4. encl4 is administered by the cluster administrator. encl4 shows noncondensed output.
  • encl5 contains all hosts starting with the string hostC except for hosts hostC101 to hostC150, and is administered by usergroupJ. encl5 shows condensed output.
  • rack1 contains encl1, encl2, and encl3. rack1 shows condensed output.
  • rack2 contains encl4, and encl5. rack2 shows uncondensed output.
  • rack3 contains encl6. rack3 shows uncondensed output.
  • cbnt1 contains rack1 and rack2. cbnt1 shows condensed output.
  • cbnt2 contains rack3. Even though rack3 only contains encl6, cbnt3 cannot contain encl6 directly because that would mean skipping the level associated with compute unit type rack. cbnt2 shows noncondensed output.

Automatic time-based configuration

Variable configuration is used to automatically change LSF configuration based on time windows. You define automatic configuration changes in lsb.hosts by using if-else constructs and time expressions. After you change the files, reconfigure the cluster with the badmin reconfig command.

The expressions are evaluated by LSF every 10 minutes based on mbatchd start time. When an expression evaluates true, LSF dynamically changes the configuration based on the associated configuration statements. Reconfiguration is done in real time without restarting mbatchd, providing continuous system availability.

Example

In the following example, the #if, #else, #endif are not interpreted as comments by LSF but as if-else constructs.
Begin Host
HOST_NAME   r15s   r1m   pg
host1       3/5    3/5   12/20
#if time(5:16:30-1:8:30 EDT 20:00-8:30 EDT)
host2       3/5    3/5   12/20
#else
host2       2/3    2/3   10/12
#endif
host3       3/5    3/5   12/20
End Host

Specifying the time zone is optional. If you do not specify a time zone, LSF uses the local system time zone. LSF supports all standard time zone abbreviations.