-hostfile

Submits a job with a user-specified host file.

Categories

resource

Synopsis

bsub -hostfile file_path

Conflicting options

Do not use with the following options: -ext, -n, -m, -R res-req.

Description

When submitting a job, you can point the job to a file that specifies hosts and number of slots for job processing.

For example, some applications (typically when benchmarking) run best with a very specific geometry. For repeatability (again, typically when benchmarking) you may want it to always run it on the same hosts, using the same number of slots.

The user-specified host file specifies a host and number of slots to use per task, resulting in a rank file.

The -hostfile option allows a user to submit a job, specifying the path of the user-specified host file:

bsub -hostfile "spec_host_file"
Important:
  • Do not use a user-specified host file if you have enabled task geometry as it may cause conflicts and jobs may fail.
  • Alternatively, if resources are not available at the time that a task is ready a job may not run smoothly. Consider using advance reservation instead of a user-specified host file, to ensure reserved slots are available.

Any user can create a user-specified host file. It must be accessible by the user from the submission host. It lists one host per line. The format is as follows:

# This is a user-specified host file
<host_name1>   [<# slots>]
<host_name2>   [<# slots>]
<host_name1>   [<# slots>]
<host_name2>   [<# slots>]
<host_name3>   [<# slots>]
<host_name4>   [<# slots>]
The following rules apply to the user-specified host file:
  • Insert comments starting with the # character.
  • Specifying the number of slots for a host is optional. If no slot number is indicated, the default is 1.
  • A host name can be either a host in a local cluster or a host leased-in from a remote cluster (host_name@cluster_name).
  • A user-specified host file should contain hosts from the same cluster only.
  • A host name can be entered with or without the domain name.
  • Host names may be used multiple times and the order entered represents the placement of tasks. For example:
    #first three tasks
    host01                      3
    #fourth tasks
    host02
    #next three tasks
    host03                      3
    

The resulting rank file is made available to other applications (such as MPI).

The LSB_DJOB_RANKFILE environment variable is generated from the user-specified host file. If a job is not submitted with a user-specified host file then LSB_DJOB_RANKFILE points to the same file as LSB_DJOB_HOSTFILE.

The esub parameter LSB_SUB4_HOST_FILE reads and modifies the value of the -hostfile option.

The following is an example of a user-specified host file that includes duplicate host names:

user1: cat ./user1_host_file
# This is my user-specified host file for job242
host01   3
host02    
host03   3
host01    
host02   2

This user-specified host file tells LSF to allocate 10 slots in total (4 slots on host01, 3 slots on host02, and 3 slots on host03). Each line represents the order of task placement.

Duplicate host names are combined, along with the total number of slots for a host name and the results are used for scheduling (whereas LSB_DJOB_HOSTFILE groups the hosts together) and for LSB_MCPU_HOSTS. LSB_MCPU_HOSTS represents the job allocation.

The result is the following:

LSB_DJOB_RANKFILE:
host01
host01
host01
host02
host03
host03
host03
host01
host02
host02
LSB_DJOB_HOSTFILE:
host01
host01
host01
host01
host02
host02
host02
host03
host03
host03
LSB_MCPU_HOSTS = host01 4 host02 3 host03 3