Global job IDs allow an LSF
multicluster environment to use the same job IDs between the forwarding and forwarded clusters,
keeping the IDs uniform. These global job IDs are unique. To guarantee unique job IDs, starting in
Fix Pack 14, LSF
introduces indexes for clusters, so that each job submitted from the cluster includes an index to
the ending digits of the job ID (for example, job ID 100 with an index value of 22 will have a
global job ID of 10022).
About this task
To configure global job IDs for your LSF
multicluster environment, add an index column to the lsf.shared configuration
file, then as a
best practice, increase both the MAX_JOBID value in the
lsb.params file (to 99999999) and the
LSB_JOBID_DISP_LENGTH value in the lsf.conf file (to
8).
Procedure
- Log on to any host in the cluster as the LSF
administrator.
- Edit the lsf.shared configuration file:
- Add new column called
Index to the Cluster section
and assign indexes for each cluster. The index can be a number from 1-99, and must be unique to the
other clusters within the same cluster group.
For example:
Begin Cluster
ClusterName Servers Index
cluster1 (hostA hostB) 1
cluster2 (hostD) 2
End Cluster
Tip: Typically, the lsf.shared file is centrally located so than
any changes can be accessed by all clusters; however, if you do not have a shared system, then
ensure that you update the lsf.shared file on each cluster.
With the Index column set, the cluster reading the configuration generates
global job IDs ending with the index. Also, when jobs are forwarded to the cluster, the cluster will
try to match the job ID with the index configured. If the index does not match the job ID, the job
will be rejected from the cluster. If the index matches the job ID, the job will then be accepted as
the same job ID given from the forwarding cluster. If the job ID has already been used, the job ID
will be changed to end in 00.
Starting
in Fix Pack 15, for hybrid global job IDs and non-global job IDs, you can specify a hyphen (-)
(instead of a number between 1-99), to be configured in the Index column. In this
case, also ensure that the MC_STRICT_JOBID_CHECKING parameter is set to
N in the lsb.params file. By Default
MC_STRICT_JOBID_CHECKING is set to N. Forwarded jobs not matching the cluster
index set in the lsf.shared file, will by default be allowed to run on the
execution host. Typically this parameter is left as N; however you can set it
to N to enable the previous behavior.
- Save the changes to the lsf.shared file.
- Run lsadmin reconfig to reconfigure LIM.
- Run badmin reconfig to reconfigure the mbatchd
daemon.
- To maximize the number of available global job IDs that LSF can
assign, as a
best practice, increase both the MAX_JOBID value in the
lsb.params file (to 99999999) and the
LSB_JOBID_DISP_LENGTH value in the lsf.conf file (to
8).
- To view the new index column, run the lscluster
cluster_name command to see all configuration information for your cluster
(including the new column).
Here is example output from running
lscluster hostA:
CLUSTER_NAME STATUS MASTER_HOST ADMIN HOSTS SERVERS INDEX
cluster1 ok hostA jsmith 1 1 1