Set common ports
Participating clusters must use the same port numbers for the daemons LIM, RES, MBD, and SBD.
LSF_LIM_PORT=7869
LSF_RES_PORT=6878
LSB_MBD_PORT=6881
LSB_SBD_PORT=6882
- LSF_LIM_PORT change
- The default for LSF_LIM_PORT changed in LSF version 7.0 to accommodate IBM® EGO default port
configuration. On EGO, default ports start with lim at 7869, and are numbered
consecutively for the EGO pem, vemkd, and
egosc daemons.
This is different from previous LSF releases where the default LSF_LIM_PORT was 6879. LSF res, sbatchd, and mbatchd continue to use the default pre-version 7.0 ports 6878, 6881, and 6882.
Upgrade installation preserves existing port settings for lim, res, sbatchd, and mbatchd. EGO pem, vemkd, and egosc use default EGO ports starting at 7870, if they do not conflict with existing lim, res, sbatchd, and mbatchd ports.
- Troubleshooting
- To check your port numbers, check the LSF_TOP/conf/lsf.conf file in each
cluster. (LSF_TOP is the LSF installation directory. On UNIX, this is defined in the
install.config file). Make sure you have identical settings in each cluster for
the following parameters:
- LSF_LIM_PORT
- LSF_RES_PORT
- LSB_MBD_PORT
- LSB_SBD_PORT
Set common resource definitions
For resource sharing to work between clusters, the clusters should have common definitions of host types, host models, and resources. Each cluster finds this information in lsf.shared, so the best way to configure multicluster is to make sure lsf.shared is identical for each cluster. If you do not have a shared file system, replicate lsf.shared across all clusters.
- Local cluster information overrides remote cluster information (host type, host model, or resource attributes and order of specification in configuration files).
- The local cluster ignores remote cluster configuration if the remote type/host model/resource does not exist in local cluster.
Define participating clusters and valid management hosts
- For ClusterName, specify the name of each participating cluster. On UNIX, each cluster name is defined by LSF_CLUSTER_NAME in the install.config file.
- For Servers, specify the management host and optionally candidate management hosts for the cluster. A cluster will not participate in multicluster resource sharing unless its current management host is listed here.
Example
Begin Cluster
ClusterName Servers
Cluster1 (hostA hostB)
Cluster2 (hostD)
End Cluster
In this example, hostA should be the management host of Cluster1 with hostB as the backup, and hostD should be the management host of Cluster2. If the management host fails in Cluster1, MultiCluster will still work because the backup management host is also listed here. However, if the management host fails in Cluster2, MultiCluster will not recognize any other host as the management host, so Cluster2 will no longer participate in MultiCluster resource sharing.