View host information

About this task

LSF uses some or all of the hosts in a cluster as execution hosts. The host list is configured by the LSF administrator.

Procedure

  • Use the bhosts command to view host information.
  • Use the lsload command to view host load information.

    To view...

    Run...

    All hosts in the cluster and their status

    bhosts

    Condensed host groups in an uncondensed format

    bhosts -X

    Detailed server host information

    bhosts -l and lshosts -l

    Host load by host

    lsload

    Host architecture information

    lshosts

    Host history

    badmin hhist

    Host model and type information

    lsinfo

    Job exit rate and load for hosts

    bhosts -l and bhosts -x

    Dynamic host information

    lshosts


View all hosts in the cluster and their status

Procedure

Run bhosts to display information about all hosts and their status.

bhosts displays condensed information for hosts that belong to condensed host groups. When displaying members of a condensed host group, bhosts lists the host group name instead of the name of the individual host. For example, in a cluster with a condensed host group (groupA), an uncondensed host group (groupB containing hostC and hostE), and a host that is not in any host group (hostF), bhosts displays the following:

bhosts
HOST_NAME          STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV
groupA             ok              5      8     4       2      0      1      1
hostC              ok              -      3     0       0      0      0      0
hostE              ok              2      4     2       1      0      0      1
hostF              ok              -      2     2       1      0      1      0

Define condensed host groups in the HostGroups section of lsb.hosts.

View uncondensed host information

Procedure

Run bhosts -X to display all hosts in an uncondensed format, including those belonging to condensed host groups:
bhosts -X
HOST_NAME          STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV
hostA              ok              2      2      0      0      0      0      0
hostD              ok              2      4      2      1      0      0      1
hostB              ok              1      2      2      1      0      1      0
hostC              ok              -      3      0      0      0      0      0
hostE              ok              2      4      2      1      0      0      1
hostF              ok              -      2      2      1      0      1      0

View detailed server host information

Procedure

Run bhosts -l host_name and lshosts -l host_name to display all information about each server host such as the CPU factor and the load thresholds to start, suspend, and resume jobs:
bhosts -l hostB
HOST  hostB
STATUS   CPUF   JL/U   MAX   NJOBS   RUN   SSUSP   USUSP  RSV  DISPATCH_WINDOWS
ok       20.20   -      -     0       0      0      0      0    -
CURRENT LOAD USED FOR SCHEDULING:
         r15s  r1m   r15m   ut   pg   io   ls   it   tmp   swp   mem  slots
Total    0.1   0.1    0.1   9%  0.7  24   17    0    394M  396M  12M      8
Reserved 0.0   0.0    0.0   0%  0.0   0    0    0      0M    0M   0M      8
LOAD THRESHOLD USED FOR SCHEDULING:
            r15s  r1m   r15m  ut   pg   io  ls   it   tmp  swp  mem
loadSched   -     -     -     -    -     -   -    -     -    -    -
loadStop    -     -     -     -    -     -   -    -     -    -    -
 
                cpuspeed    bandwidth
 loadSched          -            -
 loadStop           -            -
lshosts -l hostB
HOST_NAME:  hostB
type model cpuf ncpus ndisks maxmem maxswp maxtmp rexpri server nprocs ncores nthreads
LINUX86 PC6000 116.1       2      1  2016M  1983M 72917M      0    Yes   1    2        2
 
RESOURCES: Not defined
RUN_WINDOWS:  (always open)
 
LOAD_THRESHOLDS:
  r15s   r1m   r15m   ut   pg   io   ls   it   tmp   swp   mem
     -   1.0      -    -    -    -    -    -     -     -    4M

View host load by host

About this task

The lsload command reports the current status and load levels of hosts in a cluster. The lshosts -l command shows the load thresholds.

Procedure

Run lsload to see load levels for each host:
lsload
HOST_NAME status r15s r1m  r15m ut  pg  ls it tmp swp  mem
hostD     ok     1.3  1.2  0.9  92% 0.0 2  20 5M  148M 88M
hostB     -ok    0.1  0.3  0.7  0%  0.0 1  67 45M 25M  34M
hostA     busy   8.0  *7.0 4.9  84% 4.6 6  17 1M  81M  27M

The first line lists the load index names, and each following line gives the load levels for one host.

View host architecture (type and model) information

About this task

The lshosts command displays configuration information about hosts. All these parameters are defined by the LSF administrator in the LSF configuration files, or determined by the LIM directly from the system.

Host types represent binary compatible hosts; all hosts of the same type can run the same executable. Host models give the relative CPU performance of different processors.

Procedure

Run lshosts to see configuration information about hosts:
lshosts
HOST_NAME   type    model cpuf ncpus maxmem maxswp server  RESOURCES
hostD     SUNSOL SunSparc  6.0     1    64M   112M    Yes  (solaris cserver)
hostM       RS6K   IBM350  7.0     1    64M   124M    Yes  (cserver aix)
hostC       RS6K     R10K 14.0    16  1024M  1896M    Yes  (cserver aix)
hostA       HPPA    HP715  6.0     1    98M   200M    Yes  (hpux fserver)

In the preceding example, the host type SUNSOL represents Sun SPARC systems running Solaris. The lshosts command also displays the resources available on each host.

type

The host CPU architecture. Hosts that can run the same binary programs should have the same type.

An UNKNOWN type or model indicates that the host is down, or LIM on the host is down.

When automatic detection of host type or model fails (the host type configured in lsf.shared cannot be found), the type or model is set to DEFAULT. LSF does work on the host, but a DEFAULT model might be inefficient because of incorrect CPU factors. A DEFAULT type may also cause binary incompatibility because a job from a DEFAULT host type can be migrated to another DEFAULT host type. automatic detection of host type or model has failed, and the host type configured in lsf.shared cannot be found.

View host history

Procedure

Run badmin hhist to view the history of a host such as when it is opened or closed:
badmin hhist hostB
Wed Nov 20 14:41:58: Host <hostB> closed by administrator <lsf>.
Wed Nov 20 15:23:39: Host <hostB> opened by administrator <lsf>.

View host model and type information

Procedure

  1. Run lsinfo -m to display information about host models that exist in the cluster:
    lsinfo -m
    MODEL_NAME      CPU_FACTOR      ARCHITECTURE
    PC1133               23.10      x6_1189_PentiumIIICoppermine
    HP9K735               4.50      HP9000735_125
    HP9K778               5.50      HP9000778
    Ultra5S              10.30      SUNWUltra510_270_sparcv9
    Ultra2               20.20      SUNWUltra2_300_sparc
    Enterprise3000       20.00      SUNWUltraEnterprise_167_sparc
    
  2. Run lsinfo -M to display all host models that are defined in lsf.shared:
    lsinfo -M
    MODEL_NAME      CPU_FACTOR      ARCHITECTURE
    UNKNOWN_AUTO_DETECT      1.00      UNKNOWN_AUTO_DETECT
    DEFAULT               1.00      
    LINUX133              2.50      x586_53_Pentium75
    PC200                 4.50      i86pc_200
    Intel_IA64           12.00      ia64
    Ultra5S              10.30      SUNWUltra5_270_sparcv9
    PowerPC_G4           12.00      x7400G4
    HP300                 1.00      
    SunSparc             12.00 
    
  3. Run lim -t to display the type, model, and matched type of the current host. You must be the LSF administrator to use this command:
    lim -t
    Host Type             : NTX64
    Host Architecture     : EM64T_1596
    Total NUMA Nodes		  : 1
    Total Processors      : 2
    Total Cores           : 4
    Total Threads         : 2
    Matched Type          : NTX64
    Matched Architecture  : EM64T_3000
    Matched Model         : Intel_EM64T
    CPU Factor            : 60.0
    

View job exit rate and load for hosts

Procedure

  1. Run bhosts to display the exception threshold for job exit rate and the current load value for hosts.

    In the following example, EXIT_RATE for hostA is configured as four jobs per minute. hostA does not currently exceed this rate

    bhosts -l hostA
    HOST  hostA
    STATUS           CPUF  JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV DISPATCH_WINDOW
    ok              18.60     -      1      0      0      0      0      0      -
     
     CURRENT LOAD USED FOR SCHEDULING:
                  r15s   r1m  r15m    ut    pg    io   ls    it   tmp   swp   mem   slots
     Total         0.0   0.0   0.0    0%   0.0     0    1     2  646M  648M  115M       8
     Reserved      0.0   0.0   0.0    0%   0.0     0    0     0    0M    0M    0M       8
     
     
                 share_rsrc host_rsrc
     Total              3.0       2.0
     Reserved           0.0       0.0
     
     
     LOAD THRESHOLD USED FOR SCHEDULING:
               r15s   r1m  r15m   ut      pg    io   ls    it    tmp    swp    mem
     loadSched   -     -     -     -       -     -    -     -     -      -      -  
     loadStop    -     -     -     -       -     -    -     -     -      -      -  
     
                    cpuspeed    bandwidth
     loadSched          -            -
     loadStop           -            -
     
     THRESHOLD AND LOAD USED FOR EXCEPTIONS:
                JOB_EXIT_RATE
     Threshold    4.00
     Load         0.00
    
  2. Use bhosts -x to see hosts whose job exit rate has exceeded the threshold for longer than JOB_EXIT_RATE_DURATION, and are still high. By default, these hosts are closed the next time LSF checks host exceptions and invokes eadmin.

    If no hosts exceed the job exit rate, bhosts -x displays:

    There is no exceptional host found
    

View dynamic host information

Procedure

Use lshosts to display information about dynamically added hosts.

An LSF cluster may consist of static and dynamic hosts. The lshosts command displays configuration information about hosts. All these parameters are defined by the LSF administrator in the LSF configuration files, or determined by the LIM directly from the system.

Host types represent binary compatible hosts; all hosts of the same type can run the same executable. Host models give the relative CPU performance of different processors. Server represents the type of host in the cluster. “Yes” is displayed for LSF servers, “No” is displayed for LSF clients, and “Dyn” is displayed for dynamic hosts.

For example:

lshosts
HOST_NAME   type    model cpuf ncpus maxmem maxswp server  RESOURCES
hostA      SOL64 Ultra60F 23.5     1    64M   112M    Yes  ()
hostB    LINUX86 Opteron8 60.0     1    94M   168M    Dyn  ()

In the preceding example, hostA is a static host while hostB is a dynamic host.