Description

By default, displays information about your own pending, running, and suspended jobs.

bjobs displays output for condensed host groups and compute units. These host groups and compute units are defined by CONDENSE in the HostGroup or ComputeUnit section of lsb.hosts. These groups are displayed as a single entry with the name as defined by GROUP_NAME or NAME in lsb.hosts. The -l and -X options display noncondensed output.

If you defined the parameter LSB_SHORT_HOSTLIST=1 in the lsf.conf file, parallel jobs running in the same condensed host group or compute unit are displayed as an abbreviated list.

For re-sizable jobs, bjobs displays the automatically-resizable attribute and the resize notification command.

To display older historical information, use bhist.

Output: Default display

Pending jobs are displayed in the order in which they are considered for dispatch. Jobs in higher priority queues are displayed before those in lower priority queues. Pending jobs in the same priority queues are displayed in the order in which they were submitted but this order can be changed by using the commands btop or bbot. If more than one job is dispatched to a host, the jobs on that host are listed in the order in which they are considered for scheduling on this host by their queue priorities and dispatch times. Finished jobs are displayed in the order in which they were completed.

A listing of jobs is displayed with the following fields:

JOBID
The job ID that LSF assigned to the job.
USER
The user who submitted the job.
STAT
The current status of the job (see JOB STATUS for details).
QUEUE
The name of the job queue to which the job belongs. If the queue to which the job belongs has been removed from the configuration, the queue name is displayed as lost_and_found. Use bhist to get the original queue name. Jobs in the lost_and_found queue remain pending until they are switched with the bswitch command into another queue.

In a LSF multicluster capability resource leasing environment, jobs scheduled by the consumer cluster display the remote queue name in the format queue_name@cluster_name. By default, this field truncates at 10 characters, so you might not see the cluster name unless you use -w or -l.

FROM_HOST
The name of the host from which the job was submitted.

With the LSF multicluster capability, if the host is in a remote cluster, the cluster name and remote job ID are appended to the host name, in the format host_name@cluster_name:job_ID. By default, this field truncates at 11 characters; you might not see the cluster name and job ID unless you use -w or -l.

EXEC_HOST
The name of one or more hosts on which the job is executing (this field is empty if the job has not been dispatched). If the host on which the job is running has been removed from the configuration, the host name is displayed as lost_and_found. Use bhist to get the original host name.

If the host is part of a condensed host group or compute unit, the host name is displayed as the name of the condensed group.

If you configure a host to belong to more than one condensed host groups using wildcards, bjobs can display any of the host groups as execution host name.

JOB_NAME
The job name assigned by the user, or the command string assigned by default at job submission with bsub. If the job name is too long to fit in this field, then only the latter part of the job name is displayed.

The displayed job name or job command can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.

SUBMIT_TIME
The submission time of the job.

Output: Long format (-l)

The -l option displays a long format listing with the following additional fields:

Job
The job ID that LSF assigned to the job.
User
The ID of the user who submitted the job.
Project
The project the job was submitted from.
Application Profile
The application profile the job was submitted to.
Command
The job command.
CWD
The current working directory on the submission host.
Data requirement requested
Indicates that the job has data requirements.
Execution CWD
The actual CWD used when job runs.
Host file
The path to a user-specified host file used when submitting or modifying a job.
Initial checkpoint period
The initial checkpoint period specified at the job level, by bsub -k, or in an application profile with CHKPNT_INITPERIOD.
Checkpoint period
The checkpoint period specified at the job level, by bsub -k, in the queue with CHKPNT, or in an application profile with CHKPNT_PERIOD.
Checkpoint directory
The checkpoint directory specified at the job level, by bsub -k, in the queue with CHKPNT, or in an application profile with CHKPNT_DIR.
Migration threshold
The migration threshold specified at the job level, by bsub -mig.
Post-execute Command
The post-execution command specified at the job-level, by bsub -Ep.
PENDING REASONS
The reason the job is in the PEND or PSUSP state. The names of the hosts associated with each reason are displayed when both -p and -l options are specified.
SUSPENDING REASONS
The reason the job is in the USUSP or SSUSP state.
loadSched
The load scheduling thresholds for the job.
loadStop
The load suspending thresholds for the job.
JOB STATUS
Possible values for the status of a job include:
PEND
The job is pending. That is, it has not yet been started.
PROV
The job has been dispatched to a power-saved host that is waking up. Before the job can be sent to the sbatchd, it is in a PROV state.
PSUSP
The job has been suspended, either by its owner or the LSF administrator, while pending.
RUN
The job is currently running.
USUSP
The job has been suspended, either by its owner or the LSF administrator, while running.
SSUSP
The job has been suspended by LSF. The following are examples of why LSF suspended the job:
  • The load conditions on the execution host or hosts have exceeded a threshold according to the loadStop vector defined for the host or queue.
  • The run window of the job's queue is closed. See bqueues(1), bhosts(1), and lsb.queues(5).
DONE
The job has terminated with status of 0.
EXIT
The job has terminated with a non-zero status – it may have been aborted due to an error in its execution, or killed by its owner or the LSF administrator.

For example, exit code 131 means that the job exceeded a configured resource usage limit and LSF killed the job.

UNKWN
mbatchd has lost contact with the sbatchd on the host on which the job runs.
WAIT
For jobs submitted to a chunk job queue, members of a chunk job that are waiting to run.
ZOMBI
A job becomes ZOMBI if:
  • A non-rerunnable job is killed by bkill while the sbatchd on the execution host is unreachable and the job is shown as UNKWN.
  • After the execution host becomes available, LSF tries to kill the ZOMBI job. Upon successful termination of the ZOMBI job, the job's status is changed to EXIT.

    With the LSF multicluster capability, when a job running on a remote execution cluster becomes a ZOMBI job, the execution cluster treats the job the same way as local ZOMBI jobs. In addition, it notifies the submission cluster that the job is in ZOMBI state and the submission cluster requeues the job.

RUNTIME
Estimated run time for the job, specified by bsub -We or bmod -We, -We+, -Wep.
The following information is displayed when running bjobs -WL, -WF, or -WP.
TIME_LEFT
The estimated run time that the job has remaining. Along with the time if applicable, one of the following symbols may also display.
  • E: The job has an estimated run time that has not been exceeded.
  • L: The job has a hard run time limit specified but either has no estimated run time or the estimated run time is more than the hard run time limit.
  • X: The job has exceeded its estimated run time and the time displayed is the time remaining until the job reaches its hard run time limit.
  • A dash indicates that the job has no estimated run time and no run limit, or that it has exceeded its run time but does not have a hard limit and therefore runs until completion.

If there is less than a minute remaining, 0:0 displays.

FINISH_TIME
The estimated finish time of the job. For done/exited jobs, this is the actual finish time. For running jobs, the finish time is the start time plus the estimated run time (where set and not exceeded) or the start time plus the hard run limit.
  • E: The job has an estimated run time that has not been exceeded.
  • L: The job has a hard run time limit specified but either has no estimated run time or the estimated run time is more than the hard run time limit.
  • X: The job has exceeded its estimated run time and had no hard run time limit set. The finish time displayed is the estimated run time remaining plus the start time.
  • A dash indicates that the pending, suspended, or job with no run limit has no estimated finish time.
%COMPLETE
The estimated completion percentage of the job.
  • E: The job has an estimated run time that has not been exceeded.
  • L: The job has a hard run time limit specified but either has no estimated run time or the estimated run time is more than the hard run time limit.
  • X: The job has exceeded its estimated run time and had no hard run time limit set.
  • A dash indicates that the jobs is pending, or that it is running or suspended, but has no run time limit specified.
Note: For jobs in the state UNKNOWN, the job run time estimate is based on internal counting by the job's mbatchd.
RESOURCE USAGE
For the LSF multicluster capability job forwarding model, this information is not shown if the LSF multicluster capability resource usage updating is disabled. Use LSF_HPC_EXTENSIONS="HOST_RUSAGE" in lsf.conf to specify host-based resource usage.
The values for the current usage of a job include:
HOST
For host-based resource usage, specifies the host.
CPU time
Cumulative total CPU time in seconds of all processes in a job. For host-based resource usage, the cumulative total CPU time in seconds of all processes in a job running on a host.
IDLE_FACTOR
Job idle information (CPU time/runtime) if JOB_IDLE is configured in the queue, and the job has triggered an idle exception.
MEM
Total resident memory usage of all processes in a job. For host-based resource usage, the total resident memory usage of all processes in a job running on a host. The sum of host-based rusage may not equal the total job rusage, since total job rusage is the maximum historical value.

Memory usage unit is scaled automatically based on the value. Use LSF_UNIT_FOR_LIMITS in lsf.conf to specify the smallest unit for display (KB, MB, GB, TB, PB, or EB).

SWAP
Total virtual memory and swap usage of all processes in a job. For host-based resource usage, the total virtual memory usage of all processes in a job running on a host. The sum of host-based usage may not equal the total job usage, since total job usage is the maximum historical value.

Swap usage unit is scaled automatically based on the value. Use the LSF_UNIT_FOR_LIMITS in the lsf.conf file to specify the smallest unit for display (KB, MB, GB, TB, PB, or EB).

By default, LSF collects both memory and swap usage through PIM:
  • If the EGO_PIM_SWAP_REPORT=n parameter is set in the lsf.conf file (this is the default), swap usage is virtual memory (VSZ) of the entire job process.
  • If the EGO_PIM_SWAP_REPORT=y parameter is set in the lsf.conf file, the resident set size (RSS) is subtracted from the virtual memory usage. RSS is the portion of memory occupied by a process that is held in main memory. Swap usage is collected as the VSZ - RSS.

If memory enforcement through the Linux cgroup memory subsystem is enabled with the LSF_LINUX_CGROUP_ACCT=y parameter in the lsf.conf file, LSF uses the cgroup memory subsystem to collect memory and swap usage of all processes in a job.

NTHREAD
Number of currently active threads of a job.
PGID
Currently active process group ID in a job. For host-based resource usage, the currently active process group ID in a job running on a host.
PIDs
Currently active processes in a job. For host-based resource usage, the currently active processes in a job running on a host.
RESOURCE LIMITS
The hard resource usage limits that are imposed on the jobs in the queue (see getrlimit(2) and lsb.queues(5)). These limits are imposed on a per-job and a per-process basis.
The possible per-job resource usage limits are:
  • CPULIMIT
  • TASKLIMIT
  • MEMLIMIT
  • SWAPLIMIT
  • PROCESSLIMIT
  • THREADLIMIT
  • OPENFILELIMIT
  • HOSTLIMIT_PER_JOB
The possible UNIX per-process resource usage limits are:
  • RUNLIMIT
  • FILELIMIT
  • DATALIMIT
  • STACKLIMIT
  • CORELIMIT

If a job submitted to the queue has any of these limits specified (see bsub(1)), then the lower of the corresponding job limits and queue limits are used for the job.

If no resource limit is specified, the resource is assumed to be unlimited. User shell limits that are unlimited are not displayed.

EXCEPTION STATUS
Possible values for the exception status of a job include:
idle
The job is consuming less CPU time than expected. The job idle factor (CPU time/runtime) is less than the configured JOB_IDLE threshold for the queue and a job exception has been triggered.
overrun
The job is running longer than the number of minutes specified by the JOB_OVERRUN threshold for the queue and a job exception has been triggered.
underrun
The job finished sooner than the number of minutes specified by the JOB_UNDERRUN threshold for the queue and a job exception has been triggered.
Requested resources
Shows all the resource requirement strings you specified in the bsub command.
Execution rusage
This is shown if the combined RES_REQ has an rusage or || construct. The chosen alternative will be denoted here.
Synchronous Execution
Job was submitted with the -K option. LSF submits the job and waits for the job to complete.
JOB_DESCRIPTION
The job description assigned by the user. This field is omitted if no job description has been assigned.

The displayed job description can contain up to 4094 characters.

MEMORY USAGE
Displays peak memory usage and average memory usage. For example:
MEMORY USAGE:
MAX MEM:11 Mbytes; AVG MEM:6 Mbytes
Starting in Fix Pack 14, displays peak memory usage, average memory usage, and memory usage efficiency. For example:
MEMORY USAGE:
MAX MEM:11 Mbytes; AVG MEM:6 Mbytes; MEM Efficiency: 10.00%
where, MEM Efficiency is calculated using the following formula:
MEM Efficiency = (MAX MEM / MEM requested in bsub -R "rusage[mem=]") * 100%
If no memory in requested in the bsub command, the MEM Efficiency value will be 0.

You can adjust the rusage value accordingly, the next time for the same job submission, if consumed memory is larger or smaller than current rusage amount.

CPU USAGE
Available starting in Fix Pack 14: displays the maximum number of CPUs used while running the job (CPU peak), duration for CPU to peak (in seconds), CPU average efficiency, and CPU peak efficiency. For example:
CPU USAGE:
 CPU PEAK: 4.24; CPU PEAK DURATION: 54 second(s)
 CPU AVERAGEG EFFICIENCY: 99.55%; CPU PEAK EFFICIENCY: 106.02%
  • CPU PEAK is the maximum number of CPUs used for running the job.
  • CPU PEAK DURATION is the duration, in seconds, to reach the CPU peak for the job.
  • CPU AVERAGE EFFICIENCY is calculated using the following formula:
    CPU AVERAGE EFFICIENCY = (CPU_TIME / (JOB_RUN_TIME * CPU_REQUESTED)) * 100%

    CPU AVERAGE EFFICIENCY is calculated periodically every time the CPU_PEAK_SAMPLE_DURATION value (defined in the lsb.params file) is reached during a job's run. The CPU_TIME and JOB_RUN_TIME values are used only since the last calculation; the job's CPU AVERAGE EFFICIENCY value is the average of all calculated CPU AVERAGE EFFICIENCY values in each cycle.

  • CPU PEAK EFFICIENCY is calculated using the following formula:
    CPU PEAK Efficiency = (CPU PEAK / CPU_REQUESTED) * 100%
RESOURCE REQUIREMENT DETAILS
Displays the configured level of resource requirement details. The BJOBS_RES_REQ_DISPLAY parameter in lsb.params controls the level of detail that this column displays, which can be as follows:
  • none - no resource requirements are displayed (this column is not displayed in the -l output).
  • brief - displays the combined and effective resource requirements.
  • full - displays the job, app, queue, combined and effective resource requirements.
Requested Network
Displays network resource information for IBM Parallel Edition (PE) jobs submitted with the bsub -network option. It does not display network resource information from the NETWORK_REQ parameter in lsb.queues or lsb.applications.
For example:
bjobs -l
Job <2106>, User <user1>;, Project <default>;, Status <RUN>;, Queue <normal>,
                     Command <my_pe_job>
Fri Jun  1 20:44:42: Submitted from host <hostA>, CWD <$HOME>, Requested Network
                      <protocol=mpi: mode=US: type=sn_all: instance=1: usage=dedicated>

If mode=IP is specified for the PE job, instance is not displayed.

DATA REQUIREMENTS
When you use -data, displays a list of requested files for jobs with data requirements.

Output: Forwarded job information

The -fwd option filters output to display information on forwarded jobs in the LSF multicluster capability job forwarding mode. The following additional fields are displayed:
CLUSTER
The name of the cluster to which the job was forwarded.
FORWARD_TIME
The time that the job was forwarded.

Output: Job array summary information

Use -A to display summary information about job arrays. The following fields are displayed:

JOBID
Job ID of the job array.
ARRAY_SPEC
Array specification in the format of name[index]. The array specification may be truncated, use -w option together with -A to show the full array specification.
OWNER
Owner of the job array.
NJOBS
Number of jobs in the job array.
PEND
Number of pending jobs of the job array.
RUN
Number of running jobs of the job array.
DONE
Number of successfully completed jobs of the job array.
EXIT
Number of unsuccessfully completed jobs of the job array.
SSUSP
Number of LSF system suspended jobs of the job array.
USUSP
Number of user suspended jobs of the job array.
PSUSP
Number of held jobs of the job array.

Output: LSF Session Scheduler job summary information

JOBID
Job ID of the Session Scheduler job.
OWNER
Owner of the Session Scheduler job.
JOB_NAME
The job name assigned by the user, or the command string assigned by default at job submission with bsub. If the job name is too long to fit in this field, then only the latter part of the job name is displayed.

The displayed job name or job command can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.

NTASKS
The total number of tasks for this Session Scheduler job.
PEND
Number of pending tasks of the Session Scheduler job.
RUN
Number of running tasks of the Session Scheduler job.
DONE
Number of successfully completed tasks of the Session Scheduler job.
EXIT
Number of unsuccessfully completed tasks of the Session Scheduler job.

Output: Unfinished job summary information

Use -sum to display summary information about unfinished jobs. The count of job slots for the following job states is displayed:
RUN
The job is running.
SSUSP
The job has been suspended by LSF.
USUSP
The job has been suspended, either by its owner or the LSF administrator, while running.
UNKNOWN
mbatchd has lost contact with the sbatchd on the host where the job was running.
PEND
The job is pending, which may include PSUSP and chunk job WAIT. When -sum is used with -p in the LSF multicluster capability, WAIT jobs are not counted as PEND or FWD_PEND. When -sum is used with -r, WAIT jobs are counted as PEND or FWD_PEND.
FWD_PEND
The job is pending and forwarded to a remote cluster. The job has not yet started in the remote cluster.

Output: Affinity resource requirements information (-l -aff)

Use -l -aff to display information about CPU and memory affinity resource requirements for job tasks. A table with the heading AFFINITY is displayed containing the detailed affinity information for each task, one line for each allocated processor unit. CPU binding and memory binding information are shown in separate columns in the display.
HOST
The host the task is running on
TYPE
Requested processor unit type for CPU binding. One of numa, socket, core, or thread.
LEVEL
Requested processor unit binding level for CPU binding. One of numa, socket, core, or thread. If no CPU binding level is requested, a dash (-) is displayed.
EXCL
Requested processor unit binding level for exclusive CPU binding. One of numa, socket, or core. If no exclusive binding level is requested, a dash (-) is displayed.
IDS
List of physical or logical IDs of the CPU allocation for the task.

The list consists of a set of paths, represented as a sequence integers separated by slash characters (/), through the topology tree of the host. Each path identifies a unique processing unit allocated to the task. For example, a string of the form 3/0/5/12 represents an allocation to thread 12 in core 5 of socket 0 in NUMA node 3. A string of the form 2/1/4represents an allocation to core 4 of socket 1 in NUMA node 2. The integers correspond to the node ID numbers displayed in the topology tree from bhosts -aff.

POL
Requested memory binding policy. Eitherlocal or pref. If no memory binding is requested, a dash (-) is displayed.
NUMA
ID of the NUMA node that the task memory is bound to. If no memory binding is requested, a dash (-) is displayed.
SIZE
Amount of memory allocated for the task on the NUMA node.
For example the following job starts 6 tasks with the following affinity resource requirements:
bsub -n 6 -R"span[hosts=1] rusage[mem=100]affinity[core(1,same=socket,
exclusive=(socket,injob)):cpubind=socket:membind=localonly:distribute=pack]" myjob
Job <6> is submitted to default queue <normal>.
bjobs -l -aff 6

Job <6>, User <user1>, Project <default>, Status <RUN>, Queue <normal>, Comman
                     d <myjob1>
Thu Feb 14 14:13:46: Submitted from host <hostA>, CWD <$HOME>, 6 Task(s), 
                     Requested Resources <span[hosts=1] rusage[mem=10
                     0]affinity[core(1,same=socket,exclusive=(socket,injob)):cp
                     ubind=socket:membind=localonly:distribute=pack]>;
Thu Feb 14 14:15:07: Started 6 Task(s) on Hosts <hostA> <hostA> <hostA> <hostA>
                     <hostA> <hostA>, Allocated 6 Slot(s) on Hosts <hostA>
                     <hostA> <hostA> <hostA> <hostA> <hostA>, Execution Home 
                     </home/user1>, Execution CWD </home/user1>;

 SCHEDULING PARAMETERS:
           r15s   r1m  r15m   ut      pg    io   ls    it    tmp    swp    mem
 loadSched   -     -     -     -       -     -    -     -     -      -      -
 loadStop    -     -     -     -       -     -    -     -     -      -      -

 RESOURCE REQUIREMENT DETAILS:
 Combined: select[type == local] order[r15s:pg] rusage[mem=100.00] span[hosts=1
                     ] affinity[core(1,same=socket,exclusive=(socket,injob))*1:
                     cpubind=socket:membind=localonly:distribute=pack]
 Effective: select[type == local] order[r15s:pg] rusage[mem=100.00] span[hosts=
                     1] affinity[core(1,same=socket,exclusive=(socket,injob))*1
                     :cpubind=socket:membind=localonly:distribute=pack]

 AFFINITY:
                     CPU BINDING                          MEMORY BINDING
                     ------------------------             --------------------
 HOST                TYPE   LEVEL  EXCL   IDS             POL   NUMA SIZE
 hostA               core   socket socket /0/0/0          local 0    16.7MB
 hostA               core   socket socket /0/1/0          local 0    16.7MB
 hostA               core   socket socket /0/2/0          local 0    16.7MB
 hostA               core   socket socket /0/3/0          local 0    16.7MB
 hostA               core   socket socket /0/4/0          local 0    16.7MB
 hostA               core   socket socket /0/5/0          local 0    16.7MB

  

Output: Data requirements information (-l -data)

Use -l -data to display detailed information about jobs with data requirements. The heading DATA REQUIREMENTS is displayed followed by a list of the files requested by the job.

For example:
bjobs -l -data 1962
Job <1962>, User <user1>, Project <default>, Status <PEND>, Queue
        <normal>,Command <my_data_job>
Fri Sep 20 16:31:17: Submitted from host <hb05b10>, CWD 
        <$HOME/source/user1/work>, Data requirement requested;
 PENDING REASONS:
 Job is waiting for its data requirement to be satisfied;

 SCHEDULING PARAMETERS:
           r15s  r1m  r15m  ut  pg  io  ls  it  tmp  swp  mem
 loadSched   -     -   -     -       -     -    -     -     -      -      -
 loadStop    -     -   -     -       -     -    -     -     -      -      -

 RESOURCE REQUIREMENT DETAILS:
 Combined: select[type == local] order[r15s:pg]
 Effective: -

 DATA REQUIREMENTS: 
 FILE: hostA:/home/user1/data2
 SIZE: 40 MB
 MODIFIED: Thu Aug 14 17:01:57

 FILE: hostA:/home/user1/data3
 SIZE: 40 MB
 MODIFIED: Fri Aug 15 16:32:45

 FILE: hostA:/home/user1/data4
 SIZE: 500 MB
 MODIFIED: Mon Apr 14 17:15:56

See also

bsub, bkill, bhosts, bmgroup, bclusters, bqueues, bhist, bresume, bsla, bstop, lsb.params, lsb.serviceclasses, mbatchd