Configuration to enable job submission and execution controls

Enable job submission and execution controls with at least one esub, epsub, or eexec executable file in the directory specified by the parameter LSF_SERVERDIR in the lsf.conf file. LSF does not include a default esub, epsub, or eexec; write your own executable files to meet the job requirements of your site.


Executable file UNIX naming convention Windows naming convention
esub LSF_SERVERDIR/esub.application LSF_SERVERDIR\esub.application.exe

LSF_SERVERDIR\esub.application.bat

epsub LSF_SERVERDIR/epsub.application LSF_SERVERDIR\epsub.application.exe

LSF_SERVERDIR\epsub.application.bat

eexec LSF_SERVERDIR/eexec LSF_SERVERDIR\eexec.exe

LSF_SERVERDIR\eexec.bat


The name of your esub/epsub indicates the application with which it runs. For example: esub.fluent or epsub.fluent.

Restriction: The names esub.user and epsub.user are reserved. Do not use esub.user and epsub.user for application-specific esub and epsub executable files.

Valid file names contain only alphanumeric characters, underscores (_), and hyphens (-).

Once the LSF_SERVERDIR contains one or more esub/epsub executable files, users can specify the esub/epsub executable files that are associated with each job they submit. If an eexec exists in LSF_SERVERDIR, LSF invokes that eexec for all jobs that are submitted to the cluster.

The following esub executable files are provided as separate packages, available from IBM upon request:
  • esub.afs or esub.dce: for installing LSF onto an AFS or DCE filesystem
  • esub.bproc: Beowulf Distributed Process Space (BProc) job submission
  • esub.checkcmd: Check bsub option arguments.
  • esub.dprov: Data provenance options for job submission
  • esub.fluent: FLUENT job submission
  • esub.intelmpi: Intel® MPI job submission
  • esub.lammpi: LAM/MPI job submission
  • esub.ls_dyna: LS-Dyna job submission
  • esub.mpich_gm: MPICH-GM job submission
  • esub.mpich2: MPICH2 job submission
  • esub.mpichp4: MPICH-P4 job submission
  • esub.mvapich: MVAPICH job submission
  • esub.openmpi: OpenMPI job submission
  • esub.p8aff: POWER8 affinity job submission
  • esub.poe: POE job submission
  • esub.pvm: PVM job submission
  • esub.tv, esub.tvlammpi, esub.tvmpich_gm, esub.tvpoe: TotalView® debugging for various MPI applications.

Environment variables used by esub

When you write an esub, you can use the following environment variables that are provided by LSF for the esub execution environment:

LSB_SUB_PARM_FILE
Points to a temporary file that LSF uses to store the bsub options that are entered in the command line. An esub reads this file at job submission and either accepts the values, changes the values, or rejects the job. Job submission options are stored as name-value pairs on separate lines with the format option_name=value.

For example, if a user submits the following job:

bsub -q normal -x -P myproject -R "r1m rusage[mem=100]" -n 90 myjob

The LSB_SUB_PARM_FILE contains the following lines:
LSB_SUB_QUEUE="normal"
LSB_SUB_EXLUSIVE=Y
LSB_SUB_RES_REQ="r1m usage[mem=100]"
LSB_SUB_PROJECT_NAME="myproject"
LSB_SUB_COMMAND_LINE="myjob"
LSB_SUB_NUM_PROCESSORS=90
LSB_SUB_MAX_NUM_PROCESSORS=90
LSB_SUB_MEM_USAGE=100

An esub can change any or all of the job options by writing to the file specified by the environment variable LSB_SUB_MODIFY_FILE.

The temporary file pointed to by LSB_SUB_PARM_FILE stores the following information:
Option bsub or bmod option Description
LSB_SUB_ADDITIONAL -a String that contains the application name or names of the esub executable files that are requested by the user.
Restriction: The -a option is the only option that an esub cannot change or add at job submission.
LSB_SUB_BEGIN_TIME -b Begin time, in seconds since 00:00:00 GMT, 1 January 1970.
LSB_SUB_CHKPNT_DIR -k Checkpoint directory

The file path of the checkpoint directory can contain up to 4000 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name.

LSB_SUB_COMMAND_LINE bsub job command argument The LSB_SUB_COMMANDNAME parameter must be set in the lsf.conf parameter to enable esub to use this variable.
LSB_SUB_CHKPNT_PERIOD -k Checkpoint period in seconds
LSB_SUB3_CWD -cwd Current working directory
LSB_SUB_DEPEND_COND -w Dependency condition
LSB_SUB_ERR_FILE -e, -eo Standard error file name
LSB_SUB_EXCLUSIVE -x Exclusive execution, which is specified by Y.
LSB_SUB_HOLD -H Hold job.
LSB_SUB_HOST_SPEC -c or -w Host specifier, limits the CPU time or RUN time.
LSB_SUB_HOSTS -m List of requested execution host names
LSB_SUB_IN_FILE -i, -io Standard input file name
LSB_SUB_INTERACTIVE -I Interactive job, which is specified by Y.
LSB_SUB6_JOBAFF -jobaff Job's affinity preferences
LSB_SUB_JOB_DESCRIPTION -Jd Job description
LSB_SUB_JOB_NAME -J Job name
LSB_SUB_LOGIN_SHELL -L Login shell
LSB_SUB_MAIL_USER -u Email address to which LSF sends job-related messages.
LSB_SUB_MEM_USAGE -R "rusage[mem=value]" Specifies the mem value in the rusage[] string.
LSB_SUB_MAX_NUM

_PROCESSORS

-n Maximum number of processors requested
LSB_SUB_SWP_USAGE -R "rusage[swp=value]" Specifies the swp value in the rusage[] string.
LSB_MC_SUB_CLUSTERS -clusters Cluster names
LSB_SUB_MODIFY bmod Indicates that bmod invoked esub, specified by Y.
LSB_SUB_MODIFY_ONCE bmod Indicates that the job options that are specified at job submission are already modified by bmod, and that bmod is invoking esub again. This is specified by Y.
LSB_SUB4_NETWORK -network Defines network requirements before job submission
LSB_SUB4_ORPHAN_TERM_NO_WAIT -ti Tells LSF to terminate an orphaned job immediately (ignores the grace period).
LSB_SUB4_ELIGIBLE_PEND

_TIME_LIMIT

-eptl The eligible pending time limit for the job.

LSB_SUB4_ELIGIBLE_PEND_TIME_LIMIT= [hour:]minute if bsub -eptl or bmod -eptl is specified.

LSB_SUB4_ELIGIBLE_PEND_TIME_LIMIT= SUB_RESET if bmod -eptln is specified.

LSB_SUB4_PEND_TIME_LIMIT -ptl The pending time limit for the job.

LSB_SUB4_PEND_TIME_LIMIT= [hour:]minute if bsub -ptl or bmod -ptl is specified.

LSB_SUB4_PEND_TIME_LIMIT= SUB_RESET if bmod -ptln is specified.

LSB_SUB_NOTIFY_BEGIN -B LSF sends an email notification when the job begins, specified by Y.
LSB_SUB_NOTIFY_END -N LSF sends an email notification when the job ends, which are specified by Y.
LSB_SUB_NUM_PROCESSORS -n Minimum number of processors requested.
LSB_SUB_OTHER_FILES bmod -f Indicates the number of files to be transferred. The value is SUB_RESET if bmod is being used to reset the number of files to be transferred.

The file path of the directory can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the director and file name.

LSB_SUB_OTHER_FILES

_number

bsub -f The number indicates the particular file transfer value in the specified file transfer expression.

For example, for bsub -f "a > b" -f "c < d", the following parameters are defined:

LSB_SUB_OTHER_FILES=2

LSB_SUB_OTHER_FILES_0="a > b"

LSB_SUB_OTHER_FILES_1="c < d"

LSB_SUB4_OUTDIR -outdir Output directory
LSB_SUB_OUT_FILE -o, -oo Standard output file name.
LSB_SUB_PRE_EXEC -E Pre-execution command.

The file path of the directory can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name.

LSB_SUB_PROJECT_NAME -P Project name.
LSB_SUB_PTY -Ip An interactive job with PTY support, which is specified by "Y"
LSB_SUB_PTY_SHELL -Is An interactive job with PTY shell support, which is specified by "Y"
LSB_SUB_QUEUE -q Submission queue name
LSB_SUB_RERUNNABLE -r Y specifies a rerunnable job.

N specifies a non-rerunnable job (specified with bsub -rn). The job is not rerunnable even it was submitted to a rerunnable queue or application profile.

For bmod -rn, the value is SUB_RESET.

LSB_SUB_RES_REQ -R Resource requirement string—does not support multiple resource requirement strings.
LSB_SUB_RESTART brestart Y indicates to esub that the job options are associated with a restarted job.
LSB_SUB_RESTART_FORCE brestart -f Y indicates to esub that the job options are associated with a forced restarted job.
LSB_SUB_RLIMIT_CORE -C Core file size limit
LSB_SUB_RLIMIT_CPU -c CPU limit
LSB_SUB_RLIMIT_DATA -D Data size limit

For AIX, if the XPG_SUS_ENV=ON environment variable is set in the user's environment before the process is executed and a process attempts to set the limit lower than current usage, the operation fails with errno set to EINVAL. If the XPG_SUS_ENV environment variable is not set, the operation fails with errno set to EFAULT.

LSB_SUB_RLIMIT_FSIZE -F File size limit
LSB_SUB_RLIMIT_PROCESS -p Process limit
LSB_SUB_RLIMIT_RSS -M Resident size limit
LSB_SUB_RLIMIT_RUN -W Wall-clock run limit in seconds. (Note this value is not in minutes, unlike the run limit specified by bsub -W).
LSB_SUB_RLIMIT_STACK -S Stack size limit
LSB_SUB_RLIMIT_SWAP -v Process virtual memory limit
LSB_SUB_RLIMIT_THREAD -T Thread limit
LSB_SUB_TERM_TIME -t Termination time, in seconds, since 00:00:00 GMT, Jan. 1, 1970
LSB_SUB_TIME_EVENT -wt Time event expression

LSB_SUB_USER_GROUP

-G

User group name

LSB_SUB_JOB

_WARNING_ACTION

-wa Job warning action
LSB_SUB_JOB_ACTION

_WARNING_TIME

-wt Job warning time period

LSB_SUB_WINDOW_SIG

-s

Window signal number

LSB_SUB2_JOB_GROUP

-g

Submits a job to a job group

LSB_SUB2_LICENSE

_PROJECT

-Lp

License Scheduler project name

LSB_SUB2_IN

_FILE_SPOOL

-is

Spooled input file name

LSB_SUB2_JOB

_CMD_SPOOL

-Zs

Spooled job command file name

LSB_SUB2_JOB

_PRIORITY

-sp

Job priority

For bmod -spn, the value is SUB_RESET.

LSB_SUB2_SLA

-sla

SLA scheduling options

LSB_SUB2_USE_RSV

-U

Advance reservation ID

LSB_SUB3_ABSOLUTE

_PRIORITY

bmod -aps

bmod -apsn

For bmod -aps, the value equal to the APS string given. For bmod -apsn, the value is SUB_RESET.

LSB_SUB3_AUTO

_RESIZABLE

-ar

Job autoresizable attribute. LSB_SUB3_AUTO_RESIZABLE=Y if bsub -ar -app or bmod -ar is specified.

LSB_SUB3_AUTO_RESIABLE=

SUB_RESET if bmod -arn is used.

LSB_SUB3_APP

-app

Application profile name

For bmod -appn, the value is SUB_RESET.

LSB_SUB3_CWD

-cwd

Current working directory

LSB_SUB3_ INIT_CHKPNT_PERIOD

-k init

Initial checkpoint period

LSB_SUB_INTERACTIVE

LSB_SUB3_INTERACTIVE_SSH

bsub -IS

The session of the interactive job is encrypted with SSH.

LSB_SUB_INTERACTIVE

LSB_SUB_PTY

LSB_SUB3_INTERACTIVE_SSH

bsub –ISp

If LSB_SUB_INTERACTIVE is specified by "Y", LSB_SUB_PTY is specified by "Y", and LSB_SUB3_INTERACTIVE_SSH is specified by "Y", the session of interactive job with PTY support is encrypted by SSH.

LSB_SUB_INTERACTIVE

LSB_SUB_PTY

LSB_SUB_PTY_SHELL

LSB_SUB3_INTERACTIVE_SSH

bsub -ISs

If LSB_SUB_INTERACTIVE is specified by "Y", LSB_SUB_PTY is specified by "Y", LSB_SUB_PTY_SHELL is specified by "Y", and LSB_SUB3_INTERACTIVE_SSH is specified by "Y", the session of interactive job with PTY shell support is encrypted by SSH.
LSB_SUB3_JOB_REQUEUE -Q String format parameter that contains the job requeue exit values

For bmod -Qn, the value is SUB_RESET.

LSB_SUB3_MIG -mig

-mign

Migration threshold
LSB_SUB3_POST_EXEC -Ep Run the specified post-execution command on the execution host after the job finishes (you must specify the first execution host).

The file path of the directory can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name.

LSB_SUB6_RC_ACCOUNT -rcacct LSF resource connector account name that is assigned to a job, which is then tagged to the resource connector host that runs the job.
LSB_SUB3_RESIZE_NOTIFY_CMD -rnc Job resize notification command.

LSB_SUB3_RESIZE_NOTIFY_CMD=<cmd> if bsub -rnc or bmod -rnc is specified.

LSB_SUB3_RESIZE_NOTIFY_CMD

=SUB_RESET

if bmod -rnc is used.

LSB_SUB3_RUNTIME_ESTIMATION -We Runtime estimate in seconds. (Note this runtime is not in minutes, unlike the runtime estimate specified by bsub -We).
LSB_SUB3_RUNTIME_ESTIMATION_ACC -We+ Runtime estimate that is the accumulated run time plus the runtime estimate.
LSB_SUB3_RUNTIME_ESTIMATION_PERC -Wep Runtime estimate in percentage of completion
LSB_SUB3_USER_SHELL_LIMITS -ul Pass user shell limits to execution host.
LSB_SUB_INTERACTIVELSB_SUB3_XJOB_SSH bsub -IX If both are set to "Y", the session between the X-client and X-server as well as the session between the execution host and submission host are encrypted with SSH.
LSF_SUB4_SUB_ENV_VARS -env Controls the propagation of job submission environment variables to the execution hosts. If any environment variables in LSF_SUB4_SUB_ENV_VARS conflict with the contents of the LSB_SUB_MODIFY_ENVFILE file, the conflicting environment variables in LSB_SUB_MODIFY_ENVFILE take effect.

LSB_SUB_MODIFY_FILE
Points to the file that esub uses to modify the bsub job option values that are stored in the LSB_SUB_PARM_FILE. You can change the job options by having your esub write the new values to the LSB_SUB_MODIFY_FILE in any order by using the same format shown for the LSB_SUB_PARM_FILE. The value SUB_RESET, integers, and boolean values do not require quotes. String parameters must be entered with quotes around each string, or space-separated series of strings.

When your esub runs at job submission, LSF checks the LSB_SUB_MODIFY_FILE and applies changes so that the job runs with the revised option values.

Restriction:

LSB_SUB_ADDITIONAL is the only option that an esub cannot change or add at job submission.

LSB_SUB_MODIFY_ENVFILE
Points to the file that esub uses to modify the user environment variables with which the job is submitted (not specified by bsub options). You can change these environment variables by having your esub write the values to the LSB_SUB_MODIFY_ENVFILE in any order by using the format variable_name=value, or variable_name="string".

LSF uses the LSB_SUB_MODIFY_ENVFILE to change the environment variables on the submission host. When your esub runs at job submission, LSF checks the LSB_SUB_MODIFY_ENVFILE and applies changes so that the job is submitted with the new environment variable values. LSF associates the new user environment with the job so that the job runs on the execution host with the new user environment.

LSB_SUB_ABORT_VALUE
Indicates to LSF that a job is rejected. For example, if you want LSF to reject a job, make sure that your esub contains the following line:
exit $LSB_SUB_ABORT_VALUE
Restriction: When an esub exits with the LSB_SUB_ABORT_VALUE, esub must not write to LSB_SUB_MODIFY_FILE or to LSB_SUB_MODIFY_ENVFILE.

If multiple esubs are specified and one of the esubs exits with a value of LSB_SUB_ABORT_VALUE, LSF rejects the job without running the remaining esubs and returns a value of LSB_SUB_ABORT_VALUE.

LSF_INVOKE_CMD
Specifies the name of the LSF command that most recently invoked an external executable.

The length of environment variables that are used by esub must be less than 4096.

Environment variables used by epsub

When you write an epsub, you can use the following environment variables that are provided by LSF for the epsub execution environment:

LSB_SUB_JOB_ERR
Indicates the error number for an externally submitted job that is defined by mbatchd if the job submission failed. This variable is available to the external post-submission scripts (epsub) to determine the reason for the job submission failure.

If the job submission is successful, this value is LSB_NO_ERROR (or 0).

LSB_SUB_JOB_ID
Indicates the ID of a submitted job that is assigned by LSF, as shown by the bjobs command. A value of -1 indicates that mbatchd rejected the job submission.
LSB_SUB_JOB_QUEUE
Indicates the name of the final queue from which the job is dispatched, which includes any queue modifications that are made by esub.
LSB_SUB_PARM_FILE
Points to a temporary file that LSF uses to store the bsub options that are entered in the command line. Job submission options are stored as name-value pairs on separate lines in the format option_name=value. The file that this environment variable specifies is a different file from the one that is initially created by esub before the job submission.

In addition to the environment variables available to epsub, you can also use the environment variables that are provided by LSF for the esub execution environment, except for LSB_SUB_MODIFY_FILE and LSB_SUB_MODIFY_ENVFILE.

Environment variables used by eexec

When you write an eexec, you can use the following environment variables in addition to all user-environment or application-specific variables.
LS_EXEC_T
Indicates the stage or type of job execution. LSF sets LS_EXEC_T to:
  • START at the beginning of job execution
  • END at job completion
  • CHKPNT at job checkpoint start
LS_JOBPID
Stores the process ID of the LSF process that invoked eexec. If eexec is intended to monitor job execution, eexec must spawn a child and then have the parent eexec process exit. The eexec child can periodically test that the job process is still alive by using the LS_JOBPID variable.