bkill

Sends signals to kill, suspend, or resume unfinished jobs

Synopsis

bkill [-l][-b | -stat run | pend | susp] [-app application_profile_name] [-g job_group_name] [-sla service_class_name] [-J job_name] [-m host_name | -m host_group] [-q queue_name] [-r | -s signal_value | signal_name | -d] [-u user_name | -u user_group | -u all] [0] [-C kill_reason] [job_ID ... | 0 | "job_ID[index]" ...] [-h | -V]

Description

By default, sends a set of signals to kill the specified jobs. On UNIX, SIGINT and SIGTERM signals are sent to give the job a chance to clean up before termination, then the SIGKILL signal is sent to kill the job. The time interval between sending each signal is defined by the JOB_TERMINATE_INTERVAL parameter in the lsb.params file.

By default, kills the last job that was submitted by the user who ran the command. You must specify a job ID or the -app, -g, -J, -m, -u, or -q option. If you specify the -app, -g, -J, -m, -u, or -q option without a job ID, the bkill command kills the last job that was submitted by the user who ran the command. Specify job ID 0 to kill multiple jobs.

On Windows, job control messages replace the SIGINT and SIGTERM signals, but only customized applications can process them. The TerminateProcess() system call is sent to kill the job.

The bkill command sends the signals INT, TERM, and KILL in sequence. The exit code that is returned when a dispatched job is killed with the bkill command depends on which signal killed the job.

If PRIVILEGED_USER_FORCE_BKILL=y parameter is set in the lsb.params file, only root and LSF administrators can run the bkill -r command. The -r option is ignored for other users.

Users can operate only on their own jobs. Only root and LSF administrators can operate on jobs that are submitted by other users.

If a signal request fails to reach the job execution host, LSF tries the operation later when the host becomes reachable. LSF retries the most recent signal request.

Options

0
Kills all the jobs that satisfy other options (-app. -g, -m, -q, -u, and -J options).
-b
Kills large numbers of jobs as soon as possible. Local pending jobs are killed immediately and cleaned up as soon as possible, ignoring the time interval specified by the CLEAN_PERIOD parameter in the lsb.params file. Jobs that are killed in this manner are not logged to the lsb.acct file.

Other jobs, such as running jobs, are killed as soon as possible and cleaned up normally.

If the -b option is used with the 0 subcommand, the bkill command kills all applicable jobs and silently skips the jobs that cannot be killed.
bkill -b 0
Operation is in progress

The -b option is ignored if used with the -r or -s options.

-d
Kills the jobs, then records the jobs as DONE after the jobs exit.

Use the -d option when working with remote clusters.

The -d option is ignored if used with the -r or -s options.

The -d option only takes effect for started jobs that are in the RUN, USUSP, or SSUSP state. Otherwise, the option is ignored.

-l
Displays the signal names that are supported by the bkill command. The supported signals are a subset of signals that are supported by the /bin/kill command and is operating system-dependent.
-r
Removes a job from the LSF system without waiting for the job to terminate in the operating system.

If the PRIVILEGED_USER_FORCE_BKILL=y parameter is set in the lsb.params file, only root and LSF administrators can run the bkill -r command. The -r option is ignored for other users.

Sends the same series of signals as the bkill command without the -r option, except that the job is removed from the system immediately. If the job is in UNKNWN state, the bkill -r command marks the job as ZOMBIE state. The bkill -r command changes jobs in ZOMBIE state to EXIT state. The job resources that LSF monitors are released as soon as LSF receives the first signal.

Use the bkill -r command only on jobs that cannot be killed in the operating system, or on jobs that cannot be otherwise removed by using the bkill command.

The -r option cannot be used with the -s option.

-app application_profile_name
Operates only on jobs that are associated with the specified application profile. You must specify an existing application profile. If job_ID or 0 is not specified, only the most recently submitted qualifying job is operated on.
-C kill_reason
Gives the user the option to add a reason as to why the job is being killed. The length of the reason is limited to 4095 characters.
Note: This option cannot be used in combination with the -s option unless the signal is SIGSTOP, SIGCONT, or SIGKILL.
-g job_group_name
Operates only on jobs in the job group that is specified by job_group_name.

Use the -g option with the -sla option to kill jobs in job groups that are attached to a service class.

The bkill command does not kill jobs in lower-level job groups in the path. For example, jobs are attached to job groups /risk_group and /risk_group/consolidate:
bsub -g /risk_group  myjob
Job <115> is submitted to default queue <normal>.
bsub -g /risk_group/consolidate myjob2
Job <116> is submitted to default queue <normal>.
The following bkill command kills only jobs in the /risk_group job group, not the subgroup /risk_group/consolidate:
bkill -g /risk_group 0
Job <115> is being terminated
bkill -g /risk_group/consolidate 0
Job <116> is being terminated
-J job_name
Operates only on jobs with the specified job name. The -J option is ignored if a job ID other than 0 is specified in the job_ID option.

The job name can be up to 4094 characters long. Job names are not unique.

The wildcard character (*) can be used anywhere within a job name, but it cannot appear within an array index. For example, the pattern job* returns jobA and jobarray[1]. The *AAA*[1] pattern returns the first element in job arrays with names that contain AAA. However, the pattern job1[*] does not return anything since the wildcard is within the array index.

-m host_name | -m host_group
Operates only on jobs that are dispatched to the specified host or host group.

If job_ID is not specified, only the most recently submitted qualifying job is operated on. The -m option is ignored if a job ID other than 0 is specified in the job_ID option. Use the bhosts and bmgroup commands to see information about hosts and host groups.

-q queue_name
Operates only on jobs in the specified queue.

If job_ID is not specified, only the most recently submitted qualifying job is operated on.

The -q option is ignored if a job ID other than 0 is specified in the job_ID option.

Use the bqueuescommand to see information about queues.

-s signal_value | signal_name
Sends the specified signal to specified jobs. You can specify either a name, stripped of the SIG prefix (such as KILL), or a number (such as 9).

Eligible UNIX signal names are listed by the bkill -l command.

The -s option cannot be used with the -r option.

Use the bkill -s command to suspend and resume jobs by using the appropriate signal instead of using the bstop or bresume command. Sending the SIGCONT signal is the same as using the bresume command.

Sending the SIGSTOP signal to sequential jobs or the SIGTSTP signal to parallel jobs is the same as using the bstop command.

You cannot suspend a job that is already suspended, or resume a job that is not suspended. Using the SIGSTOP or SIGTSTP signal on a job that is in the USUSP state has no effect. Using the SIGCONT signal on a job that is not in either the PSUSP or the USUSP state has no effect. Use the bjobs command to see information about job states.

Limited Windows signals are supported:
  • bkill -s 7 or bkill SIGKILL to terminate a job
  • bkill -s 16 or bkill SIGSTOP to suspend a job
  • bkill -s 15 to resume a job
-sla service_class_name
Operates on jobs that belong to the specified service class.

If a job ID is not specified, only the most recently submitted job is operated on.

Use the -sla option with the -g option to kill jobs in job groups that are attached to a service class.

The -sla option is ignored if a job ID other than 0 is specified in the job_ID option.

Use the bsla command to display the configuration properties of service classes that are configured in the lsb.serviceclasses file. The bsla command also shows the default SLA configured with the ENABLE_DEFAULT_EGO_SLA parameter in the lsb.params file, and dynamic information about the state of each service class.

-stat run | pend | susp
Kills large number of jobs in the specified status as soon as possible. Local pending jobs are killed immediately and cleaned up as soon as possible, ignoring the time interval specified by the CLEAN_PERIOD parameter in the lsb.params file. Jobs that are killed in this manner are not logged to the lsb.acct file.

Other jobs, such as running jobs, are killed as soon as possible and cleaned up normally.

The -stat option kills all applicable jobs and silently skips the jobs that LSF cannot kill.

When running the -stat command option, you do not need the job ID, nor do you need one of the -m, -q, -J, -g, -sla, or -app options.

The -stat run option kills all running jobs that you can kill.

The -stat pend option only works with three signals that are specified by the -s option: INT, KILL, or TERM.

The -stat option cannot be used with the -b option.

-u user_name | -u user_group | -u all
Operates only on jobs that are submitted by the specified user or user group, or by all users if the reserved user name all is specified. To specify a Windows user account, include the domain name in uppercase letters and use a single backslash (DOMAIN_NAME\user_name) in a Windows command line or a double backslash (DOMAIN_NAME\\user_name) in a UNIX command line.

If a job ID is not specified, only the most recently submitted qualifying job is operated on. The -u option is ignored if a job ID other than 0 is specified in the job_ID option.

job_ID ... | 0 | "job_ID[index]" ...
Operates only on jobs that are specified by job_ID or "job_ID[index]", where "job_ID[index]" specifies selected job array elements. Use the bjobs command to see the array elements for the job. For job arrays, quotation marks must enclose the job ID and index, and index must be enclosed in square brackets.

Kill an entire job array by specifying the job array ID instead of the job ID.

Jobs that are submitted by any user can be specified here without using the -u option. If you use the reserved job ID 0, all the jobs that satisfy other options (that is, -m, -q, -u, and -J options) are operated on. All other job IDs are ignored.

The options -u, -q, -m, and -J have no effect if a job ID other than 0 is specified. Job IDs are returned at job submission time. Use the bjobs command to find the job IDs.

Any jobs or job arrays that are killed are logged in the lsb.acctfile.

-h
Prints command usage to stderr and exits.
-V
Prints LSF release version to stderr and exits.

Examples

bkill -s 17 -q night
Sends signal 17 to the last job that was submitted by the invoker to queue night.
bkill -q short -u all 0
Kills all the jobs that are in the queue short.
bkill -r 1045
Forces the removal of unkillable job 1045.
bkill -sla Tofino 0
Kills all jobs that belong to the service class named Tofino.
bkill -g /risk_group 0
Kills all jobs in the job group /risk_group.
bkill -app fluent
Kills the most recently submitted job that is associated with the application profile fluent for the current user.
bkill -app fluent 0
Kills all jobs that are associated with the application profile fluent for the current user.
bkill -stat run
Kills all jobs that are in RUN status.

See also

bsub, bjobs, bqueues, bhosts, bresume, bapp, bsla, bstop, bgadd, bgdel, bjgroup, bparams, lsb.params, lsb.serviceclasses, mbatchd, kill, signal