brestart

Restarts checkpointed jobs.

Synopsis

brestart [bsub_options] [-f] checkpoint_dir [job_ID | "job_ID[index]"]
brestart [-h | -V]

Option List

-B
-f
-N | -Ne
-x
-a "esub_application[([argument[,argument...]])]..."
-b begin_time
-C core_limit
-c [hour:]minute[/host_name | /host_model]
-D data_limit
-E "pre_exec_command [argument ...]"
-F file_limit
-m "host_name[+[pref_level]] | host_group[+[pref_level]] ..."
-G user_group
-M mem_limit
-q "queue_name ..."
-R "res_req" [-R "res_req" ...]
-S stack_limit
-t term_time
-w 'dependency_expression' [-ti]
-W run_limit[/host_name| /host_model]
checkpoint_dir [job_ID | "job_ID[index]"]

Description

Restarts a checkpointed job by using the checkpoint files that are saved in the directory checkpoint_dir/last_job_ID/. Only jobs that are successfully checkpointed can be restarted.

Jobs are resubmitted and assigned a new job ID. The checkpoint directory is renamed by using the new job ID, checkpoint_dir/new_job_ID/.

The file path of the checkpoint directory can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name.

By default, jobs are restarted with the same output file and file transfer specifications, job name, window signal value, checkpoint directory and period, and rerun options as the original job.

A job can be restarted on another host under the following conditions for both hosts:
  • Must be binary compatible
  • Must run the same OS version
  • Have access to the executable files
  • Have access to all open files (LSF must locate them with an absolute path name)
  • Have access to the checkpoint directory

The environment variable LSB_RESTART is set to Y when a job is restarted.

LSF invokes the erestart executable file in the LSF_SERVERDIR directory to restart the job.

Only the bsub options that are listed here can be used with the brestart command.

Like the bsub command, the brestart command calls the parent esub file (the mesub file), which calls the executable file named esub (without .application) if it exists in the LSF_SERVERDIR directory. The mesub file also calls any mandatory esub executable files that are configured by an LSF administrator. Only esub executable files that are called by the bsub command can change the job environment on the submission host. An esub file that is called by the brestart command cannot change the job environment. Arguments for the esub executable files can also be modified.

You can use the brestart -R command to specify new resource requirements when you restart a checkpointable job. The new resource requirements must be mem or swap. You can use the brestart to specify multiple -R options for multiple resource requirement strings, specify compound resource requirements, and specify alternative resource requirements.

Options

The following option applies only to the brestart command.

-f
Forces the job to be restarted even if non-restartable conditions exist (these conditions are operating system specific).

See also

bsub, bjobs, bmod, bqueues, bhosts, bchkpnt, lsbqueues, echkpnt, erestart, mbatchd