brestart

Restarts checkpointed jobs.

Synopsis

brestart [bsub_options] [-f] checkpoint_dir [job_ID | "job_ID[index]"]
brestart [-h | -V]

Option List

-B
-f
-N | -Ne
-x
-a "esub_application[([argument[,argument...]])]..."
-b begin_time
-C core_limit
-c [hour:]minute[/host_name | /host_model]
-D data_limit
-E "pre_exec_command [argument ...]"
-F file_limit
-m "host_name[+[pref_level]] | host_group[+[pref_level]] ..."
-G user_group
-M mem_limit
-q "queue_name ..."
-R "res_req" [-R "res_req" ...]
-S stack_limit
-t term_time
-w 'dependency_expression' [-ti]
-W run_limit[/host_name| /host_model]
checkpoint_dir [job_ID | "job_ID[index]"]

Description

Restarts a checkpointed job by using the checkpoint files that are saved in the directory checkpoint_dir/last_job_ID/. Only jobs that are successfully checkpointed can be restarted.

Jobs are resubmitted and assigned a new job ID. The checkpoint directory is renamed by using the new job ID, checkpoint_dir/new_job_ID/.

The file path of the checkpoint directory can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name.

By default, jobs are restarted with the same output file and file transfer specifications, job name, window signal value, checkpoint directory and period, and rerun options as the original job.

A job can be restarted on another host under the following conditions for both hosts:
  • Must be binary compatible
  • Must run the same OS version
  • Have access to the executable files
  • Have access to all open files (LSF must locate them with an absolute path name)
  • Have access to the checkpoint directory

See also

bsub, bjobs, bmod, bqueues, bhosts, bchkpnt, lsbqueues, echkpnt, erestart, mbatchd