bmig

Migrates checkpointable or rerunnable jobs.

Synopsis

bmig [-f] [job_ID | "job_ID[index_list]"] ...
bmig [-f] [-J job_name]
[-m "host_name ... " | -m "host_group ... "] [-u user_name | -u user_group | -u all] [0]
bmig [-h | -V]

Description

Migrates one or more of your checkpointable or rerunnable jobs to a different host. You can migrate only running or suspended jobs. You cannot migrate pending jobs. Members of a chunk job in the WAIT state can be migrated. LSF removes waiting jobs from the job chunk and changes their original dispatch sequence.

By default, migrates the most recently submitted job, or the most recently submitted job that also satisfies other specified options (-u and -J). Specify 0 (zero) to migrate multiple jobs. Only LSF administrators and root can migrate jobs that are submitted by other users. Both the original and the new hosts must have the following characteristics:
  • Be binary compatible
  • Run the same version of the operating system for predictable results
  • Have network connectivity and read/execute permissions to the checkpoint and restart executable files (in the LSF_SERVERDIR directory by default)
  • Have network connectivity and read/write permissions to the checkpoint directory and the checkpoint file
  • Have access to all files open during job execution so that LSF can locate them using an absolute path name

When you migrate a checkpointable job, LSF checkpoints and kills the job and then restarts the job on the next available host. If checkpoint fails, the job continues to run on the original host. If you use the bmig command while a job is being check pointed, for example, with periodic checkpointing enabled, LSF ignores the migration request.

When you migrate a rerunnable job, LSF kills the job and restarts it from the beginning on the next available host. LSF sets the environment variable LSB_RESTART to Y when a migrating job restarts or reruns.

Note: The job owner does not receive notification when LSF kills a checkpointable or rerunnable job as part of job migration.

In LSF multicluster capability, you must use the brun command rather than the bmig command to move a job to another host.

When absolute job priority scheduling (APS) is configured in the queue, LSF always schedules migrated jobs before pending jobs. For migrated jobs, LSF keeps the existing job priority. If the LSB_REQUEUE_TO_BOTTOM and LSB_MIG2PEND parameters are configured in the lsf.conf file, the migrated jobs keep their APS information. The migrated jobs compete with other pending jobs based on the APS value. If you want to reset the APS value, you must use the brequeue command instead of the bmig command.

Options

-f
Forces a checkpointable job to be checkpointed and migrated, even if non-checkpointable conditions exist within the operating system environment.
job_ID | "job_ID[index_list]" | 0
Migrates jobs with the specified job IDs. LSF ignores the -J and -u options.

If you specify a job ID of 0 (zero), LSF ignores all other job IDs and migrates all jobs that satisfy the -J and -u options.

If you do not specify a job ID, LSF migrates the most recently submitted job that satisfies the -J and -u options.

-J job_name
Migrates the job with the specified name. Ignored if a job ID other than 0 (zero) is specified.

The job name can be up to 4094 characters long. Job names are not unique.

The wildcard character (*) can be used anywhere within a job name, but it cannot appear within an array index. For example, the pattern job* returns jobA and jobarray[1]. The *AAA*[1] pattern returns the first element in job arrays with names that contain AAA. However, the pattern job1[*] does not return anything since the wildcard is within the array index.

-m "host_name ..." | -m "host_group ..."
Migrates jobs to the specified hosts.

This option cannot be used on an LSF multicluster capability job. The bmig command can restart or rerun only the job on the original host.

-u "user_name" | -u "user_group" | -u all
Migrates only those jobs that are submitted by the specified users. To specify a Windows user account, include the domain name in uppercase letters and use a single backslash (DOMAIN_NAMEuser_name) in a Windows command line or a double backslash (DOMAIN_NAME∖∖user_name) in a UNIX command line.

If you specify the reserved user name all, LSF migrates jobs that are submitted by all users. Ignored if a job ID other than 0 (zero) is specified.

-h
Prints command usage to stderr and exits.
-V
Prints LSF release version to stderr and exits.

See also

bsub, brestart, bchkpnt, bjobs, bqueues, bhosts, bugroup, mbatchd, lsb.queues, kill