Job migration

Procedure

As long as a MultiCluster job is rerunnable (bsub -r or RERUNNABLE=yes in the send-jobs queue) and is not checkpointable, you can migrate it to another host, but you cannot specify which host. Migrated jobs return to the submission cluster to be dispatched with a new job ID.

For more information on job migration, see Administering IBM Platform LSF.

User-specified job migration

Procedure

To migrate a job manually, run bmig in either the submission or execution cluster, using the appropriate job ID.

You cannot use bmig -m to specify a host.

Tip:

Operating in the execution cluster is more efficient than sending the bmig command through the submission cluster.

Automatic job migration

Procedure

  1. To enable automatic job migration, set the migration threshold (MIG in lsb.queues) in the receive-jobs queue.
  2. You can also set a migration threshold at the host level on the execution host (MIG in lsb.hosts).

    The lowest migration threshold applies to the job.

    Tip:

    Automatic job migration configured in the send-jobs queue does not affect MultiCluster jobs.

Migration of checkpointable jobs

Procedure

Checkpointable MultiCluster jobs cannot be migrated to another host.

The migration action stops and checkpoints the job, then schedules the job on the same host again.