Controlling mbatchd

Use the badmin reconfig, badmin mbdrestart, badmin mbdrestart -C, and bctrld stop sbd commands to control the mbatchd daemon.

Procedure

You use the badmin command to control mbatchd.

Reconfiguring mbatchd

About this task

If you add a host to a host group, a host to a queue, or change resource configuration in the Hosts section of the lsf.cluster.cluster_name file, the change is not recognized by jobs that were submitted before you reconfigured.

If you want the new host to be recognized, you must restart the mbatchd daemon (or add the host that uses the bconf command if you are using live reconfiguration).

Procedure

Run the badmin reconfig command.

Results

When you reconfigure the cluster, mbatchd does not restart. Only configuration files are reloaded.

Restarting mbatchd

Procedure

Run the badmin mbdrestart command.

LSF checks configuration files for errors and prints the results to stderr. If no errors are found, LSF runs the following tasks:

  • Reload configuration files
  • Restart the mbatchd daemon
  • Reread events in the lsb.events file and replay the events to recover the running state of the last instance of the mbatchd daemon.

Results

Tip: Whenever LSF restarts the mbatchd daemon, mbatchd is not available for service requests. In large clusters with many events in the lsb.events file, restarting the mbatchd daemon can take some time. To avoid replaying events in the lsb.events file, use the badmin reconfig command.

Logging a comment when you restart mbatchd

Procedure

  1. Use the -C option of the badmin mbdrestart command to log an administrator comment in the lsb.events file.

    For example, to add "Configuration change" as a comment to the lsb.events file, run the following command:

    badmin mbdrestart -C "Configuration change"
    

    The comment text Configuration change is recorded in the lsb.events file.

  2. Run the badmin hist or badmin mbdhist commands to display administrator comments for the mbatchd daemon restart.

Shutting down mbatchd

Procedure

  1. Run the bctrld stop sbd command to shut down the sbatchd daemon on the management host.

    For example, to shut down the sbatchd daemon on the hostD host, run the following command:

    bctrld stop sbd hostD
  2. Run the badmin mbdrestart command:
    badmin mbdrestart

    Running this command causes the mbatchd and mbschd daemons to exit. The mbatchd daemon cannot be restarted because the sbatchd daemon is shut down. All LSF services are temporarily not available, but existing jobs are not affected. When the sbatchd daemon later starts up the mbatchd daemon, the previous status of the mbatchd daemon is restored from the event log file and job scheduling continues.