Performance enhancements

The following enhancements affect performance.

File sanity check for LSF Data Manager jobs moved to the transfer job

The sanity check for the existence of files or folders and whether the user can access them, discovery of the size and modification time of the files or folders, and generation of the hash from the bsub and bmod commands is moved to the transfer job. This equalizes the performance of submitting and modifying jobs with and without data requirements. If needed, the new -datachk option can be used with the bsub or bmod command to perform full checking for jobs with a data requirement. The -datachk option can be specified only with the -data command. If the data requirement is for a tag, this option has no effect.

Regardless if -datachk is specified, the bsub and bmod commands no longer provide the file or folder size, modification time, and hash to the mbatchd daemon. This means that a new transfer job might need to be started for each file or folder that is requested for each new job with a data requirement.

Therefore, LSF now introduces a new lsf.datamanager configuration file parameter, CACHE_REFRESH_INTERVAL, to limit the number of transfer jobs. If multiple requests for the same file or folder come to LSF Data Manager within the configured interval in the parameter, only the first request results in a new transfer job. The assumption is that the requested file or folder has not changed during the configured interval. This also assumes that the requested file or folder was not cleaned from the staging area.

LSF Data Manager transfer script directory

The LSF Data Manager transfer scripts are now located in the LSF_SERVERDIR directory. You can further modify these transfer scripts for your environment.

  • dm_stagein_transfer.sh: Stage-in transfer script.
  • dm_stagein_helper.sh: Helper script, which is invoked by the stage-in transfer script on the source host using ssh. The help script contains most of the operations that must be executed on the remote host, which reduces the number of times that ssh is used.
  • dm_stageout_transfer.sh: The stage-out transfer script.

Previously, LSF Data Manager created the transfer scripts on demand.