Manage I/O performance of the info directory
In large clusters, the large numbers of jobs results in a large number of job files stored in the LSF_SHAREDIR/cluster_name/logdir/info directory at any time. When the total size of the job files reaches a certain point, you will notice a significant delay when performing I/O operations in the info directory due to file server directory limits dependent on the file system implementation.
About this task
By dividing the total file size of the info directory among subdirectories, your cluster can process more job operations before reaching the total size limit of the job files.
Procedure
Example
MAX_INFO_DIRS=10
mbatchd creates ten subdirectories from LSB_SHAREDIR/cluster_name/logdir/info/0 to LSB_SHAREDIR/cluster_name/logdir/info/9.
Configure a job information directory
Job file I/O operations may impact cluster performance when there are millions of jobs in a LSF cluster. You can configure LSB_JOBINFO_DIR on high performance I/O file systems to improve cluster performance. This is separate from the LSB_SHAREDIR directory in lsf.conf. LSF will access the directory to get the job information files. If the directory does not exist, mbatchd will try to create it. If that fails, mbatchd exits.
The LSB_JOBINFO_DIR directory must be:
- Owned by the primary LSF administrator
- Accessible from all hosts that can potentially become the management host
- Accessible from the management host with read and write permission
- Set for 700 permission