Achieve efficient event switching
About this task
Periodic switching of the event file can weaken the performance of mbatchd, which automatically backs up and rewrites the events file after every 1000 batch job completions. The old lsb.events file is moved to lsb.events.1, and each old lsb.events.n file is moved to lsb.events.n+1.
Procedure
MAX_JOB_NUM specifies the number of batch jobs to complete before lsb.events is backed up and moved to lsb.events.1. The default value is 1000.
MIN_SWITCH_PERIOD controls how frequently mbatchd checks the number of completed batch jobs
The two parameters work together. Specify the MIN_SWITCH_PERIOD value in seconds.
For large clusters, set the MIN_SWITCH_PERIOD to a value equal to or greater than 600. This causes mbatchd to fork a child process that handles event switching, thereby reducing the load on mbatchd. mbatchd terminates the child process and appends delta events to new events after the MMIN_SWITCH_PERIOD has elapsed. If you define a value less than 600 seconds, mbatchd will not fork a child process for event switching.
Example
This instructs mbatchd to check if the events file has logged 1000 batch job completions every two hours. The two parameters can control the frequency of the events file switching as follows:
After two hours, mbatchd checks the number of completed batch jobs. If 1000 completed jobs have been logged (MAX_JOB_NUM=1000), it starts a new event log file. The old event log file is saved as lsb.events.n, with subsequent sequence number suffixes incremented by 1 each time a new log file is started. Event logging continues in the new lsb.events file.
If 1000 jobs complete after five minutes, mbatchd does not switch the events file until till the end of the two-hour period (MIN_SWITCH_PERIOD=7200).