LSB_RC_EXTERNAL_HOST_ABNORMAL_TIME
Allows the LSF
resource connector to control the number of minutes that LSF waits
before timing out resource connector hosts that are in n an abnormal status (that
is, closed_LIM
, unavail
,
or unreach
status).
Syntax
LSB_RC_EXTERNAL_HOST_ABNORMAL_TIME=soft_timeout_integer:hard_timeout_integer
Description
The LSB_RC_EXTERNAL_HOST_ABNORMAL_TIME parameter sets the timeout values (in minutes) and LSF behavior when an LSF resource connector host reaches its soft and hard timeout limits. Specify integers for the values, and separate the values with a colon (:).
The value before the colon means a soft timeout. When a host reaches its soft timeout, and there are still jobs on it, the mbatchd daemon will log a warning message. Cluster administrators can monitor this log message, and take immediate action, such as check and restart the LSF daemons on the host). If there are no jobs on the host when the host reaches the timeout, the host will be relinquished. The default soft timeout is 10 minutes.
The value after the colon means a hard timeout. When a host reaches its hard timeout, LSF will relinquish the host, and jobs on the host will be re-queued or exited. The default hard timeout is 60 minutes.
Example
LSB_RC_EXTERNAL_HOST_ABNORMAL_TIME=15:30
Default
10:60 (soft timeout of 10 minutes and hard time out of 60 minutes)