Additional configuration settings
This topic describes additional configuration settings that you can use as an example for your own cluster. Some of the following parameters are explicitly set to the default value, but you can edit these to suit your own cluster.
lsf.conf settings
The following is a summary of example lsf.conf configuration settings. Any settings that are commented out with a hash (#) require further adjustment or consideration before you enable them.
EGO_DEFINE_NCPUS=cores # Default value set explicitly
EGO_PIM_SWAP_REPORT=Y
LSB_DISPLAY_YEAR=Y
LSB_ENABLE_PERF_METRICS_LOG=Y
LSB_SHARE_LOCATION_ENH=Y
LSB_SUB_COMMANDNAME=Y
LSF_ACCEPT_NUMCLIENTS=6 # Default value set explicitly
LSF_DISCARD_LOG=Y # Default value set explicitly
#LSF_GPU_RESOURCE_IGNORE=Y
LSF_INTELLIGENT_CPU_BIND=Y
#LSF_LINUX_CGROUP_ACCT=Y
LSF_LOG_QUEUE_SIZE=100000
LSF_LSLOGIN_SSH=Y
LSF_PROCESS_TRACKING=Y
LSF_RSH="ssh -o 'PasswordAuthentication no' -o 'StrictHostKeyChecking no'"
LSF_STRICT_CHECKING=Y # Only change when the cluster is completely down
#LSF_STRIP_DOMAIN=.example.com:.domain.example.com
Take note before configuring the following parameters:
- LSF_LINUX_CGROUP_ACCT=Y
- Linux only. The cgroups functions are supported on x86_64 and PowerPC Linux
platforms with kernel version 2.6.24 or later (for example, RedHat 6.2 and later, or SUSE 11 patch 1
and later). You must also set up freezer, cpuacct, and memory subsystems on each host in the cluster
that supports cgroups.
For RHEL 6.2 or later, for more details, see Process tracking through cgroups. For example, updates are needed for /cgroup/*, then /etc/fstab
For RHEL 7, see the currently-mounted resource controllers in /proc/cgroups
- LSF_STRICT_CHECKING=Y
-
Important: Do not change this parameter unless you shut down the entire cluster.
- LSF_STRIP_DOMAIN=.example.com:.domain.example.com
- Optional. If all of the hosts in your cluster are reachable by using short host names, you can configure LSF to use the short host names by specifying the portion of the domain name to remove. If your hosts are in more than one domain or have more than one domain name, you can specify more than one domain suffix to remove, separated by a colon (:).
lsb.params settings
The following is a summary of lsb.params configuration settings. Any settings that are commented out with a hash (#) require further adjustment or consideration before you enable them.
# Verify the existing setting
JOB_SCHEDULING_INTERVAL=0
#JOB_SCHEDULING_INTERVAL=250ms
ACCT_ARCHIVE_AGE=30 # 30 days
ACCT_ARCHIVE_SIZE=10000000 # 10 GB
#ACCT_ARCHIVE_TIME=23:55 # Enables archiving, performed @ 11:55PM
DEFAULT_PROJECT=default # Default value set explicitly
DEFAULT_RESREQ_ORDER=r15s:pg # Default value set explicitly
#DEFAULT_USER_GROUP=staff
ENABLE_DIAGNOSE=query
ENABLE_HOST_INTERSECTION=Y
ENFORCE_UG_TREE=Y
#EVALUATE_JOB_DEPENDENCY=100
EXTEND_JOB_EXCEPTION_NOTIFY=Y
JOB_DISTRIBUTE_ON_HOST=any # Default value set explicitly
JOB_INFO_EVENT_DUMP_INTERVAL=15 # Default value set explicitly
JOB_INFO_MEMORY_CACHE_SIZE=1024 # Default value set explicitly
LOCAL_MAX_PREEXEC_RETRY=5
LOCAL_MAX_PREEXEC_RETRY_ACTION=EXIT
MAX_ACCT_ARCHIVE_FILE=10
MAX_JOB_REQUEUE=5
ORPHAN_JOB_TERM_GRACE_PERIOD=60
#PREEMPT_FOR=LEAST_RUN_TIME
RELAX_JOB_DISPATCH_ORDER=ALLOC_REUSE_DURATION[0 30]
SCHED_METRIC_ENABLE=Y
SIMPLIFIED_GURANTEE=Y
Take note before configuring the following parameters:
- DEFAULT_PROJECT
- If a user submits a job without specifying any project name and the LSB_DEFAULTPROJECT environment variable is not set, LSF automatically assigns the job to this specified project.
- DEFAULT_USER_GROUP=staff
- When DEFAULT_USER_GROUP is defined, all submitted jobs must be associated with a user group. Jobs that do not have a specified user group are associated with default_user_group, where default_user_group is a group that is configured in the lsb.users file and contains all as a direct member. DEFAULT_USER_GROUP can only contain one user group.
- EVALUATE_JOB_DEPENDENCY=100
- Sets the maximum number of job dependencies that mbatchd evaluates in one scheduling cycle. LSF evaluates all dependent jobs every 10 minutes regardless of this configuration.
- JOB_SCHEDULING_INTERVAL=0
- Minimum interval between scheduling cycles. Specify the value in seconds or in milliseconds with the ms keyword.
- SCHED_METIC_ENABLE=Y
- Enables scheduling metric collection. You also specify a value for the SCHED_METRIC_SAMPLE_PERIOD parameter, or leave it undefined to use the default value of 60 seconds.