New and changed configuration parameters and environment variables

The following configuration parameters and environment variables are new or changed for LSF 10.1.

Note: Starting in Fix Pack 14, instead of reading this topic, refer to the What's new information for each fix pack (which contain changed and new details, including direct references to affected topics).

Files that use automatic time-based configuration

All files where you specify automatic time-based configurations with if-else constructs now allow you to specify time zones whenever you specify time windows, including the lsb.applications, lsb.hosts, lsb.params, lsb.queues, and lsb.resources. LSF supports all standard time zone abbreviations. If you do not specify the time zone, LSF uses the local system time zone.

install.config

  • ENABLE_CGROUP: In LSF Version 10.1 Fix Pack 2, this new parameter enables LSF to track processes' CPU and memory accounting based on Linux cgroup memory and cpuacct subsystems. If set to Y, the installer sets parameters in the lsf.conf file enable these functions in LSF.
  • ENABLE_GPU: In LSF Version 10.1 Fix Pack 2, this new parameter enables LSF to support GPUs so that applications can use GPU resources in a Linux environment. LSF supports parallel jobs that require GPUs based on availability. If set to Y, the installer sets parameters in the lsf.conf file enable these functions in LSF, and adds resource definitions to the lsf.cluster.cluster_name and lsf.shared files for GPU resources.

lsb.applications

  • ELIGIBLE_PEND_TIME_LIMIT: Specifies the eligible pending time limit for jobs in the application profile.
  • PEND_TIME_LIMIT: Specifies the pending time limit for jobs in the application profile.
  • CONTAINER: In LSF Version 10.1 Fix Pack 1, this new parameter enables Docker container jobs to run in the application profile by using the docker[] keyword.

    In LSF Version 10.1 Fix Pack 2, enables Shifter and Singularity container jobs to run in the application profile by using the shifter[] and singularity[] keywords.

    In LSF Version 10.1 Fix Pack 3, allows you to specify a pre-execution script to run before the container job runs by specifying an at sign (@) and a full file path to the script in the option() keyword. The output of this script is used as container startup options.

    In LSF Version 10.1 Fix Pack 4, the starter[] keyword is obsolete.

  • EXEC_DRIVER: In LSF Version 10.1 Fix Pack 4, this new parameter specifies the execution driver framework for the application profile.
  • #INCLUDE: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, you can now use this directive in any place in this file. Previously, you could only use this directive in the beginning of the lsb.applications file.
  • PRIORITY: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this new parameter specifies a priority that is used as a factor when calculating the job priority for absolute priority scheduling (APS).
  • ESTIMATED_RUNTIME: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this parameter is introduced to configure estimated runtimes for jobs in an application, and is meant to replace the existing RUNTIME parameter from the lsb.applications file. Can also be configured at the queue level (lsb.queues file) and cluster level (lsb.params file).
  • PLAN: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this parameter is introduced for use when plan-based scheduling is enabled with the parameter ALLOCATION_PLANNER=Y. This parameter controls whether or not jobs are candidates for plan-based scheduling. Can also be configured at the queue level (lsb.queues file), cluster level (lsb.params file), and job level.
  • GPU_REQ: In LSF Version 10.1 Fix Pack 6, this new parameter specifies the GPU requirements for the application profile.

    In LSF Version 10.1 Fix Pack 8, GPU_REQ has the following changes:

    • You can now specify aff=no in the GPU requirements to relax GPU affinity while maintaining CPU affinity. By default, aff=no is set to maintain strict GPU-CPU affinity binding.
    • You can now specify mps=yes,share in the GPU requirements to enable LSF to share the one MPS daemon per host for all jobs that are submitted by the same user with the same resource requirements, and these jobs use the same MPS daemon on the host.
    • You can now specify mps=per_socket in the GPU requirements to enable LSF to start one MPS daemon per socket per job. You can also use the mps=per_socket,share to enable LSF to share the one MPS daemon per socket for all jobs that are submitted by the same user with the same resource requirements, and these jobs use the same MPS daemon for the socket.
    • You can now specify mps=per_gpu in the GPU requirements to enable LSF to start one MPS daemon per GPU per job. You can also use the mps=per_gpu,share to enable LSF to share the one MPS daemon per GPU for all jobs that are submitted by the same user with the same resource requirements, and these jobs use the same MPS daemon for the GPU.
  • WATCHDOG: In LSF Version 10.1 Fix Pack 8, this new parameter enables LSF to use the watchdog feature to regularly run external scripts that check application data, logs, and other information. LSF can use these scripts to pass on the job information.
  • In LSF Version 10.1 Fix Pack 8, the maximum integer that you can specify for the following resource limits is increased from 32 bit (2³¹) to 64 bit (2⁶³):
    • memory limit (MEMLIMIT parameter)
    • swap limit (SWAPLIMIT parameter)
    • core file size limit (CORELIMIT parameter)
    • stack limit (STACKLIMIT parameter)
    • data segment size limit (DATALIMIT parameter)
    • file size limit (FILELIMIT parameter)
  • GPU_REQ: In LSF Version 10.1 Fix Pack 9, you can now specify mps=yes,share, mps=per_socket,share, and mps=per_gpu,share in the GPU requirements to enable LSF to share the MPS daemon on the host, socket, or GPU for jobs that are submitted by the same user with the same resource requirements.

    That is, you can now add ",share" to the mps value to enable MPS daemon sharing for the host, socket, or GPU.

    In addition, you can now assign the number of GPUs per task or per host by specifying num=number/task or num=number/host. By default, the number of GPUs is still assigned per host.

  • RES_REQ: In LSF Version 10.1 Fix Pack 9, you can now specify the resource reservation method (by task, by job, or by host) in the rusage string by using the /task, /job, or /host keyword after the numeric value in the rusage string. You can only specify resource reservation methods for consumable resources.

    In addition, you can now enable LSF to stripe tasks of a parallel job across the free resources of the candidate hosts by using the stripe keyword in the span string of the resource requirements.

  • DOCKER_IMAGE_AFFINITY: In LSF Version 10.1 Fix Pack 9, this new parameter enables LSF to give preference for execution hosts that already have the requested Docker image when submitting or scheduling Docker jobs.
  • GPU_REQ has the following changes in LSF Version 10.1 Fix Pack 10:
    • You can now add the ",nocvd" keyword to the existing mps value in the GPU resource requirements string to disable the CUDA_VISIBLE_DEVICES environment variable for MPS jobs.
    • You can now specify block=yes in the GPU resource requirements string to enable block distribution of allocated GPUs.
    • You can now specify gpack=yes in the GPU resource requirements string to enable pack scheduling for shared mode GPU jobs.
  • USE_PAM_CREDS has the following changes in LSF Version 10.1 Fix Pack 10:
    • You can now specify the session keyword to enable LSF to open a PAM session when submitting jobs to Linux hosts using PAM..
    • You can now specify the limits keyword to apply the limits that are specified in the PAM configuration file to an application. This is functionally identical to enabling USE_PAM_CREDS=y except that you can define limits together with the session keyword.
  • CONTAINER: LSF Version 10.1 Fix Pack 11 enables the following new containers with this parameter:
    • Pod Manager (Podman) container jobs run in the application profile by using the docker[] keyword to run Podman jobs with the Docker execution driver. The options[] keyword specifies Podman job run options for the podman run command, which are passed to the job container. Because Podman uses the Docker execution driver to run Podman container jobs, podman and docker command options are not compatible, and execution driver permissions are not the same, you cannot use LSF to run Docker container jobs if you are using LSF to run Podman container jobs.
    • Enroot container jobs run in the application profile by using the new enroot[] keyword to run Enroot jobs with the Enroot execution driver. The options[] keyword specifies Enroot job run options for the enroot start command, which are passed to the job container.
  • EXEC_DRIVER: In LSF Version 10.1 Fix Pack 11, this parameter specifies the execution driver framework for the following container jobs:
    • Pod Manager (Podman) container jobs if LSF is configured to run Podman container jobs instead of Docker container jobs. The user keyword must be set to default for Podman jobs, and the starter and controller file permissions must be set to 0755. The monitor file is not required for podman jobs, but if it is used, the monitor file permission must also be set to 0755. Because Podman uses the Docker execution driver to run Podman container jobs, podman and docker command options are not compatible, and execution driver permissions are not the same, you cannot use LSF to run Docker container jobs if you are using LSF to run Podman container jobs.
    • Enroot container jobs. The starter file permission must be set to 0755. The monitor and controller files are ignored. This parameter is optional for Enroot container jobs, and is context[user(default)] starter[/path/to/serverdir/enroot-starter.py] by default.
  • GPU_REQ has the following changes in LSF Version 10.1 Fix Pack 11:
    • The new gvendor keyword in the GPU requirements strings enables LSF to allocate GPUs with the specified vendor type. Specify gvendor=nvidia to request Nvidia GPUs and gvendor=amd to request AMD GPUs.
    • The nvlink=yes keyword in the GPU requirements string is deprecated. Replace nvlink=yes in the GPU requirements string with glink=nvlink instead.
    • The new glink keyword in the GPU requirements strings specifies the connections among GPUs. Specify glink=nvlink for the NVLink connection for Nvidia GPUs or glink=xgmi for the xGMI connection for AMD GPUs. Do not use glink with the nvlink keyword, which is now deprecated.
  • In LSF Version 10.1 Fix Pack 12, the following parameters are deprecated and will be removed in a future version:
    • CHUNK_JOB_SIZE
    • NETWORK_REQ
  • In LSF Version 10.1 Fix Pack 13, the supported Podman version is 3.3.1. For both the lsb.applications and lsb.queues files, the CONTAINER parameter now supports podman configuration, and requires the EXEC_DRIVER parameter (with mandatory [user(default)] configuration for the Podman jobs to start, and controller[] configuration (such as controller[/path/to/serverdir/docker-control.py]) for Podman to work.
  • In LSF Version 10.1 Fix Pack 13, the CONTAINER parameter supports Apptainer container jobs to run in the application profile by using the apptainer[] keyword. Apptainer is the rebranded product name for Singularity.

lsb.globalpolicies (New)

The lsb.globalpolicies file defines global policies for multiple clusters. This file is optional, but is required to enable global fair share scheduling. It is installed by default in LSB_CONFDIR/cluster_name/configdir.

Global fair share policies are defined in the GlobalFairshare section.

After you change the lsb.globalpolicies file, use the badmin gpdrestart command to reconfigure the global policy daemon (gpolicyd).

  • Limits section: In LSF Version 10.1 Fix Pack 10, you can now specify global resource allocations in the Limits sections.

    Specify global resource allocations the same way you would specify local resource allocation limits in the Limit sections of the lsb.resources file, using the following parameters: APPS, ELIGIBLE_PEND_JOBS, INELIGIBLE, JOBS, JOBS_PER_SCHED_CYCLE, LIC_PROJECTS, MEM, NAME, PROJECTS, QUEUES, RESOURCE, SLOTS, SWP, TMP, and USERS.

    When using the LSF multicluster capability, global resource allocation limits apply to all clusters.

  • Per-consumer limits: LSF Version 10.1 Fix Pack 11, you can now specify global per-consumer resource allocations in the Limits sections by specifying the following new parameters: PER_APP, PER_LIC_PROJECT, PER_PROJECT, PER_QUEUE, and PER_USER.
  • Resource section: In LSF 10.1 Fix Pack 13, this new section defines global resources that are shared between all clusters.
  • ResourceMap section: In LSF 10.1 Fix Pack 13, this new section defines the mapping between shared resources and their sharing clusters.
  • DistributePolicy section: In LSF 10.1 Fix Pack 13, this new section defines the distribution policies for global resources and global limits.
  • ReservationUsage section: In LSF 10.1 Fix Pack 13, this new section defines the method of reserving global resources.

lsb.hosts

  • In the ComputeUnit section, the MEMBER parameter now allows use colons (:) to specify a range of numbers when you specify condensed notation for host names. Colons are used the same as hyphens (-) are currently used to specify ranges and can be used interchangeably in condensed notation. You can also use leading zeros to specify host names.
  • #INCLUDE: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, you can use this new directive to insert the contents of a specified file into this configuration file.
  • In the Host section, the HOST_NAME parameter now supports condensed notation for host names.

    Use square brackets ([]) to enclose the multiple numbers, and use a hyphen (-) or colon (:) to specify a range of numbers. Use a comma (,) to separate multiple ranges of numbers or to separate individual numbers. You can also use leading zeros to specify host names.

    Use multiple sets of square brackets (with the supported special characters) to define multiple sets of non-negative integers anywhere in the host name. For example, hostA[1,3]B[1-3] includes hostA1B1, hostA1B2, hostA1B3, hostA3B1, hostA3B2, and hostA3B3.

  • In the HostGroup section, the GROUP_MEMBER parameter now allows colons (:) to specify a range of numbers when you specify condensed notation for host names. Colons are used the same as hyphens (-) are currently used to specify ranges and can be used interchangeably in condensed notation. You can also use leading zeros to specify host names.

    Use multiple sets of square brackets (with the supported special characters) to define multiple sets of non-negative integers anywhere in the host name. For example, hostA[1,3]B[1-3] includes hostA1B1, hostA1B2, hostA1B3, hostA3B1, hostA3B2, and hostA3B3.

  • LSF 10.1 Fix Pack 13, in the HostGroup section, the GROUP_MEMBER parameter now supports preferred host groups to indicate your preference for dispatching a job to a certain host group. LSF supports using a plus sign (+) and a positive number, after the names of host groups that you would prefer to use. A higher number indicates a higher preference (for example, (hostA groupB+2 hostC+1) indicates that groupB is the most preferred and hostA is the least preferred.

lsb.params

  • JOB_ARRAY_EVENTS_COMBINE: This parameter is introduced to improve performance with large array jobs. When enabled (set to Y), more events for operations on array jobs or elements are generated, specific to the array job. For job arrays with large array sizes, mbatchd daemon performance is improved because operations are used by events specific to the array jobs. When this parameter is enabled, the following events are modified in the lsb.events file to accommodate array index ranges: JOB_CLEAN, JOB_MODIFY2, JOB_MOVE, JOB_SIGNAL, JOB_STATUS, and JOB_SWITCH. In the lsb.acct file, the JOB_FINISH event is modified. In the lsb.stream file, the JOB_FINISH2 event is modified. In the lsb.status file, the JOB_STATUS2 event is modified.
  • JOB_INFO_EVENT_DUMP_INTERVAL: Controls how frequently the job information events file is rewritten. If the dump interval is too frequent, it means a greater load on I/O operations. If the dump interval is too infrequent, events replay will take longer to finish. The parameter specifies the interval in number of minutes. Specify any positive integer between 1 and 2147483646. The default interval is 15 minutes.
  • JOB_INFO_MEMORY_CACHE_SIZE: Configures how much memory to use for the job information cache. The job information cache can reduce the load on the work directory file server by caching job information such as the job's environment variables, command-line and eexec data in memory, in a compressed format. Set this parameter to the amount of memory in MB allocated to cached job information. The cache is enabled by default and the default cache size is 1024 MB (1 GB). The cache can be disabled by setting this parameter to 0. The minimum recommended cache size is 500 MB. Valid values are greater than or equal to zero, and less than MAX_INT. The value of the JOB_INFO_MEMORY_CACHE_SIZE parameter can be viewed with the command bparams -a or bparams -l. Real cache memory that is used can affect mbatchd fork performance.
  • JOB_SWITCH2_EVENT: Obsolete in LSF 10.1. Replaced with the JOB_ARRAY_EVENTS_COMBINE parameter.
  • RELAX_JOB_DISPATCH_ORDER: Allows LSF to deviate from standard job prioritization policies to improve cluster utilization by allowing multiple jobs with common resource requirements to run consecutively on the same allocation.

    By default, the same allocation can be reused for up to 30 minutes. You can also specify a custom allocation reuse time by specifying the ALLOC_REUSE_DURATION keyword with a maximum value and an optional minimum value.

  • DIAGNOSE_LOGDIR: This parameter no longer requires ENABLE_DIAGNOSE to be enabled. The DIAGNOSE_LOGDIR parameter is also the default location for the snapshot of the scheduling job buckets (badmin diagnose -c jobreq) in addition to default location of the log file for query source information (badmin diagnose -c query). The ENABLE_DIAGNOSE parameter is required for the query source information log file to be saved to this location.
  • CONDENSE_PENDING_REASONS: Previously, set to Y at time of installation for the HIGH_THROUGHPUT configuration template. If otherwise undefined, then N as default. For this release, it is removed from the HIGH_THROUGHPUT configuration template. Therefore, the default is always N.

    For this release, when the condensed pending reason feature is enabled, the single key pending reason feature and the categorized pending reason feature will be overridden by the main reason, if there is one.

  • MC_SORT_BY_SUBMIT_TIME - When set to Y/y allows forwarded jobs on the execution cluster to be sorted and run based on their original submission time (instead of their forwarded time). Available in the IBM Spectrum LSF multicluster capability only.
  • PEND_REASON_MAX_JOBS: Obsolete in LSF 10.1.
  • TRACK_ELIGIBLE_PENDINFO: Set to Y to enable LSF to determine whether a pending job is eligible for scheduling, and to use eligible pending time instead of total pending time to determine job priorities for automatic job priority escalation and absolute priority scheduling.
  • ELIGIBLE_PENDINFO_SNAPSHOT_INTERVAL: Specifies the time interval, in minutes, for mbschd to dump eligible and ineligible pending information to disk. The eligible and ineligible pending information is saved when mbatchd or mbschd restarts. The default value is 5 minutes
  • JOB_SCHEDULING_INTERVAL: Now specifies the minimal interval between subsequent job scheduling sessions. Specify in seconds, or include the keyword ms to specify in milliseconds. A value of 0 means no minimum interval between subsequent sessions. Previously, this parameter specified the amount of time that mbschd sleeps before the next scheduling session starts.
  • ESTIMATOR_MAX_JOBS_PREDICTION: Specifies the number of pending jobs that the estimator predicts, which is 1000 by default.
  • ESTIMATOR_MAX_TIME_PREDICTION: Specifies the amount of time into the future, in minutes, that a job is predicted to start before the estimator stops the current round of estimation. By default, the estimator stops after a job is predicted to start in one week (10080 minutes).
  • ESTIMATOR_MAX_RUNTIME_PREDICTION: Specifies the amount of time that the estimator runs, up to the value of the ESTIMATOR_SIM_START_INTERVAL parameter. By default, the estimator stops after it runs for 30 minutes or the amount of time as specified by the ESTIMATOR_SIM_START_INTERVAL parameter, whichever is smaller.
  • EVALUATE_JOB_DEPENDENCY_TIMEOUT: In LSF Version 10.1 Fix Pack 2, this new parameter sets the maximum amount of time, in seconds or milliseconds, that the mbatchd daemon takes to evaluate job dependencies in one scheduling cycle. This parameter limits the amount of time that mbatchd spends on evaluating job dependencies in a scheduling cycle, which limits the amount of time the job dependency evaluation blocks services. If the EVALUATE_JOB_DEPENDENCY parameter is also defined, the EVALUATE_JOB_DEPENDENCY_TIMEOUT parameter takes effect.
  • EVALUATE_WAIT_CONDITION_TIMEOUT: In LSF Version 10.1 Fix Pack 2, this new parameter specifies a limit to the amount of time that the mbatchd daemon spends on evaluating the bwait wait conditions in a scheduling session.
  • DEFAULT_BWAIT_TIMEOUT: In LSF Version 10.1 Fix Pack 2, this new parameter specifies the default timeout interval to evaluate the wait conditions in a scheduling session, in minutes.
  • MAX_PEND_JOBS: In LSF Version 10.1 Fix Pack 3, this parameter has been changed to specify pending "jobs" instead of pending "job slots" as in previous versions of LSF.
  • MAX_PEND_SLOTS: In LSF Version 10.1 Fix Pack 3, this new parameter has been added to specify "job slots" and replaces the previous role of MAX_PEND_JOBS.
  • FWD_JOB_FACTOR: In LSF Version 10.1 Fix Pack 4, this new parameter defines the forwarded job slots factor, which accounts for forwarded jobs when making the user priority calculation for the fair share policies.
  • JOB_GROUP_CLEAN: In LSF Version 10.1 Fix Pack 4, a new option "all" has been provided for JOB_GROUP_CLEAN, to delete empty implicit job groups automatically even if they have limits.
  • EADMIN_TRIGGER_INTERVAL: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this parameter is introduced to invoke an eadmin script at a set interval even if there is no job exception. The default is 0, which disables this feature and only triggers an eadmin script when there is a job exception.
  • PERSIST_LIVE_CONFIG: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this parameter is introduced to allow update of configuration files for a live reconfiguration. This allows for job submission during a policy update or cluster restart. The default is Y, which enables this feature.

    If PERSIST_LIVE_CONFIG=Y LSF will persist all live config request so that they take effect after mbatchd restart.

    If PERSIST_LIVE_CONFIG=N LSF will not persist live config request and they will not take effect after mbatchd restart.

  • ALLOCATION_PLANNER: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this parameter is introduced to enable the plan-based scheduling and reservation feature.
  • ESTIMATED_RUNTIME: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this parameter is introduced to configure cluster-wide estimated runtimes for jobs and is meant to replace the existing RUNTIME parameter from the lsb.applications file. Can also be configured at the application level (lsb.applications file) and queue level (lsb.queues file).
  • PLAN: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this parameter is introduced for use when plan-based scheduling is enabled with the parameter ALLOCATION_PLANNER=Y. This parameter controls whether or not jobs are candidates for plan-based scheduling. Can also be configured at the application level (lsb.applications file) and queue level (lsb.queues file).
  • DEFAULT_PROJECT: In LSF Version 10.1 Fix Pack 6, the project name can now be up to 511 characters long (previously, this limit was 59 characters).
  • EGROUP_UPDATE_INTERVAL: In LSF Version 10.1 Fix Pack 7, this parameter also controls the time interval for which dynamic host group information is updated automatically, in addition to dynamic user group information. You can also specify the time interval in minutes by using the m keyword after the time interval.
  • GPU_RUN_TIME_FACTOR: In LSF Version 10.1 Fix Pack 7, this new parameter defines the GPU run time factor, which accounts for the total GPU run time of a user's running GPU jobs.
  • GPU_RUN_TIME_FACTOR: In LSF Version 10.1 Fix Pack 7, this new parameter defines the GPU run time factor, which accounts for the total GPU run time of a user's running GPU jobs when calculating fair share scheduling policy.
  • ENABLE_GPU_HIST_RUN_TIME: In LSF Version 10.1 Fix Pack 7, this new parameter enables the use of historical GPU run time in the calculation of fair share scheduling policy.
  • KILL_JOBS_OVER_RUNLIMIT: In LSF Version 10.1 Fix Pack 7, this new parameter enables the mbatchd daemon to kill jobs that are running over the defined RUNLIMIT value for a long period of time.
  • CSM_VALID_SMT: In LSF Version 10.1 Fix Pack 8, this new parameter defines a space-delimited list of valid SMT mode values for CSM jobs. The first value in the list is the default value for CSM jobs if the SMT mode is not specified at the queue or job level.
  • SECURE_INFODIR_USER_ACCESS: In LSF Version 10.1 Fix Pack 9, this parameter now has the new keyword G to provide full granularity over what information the bhist and bacct commands display for jobs for other users. Enable this feature by defining SECURE_INFODIR_USER_ACCESS=G.
  • SECURE_JOB_INFO_LEVEL: In LSF Version 10.1 Fix Pack 9, this parameter now has an additional information level 5 to display summary information for the jobs that belong to other users. Enable this information level by defining SECURE_JOB_INFO_LEVEL=5.
  • DOCKER_IMAGE_AFFINITY: In LSF Version 10.1 Fix Pack 9, this new parameter enables LSF to give preference for execution hosts that already have the requested Docker image when submitting or scheduling Docker jobs.
  • GPU_REQ_MERGE: In LSF Version 10.1 Fix Pack 9, this new parameter enables all individual options in the GPU requirement string to be merged separately. Any specified options override the any options that are specified at the lower levels of precedence. If an individual option is not specified, but is explicitly specified at a lower level, then the highest level for which the option is specified takes precedence.
  • SIMPLIFIED_GUARANTEE: In LSF Version 10.1 Fix Pack 10, this new parameter enables simplified scheduling algorithms for package and slot pools that are used by jobs with guarantee policies.
  • ATTR_CREATE_USERS: In LSF Version 10.1 Fix Pack 10, this new parameter specifies the users who can create host attributes for attribute affinity scheduling.
  • ATTR_MAX_NUM: In LSF Version 10.1 Fix Pack 10, this new parameter specifies the maximum number of host attributes that can exist simultaneously in the cluster.
  • ATTR_TTL: In LSF Version 10.1 Fix Pack 10, this new parameter specifies the time-to-live (TTL) for newly-created host attributes.
  • SAME_JOB_AFFINITY: In LSF Version 10.1 Fix Pack 10, this new parameter enables users to specify affinity preferences for jobs to run on the same host or compute unit as another job. That is, users can use the samehost and samecu keywords with the bsub -jobaff command option.
  • GLOBAL_LIMITS: In LSF Version 10.1 Fix Pack 10, this new parameter enables global limit scheduling, which allows you to specify global resource allocation limits in the lsb.globalpolicies file. When using the LSF multicluster capability, global resource allocation limits apply to all clusters.
  • RELAX_JOB_DISPATCH_ORDER: In LSF Version 10.1 Fix Pack 10, this parameter now has the SHARE[] keyword to relax additional constraints on the pending jobs that can reuse resource allocations for finished jobs.
  • JOB_DISPATCH_PACK_SIZE: In LSF Version 10.1 Fix Pack 10, this new parameter specifies the maximum number of job decisions that can accumulate before LSF publishes the decisions in a decision package before the end of the job scheduling cycle.
  • JOB_SCHEDULING_INTERVAL: In LSF Version 10.1 Fix Pack 10, you can now specify a maximum time for the job scheduling cycle. mbschd skips job scheduling if the scheduling cycle exceeds this time. To specify the maximum time, add a second number, in seconds.
  • RESCHED_UPON_CSM_SETUP_ERROR: In LSF Version 10.1 Fix Pack 10, this new parameter enables LSF to reschedule IBM CSM jobs that are stage-in or non-transfer jobs that fail during CSM setup if they fail with the specified CSM API error codes.
  • DEFAULT_RC_ACCOUNT_PER_PROJECT: In LSF Version 10.1 Fix Pack 11, this new parameter enables LSF to set the project name as the default account name on hosts that are borrowed through LSF resource connector.
  • ENABLE_RC_ACCOUNT_REQUEST_BY_USER: In LSF Version 10.1 Fix Pack 11, this new parameter enables users to assign a specific account name at the job level on hosts that are borrowed through LSF resource connector. This allows users to use the bsub -rcacct "rc_account_name" command option to assign an account name.
  • In LSF Version 10.1 Fix Pack 12, the following parameters are deprecated and will be removed in a future version:
    • CHUNK_JOB_DURATION
    • ENABLE_DEFAULT_EGO_SLA
    • MAX_PROTOCOL_INSTANCES
    • NETWORK_REQ
    • SIMPLIFIED_GUARANTEE: Now fixed to Y.
    • STRIPING_WITH_MINIMUM_NETWORK
  • FAIRSHARE_JOB_COUNT parameter: In LSF 10.1 Fix Pack 13, this new parameter enables LSF to use the number of jobs instead of job slots in the fair share scheduling algorithm.
  • JOB_GROUP_IDLE_TTL parameter: In LSF 10.1 Fix Pack 13, this new parameter defines the job group's time-to-live (TTL) when all jobs leave the job group.

lsb.queues

  • RELAX_JOB_DISPATCH_ORDER: Allows LSF to deviate from standard job prioritization policies to improve cluster utilization by allowing multiple jobs with common resource requirements to run consecutively on the same allocation.

    By default, the same allocation can be reused for up to 30 minutes. You can also specify a custom allocation reuse time by specifying the ALLOC_REUSE_DURATION keyword with a maximum value and an optional minimum value.

  • ELIGIBLE_PEND_TIME_LIMIT specifies the eligible pending time limit for jobs in the queue.
  • PEND_TIME_LIMIT specifies the pending time limit for jobs in the queue.
  • FWD_JOB_FACTOR: In LSF Version 10.1 Fix Pack 4, this new parameter defines the forwarded job slots factor, which accounts for forwarded jobs when making the user priority calculation for the fair share policies.
  • #INCLUDE: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, you can use this new directive to insert the contents of a specified file into this configuration file.
  • FWD_USERS: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, you can use this new parameter to specify a list of users or user groups that can forward jobs to remote clusters when using the LSF multicluster capability. This allows you to prevent jobs from certain users or user groups from being forwarded to an execution cluster, and to set limits on the submission cluster.
  • EXTENDABLE_RUNLIMIT: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, you can use this new parameter to enable jobs to continue running past the original run limit if resources are not needed by other jobs.
  • ESTIMATED_RUNTIME: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this parameter is introduced to configure estimated runtimes for jobs in a queue, and is meant to replace the existing RUNTIME parameter from the lsb.applications file. Can also be configured at the application level (lsb.applications file) and cluster level (lsb.params file).
  • PLAN: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this parameter is introduced for use when plan-based scheduling is enabled with the parameter ALLOCATION_PLANNER=Y. This parameter controls whether or not jobs are candidates for plan-based scheduling. Can also be configured at the application level (lsb.applications file), cluster level (lsb.params file), and job level.
  • CSM_REQ: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this parameter specifies the required values for the IBM Cluster Systems Manager (CSM) bsub job submission command options. These settings override job level CSM options and append system level allocation flags to the job level allocation flags.
    • In LSF Version 10.1 Fix Pack 8, you can now use the smt keyword to specify the SMT mode.
  • GPU_REQ: In LSF Version 10.1 Fix Pack 6, this new parameter specifies the GPU requirements for the queue.
  • GPU_RUN_TIME_FACTOR: In LSF Version 10.1 Fix Pack 7, this new parameter defines the GPU run time factor, which accounts for the total GPU run time of a user's running GPU jobs when calculating fair share scheduling policy.
  • ENABLE_GPU_HIST_RUN_TIME: In LSF Version 10.1 Fix Pack 7, this new parameter enables the use of historical GPU run time in the calculation of fair share scheduling policy.
  • In LSF Version 10.1 Fix Pack 8, the maximum integer that you can specify for the following resource limits is increased from 32 bit (2³¹) to 64 bit (2⁶³):
    • memory limit (MEMLIMIT parameter)
    • swap limit (SWAPLIMIT parameter)
    • core file size limit (CORELIMIT parameter)
    • stack limit (STACKLIMIT parameter)
    • data segment size limit (DATALIMIT parameter)
    • file size limit (FILELIMIT parameter)
  • GPU_REQ: In LSF Version 10.1 Fix Pack 8, GPU_REQ has the following changes:
    • You can now specify aff=no in the GPU requirements to relax GPU affinity while maintaining CPU affinity. By default, aff=no is set to maintain strict GPU-CPU affinity binding.
    • You can now specify mps=per_socket in the GPU requirements to enable LSF to start one MPS daemon per socket per job on each GPU host.
    • You can now specify mps=per_gpu in the GPU requirements to enable LSF to start one MPS daemon per GPU per job on each GPU host.
  • RES_REQ: In LSF Version 10.1 Fix Pack 9, you can now specify the resource reservation method (by task, by job, or by host) in the rusage string by using the /task, /job, or /host keyword after the numeric value in the rusage string. You can only specify resource reservation methods for consumable resources.

    In addition, you can now enable LSF to stripe tasks of a parallel job across the free resources of the candidate hosts by using the stripe keyword in the span string of the resource requirements.

  • GPU_REQ: In LSF Version 10.1 Fix Pack 9, you can now specify mps=yes,share, mps=per_socket,share, and mps=per_gpu,share in the GPU requirements to enable LSF to share the MPS daemon on the host, socket, or GPU for jobs that are submitted by the same user with the same resource requirements.

    That is, you can now add ",share" to the mps value to enable MPS daemon sharing for the host, socket, or GPU.

    In addition, you can now assign the number of GPUs per task or per host by specifying num=number/task or num=number/host. By default, the number of GPUs is still assigned per host.

  • RUN_WINDOW: In LSF Version 10.1 Fix Pack 9, this parameter now allows you to specify supported time zones when specifying the time window. You can specify multiple time windows, but all time window entries must be consistent in whether they set the time zones. That is, either all entries must set a time zone, or all entries must not set a time zone.
  • DISPATCH_WINDOW: In LSF Version 10.1 Fix Pack 9, this parameter now allows you to specify supported time zones when specifying the time window. You can specify multiple time windows, but all time window entries must be consistent in whether they set the time zones. That is, either all entries must set a time zone, or all entries must not set a time zone.
  • CONTAINER: In LSF Version 10.1 Fix Pack 9, this new parameter enables container jobs to run in the queue. The usage of this parameter is the same as the CONTAINER parameter in the lsb.applications file.
  • EXEC_DRIVER: In LSF Version 10.1 Fix Pack 9, this new parameter specifies the execution driver framework for the queue. The usage of this parameter is the same as the EXEC_DRIVER parameter in the lsb.applications file.
  • DOCKER_IMAGE_AFFINITY: In LSF Version 10.1 Fix Pack 9, this new parameter enables LSF to give preference for execution hosts that already have the requested Docker image when submitting or scheduling Docker jobs.
  • GPU_REQ has the following changes in LSF Version 10.1 Fix Pack 10:
    • You can now add the ",nocvd" keyword to the existing mps value in the GPU resource requirements string to disable the CUDA_VISIBLE_DEVICES environment variable for MPS jobs.
    • You can now specify block=yes in the GPU resource requirements string to enable block distribution of allocated GPUs.
    • You can now specify gpack=yes in the GPU resource requirements string to enable pack scheduling for shared mode GPU jobs.
  • USE_PAM_CREDS has the following changes in LSF Version 10.1 Fix Pack 10:
    • You can now specify the session keyword to enable LSF to open a PAM session when submitting jobs to Linux hosts using PAM..
    • You can now specify the limits keyword to apply the limits that are specified in the PAM configuration file to a queue. This is functionally identical to enabling USE_PAM_CREDS=y except that you can define limits together with the session keyword.
  • MC_FORWARD_DELAY: In LSF Version 10.1 Fix Pack 10, this new parameter specifies the job forwarding behavior and the amount of time after job submission and scheduling for LSF revert to the default job forwarding behavior.
  • RELAX_JOB_DISPATCH_ORDER: In LSF Version 10.1 Fix Pack 10, this parameter now has the SHARE[] keyword to relax additional constraints on the pending jobs that can reuse resource allocations for finished jobs.
  • MAX_SBD_CONNS: In LSF Version 10.1 Fix Pack 10, the default value of this parameter is changed to 2 * numOfHosts + 300.
  • DISPATCH_BY_QUEUE: In LSF Version 10.1 Fix Pack 10, this parameter is obsolete and replaced by the JOB_DISPATCH_PACK_SIZE parameter in the lsb.params file.
  • CONTAINER: LSF Version 10.1 Fix Pack 11 enables the following new containers with this parameter:
    • Pod Manager (Podman) container jobs run in the application profile by using the docker[] keyword to run Podman jobs with the Docker execution driver. The options[] keyword specifies Podman job run options for the podman run command, which are passed to the job container. Because Podman uses the Docker execution driver to run Podman container jobs, podman and docker command options are not compatible, and execution driver permissions are not the same, you cannot use LSF to run Docker container jobs if you are using LSF to run Podman container jobs.
    • Enroot container jobs run in the application profile by using the new enroot[] keyword to run Enroot jobs with the Enroot execution driver. The options[] keyword specifies Enroot job run options for the enroot start command, which are passed to the job container.
  • EXEC_DRIVER: In LSF Version 10.1 Fix Pack 11, this parameter specifies the execution driver framework for the following container jobs:
    • Pod Manager (Podman) container jobs if LSF is configured to run Podman container jobs instead of Docker container jobs. The user keyword must be set to default for Podman jobs, and the starter and controller file permissions must be set to 0755. The monitor file is not required for podman jobs, but if it is used, the monitor file permission must also be set to 0755. Because Podman uses the Docker execution driver to run Podman container jobs, podman and docker command options are not compatible, and execution driver permissions are not the same, you cannot use LSF to run Docker container jobs if you are using LSF to run Podman container jobs.
    • Enroot container jobs. The starter file permission must be set to 0755. The context, monitor, and controller settings are ignored. This parameter is optional for Enroot container jobs, and is context[user(default)] starter[/path/to/serverdir/enroot-starter.py] by default.
  • GPU_REQ has the following changes in LSF Version 10.1 Fix Pack 11:
    • The new gvendor keyword in the GPU requirements strings enables LSF to allocate GPUs with the specified vendor type. Specify gvendor=nvidia to request Nvidia GPUs and gvendor=amd to request AMD GPUs.
    • The nvlink=yes keyword in the GPU requirements string is deprecated. Replace nvlink=yes in the GPU requirements string with glink=nvlink instead.
    • The new glink keyword in the GPU requirements strings specifies the connections among GPUs. Specify glink=nvlink for the NVLink connection for Nvidia GPUs or glink=xgmi for the xGMI connection for AMD GPUs. Do not use glink with the nvlink keyword, which is now deprecated.
  • In LSF Version 10.1 Fix Pack 12, the following parameters are deprecated and will be removed in a future version:
    • CHUNK_JOB_SIZE
    • HOSTS (allremote and all@cluster_name keywords only)
    • MAX_PROTOCOL_INSTANCES
    • MAX_SLOTS_IN_POOL
    • NETWORK_REQ
    • SLOT_POOL
    • SLOT_SHARE
    • STRIPING_WITH_MINIMUM_NETWORK
    • USE_PRIORITY_IN_POOL
  • In LSF Version 10.1 Fix Pack 13, the new IMPT_JOBLIMIT and IMPT_TASKLIMIT parameters allow you to specify how many MultiCluster jobs or tasks, from remote clusters, that can be configured at the receive-jobs queue.
  • In LSF Version 10.1 Fix Pack 13, the supported Podman version is 3.3.1. For both the lsb.applications and lsb.queues files, the CONTAINER parameter now supports podman configuration, and requires the EXEC_DRIVER parameter (with mandatory [user(default)] configuration for the Podman jobs to start, and controller[] configuration (such as controller[/path/to/serverdir/docker-control.py]) for Podman to work.
  • In LSF Version 10.1 Fix Pack 13, the CONTAINER parameter supports Apptainer container jobs to run in the queue. Apptainer is the rebranded product name for Singularity. The usage of this parameter is the same as the CONTAINER parameter in the lsb.applications file.

lsb.reasons

lsb.reasons allows for individual configuration of pending reason messages. Administrators can make messages clear and can inform users on which action they can take to allow the job to run. Messages can be customized for one or more pending reasons and the priority that is given to particular resources.

This file is optional. It is installed by default in config/lsbatch/<cluster_name>/configdir/lsb.reasons.

After you change the lsb.reasons file, run badmin reconfig.

  • #INCLUDE: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, you can use this new directive to insert the contents of a specified file into this configuration file.

lsb.resources

  • LOAN_POLICIES: In LSF Version 10.1 Fix Pack 1, you can now enable queues to ignore the RETAIN and DURATION loan policies when LSF determines whether jobs in those queues can borrow unused guaranteed resources. To enable the queue to ignore the RETAIN and DURATION loan policies, specify an exclamation point (!) before the queue name in the LOAN_POLICIES parameter definition.
  • #INCLUDE: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, you can use this new directive to insert the contents of a specified file into this configuration file.
  • JOBS_PER_SCHED_CYCLE: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, you can use this new parameter to set limits on the maximum number of jobs that are dispatched in a scheduling cycle for users, user groups, and queues. You can only set job dispatch limits if the limit consumer types are USERS, PER_USER, QUEUES, or PER_QUEUE.
  • PER_PROJECT: In LSF Version 10.1 Fix Pack 6, each project name can now be up to 511 characters long (previously, this limit was 59 characters).
  • PROJECTS: In LSF Version 10.1 Fix Pack 6, each project name can now be up to 511 characters long (previously, this limit was 59 characters).
  • In LSF Version 10.1 Fix Pack 7, the JOBS_PER_SCHED_CYCLE parameter is renamed to ELIGIBLE_PEND_JOBS. The old JOBS_PER_SCHED_CYCLE parameter is still kept for backwards compatibility.
  • APPS: In LSF Version 10.1 Fix Pack 9, this new parameter specifies one or more application profiles on which limits are enforced. Limits are enforced on all application profiles listed.
  • PER_APP: In LSF Version 10.1 Fix Pack 9, this new parameter specifies one or more application profiles on which limits are enforced. Limits are enforced on each application profile listed.
  • LOAN_POLICIES: In LSF Version 10.1 Fix Pack 10, the RETAIN keyword is now deprecated and replaced with IDLE_BUFFER.
  • HostExport section: In LSF Version 10.1 Fix Pack 12, the HostExport section is deprecated and will be removed in a future version.
  • SharedResourceExport section: In LSF Version 10.1 Fix Pack 12, the SharedResourceExport section is deprecated and will be removed in a future version.

lsb.users

  • FS_POLICY in the UserGroup section: This new parameter enables global fair share policy for the defined user group. FS_POLICY specifies which global fair share policy the share account will participate into.
  • MAX_PEND_JOBS in the User section: In LSF Version 10.1 Fix Pack 3, this parameter has been changed to specify pending "jobs" instead of pending "job slots" as in previous versions of LSF.
  • MAX_PEND_SLOTS in the User section: In LSF Version 10.1 Fix Pack 3, this new parameter has been added to specify "job slots" and replaces the previous role of MAX_PEND_JOBS.
  • #INCLUDE: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, you can use this new directive to insert the contents of a specified file into this configuration file.
  • PRIORITY in the User and UserGroup sections: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this new parameter specifies a priority that is used as a factor when calculating the job priority for absolute priority scheduling (APS).

lsf.cluster.cluster_name

  • HOSTNAME in the Host section now supports condensed notation for host names.

    Use square brackets ([]) to enclose the multiple numbers, and use a hyphen (-) or a colon (:) to specify a range of numbers. Use a comma (,) to separate multiple ranges of numbers or to separate individual numbers. You can also use leading zeros to specify host names.

    Use multiple sets of square brackets (with the supported special characters) to define multiple sets of non-negative integers anywhere in the host name. For example, hostA[1,3]B[1-3] includes hostA1B1, hostA1B2, hostA1B3, hostA3B1, hostA3B2, and hostA3B3.

lsf.conf

  • The LSB_BJOBS_FORMAT parameter now has the following fields:
    • effective_plimit, plimit_remain, effective_eplimit, and eplimit_remain to display the job's pending time limit, remaining pending time, eligible pending time limit, and remaining eligible pending. You can use the -p option to show only information for pending jobs.
    • "pend_reason" shows the pending reason field of a job. If a job has no pending reason (for example, the job is running), then the pend_reason field of the job is NULL and shows a hyphen (-).
  • LSB_BCONF_PROJECT_LIMITS: By default, LSF does not allow the bconf command to create project-based limits because LSF schedules jobs faster if the cluster has no project-based limits. If you need to use the bconf command to dynamically create project-based limits when the cluster is running, set the LSB_BCONF_PROJECT_LIMITS parameter to Y.
  • LSB_BJOBS_PENDREASON_LEVEL: This new parameter sets the default behavior when a user enters the command bjobs -p, without specifying a level of 0 to 3. For upgraded clusters, if the LSB_BJOBS_PENDREASON_LEVEL parameter is not configured, the level for the bjobs -p command is 0 by default. For new clusters, the LSB_BJOBS_PENDREASON_LEVEL parameter is set to 1 in the installation template, showing the single key reason by default.
  • LSB_BMGROUP_ALLREMOTE_EXPAND: When set to N or n in the IBM® Spectrum LSF multicluster capability resource leasing model, the bmgroup command displays leased-in hosts with a single keyword allremote instead of being displayed as a list. Otherwise, a list of leased-in hosts is displayed in the HOSTS column in the form host_name@cluster_name by default.
  • LSB_DEBUG_GPD: This new parameter sets the debugging log class for gpolicyd. Only messages that belong to the specified log class are recorded.
  • LSB_EXCLUDE_HOST_PERIOD: Specifies the amount of time, in the number of mbatchd sleep time units (MBD_SLEEP_TIME), that a host remains excluded from a job. When this time expires, the hosts are no longer excluded and the job can run on the host again.

    This parameter does not apply to the IBM Spectrum LSF multicluster capability job lease model.

  • LSB_ESUB_SINGLE_QUOTE: When set to Y or y, the values in the LSB_SUB_PROJECT_NAME parameter that are written to the $LSB_SUB_PARM_FILE file for esub processes are enclosed in single quotation marks ('). The shell is prevented from processing certain meta characters in parameter values such as $.

    Otherwise, the LSB_SUB_PROJECT_NAME parameter values written to the $LSB_SUB_PARM_FILE file for esub processes are enclosed in double quotation marks (") by default.

  • LSB_GSLA_PREFER_ADRSV_HOST: When set to Y or y, the guaranteed SLA first tries to reserve hosts without advanced reservation. The LSB_GSLA_PREFER_ADRSV_HOST ensures that advance reservation does not interfere with guaranteed SLA job scheduling.
  • LSB_GPD_CLUSTER: This new parameter defines the names of clusters whose master hosts start gpolicyd for global fair share policy among multiple clusters. The LSB_GPD_CLUSTER parameter must be configured for every cluster that participates in global fair share.
  • LSB_GPD_PORT: This new parameter defines the TCP service port for communication with gpolicyd. The LSB_GPD_PORT parameter must be configured for every cluster that participates in global fair share.
  • LSB_MBD_MAX_SIG_COUNT: Obsolete in LSF 10.1.
  • LSB_SUPPRESS_CUSTOM_REASONS: This new parameter allows individual users to disable display of customized pending reasons for the new Single key reason feature (bjobs -p1) and Categorized Pending Reasons feature (bjobs -p2 and bjobs -p3).

    By default, the value of the LSB_SUPPRESS_CUSTOM_REASONS parameter is set to N. This parameter applies to all bjobs -p levels except -p0. The command bjobs -p0, is used to display pending reasons in the style previous to version 10.1, without using the single key reason or the categorized pending reason features.

  • LSB_TERMINAL_SERVICE_PORT: Specifies the terminal service port number for Remote Desktop Protocol (RDP). This port is used in tssub jobs.
  • LSB_TIME_GPD: This new parameter sets a timing level for checking how long gpolicyd routines run. Time usage is logged in milliseconds.
  • LSF_AUTH: You can now specify LSF_AUTH=none to disable authentication. Use the LSF_AUTH=none parameter only for performance benchmarking.
  • LSF_CONNECTION_CHANGE: Windows only. When set to Y or y, enables lsreghost to register with LSF servers whenever it detects a change in the total number of connections (IP addresses) that are associated with the local host. This parameter is only valid if registration handling is enabled for LSF hosts (that is, LSF_REG_FLOAT_HOSTS=Y is set in the lsf.conf file on the LSF server).
  • LSF_CRAY_RUR_ACCOUNTING: For LSF on Cray. Specify N to disable RUR job accounting if RUR is not enabled in your Cray environment, or to increase performance. Default value is Y (enabled).
  • LSF_CRAY_RUR_DIR: Location of the Cray RUR data files, which is a shared file system that is accessible from any potential first execution host. Default value is LSF_SHARED_DIR/<cluster_name>/craylinux/<cray_machine_name>/rur.
  • LSF_CRAY_RUR_PROLOG_PATH: File path to the RUR prolog script file. Default value is /opt/cray/rur/default/bin/rur_prologue.py.
  • LSF_CRAY_RUR_EPILOG_PATH: File path to the RUR epilog script file. Default value is /opt/cray/rur/default/bin/rur_epilogue.py.
  • LSF_DISCARD_LOG: Specifies the behavior of the mbatchd and mbschd logging threads if the logging queue is full.

    If set to Y, the logging thread discards all new messages at a level lower than LOG_WARNING when the logging queue is full. LSF logs a summary of the discarded messages later.

    If set to N, LSF automatically extends the size of the logging queue if the logging queue is full.

  • LSF_LOG_QUEUE_SIZE: Specifies the maximum number of entries in the logging queues that the mbatchd and mbschd logging threads use before the logging queue is full.
  • LSF_LOG_THREAD: If set to N, mbatchd and mbschd does not create dedicated threads to write messages to the log files.
  • LSF_PLATFORM_COMPATIBILITY: Allows for compatibility with an earlier version of the IBM Platform name after LSF 10.1. Set it to y|Y in lsf.conf to enable lsid and the LSF command -V to display "IBM Platform LSF" instead of "IBM Spectrum LSF". The LSF_PLATFORM_COMPATIBILITY parameter solves compatibility issues between LSF 10.1 and older versions of IBM Platform Process Manager.
  • LSF_REG_FLOAT_HOSTS: When set to Y or y, enables registration handling for LSF hosts so that LSF servers can resolve these hosts without requiring the use of DNS servers.
  • LSF_REG_HOST_INTERVAL: Windows only. Specifies the interval, in minutes, in which lsreghost sends more registration messages to LSF servers. This parameter is only valid if registration handling is enabled for LSF hosts (that is, LSF_REG_FLOAT_HOSTS=Y is set in the lsf.conf file on the LSF server).
  • LSF_REPLACE_PIM_WITH_LINUX_CGROUP: Minimizes the impact of PIM daemon processing load for parallel jobs. PIM collects job processes, the relationship among all processes, the memory usage of each process, and the CPU time of each process periodically. Those actions can influence the execution of parallel jobs (so-called OS jitter). To minimize OS jitter, you can configure the LSF cgroup feature. This parameter is only supported on Linux. The parameter is ignored on other operating systems. The LSF cgroup feature does not support PAM jobs, so you cannot disable PIM if you run PAM jobs.
  • In LSF Version 10.1 Fix Pack 2, the LSB_BJOBS_FORMAT parameter now has the following fields:
    • jobindex shows the job array index.
    • estimated_run_time shows estimated run time of the job.
    • ru_utime and ru_stime show the user time used and the system time used from the resource usage information for the job.
    • nthreads shows the number of threads that the job used
    • hrusage shows the per-host resource usage information.
    • plimit and eplimit show the pending time limit and eligible time limit.
    • licproject shows the license project information.
    • srcjobid, dstjobid, and source_cluster show the submission cluster job ID, execution cluster job ID, and the name of the submission cluster when using the LSF multicluster capability.
  • LSF_INTELLIGENT_CPU_BIND: In LSF Version 10.1 Fix Pack 2, this new parameter enables LSF to bind a defined set of LSF daemons to CPUs.
  • LSB_BWAIT_REREG_INTERVAL: In LSF Version 10.1 Fix Pack 2, this new parameter specifies the default time interval to reregister the wait condition from the bwait command to the mbatchd daemon, in minutes.
  • LSF_HOST_CACHE_NTTL: In LSF Version 10.1 Fix Pack 2, the default value of this parameter is increased from 20s to 60s, which is the maximum valid value.
  • LSB_QUERY_PORT: In LSF Version 10.1 Fix Pack 2, the value of this parameter is now set to 6891 at the time of installation, which enables the multithread mbatchd job query daemon and specifies the port number that the mbatchd daemon uses for LSF query requests.
  • LSB_QUERY_ENH: In LSF Version 10.1 Fix Pack 2, the value of this parameter is now set to Y at the time of installation, which extends multithreaded query support to batch query requests (in addition to bjobs query requests).
  • LSF_DCGM_PORT: In LSF Version 10.1 Fix Pack 2, this new parameter enables the NVIDIA Data Center GPU Manager (DCGM) features and specifies the port number that LSF uses to communicate with the DCGM daemon.
  • LSF_ENABLE_TMP_UNIT: In LSF Version 10.1 Fix Pack 2, this new parameter allows units that are defined by the LSF_UNIT_FOR_LIMITS parameter to also apply cluster-wide to the tmp resource.
  • LSB_RC_MAX_INSTANCES_PER_TEMPLATE: In LSF Version 10.1 Fix Pack 2, this new parameter for the LSF resource connector specifies the maximum number of resource instances that can be launched for any template for any resource provider in the cluster. The default value is 50.
  • LSB_BHOSTS_FORMAT: In LSF Version 10.1 Fix Pack 2, this new parameter customizes specific fields that the bhosts command displays.
  • LSB_BQUEUES_FORMAT: In LSF Version 10.1 Fix Pack 2, this new parameter customizes specific fields that the bqueues command displays.
  • LSB_HMS_TIME_FORMAT: In LSF Version 10.1 Fix Pack 2, this new parameter displays times from the customized bjobs -o command output in hh:mm:ss format. This parameter setting only applies to bjobs -o or bjobs -o -json command output.
  • LSB_PROFILE_MBD: In LSF Version 10.1 Fix Pack 3, this new parameter configures the mbatchd daemon profiler to track the time that mbatchd spends on key functions.
  • LSB_PROFILE_SCH: In LSF Version 10.1 Fix Pack 3, this new parameter configures the mbschd daemon profiler to track the time that mbschd spends on key functions.
  • In LSF Version 10.1 Fix Pack 3, the LSB_BJOBS_FORMAT option now has the following fields:
    • rsvid shows the reservation ID, if the job is associated with an advance reservation.
  • LSF_LSLOAD_FORMAT: In LSF Version 10.1 Fix Pack 3, this new parameter customizes specific fields that the lsload command displays.
  • LSB_GPU_NEW_SYNTAX: In LSF Version 10.1 Fix Pack 3, this new parameter enables the bsub -gpu option to submit jobs that require GPU resources.
  • LSF_ENABLE_BEAT_SERVICE: In LSF Version 10.1 Fix Pack 4, this new parameter enables the lsfbeat tool which integrates energy accounting into LSF. IBM Spectrum LSF Explorer uses Elasticsearch in the collection of the energy data of each host using Beats. With this tool enabled, LSF can query the data from IBM Spectrum LSF Explorer and the bjobs and bhosts display the job level or host level energy to users.
  • LSF_QUERY_ES_SERVERS: In LSF Version 10.1 Fix Pack 4, this new parameter specifies LSF Explorer servers to retrieve log records. Use this parameter to enable the supported commands (as defined by the LSF_QUERY_ES_FUNCTIONS parameter) to use LSF Explorer to get log records instead of parsing the log files to get data.
  • LSF_QUERY_ES_FUNCTIONS: In LSF Version 10.1 Fix Pack 4, this new parameter specifies the commands and functions that use LSF Explorer to retrieve job records.
  • LSF_LSLOAD_FORMAT: In LSF Version 10.1 Fix Pack 4, this parameter now has the following fields:
    • gpu_status* shows the status of the GPU (ok, error, or warning). If more than 1 GPU is reported, an index is appended to the resource name, starting at 0. For example, gpu_status0 and gpu_status1.
    • gpu_error* shows the detailed error or warning message if the gpu_status* field is not ok. If more than 1 GPU is reported, an index is appended to the resource name, starting at 0. For example, gpu_status0 and gpu_status1.
  • LSB_GPU_AUTOBOOST: In LSF Version 10.1 Fix Pack 4, this parameter is now obsolete because LSF synchronizes the GPU auto-boost to resolve any issues that previously required disabling the auto-boost.
  • LSF_HWLOC_DYNAMIC: In LSF Version 10.1 Fix Pack 4, this new parameter enables LSF to dynamically load the hardware locality (hwloc) library from system library paths whenever it is needed. If LSF fails to load the library, LSF defaults to using the hwloc functions in the static library.
  • LSB_ESWITCH_METHOD: In LSF Version 10.1 Fix Pack 4, this new parameter specifies a mandatory eswitch executable file that applies to all job switch requests.
  • LSF_MC_FORWARD_FAIRSHARE_CHARGE_DURATION: In LSF Version 10.1 Fix Pack 4, this new parameter specifies the duration of time after which LSF removes the forwarded jobs from the user priority calculation for fair share scheduling. This parameter is used if global fair share scheduling is enabled for the LSF multicluster capability job forwarding model.
  • LSB_START_EBROKERD: In LSF Version 10.1 Fix Pack 4, this new parameter enables the mbatchd daemon to start the ebrokerd daemon whenever mbatchd starts up, is reconfigured, or when it detects that the old ebrokerd daemon exits. This is required to use advance reservation prescripts and post-scripts. The ebrokerd daemon also starts automatically if the LSF resource connector is configured and in use.
  • LSF_MQ_BROKER_HOSTS: In LSF Version 10.1 Fix Pack 4, this new parameter for LSF resource connector enables support for the bhosts -rc and bhosts -rconly command options to get LSF resource connector provider host information.
  • MQTT_BROKER_HOST: In LSF Version 10.1 Fix Pack 4, new parameter for LSF resource connector. If you do not use the MQTT message broker daemon (mosquitto) that is provided with LSF, specifies the host name that mosquitto runs on. The MQTT message broker receives provider host information from ebrokerd and publishes that information for the bhosts -rc and bhosts -rconly command options to display.
  • LSF_MQ_BROKER_PORT: In LSF Version 10.1 Fix Pack 4, new parameter for LSF resource connector. If you do not use the MQTT message broker daemon (mosquitto) that is provided with LSF, specifies an optional TCP port for the MQTT message broker daemon (mosquitto). The MQTT message broker receives provider host information from ebrokerd and publishes that information for the bhosts -rc and bhosts -rconly command options to display.
  • EBROKERD_HOST_CLEAN_DELAY: In LSF Version 10.1 Fix Pack 4, this new parameter for LSF resource connector specifies the delay, in minutes, after which the ebrokerd daemon removes information about relinquished or reclaimed hosts. This parameter allows the bhosts -rc and bhosts -rconly command options to get LSF resource connector provider host information for some time after they are deprovisioned.
  • LSF_UGROUP_TRANSFER: In LSF Version 10.1 Fix Pack 5, this new parameter transfers secondary user group IDs from the submission host to the execution host for job execution, thereby overcoming an NFS limitation of 16 user groups.
  • LSF_UDP_PORT_RANGE: In LSF Version 10.1 Fix Pack 5 and Fix Pack 6, this new parameter defines the UDP port range to be used by the LSF daemons. If defined, the UDP socket for the LSF daemons binds to one port in the specified range.
  • LSF Version 10.1 Fix Pack 5 and Fix Pack 6, now have the following parameters for running jobs with IBM Cluster Systems Manager (CSM):
    • LSB_JSM_DEFAULT specifies the default value for the bsub -jsm option for CSM jobs.
    • LSB_STAGE_IN_EXEC specifies the stage in script for direct data staging (for example, IBM CAST burst buffer).
    • LSB_STAGE_OUT_EXEC specifies the stage out script for direct data staging.
    • LSB_STAGE_STORAGE specifies the resource name to report available storage space for direct data staging.
    • LSB_STAGE_TRANSFER_RATE specifies the estimated data transfer rate for the burst buffer. LSF uses this value to calculate the predicted duration for data stage in.
    • LSB_STEP_CGROUP_DEFAULT specifies the default value for the bsub -step_cgroup option for CSM jobs.
  • In LSF Version 10.1 Fix Pack 6, the LSB_BJOBS_FORMAT parameter now has the following fields:
    • gpfsio shows job usage (I/O) data on IBM Spectrum Scale if IBM Spectrum Scale I/O accounting with IBM Spectrum LSF Explorer is enabled by setting LSF_QUERY_ES_FUNCTIONS="gpfsio" or "all" in the lsf.conf file.
  • LSF_QUERY_ES_FUNCTIONS: In LSF Version 10.1 Fix Pack 6, this parameter now allows you to specify the gpfsio function, which enables IBM Spectrum Scale I/O accounting with IBM Spectrum LSF Explorer.
  • LSF_GPU_AUTOCONFIG: In LSF Version 10.1 Fix Pack 6, this new parameter controls whether LSF enables use of GPU resources automatically.
  • LSB_GPU_NEW_SYNTAX: In LSF Version 10.1 Fix Pack 6, this parameter now has extend as a new keyword. If LSB_GPU_NEW_SYNTAX=extend is set, you can specify the gmem, gmodel, gtile, and nvlink GPU requirements with the bsub -gpu option, the GPU_REQ parameter in the lsb.queues file, the GPU_REQ parameter in the lsb.applications file, or the LSB_GPU_REQ parameter in the lsf.conf file.
  • LSB_GPU_REQ: In LSF Version 10.1 Fix Pack 6, this new parameter specifies the default GPU requirements for the cluster.
  • LSB_GSLA_DISPLAY_ALLOC_HOSTS: In LSF Version 10.1 Fix Pack 7, this new parameter enables the bsla command to display information on guarantee hosts that are being used (allocated) from each guarantee pool for the guarantee SLA.
  • LSB_BSUB_PARSE_SCRIPT: In LSF Version 10.1 Fix Pack 7, this new parameter enables the bsub command to load, parse, and run job scripts from the command line.
  • LSF_LSHOSTS_FORMAT: In LSF Version 10.1 Fix Pack 7, this new parameter customizes specific fields that the lshosts command displays.
  • LSB_STAGE_MAX_STAGE_IN: In LSF Version 10.1 Fix Pack 7, this new parameter specifies the maximum number of concurrent stage-in process that run on a host, which prevents LSF from launching too many stage-in processes to transfer files to the host.
  • LSF_STAGE_STORAGE: In LSF Version 10.1 Fix Pack 7, this parameter now allows you to specify a resource that reports the total storage space in addition to a resource that reports the available storage space. This prevents LSF from assigning more storage than is available because the resource information might be out of date. This can occur for direct data staging jobs where the job handles the file transfer instead of LSF because LSF cannot reliably predict the storage usage for these jobs.
  • LSB_PLAN_KEEP_RESERVE: In LSF Version 10.1 Fix Pack 7, this new parameter enables LSF to keep the resource reservations for jobs with plans, even if the plan is no longer valid, until LSF creates new plans based on updated resource availability.
  • In LSF Version 10.1 Fix Pack 7, the LSB_BJOBS_FORMAT parameter now has the following fields:
    • nreq_slot shows the calculated number of requested slots for jobs.
    • gpu_num shows the number of physical GPUs that the job is using.
    • gpu_mode shows the GPU compute mode that the job is using (shared or exclusive_process).
    • gpu_alloc shows the job-based GPU allocation information.
    • j_exclusive shows whether the job requested exclusive allocated GPUs (that is, if the GPUs cannot be shared with other jobs).
    • kill_reason shows the user-specified reason for killing the job.
  • LSF_IMAGE_INFO_PUBLISH_INTERVAL: In LSF Version 10.1 Fix Pack 7, this new parameter specifies the interval for the lim process to fork a new process to collect host Docker container image information.
  • LSF_IMAGE_INFO_EXPIRE_INTERVAL: In LSF Version 10.1 Fix Pack 7, this new parameter specifies how long the host image information is available in mosquitto before the information expires.
  • LSF_EXT_SERVERDIR: In LSF Version 10.1 Fix Pack 7, this new parameter specifies a secure directory in which the eauth and esub.application binary files are located.
  • LSF_ENV_OVERRIDE: In LSF Version 10.1 Fix Pack 7, this new parameter specifies whether environment variable values and the $LSF_ENVIDR/lsf.conf file parameters can override the parameter settings in the /etc/lsf.conf file.
  • LSB_GPU_REQ has the following changes in LSF Version 10.1 Fix Pack 8:
    • You can now specify aff=no in the GPU requirements to relax GPU affinity while maintaining CPU affinity. By default, aff=no is set to maintain strict GPU-CPU affinity binding.
    • You can now specify mps=per_socket in the GPU requirements to enable LSF to start one MPS daemon per socket per job.
    • You can now specify mps=per_gpu in the GPU requirements to enable LSF to start one MPS daemon per GPU per job.
  • LSF_AC_PNC_URL: In LSF Version 10.1 Fix Pack 8, this new parameter specifies the URL and listen port of the LSF Application Center Notifications server for sending notifications. If the listen port is not specified, the default port number is 80.
  • LSB_RC_TEMPLATE_REQUEST_DELAY: In LSF Version 10.1 Fix Pack 8, this new parameter for LSF resource connector specifies the amount of time that LSF waits before repeating a request for a template, in minutes, if the ebrokerd daemon encountered certain provider errors.
  • LSB_RC_MQTT_ERROR_LIMIT: In LSF Version 10.1 Fix Pack 8, this new parameter for LSF resource connector specifies the maximum number of API error messages that are stored in Mosquitto per host provider. This parameter specifies the maximum number of messages that the badmin rc error command displays for each host provider.
  • LSB_GPU_REQ: In LSF Version 10.1 Fix Pack 9, you can now specify mps=yes,share, mps=per_socket,share, and mps=per_gpu,share in the GPU requirements to enable LSF to share the MPS daemon on the host, socket, or GPU for jobs that are submitted by the same user with the same resource requirements.

    That is, you can now add ",share" to the mps value to enable MPS daemon sharing for the host, socket, or GPU.

    In addition, you can now assign the number of GPUs per task or per host by specifying num=number/task or num=number/host. By default, the number of GPUs is still assigned per host.

  • LSB_BUSERS_FORMAT: In LSF Version 10.1 Fix Pack 9, this new parameter customizes specific fields that the busers command displays.
  • LSF_DATA_BSUB_CHKSUM: In LSF Version 10.1 Fix Pack 9, this new parameter enables the bsub and bmod commands to perform a full sanity check on the files and folders for jobs with a data requirement, and to generate the hash for each file and folder. If not specified, these operations occur at the transfer job.
  • LSB_JOB_REPORT_MAIL: In LSF Version 10.1 Fix Pack 9, you can now specify ERROR for the sbatchd daemon to send mail only when the job exits (that is, when the job is under Exit status). This ensures than an email notification is only sent on a job error.
  • In LSF Version 10.1 Fix Pack 9, the LSB_BJOBS_FORMAT parameter now has the ask_hosts field, which shows the list of requested hosts as specified by the bsub -m command option.
  • LSB_MEMLIMIT_ENF_CONTROL: In LSF Version 10.1 Fix Pack 9, you can now exclude the swap threshold from memory limit enforcement and specify only the memory threshold. To exclude swap threshold, specify a value of 0 for the swap threshold.
  • LSF_DATA_NO_SSH_CHK_HOSTS: In LSF Version 10.1 Fix Pack 9, this new parameter specifies a list of data hosts for which ssh is not needed. If the host specified with the data specification of a submitted job matches one of the hosts in this list, LSF assumes that the submission host can directly access the file within the data specification.
  • In LSF Version 10.1 Fix Pack 10, the LSB_BJOBS_FORMAT parameter now has the following fields:
    • suspend_reason shows the user-specified reason for suspending (stopping) the job.
    • resume_reason shows the user-specified reason for resuming the job.
    • kill_issue_host shows the host that issued the job kill request.
    • suspend_issue_host shows the host that issued the job suspend (stop) request.
    • resume_issue_host shows the host that issued the job resume request.
  • LSB_SUBK_SHOW_JOBID: In LSF Version 10.1 Fix Pack 10, this new parameter enables the bsub -K command option to display the job ID of a job after it is finished.
  • LSB_GPU_REQ has the following changes in LSF Version 10.1 Fix Pack 10:
    • You can now add the ",nocvd" keyword to the existing mps value in the GPU resource requirements string to disable the CUDA_VISIBLE_DEVICES environment variable for MPS jobs.
    • You can now specify block=yes in the GPU resource requirements string to enable block distribution of allocated GPUs.
    • You can now specify gpack=yes in the GPU resource requirements string to enable pack scheduling for shared mode GPU jobs.
  • LSB_NCPU_ENFORCE: In LSF Version 10.1 Fix Pack 10, this parameter is now enabled (that is, set to 1) at the time of installation for new LSF installations.
  • LSB_MAX_JOB_DISPATCH_PER_SESSION: In LSF Version 10.1 Fix Pack 10, the default value of this parameter is changed to 15000.
  • LSF_ACCEPT_NUMCLIENTS: In LSF Version 10.1 Fix Pack 10, this new parameter specifies the maximum number of new client connections to the mbatchd port that mbatchd accepts during each scheduling cycle. Previously, this value was fixed at 1.
  • LSF_GPU_RESOURCE_IGNORE: In LSF Version 10.1 Fix Pack 10, this new parameter enables the mbatchd and mbschd daemons to ignore GPU resources. This means that the lsload -s, lsload -l, and bhosts -l commands, which display LSF resources, no longer display information about GPU resources. That is, these options do not display gpu_<num>n resources.
  • LSF_ROOT_REX: In LSF Version 10.1 Fix Pack 10, this parameter is obsolete and no longer allows root execution privileges for jobs from local and remote hosts. Any actions that were performed as root must instead be performed as the LSF administrator.
  • LSF_ROOT_USER: In LSF Version 10.1 Fix Pack 10, this new parameter enables the root user to perform actions as a valid user from the LSF command line.
    Important: Only enable LSF_ROOT_USER=Y as a temporary configuration setting. When you are done, you must disable this parameter to ensure that your cluster remains secure.
  • LS_ROOT_USER: In LSF Version 10.1 Fix Pack 10, this new parameter enables the root user to run LSF License Scheduler commands (bladmin, blkill, globauth, and taskman) as a valid user from the LSF command line.
    Important: Only enable LS_ROOT_USER=Y as a temporary configuration setting. When you are done, you must disable this parameter to ensure that your cluster remains secure.
  • LSF_ADDON_HOSTS: In LSF Version 10.1 Fix Pack 10, this new parameter specifies a list of hosts for LSF Application Center, LSF RTM, or LSF Explorer that require root privileges to remotely execute commands.
  • LSB_BJOBS_FORMAT: In LSF Version 10.1 Fix Pack 11, this parameter now allows you to display a unit prefix for the following resource fields: mem, max_mem, avg_mem, memlimit, swap, swaplimit, corelimit, stacklimit, and hrusage (for hrusage, the unit prefix is for mem and swap resources only).

    In addition, the default width for these resource fields except hrusage are increased from 10 to 15. That is, the following output fields now have a default width that is increased from 10 to 15:mem, max_mem, avg_mem, memlimit, swap, swaplimit, corelimit, and stacklimit.

  • LSF_DISABLE_LSRUN: In LSF Version 10.1 Fix Pack 11, this parameter now also enables RES to refuse remote connections from the lsmake command, in addition to refusing remote connections from the lsrun and lsgrun commands.
  • LSF_GPU_RESOURCE_IGNORE: In LSF Version 10.1 Fix Pack 11, if LSF_GPU_AUTOCONFIG is set to Y and LSB_GPU_NEW_SYNTAX is set to Y or extend, setting LSF_GPU_RESOURCE_IGNORE to Y also enables LSF to remove all built-in GPU resources (gpu_<num>n) from the management host LIM. LSF uses a different method for the management host LIM to collect GPU information in the cluster.
  • LSB_GPU_NEW_SYNTAX: In LSF Version 10.1 Fix Pack 11, if GPU preemption is enabled (that is, PREEMPTABLE_RESOURCES parameter in the lsb.params file includes the ngpus_physical resource), setting LSB_GPU_NEW_SYNTAX=extend removes several restrictions to GPU preemption:
    • Non-GPU jobs can now preempt lower priority GPU jobs.
    • GPU jobs no longer have to be configured for automatic job migration and rerun to be preemptable by higher priority jobs. That is, the MIG parameter no longer has to be defined and the RERUNNABLE parameter no longer has to be set to yes in the lsb.queues or lsb.applications file. Ensure that you properly configure the MIG, RERUNNABLE, or REQUEUE parameters to ensure that GPU resources are properly released after the job is preempted.
    • GPU jobs no longer have to have either mode=exclusive_process or j_exclusive=yes set to be preempted by other GPU jobs. GPU jobs can also use mode=shared if the GPU is used by only one shared-mode job.

      Higher priority GPU jobs cannot preempt shared-mode GPU jobs if there are multiple jobs running on the GPU.

    Setting LSB_GPU_NEW_SYNTAX=Y enables GPU preemption with the previous restrictions (as introduced in LSF Version 10.1 Fix Pack 7).

  • LSB_KRB_IMPERSONATE: In LSF Version 10.1 Fix Pack 11, this new parameter enables Kerberos user impersonation if external authentication is enabled (LSF_AUTH=eauth in the file lsf.conf).
  • LSF_STRICT_CHECKING: In LSF Version 10.1 Fix Pack 11, you can now set this parameter to the ENHANCED, which enables LSF to also add a checksum to each authorization request in addition to enabling more strict checking of communications between LSF daemons and between LSF commands and daemons.
  • LSF_AUTH_QUERY_COMMANDS: In LSF Version 10.1 Fix Pack 11, this new parameter enables query command authentication.
  • LSF_MANAGE_MIG: In LSF Version 10.1 Fix Pack 11, this new parameter enables dynamic MIG scheduling.
  • LSB_BHOSTS_FORMAT: In LSF Version 10.1 Fix Pack 11, this parameter has a new mig_alloc keyword to display the MIG allocation information in the bhosts customized output.
  • LSB_BHOSTS_FORMAT: In LSF Version 10.1 Fix Pack 11, this parameter has a new mig_alloc keyword to display the MIG allocation information in the bhosts customized output.
  • LSF_ENV_OVERRIDE: In LSF Version 10.1 Fix Pack 12, the default value of this parameter is changed to N.
  • In LSF Version 10.1 Fix Pack 12, the following parameters are deprecated and will be removed in a future version:
    • LSB_CHUNK_RUSAGE
    • LSB_CPUSET_BESTCPUS
    • LSB_CPUSET_DISPLAY_CPULIST
    • LSB_GPU_NEW_SYNTAX: Now fixed to extend.
    • LSF_CPUSETLIB
    • LSF_GPU_AUTOCONFIG: Now fixed to Y.
    • LSF_GPU_RESOURCE_IGNORE: Now fixed to Y.
    • LSF_PAM_APPL_CHKPNT
    • LSF_PAM_CLEAN_JOB_DELAY
    • LSF_PAM_HOSTLIST_USE
    • LSF_PAM_PLUGINDIR
    • LSF_PAM_USE_ASH
    • LSF_PE_NETWORK_NUM
    • LSF_PE_NETWORK_UPDATE_INTERVAL
    • LSF_SHELL_AT_USERS
    • LSF_STRICT_RESREQ: Now fixed to Y.
    • LSF_TOPD_PORT
    • LSF_TOPD_TIMEOUT
  • LSF_STRICT_CHECKING: In LSF Version 10.1 Fix Pack 12, the default value of this parameter is now ENHANCED.
  • LSF_AUTH_QUERY_COMMANDS: In LSF Version 10.1 Fix Pack 12, the default value of this parameter is now Y.
  • LSF_ADDON_HOSTS: In LSF Version 10.1 Fix Pack 12, this parameter is now required if you are running LSF Application Center, LSF Explorer, LSF Process Manager, or LSF RTM.
  • The LSB_BQUEUES_FORMAT parameter now has the following fields for limits and resources:
    • max_corelimit, max_cpulimit, default_cpulimit, max_datalimit, default_datalimit, max_filelimit, max_memlimit, default_memlimit, max_processlimit, max_runlimit, default_runlimit, max_stacklimit, max_swaplimit, max_tasklimit, min_tasklimit, default_tasklimit, max_threadlimit, default_threadlimit, res_req, hosts.
    • The following resource limit fields show the same content as their corresponding maximum resource limit fields: corelimit, cpulimit, datalimit, filelimit, memlimit, processlimit, runlimit, stacklimit, swaplimit, tasklimit, threadlimit.

      For example, corelimit is the same as max_corelimit.

  • LSB_BWAIT_IN_JOBS: In LSF 10.1 Fix Pack 13, this new parameter specifies whether LSF can use the bwait command within a job.
  • LSF_GPU_AUTOCONFIG: In LSF 10.1 Fix Pack 13, the default value of this parameter changed from N to Y.
  • LSB_GPU_NEW_SYNTAX: In LSF 10.1 Fix Pack 13, the default value of this parameter changed from not defined to extend.
  • LSF_GPU_RESOURCE_IGNORE: In LSF 10.1 Fix Pack 13, the default value of this parameter changed from N to Y.
  • Starting in LSF Version 10.1 Fix Pack 13, existing LSF commands that support host names as a parameter option now also accept host groups. For details on the affected commands, see battr, bresume, brvs, lshosts, and lsload.

lsf.datamanager

  • CACHE_REFRESH_INTERVAL: In LSF Version 10.1 Fix Pack 9, this parameter is added to the Parameters section to limit the number of transfer jobs to the data manager by setting a refresh interval for the file cache. This is due to a change to the default behavior of jobs submitted to the data manager. The sanity check for the existence of files or folders and whether the user can access them, discovery of the size and modification of the files or folders, and generation of the hash from the bsub and bmod commands is moved to the transfer job. This equalizes the performance of submitting and modifying jobs with and without data requirements.

lsf.licensescheduler

  • LM_RESERVATION in the Parameters and Feature section: In LSF Version 10.1 Fix Pack 3, this new parameter enables LSF License Scheduler to support the FlexNet Manager reservation keyword (RESERVE). LSF License Scheduler treats the RESERVE value in the FlexNet Manager license option file as OTHERS tokens instead of FREE tokens. The RESERVE value is now included in the OTHERS value in the blstat command output and is no longer included in the FREE value.
  • In LSF 11.1, the following parameters are deprecated and removed:
    • ACCINUSE_INCLUDES_OWNERSHIP
    • FAST_DISPATCH: Now fixed to Y.
    • GROUP
    • LOCAL_TO
    • LS_ACTIVE_PERCENTAGE
  • Clusters section: In LSF 10.1 Fix Pack 13, this new section specifies the licenses to share as global resources and the clusters that will share these licenses.

lsf.sudoers

  • In LSF Version 10.1 Fix Pack 10, you must enable the setuid bit for the LSF administration commands to use the lsf.sudoers file. Run the hostsetup --setuid command option on the LSF master and candidate hosts. Since this allows LSF administration commands to run with root privileges, do not enable the setuid bit if you do not want these LSF commands to run with root privileges.
  • LSF_EAUTH_OLDKEY: In LSF Version 10.1 Fix Pack 12, this parameter specifies the previous key that eauth used to encrypt and decrypt user authentication data after you specify a new eauth key. To use this parameter, you must also define the LSF_EAUTH_OLDKEY_EXPIRY parameter to specify an expiry date for the old key.
  • LSF_EAUTH_OLDKEY_EXPIRY: In LSF Version 10.1 Fix Pack 12, this parameter specifies the expiry date for the previous eauth key (LSF_EAUTH_OLDKEY_EXPIRY parameter), after which the previous key no longer works and only the new LSF_EAUTH_KEY parameter works.

lsf.task

In LSF Version 10.1 Fix Pack 12, this file is deprecated and will be removed in a future version.

lsf.usermapping (New)

In LSF Version 10.1 Fix Pack 11, the lsf.usermapping file defines the user mapping policy for the new bsubmit command. The lsf.usermapping file allows you to map several job execution users and user groups to a single submission user or user group. Create the lsf.usermapping file in the $LSF_ENVDIR directory.

awsprov_templates.json

  • interfaceType: In LSF Version 10.1 Fix Pack 10, this new parameter specifies whether the Elastic Fabric Adapter (EFA) network interface is attached to the instance.
  • launchTemplateId: In LSF Version 10.1 Fix Pack Fix Pack 10, this new parameter specifies the AWS launch template.
  • launchTemplateVersion: In Fix Pack 10, this new parameter specifies the specific version of AWS launch template to select.

azureccprov_templates.json (New)

In LSF Version 10.1 Fix Pack 9, the azureccprov_templates.json file defines the mapping between LSF resource demand requests and Microsoft Azure CycleCloud instances for LSF resource connector.

  • imageName: In LSF Version 10.1 Fix Pack 10, this new parameter specifies that a cluster node uses a private Custom Azure image or a Marketplace image. You can find this ID for custom images in the Azure portal as the Resource ID for the image.
  • interruptible: In LSF Version 10.1 Fix Pack 11, this new parameter enables the use of spot VMs.
  • maxPrice: In LSF Version 10.1 Fix Pack 11, this new parameter defines the maximum allowed price of a spot VM before Azure reclaims the VM.

googleprov_config.json

  • GCLOUD_REGION: In LSF Version 10.1 Fix Pack 12, this new parameter specifies the default region for LSF resource connector to use with the bulk API endpoint. The region that is defined in the googleprov_templates.json file overrides the region that is defined here.

googleprov_templates.json

  • hostProject: In LSF Version 10.1 Fix Pack 10, this new parameter specifies the host project ID to generate the VPN and subnet values instead of the Google Cloud Project ID (that is, the GCLOUD_PROJECT_ID parameter in the googleprov_config.json file). If not specified, the LSF resource connector uses the Google Cloud Project ID to generate the VPN and subnet.
  • launchTemplateId: In LSF Version 10.1 Fix Pack 12, this new parameter specifies the launch template ID. Specify this parameter and the zone parameter to enable launch instance templates.

hostProviders.json

  • preProvPath: In LSF Version 10.1 Fix Pack 2, this new parameter specifies the absolute path file to the pre-provisioning script that the LSF resource connector runs after the instance is created and started successfully but before it is marked allocated to the LSF cluster.
  • postProvPath: In LSF Version 10.1 Fix Pack 2, this new parameter specifies the absolute file path to the post-provisioning script that the LSF resource connector runs after the instance is terminated successfully but before it is removed from the LSF cluster.
  • provTimeOut: In LSF Version 10.1 Fix Pack 2, this new parameter specifies the maximum amount of time, in minutes, for a pre- or post-provisioning script to run before it ends. Use this parameter to avoid the pre- or post-provisioning program from running for an unlimited time. The default value is 10 minutes. If set to 0, pre- and post-provisioning is disabled for the LSF resource connector .

ibmcloudgen2_config.json (New)

In LSF Version 10.1 Fix Pack 11, the ibmcloudgen2_config.json file manages remote administrative functions that the resource connector must perform on IBM Cloud Virtual Servers for Virtual Private Cloud Gen 2 (Cloud VPC Gen 2).

ibmcloudgen2_templates.json (New)

In LSF Version 10.1 Fix Pack 11, the ibmcloudgen2_templates.json file defines the mapping between LSF resource demand requests and Cloud VPC Gen 2 instances for LSF resource connector.

policy_config.json (New)

In LSF Version 10.1 Fix Pack 2, the policy_config.json file configures custom policies for resources providers for the LSF resource connector. The resource policy plug-in reads this file.

The default location for the file is <LSF_TOP>/conf/resource_connector/policy_config.json.

The policy_config.json file contains a JSON list of named policies. Each policy contains a name, a consumer, a maximum number of instances that can be launched for the consumer, and maximum number of instances that can be launched in a specified period.

Environment variables

  • LSB_BMGROUP_ALLREMOTE_EXPAND: In the IBM Spectrum LSF multicluster capability resource leasing model, the bmgroup command now displays a list of leased-in hosts in the HOSTS column in the form host_name@cluster_name by default.

    If LSB_BMGROUP_ALLREMOTE_EXPAND=N is configured, leased-in hosts are represented by a single keyword allremote instead of being displayed as a list.

  • The epsub environment variable LSB_SUB_JOB_ID indicates the ID of a submitted job that is assigned by LSF, as shown by bjobs. A value of -1 indicates that mbatchd rejected the job submission.
  • The epsub environment variable LSB_SUB_JOB_QUEUE indicates the name of the final queue from which the submitted job is dispatched, which includes any queue modifications that are made by esub.
  • The epsub environment variable LSB_SUB_JOB_ERR indicates the error number of a submitted job if the job submission failed, and is used by the epsub to determine the reason for job submission failure. If the job was submitted or modified successfully, the value of this environment variable is LSB_NO_ERROR (or 0)
  • LSB_BHOSTS_FORMAT: In LSF Version 10.1 Fix Pack 2, this new environment variable customizes specific fields that the bhosts command displays.
  • LSB_BQUEUES_FORMAT: In LSF Version 10.1 Fix Pack 2, this new environment variable customizes specific fields that the bqueues command displays.
  • LSB_HMS_TIME_FORMAT: In LSF Version 10.1 Fix Pack 2, this new environment variable displays times from the customized bjobs -o command output in hh:mm:ss format. This environment variable setting only applies to bjobs -o or bjobs -o -json command output.
  • NOCHECKVIEW_POSTEXEC: In LSF Version 10.1 Fix Pack 3, this environment variable for the LSF Integration for Rational ClearCase is obsolete because the daemon wrappers no longer run the checkView function to the check the ClearCase view, which means that this environment variable is no longer needed.
  • LSB_DATA_PROVENANCE: In LSF Version 10.1 Fix Pack 4, this new environment variable enables data provenance tools for tracing data output files for jobs.
  • LSB_DEFAULTPROJECT: In LSF Version 10.1 Fix Pack 6, the project name can now be up to 511 characters long (previously, this limit was 59 characters).
  • LSB_PROJECT_NAME: In LSF Version 10.1 Fix Pack 6, the project name can now be up to 511 characters long (previously, this limit was 59 characters).
  • LSB_DOCKER_IMAGE_AFFINITY: In LSF Version 10.1 Fix Pack 9, this new environment variable enables LSF to give preference for execution hosts that already have the requested Docker image when submitting or scheduling Docker jobs.
  • LSB_BUSERS_FORMAT: In LSF Version 10.1 Fix Pack 9, this new environment customizes specific fields that the busers command displays.
  • LSF_AC_JOB_NOTIFICATION: In LSF Version 10.1 Fix Pack 9, this new environment requests that the user be notified when the job reaches any of the specified states.