Enabling jobs to use GPU resources
LSF jobs can specify GPU resource requirements in one statement.
You can specify all GPU requirements for your job together with the bsub -gpu
option or in configuration in a queue, application profile, or in a default GPU requirement. The resource requirements of your job submission cannot use the legacy GPU resources
(ngpus_shared, ngpus_excl_t,
ngpus_excl_p) as job resource requirements. In addition, if the
PREEMPTABLE_RESOURCES parameter in the lsb.params file
includes the ngpus_physical resource, GPU preemption is enabled with only one
restriction: higher priority GPU jobs cannot preempt GPU jobs with
mode=shared configuration in the GPU resource requirements if there are
multiple jobs running on the GPU. (Note that as of
Fix Pack 14, this restriction has been removed so that higher priority GPU jobs with
j_exclusive=yes
or mode=exclusive_process
settings can preempt
shared-mode GPU jobs if there were multiple jobs running on the GPU.) Ensure that you properly
configure the MIG, RERUNNABLE, or
REQUEUE parameters to ensure that GPU resources are properly released after the
job is preempted.
If any option of the GPU requirements is not defined, the default value is used for each option: "num=1:mode=shared:mps=no:j_exclusive=no". Use the LSB_GPU_REQ parameter in the lsf.conf file to specify a different default GPU resource requirement.
You can also specify GPU resource requirements with the GPU_REQ parameter in a queue (lsb.queues file) or application profile (lsb.applications file).
If a GPU requirement is specified at the cluster level (lsf.conf file), queue, or application profile, and at job level, each option (num, mode, mps, and j_exclusive) of the GPU requirement is merged separately. Job level overrides application level, which overrides queue level, which overrides cluster level configuration. For example, if the mode option of GPU requirement is defined on the -gpu option, and the mps option is defined in the queue, the mode of job level and the mps value of queue is used.