Example GPU job submissions
The following example job submissions demonstrate ways to submit jobs that use GPU resources.
- The following job requests the default GPU resource requirement
num=1:mode=shared:mps=no:j_exclusive=no. The job requests one GPU in
DEFAULT mode, without starting MPS, and the GPU can be used by other jobs
since j_exclusive is set to
no.
bsub -gpu - ./app - The following job requires two EXCLUSIVE_PROCESS mode GPUs and starts MPS
before the job
runs:
bsub -gpu "num=2:mode=exclusive_process:mps=yes" ./app - The following job requires two EXCLUSIVE_PROCESS
mode GPUs, starts MPS before the job runs, and allows multiple jobs in the host to share the same
MPS daemon if those jobs are submitted by the same user with the same GPU
requirements:
bsub -gpu "num=2:mode=exclusive_process:mps=yes,share" ./app - The following job requires two EXCLUSIVE_PROCESS
mode GPUs and starts multiple MPS daemons (one MPS daemon per
socket):
bsub -gpu "num=2:mode=exclusive_process:mps=per_socket" ./app - The following job requires two EXCLUSIVE_PROCESS
mode GPUs and starts multiple MPS daemons (one MPS daemon per socket), and allows multiple jobs in
the socket to share the same MPS daemon if those jobs are submitted by the same user with the same
GPU
requirements:
bsub -gpu "num=2:mode=exclusive_process:mps=per_socket,share" ./app - The following job requires two EXCLUSIVE_PROCESS
mode GPUs and starts multiple MPS daemons (one MPS daemon per
GPU):
bsub -gpu "num=2:mode=exclusive_process:mps=per_gpu" ./app - The following job requires two EXCLUSIVE_PROCESS
mode GPUs and starts multiple MPS daemons (one MPS daemon per GPU), and allows multiple jobs in the
GPU to share the same MPS daemon if those jobs are submitted by the same user with the same GPU
requirements:
bsub -gpu "num=2:mode=exclusive_process:mps=per_gpu,share" ./app - The following job requires two DEFAULT mode GPUs and uses them
exclusively. The two GPUs cannot be used by other jobs even though the mode is
shared:
bsub -gpu "num=2:mode=shared:j_exclusive=yes" ./app - The following job uses three DEFAULT mode GPUs and shares them with other
jobs:
bsub -gpu "num=3:mode=shared:j_exclusive=no" ./app - The following job requests two AMD
GPUs:
bsub -gpu "num=2:gvendor=amd" ./app - The following job requests two Vega GPUs with xGMI
connections:
bsub -gpu "num=2:gmodel=Vega:glink=yes" ./app - The following job requests two NVIDIA
GPUs:
bsub -gpu "num=2:gvendor=nvidia" ./app - The following job requests two Tesla C2050 or C2070
GPUs:
bsub -gpu "num=2:gmodel=C2050_C2070" - The following job requests two Tesla GPUs of any model with a total memory size of 12 GB on each
GPU:
bsub -gpu "num=2:gmodel=Tesla-12G" - The following job requests two Tesla GPUs of any model with a total memory
size of 12 GB on each GPU, but with relaxed GPU affinity
enforcement:
bsub -gpu "num=2:gmodel=Tesla-12G":aff=no - The following job requests two Tesla GPUs of any model with a total memory size of 12 GB on each
GPU and reserves 8 GB of GPU memory on each
GPU:
bsub -gpu "num=2:gmodel=Tesla-12G:gmem=8G" - The following job requests four Tesla K80 GPUs per host and 2 GPUs on each
socket:
bsub -gpu "num=4:gmodel=K80:gtile=2" - The following job requests four Tesla K80 GPUs per host and the GPUs are spread evenly on each
socket:
bsub -gpu "num=4:gmodel=K80:gtile='!'" - The following job requests four Tesla P100 GPUs per host with NVLink connections and the GPUs
are spread evenly on each
socket:
bsub -gpu "num=4:gmodel=TeslaP100:gtile='!':glink=yes" - The following job uses two NVIDIA MIG devices with a GPU instance size of
3 and a compute instance size of 2.
bsub -gpu "num=2:mig=3/2" ./app - The following job uses four EXCLUSIVE_PROCESS GPUs that cannot be used by
other jobs. The j_exclusive option defaults to yes for this
job.
bsub -gpu "num=4:mode=exclusive_process" ./app - The following job requires two tasks. Each task requires two
EXCLUSIVE_PROCESS GPUs on two hosts. The GPUs are allocated in the same NUMA
as the allocated
CPU.
bsub -gpu "num=2:mode=exclusive_process" -n2 -R "span[ptile=1] affinity[core(1)]" ./app - The following job ignores the simple GPU resource requirements that are specified in the
-gpu option because the -R option is specifying the
ngpus_physical GPU
resource:
bsub -gpu "num=2:mode=exclusive_process" -n2 -R "span[ptile=1] rusage[ngpus_physical=2:gmodel=TeslaP100:glink=yes]" ./appSince you can only request EXCLUSIVE_PROCESS GPUs with the -gpu option, move the rusage[] string contents to the -gpu option arguments. The following corrected job submission requires two tasks, and each task requires 2 EXCLUSIVE_PROCESS Tesla P100 GPUs with NVLink connections on two hosts:bsub -gpu "num=2:mode=exclusive_process:gmodel=TeslaP100:glink=yes" -n2 -R "span[ptile=1]" ./app