-gpu

bjobs -l -gpu shows the following information on GPU job allocation:

Categories

filter

Synopsis

bjobs -l | -UF [-gpu]

Conflicting options

Use only with the -l or -UF option.

Description

HOST
The name of the host.
TASK
List of job tasks and IDs using the GPU (separated by comma if used by multiple tasks).
GPU_ID
The GPU IDs on the host. Each GPU is shown as a separate line.
GI_PLACEMENT/SIZE
(Available as of Fix Pack 14) The location and size of the GPU instance within the GPU device, separated by a forward slash /. For more information about profile placements, see https://docs.nvidia.com/datacenter/tesla/mig-user-guide/.
CI_PLACEMENT/SIZE
(Available as of Fix Pack 14) The location and size of the compute instance within the compute device, separated by a forward slash /. For more information about profile placements, see https://docs.nvidia.com/datacenter/tesla/mig-user-guide/.
MODEL
The GPU brand name and model type name.
MTOTAL
The total GPU memory size.
FACTOR
MRSV
GPU memory reserved by the job.
SOCKET
Socket ID of the GPU.
NVLINK/XGMI
Indicates if the GPU has NVLink or XGMI connections with other GPUs allocated for the job (ranked by GPU ID and including itself). The connection flag of each GPU is a character separated by a forward slash (/) between GPUs:
A Y indicates there is a direct connection between two GPUs.
An N shows there is no direct connection with that GPU.
A hyphen (-) shows the GPU itself.

If the job exited abnormally due to a GPU-related error or warning, the error or warning message displays. If LSF could not get GPU usage information from DCGM, a hyphen (-) displays.