-gpu

bjobs -l -gpu shows the following information on GPU job allocation:

Synopsis

bjobs -l | -UF [-gpu]

Conflicting options

Use only with the -l or -UF option.

Description

HOST: The name of the host.
TASK: List of job tasks and IDs using the GPU (separated by comma if used by multiple tasks).
GPU_ID: The GPU IDs on the host. Each GPU is shown as a separate line.
GI_PLACEMENT/SIZE: (Available as of Fix Pack 14) The location and size of the GPU instance within the GPU device, separated by a forward slash /. For more information about profile placements, see https://docs.nvidia.com/datacenter/tesla/mig-user-guide/.
CI_PLACEMENT/SIZE: (Available as of Fix Pack 14) The location and size of the compute instance within the compute device, separated by a forward slash /. For more information about profile placements, see https://docs.nvidia.com/datacenter/tesla/mig-user-guide/.
MODEL: The GPU brand name and model type name.
MTOTAL: The total GPU memory size.
FACTOR
MRSV: GPU memory reserved by the job.
SOCKET: Socket ID of the GPU.
NVLINK/XGMI: Indicates if the GPU has NVLink or XGMI connections with other GPUs allocated for the job (ranked by GPU ID and including itself). The connection flag of each GPU is a character separated by a forward slash (/) between GPUs:; A Y indicates there is a direct connection between two GPUs.; An N shows there is no direct connection with that GPU.; A hyphen (-) shows the GPU itself.

If the job exited abnormally due to a GPU-related error or warning, the error or warning message displays. If LSF could not get GPU usage information from DCGM, a hyphen (-) displays.

-gpu

Categories

Synopsis

Conflicting options

Description