Configuring GPU access with NVIDIA vGPU
NVIDIA vGPU technology enables a single physical GPU to be divided into multiple virtual instances, allowing multiple virtual machines to share GPU resources. This improves GPU utilization, reduces infrastructure costs, and maintains performance isolation for AI, data analytics, and graphics workloads.
- To enable sandbox workloads for vGPU, you must configure additional parameters in the
ClusterPolicyfor the GPU operator.sandboxWorkloads.enabled=true sandboxDevicePlugin.enabled=true sandboxWorkloads.defaultWorkload=vm-vgpu vfioManager.enabled=false - After completing the step creating a
ClusterPolicyfor the GPU operator in the NVIDIA documentation, you must label the GPU node with the specific vGPU device profile.oc label node <gpu-node-name> \ nvidia.com/vgpu.config=<vgpu-device-name> \ --overwriteReplace
<gpu-node-name>with the name of your GPU-enabled node, and<vgpu-device-name>with the vGPU profile that you want to enable. For example, on NVIDIA RTX Pro 6000 Blackwell GPUs, a valid vGPU profile isDC-48Q. You can find the appropriatevgpu-device-namein thedefault-vgpu-devices-configConfigMap, or in a customConfigMapif created.To identify the correct profile, search for the
device_idthat matches your GPU, using the last four characters of the device ID as a reference.