For Red Hat® OpenShift® clusters that are provisioned on
hypervisors, such as VSphere, Citrix, and KVM, you can use a GPU passthrough and assign GPU
resources directly to specific virtual machines (V Ms).
About this task
For Red Hat OpenShift clusters that are provisioned on hypervisors,
such as VSphere, Citrix, and KVM, two methods can be applied to expose the underlying GPU hardware
to the Red Hat OpenShift cluster. The first is to install a GPU virtualization driver on the
hypervisor so that the physical GPUs can be virtualized and its resources are shared among the VMs
through the hypervisor. The second method is to enable GPU passthrough and assign GPU resources
directly to specific VMs, thus bypassing the hypervisor. The second method, as it applies to the
VSphere hypervisor, is described in this task.
To enable GPU passthrough and assign GPU resources, see the following NVIDIA documentation
that correlates to your specific VM:
Procedure
- Enable GPU passthrough in the VSphere hypervisor.
- From the Navigator on the VSphere browser-based console, click
.
- On the Hardware tab, click PCI Devices
and search for the GPU devices.
- Click the checkbox next to all wanted GPU devices to enable passthrough and click
Toggle passthrough.
Note:
Now that GPU passthrough is enabled on the GPU devices, the next steps assign the physical GPU
devices to VMs that become GPU-enabled compute nodes in the OpenShift Container Platform cluster.
- Add or modify an existing compute node VM and assign the GPU to the compute
node.
- Edit the VM settings and expand Memory. Ensure that the
Reserve all guest memory checkbox is enabled.
- From the VM settings, select .
- From the New PCI device setting, select the GPU devices from the
drop-down menu to assign to the VM. Click Save.
- Click VM Optionson the Edit settings
page.
- Expand Boot Options and ensure that EFI is
selected from the Firmware drop-down menu.
Note: Booting from EFI is a requirement. If booting from BIOS is configured for the VM, the GPU
fails to pass through.
- From VM Options, click Advanced. Click
Edit Configuration.
- Add the following configuration parameters to the compute node VM. Click
Save.
- pciPassthru.64bitMMIOSizeGB to 64
- pciPassthru.use64bitMMIO to TRUE
For example:
- The VM is now assigned the specified GPU resources. Power® on the VM and run the following command to confirm that the GPU device is visible to the Guest OS:
[core@worker4 ~]$ lspci | grep NVIDIA
13:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 PCIe 16GB] (rev a1)
[core@worker4 ~]$
What to do next
Next, install the NVIDIA operator to automate the management of all NVIDIA software
components. For more information about installing the NVIDIA operator, see Installing the NVIDIA operator