Configuring NVIDIA Multi-Instance GPU (MIG)

NVIDIA Multi-Instance GPU (MIG) enables you to partition your GPUs. MIG is useful when a service does not require all of the resources that are available on the GPU. When you configure MIG, each partition is treated as an independent, CUDA-enabled GPU.

Installation phase

Setting up a client workstation
Setting up a cluster
Collecting required information
Preparing to run installs in a restricted network
Preparing to run installs from a private container registry
Preparing the cluster for IBM Software Hub
Preparing to install an instance of IBM Software Hub
Installing an instance of IBM Software Hub
Setting up the control plane
Installing solutions and services

Who needs to complete this task?

Cluster administrator A cluster administrator must complete this task.

When do you need to complete this task?

One-time setup Complete this task if want to partition your GPUs.

About this task

NVIDIA Multi-Instance GPU provides single (homogeneous) and mixed (heterogeneous) advertisement strategies. However, IBM Software Hub supports only the single advertisement strategy where all of the GPUs on a node are configured with the same amount of compute and memory.

Review MIG support in OpenShift® Container Platform in the NVIDIA documentation to choose the profile or profiles that you want to use.

You have the following options when configuring MIG:

You can use the same profile on all of the GPU worker nodes on your cluster.
If you use the same profile on all of the worker nodes, you do not need to create custom runtime definitions.
You can use different profiles on each worker node in your cluster.
If you use different profiles, you must create custom runtime definitions.

Procedure

To configure MIG

Confirm which MIG profiles are available for your GPU.
The supported profiles are listed in the mig-parted-config ConfigMap in the GPU operator project.
Set the MIG_LABEL based on the profile that you want to use:
```
export MIG_LABEL=<label>
```
Label each GPU node where you want to enable MIG.
Replace <node name> with the name of the node that you want to label.
```
oc label nodes <node name> nvidia.com/mig.config=${MIG_LABEL} --overwrite=true
```

What to do next

Now that you've configured MIG, you're ready to complete Installing Red Hat OpenShift AI.