Configuring MIG support in Red Hat OpenShift
Beginning Cloud Pak for Data version 4.8.4, Watson Machine Learning supports GPU inferencing for single model deployments. This feature is not available in Cloud Pak for Data versions 4.8.3 and earlier.
The Red Hat OpenShift container provides a platform to configure and use GPU resources. The NVIDIA Multi-Instance GPU (MIG) provides necessary software components to provision the GPU, such as NVIDIA drivers to enable CUDA. To use CUDA
software specifications for your deployment, you must configure the Nvidia Multi-Instance GPU (MIG) in a Red Hat OpenShift® cluster. To configure MIG support, see Nvidia Guide for configuring MIG support.
Configuring MIG profiles within a cluster
To enable different MIG profiles, assign the MIG profile to each node and update the runtime definition. This is applicable to CUDA
enabled runtime definitions only.
Assigning MIG profile to nodes
To assign MIG profile to a node, label the node by using the following command:
oc label nodes node1 nvidia.com/mig.config=all-1g.10gb --overwrite=true
Accessing supported MIG profiles
To find a list of supported MIG profiles for your GPU, see mig-parted-config
configmap in the GPU operator namespace.
The standard setup uses a single MIG profile across the entire Cloud Pak for Data cluster and does not require any custom runtime definitions to be configured. To use a standard setup, label all nodes with the same MIG profile.
Following the setup, you can start a GPU runtime and select a single GPU to get a MIG device assigned.
Updating runtime definition for nodes
To update the runtime definition, follow these steps:
-
Download the runtime definition for the GPU runtime (for example,
runtime-23.1-py3.10-cuda
). For more information, see Downloading the runtime configuration. -
In the runtime definition, add the
nodeAffinity
property to specify the MIG profile:"nodeAffinity": { "requiredDuringSchedulingIgnoredDuringExecution": { "nodeSelectorTerms": [ { "matchExpressions": [ { "key": "nvidia.com/mig.config", "operator": "In", "values": ["all-1g.10g"] } ] } ] } }
-
Update the runtime definition by using the
service id
credentials:a. To get the
service id
credentials, check the required namespace:oc get secret -A | grep wdp-service-id
b. Get the required
service-id-credentials
token:oc get secret -n <NAMESPACE> wdp-service-id -o jsonpath='{.data.service-id-credentials}' | base64 --decode
c. Update the runtime definition with
PUT
to/v2/runtime_definitions/<runtime_id>
.The following code shows how to update the runtime definition in Python. The <runtime_id> is the runtime definition ID of the runtime definition that is being updated, and
new_rd
is the updated JSON.headers={'Authorization': 'Basic <service-id-credentials>', 'Content-Type': 'application/json'} response = requests.put( f"{CPD_URL}/v2/runtime_definitions/<runtime_id>", json=new_rd, headers=headers, verify=False)
After the custom runtime definitions are updated, you can create deployments that select the nodes that offer a certain MIG profile as updated in the runtime definition.
Parent topic: Frameworks and software specifications in Watson Machine Learning