Creating deployments with GPU hardware specifications

You can deploy machine learning and deep learning models on GPUs by using CUDA software specifications and GPU hardware specifications.

Limitations:

To deploy models on GPUs, the cluster setup must be homogenous:
- All the GPU nodes on the cluster must be the same GPU type
- All the MIG nodes must be of the same MIG configuration or partition size
You cannot use custom hardware specifications for GPU deployments.
CUDA software specifications include Nvidia cuda drivers, however, no GPU will get allocated to the deployment unless GPUx hardware specification is specified with the deployment.
Dedicated GPU and MIG partitions cannot be used at the same time.

Important: If you want your deployment to be allocated with GPU resources you must specify both a CUDA software specification and a GPU hardware specification during deployment creation.

For a list of CUDA software specifications, see Software specifications. For a list of GPU hardware specifications, see GPU hardware specifications.

You can also enable MIG support for GPUs when you want to deploy an application that does not require the full power of an enitre GPU. If you are configuring MIG for GPU-accelerated workloads, all GPU-enabled nodes should adhere to a single strategy determined in the prior configuration steps. This ensures consistent behavior across all GPU-enabled nodes in the cluster. To configure MIG support, see Nvidia Guide for configuring MIG support.

See the code example of how to create a deployment that uses a GPU hardware specification:

_from ibm_watsonx_ai import APIClient

wx_ai_client = APIClient(credentials)
meta_props = {
    client.deployments.ConfigurationMetaNames.NAME: f"GPU deployment",
    client.deployments.ConfigurationMetaNames.ONLINE: {},
    client.deployments.ConfigurationMetaNames.HARDWARE_SPEC:{"name": "GPUx2"}

}

deployment_details = wx_ai_client.deployments.create(<asset_id>, meta_props)_