Creating custom hardware specifications for deployment on dedicated GPUs

Some foundation model types and hardware configurations might require a custom hardware specification. Learn about the requirements for creating a custom hardware specification.

You model's resource utilization might also require a custom hardware specification. For guidelines, see Hardware requirements.

Additionally, you cannot use predefined hardware specifications with quantized models. For quantized models and in other non-standard cases, use a custom hardware specification.

Supported hardware specifications

For a list of standard hardware specifications that you can use to deploy your custom foundation models on dedicated GPUs, see Predefined hardware specifications.

Note:

Custom hardware specifications are not checked for compliance with the installed hardware. If you use a hardware specification with insufficient resources for your deployment, the deployment will fail with the following message:

Failed to deploy the custom foundation model due to an internal error. The runtime failed to start due to 'insufficient resources'. Retry the operation. Contact IBM support if the problem persists.

Resource utilization guidelines for custom hardware specifications

If you want to create a custom hardware specification for your model, follow these guidelines.

Non-quantized models:

Guidelines for custom hardware specifications: non-quantized models
Resource Calculation
GPU Memory (Number of Billion parameters * 2) + 50 % additional memory
Number of GPUs Number of GPUs depends on GPU memory requirements: 1GPU = 80 GB
Number of CPUs Number of GPUs + 1
CPU memory Equal to GPU memory

Quantized models:

4-bit quantized models:

Guidelines for custom hardware specifications: 4-bit quantized models
Resource Calculation
GPU Memory (Number of Billion parameters * 0.5) + 50 % additional memory
Number of GPUs Number of GPUs depends on GPU memory requirements: 1GPU = 80 GB
Number of CPUs Number of GPUs + 1
CPU memory Equal to GPU memory

8-bit quantized models:

Guidelines for custom hardware specifications: 8-bit quantized models
Resource Calculation
GPU Memory Number of Billion parameters + 50 % additional memory
Number of GPUs Number of GPUs depends on GPU memory requirements: 1GPU = 80 GB
Number of CPUs Num of GPUs + 1
CPU memory Equal to GPU memory
Note: Failure to follow these formulas might result in an unexpected model behavior.

Creating custom hardware specifications in Projects

Use the following code sample to create a custom hardware specification for your model in a project:

curl -ik -X POST -H "Authorization: Bearer $TOKEN" "https://<cluster_url>/v2/hardware_specifications?project_id=$project_id" \
-H "Content-Type:application/json" \
--data '{
  "name": "custom_hw_spec",
  "description": "Custom hardware specification for foundation models",
  "nodes": {
    "cpu": {
      "units": "2"
    },
    "mem": {
      "size": "128Gi"
    },
    "gpu": {
      "num_gpu": 1
    }
  }
}'

Creating custom hardware specification in deployment spaces

Use the following code sample to create a custom hardware specification for your model in a deployment space:

curl -ik -X POST -H "Authorization: Bearer $TOKEN" "https://<cluster_url>/v2/hardware_specifications?space_id=$space_id"
-H "Content-Type:application/json"
--data '{
  "name": "custom_hw_spec",
  "description": "Custom hardware specification for foundation models",
  "nodes": {
    "cpu": {
      "units": "2"
    },
    "mem": {
      "size": "128Gi"
    },
    "gpu": {
      "num_gpu": 1
    }
  }
}'

Creating custom hardware specifications for selected GPU nodes

To create a custom hardware specification for specific GPU nodes, you can use the node_selector field and specify the label_name and label_value of the GPU node that you want to use. You can also create custom node labels for the GPU node and provide the node label name and value in the node_selector field when you create the custom hardware specification.

For example, if you have two NVIDIA A100 GPUs available in your cluster with the labels nvidia.com/gpu.product=NVIDIA-A100-SXM4-80GB and nvidia.com/gpu.product=NVIDIA-A100-80GB-PCIe, and you do not provide values for node_selector when you create the custom hardware specification, watsonx.ai chooses one of the available GPUs automatically to deploy your custom foundation model. If you want to deploy your custom foundation model on a specific GPU node, you must specify the label_name and label_value in the node_selector field when you create the custom hardware specification which is used for deploying the model.

The following code sample shows how to create a custom hardware specification for the node selector nvidia.com/gpu.product=NVIDIA-A100-SXM4-80GB by adding the label_name and label_value in the node_selector field:

curl -ik -X POST -H "Authorization: Bearer $TOKEN" "https://<replace with your CPD hostname>/v2/hardware_specifications?      project_id=$project_id" \
   -H "Content-Type:application/json" \
   --data '{
      "name": "custom_hw_spec",
      "description": "Custom hardware specification for foundation models",
      "nodes": {
         "cpu": {
             "units": "2"
         },
         "mem": {
             "size": "128Gi"
         },
         "gpu": {
             "num_gpu": 1
         },
         "node_selector": [
             {
                 "label_name": "nvidia.com/gpu.product",
                 "label_value": "NVIDIA-A100-SXM4-80GB"
             }
         ]
      }
}'

As the result, the new hardware specification appears in the Select a hardware specification dropdown menu when you deploy the model in the UI.

Parent topic: Requirements for deploying custom foundation models