Configuring Milvus to run on GPU
In an existing Milvus service, you can configure index nodes and query nodes to run on GPU hardware.
watsonx.data on IBM Software Hub
Before you begin
- For software prerequisites, see Software requirements.
- For hardware prerequisites, see Hardware requirements.
About this task
Enabling GPU support in Milvus has significant performance gains and allows you to take advantage
of GPU indexes such as GPU_CAGRA
, GPU_IVF_FLAT
, or
GPU_BRUTE_FORCE
. GPU_CAGRA
is the most recommended one.
Procedure
Troubleshooting
If the query node or index node status remains
Pending
for long, you can check
the reason by
running:oc describe po <pod-name> -n ${PROJECT_CPD_INST_OPERANDS}
If the reason is insufficiency of resources such as not enough memory or not enough GPUs to
satisfy the request, you can:
- Adjust the patch to lower the number of replicas.
- Lower the memory.
- Fix any invalid syntax.
Pending
pod, you can restart the controller to ensure that the new patch is applied to the pods faster by
running:export OPERATOR_POD=$(oc get po -o name -n $PROJECT_CPD_INST_OPERATORS | grep "ibm-lakehouse-controller-manager-") && oc delete $OPERATOR_POD -n $PROJECT_CPD_INST_OPERATORS
Patch examples
The following are some examples to show how you can patch Milvus in different ways. The
resource requests, resource limits, replicas, and pod toleration are fully customizable to allow
different kinds of hardware and support advanced tuning.
- Dedicate an entire GPU for each query and index node
- Use the existing replica counts, memory, and CPU
settings:
oc patch wxdengine/lakehouse-milvus496 \ --type=merge \ -n ${PROJECT_CPD_INST_OPERANDS} \ -p '{ "spec": { "milvus_indexnode": { "resources": { "requests": { "nvidia.com/gpu": "1" }, "limits": { "nvidia.com/gpu": "1" } } }, "milvus_querynode": { "resources": { "requests": { "nvidia.com/gpu": "1" }, "limits": { "nvidia.com/gpu": "1" } } } } }'
- Dedicate a Multi-Instance GPU (MIG) partition to one query node and one index node
- Update memory to match the MIG partition
size:
oc patch wxdengine/lakehouse-milvus496 \ --type=merge \ -n ${PROJECT_CPD_INST_OPERANDS} \ -p '{ "spec": { "milvus_indexnode": { "replicas": 1, "resources": { "requests": { "nvidia.com/mig-2g.20gb": "1", "memory": "19G" }, "limits": { "nvidia.com/mig-2g.20gb": "1", "memory": "19G" } } }, "milvus_querynode": { "replicas": 1, "resources": { "requests": { "nvidia.com/mig-2g.20gb": "1", "memory": "19G" }, "limits": { "nvidia.com/mig-2g.20gb": "1", "memory": "19G" } } } } }'
- Add toleration to query and index nodes
- Allow the nodes to schedule on tainted Red Hat OpenShift nodes that have
NoSchedule
andkey=nvidia.com/gpu
:oc patch wxdengine/lakehouse-milvus496 \ --type=merge \ -n ${PROJECT_CPD_INST_OPERANDS} \ -p '{ "spec": { "milvus_indexnode": { "replicas": 1, "resources": { "requests": { "nvidia.com/gpu": "1" }, "limits": { "nvidia.com/gpu": "1" } }, "tolerations": [ { "effect": "NoSchedule", "key": "nvidia.com/gpu", "operator": "Equal", "value": "present" } ] }, "milvus_querynode": { "replicas": 1, "resources": { "requests": { "nvidia.com/gpu": "1" }, "limits": { "nvidia.com/gpu": "1" } }, "tolerations": [ { "effect": "NoSchedule", "key": "nvidia.com/gpu", "operator": "Equal", "value": "present" } ] } } }'
- Run multiple replicas and customize CPU, memory, and ephemeral storage
-
oc patch wxdengine/lakehouse-milvus496 \ --type=merge \ -n ${PROJECT_CPD_INST_OPERANDS} \ -p '{ "spec": { "milvus_indexnode": { "replicas": 4, "resources": { "requests": { "nvidia.com/mig-2g.20gb": "1", "memory": "19G", "cpu": "16", "ephemeral-storage": "10G" }, "limits": { "nvidia.com/mig-2g.20gb": "1", "memory": "19G", "cpu": "16", "ephemeral-storage": "10G" } }, "tolerations": [ { "effect": "NoSchedule", "key": "nvidia.com/gpu", "operator": "Equal", "value": "present" } ] }, "milvus_querynode": { "replicas": 4, "resources": { "requests": { "nvidia.com/mig-2g.20gb": "1", "memory": "19G", "cpu": "16", "ephemeral-storage": "10G" }, "limits": { "nvidia.com/mig-2g.20gb": "1", "memory": "19G", "cpu": "16", "ephemeral-storage": "10G" } }, "tolerations": [ { "effect": "NoSchedule", "key": "nvidia.com/gpu", "operator": "Equal", "value": "present" } ] } } }'
- Reset Milvus to clear all customized settings and GPU requests
-
oc patch wxdengine/lakehouse-milvus496 \ --type json \ -n ${PROJECT_CPD_INST_OPERANDS} \ -p '[ { "op": "remove", "path": "/spec/milvus_indexnode" }, { "op": "remove", "path": "/spec/milvus_querynode" } ]'