Configuring Docling Multimodal GPUs
This section describes how to configure Docling Multimodal GPUs.
The default GPU resource that is used by the Docling services is nvidia.com/gpu.
If the GPUs are configured to use Multi-Instance GPU
(MIG), then a corresponding MIG profile must be assigned to each service for it to start
properly. The IBM Docling Multimodal resources can be overridden in the CasInstall
Custom Resource (CR). Each deployment that requires a GPU has an override flag for use with
alternative resource configurations.
To alter the Docling Multimodal GPU targets from the command line, access the
CasInstall CR and add the following flags under the spec
section:
spec:
flags:
- VLLM_VISION_GPU=<gpu target>
- VLLM_EMBEDDING_GPU=<gpu target>
- DOCLING_GPU=<gpu target>
For example, to specify an alternative Multi-Instance GPU (MIG) indicator for all, add the
following flags under the spec section:
spec:
flags:
- VLLM_VISION_GPU=nvidia.com/mig-7g.80gb
- VLLM_EMBEDDING_GPU=nvidia.com/mig-7g.80gb
- DOCLING_GPU=nvidia.com/mig-7g.80gb
Result: The GPU resource indicator is then automatically detected and a new deployment is rolled out.