IBM Docling Multimodal processing engine
IBM Docling Multimodal is an open source, document processing engine that is designed to convert unstructured documents into structured, machine-readable formats for use in AI applications. It enables you to extract, parse, embed, and store document vectors by using Docling with Granite embedding models.
The supported document types for text-based extraction include: .pdf,
.txt, .md, .asciidoc, .csv,
.html, .xhtml, .docx, .pptx,
.xlsx. As of now, other file types without text are not supported (for example,
image and audio files).
Enabling Docling Multimodal document processor engine
To enable the Docling Multimodal document processor engine through command line, access the
CasInstall Custom Resource (CR) and add the following configuration under the spec
section:
documentProcessingEngine:
docling_multimodal:
enabled: true
Alternatively, the CasInstall CR instance can be patched to enable
docling_multimodal, as shown in the following example:
oc patch casinstalls.cas.isf.ibm.com ibm-cas-service-instance -n ibm-cas --type=merge -p '{"spec":{"documentProcessingEngine":{"docling_multimodal":{"enabled":true}}}}'
Disabling Docling Multimodal document processor engine
Disabling the Docling Multimodal document processor engine removes the Docling services and no longer allows ingestion or query to any domains, which are configured to use Docling Multimodal.
To disable the Docling Multimodal document processor engine through command line, access the
CasInstall CR and add the following configuration under the spec section:
documentProcessingEngine:
docling_multimodal:
enabled: false
Alternatively, the CasInstall CR instance can be patched to disable
docling_multimodal, as shown in the following example:
oc patch casinstalls.cas.isf.ibm.com ibm-cas-service-instance -n ibm-cas --type=merge -p '{"spec":{"documentProcessingEngine":{"docling_multimodal":{"enabled":false}}}}'