NVIDIA Multimodal Processing Engine settings

This section enables you to configure the NVIDIA Multimodal pipeline.

The DocumentProcessor custom resource (CR) supports the following configuration options for controlling the NVIDIA Multimodal file processing engine. Each option affects content chunking, processing performance, and search quality. When you select a processing approach, consider the tradeoffs among these factors.
NVIDIA Multimodal DocumentProcessor CR configuration options and considerations
Option Valid option values Description Considerations
spec > tasks > name > nvidiaExtract > extractText

True/False

(Default: True)

Extracts the actual text from a document, including the text in tables, charts, and infographics. Extract text is the most performant for ingestion.
spec > tasks > name > nvidiaExtract > extractTables

True/False

(Default: True)

Extracts text by summarizing each table. A second chunk is created if extractText is also selected.
spec > tasks > name > nvidiaExtract > extractCharts

True/False

(Default: True)

Extracts text by summarizing each chart. A second chunk is created if extractText is also selected.
spec > tasks > name > nvidiaExtract > extract > infographics

True/False

(Default: True)

Extracts text by summarizing each infographic. A second chunk is created if extractText is also selected.
spec > tasks > name > nvidiaExtract > extractImages

True/False

(Default: True)

Extracts caption by summarizing each image. Impacts the time of ingestion.
spec > tasks > name > nvidiaExtract > tableOutputFormat

Enumeration: markdown, pseudo_markdown, simple

(Default: pseudo_markdown)

Extracted tables are returned in a particular format to be returned as the text that is vectorized. N/A
spec > tasks > name > nvidiaExtract > extractTextDepth

Enumeration: page, document

(Default: page)

Extracts text at levels of granularity. Impacts the number of chunks that are stored and the size of text returned.
spec > tasks > name > nvidiaSplit > chunkSize

Numeric

(Default: 1024)

Size of each chunk in number of bytes.

For more information on all splitting options, see Split Documents.

spec > tasks > name > nvidiaSplit > chunkOverlap

Numeric

(Default: 150)

Number of bytes to overlap between consecutive chunks.
spec > tasks > name > nvidiaSplit > tokenizer

String

(Default: meta-llama/Llama-3.2-1B)

Identifies the embedding model to split the content with.
spec > tasks > name > nvidiaSplit > splitSourceTypes

Array of Strings

(Default: empty)

Identifies the file types to apply the prior nvidiaSplit configuration.

During the Creating a domain and connecting with data source procedure, the CAS user interface creates a Domain custom resource (CR) and an associated NVIDIA Multimodal DocumentProcessor CR. The DocumentProcessor CR configures the processing engine to use the default settings, which include extracting text only and processing content on a per‑page basis. To apply more configuration options, you must create or modify the NVIDIA Multimodal DocumentProcessor CR by using the OpenShift console or the CLI.

Configuring NVIDIA Multimodal pipeline

After you create a Domain CR in the CAS user interface, a corresponding DocumentProcessor CR is generated for that domain. Update the spec section of the DocumentProcessor CR with the required configuration options.
Note: Changes to processing options affect only newly ingested or updated documents. Previously processed documents are reingested with the new settings only when the original document is modified.
  1. Find the DocumentProcessor CR that is associated with the Domain.
    Tip: The DocumentProcessor CR has the same name as the Domain.
    Example:
    • oc get domains.cas.isf.ibm.com -n ibm-cas
    • oc get documentprocessors.cas.isf.ibm.com -n ibm-cas
  2. View the current configuration of the DocumentProcessor CR.
    Example:
    oc get documentprocessors.cas.isf.ibm.com mydomain -n ibm-cas -o yaml
  3. Ensure that the type is nvidia_multimodal. Example: type: nvidia_multimodal
  4. Modify the DocumentProcessor CR according to the options added in the table.
  5. Save the DocumentProcessor CR.
  6. For the changes to take effect, the associated processing engine pods must be restarted. To restart the processing engine pods, follow these steps:
    1. List the Deployments in the ibm-cas namespace.
      Example:
      oc get deployment -n ibm-cas
    2. Find the Deployment that has the same name as the Domain. Example: mydomain
    3. Delete the Deployment.
      Example:
      oc delete deployment/mydomain -n ibm-cas
    4. List the Pods in the ibm-cas namespace.
      Example:
      oc get pods -n ibm-cas
    5. Find the CAS Operator Controller Pod with a name that begins with ibm-isf-cas-operator-controller-manage. Example: ibm-isf-cas-operator-controller-manager-6dd8f5dc86-pjdrs
    6. Delete the Pod.
      Example:
      oc delete pod/ibm-isf-cas-operator-controller-manager-6dd8f5dc86-pjdrs -n ibm-cas
Results
  • The CAS Operator Controller Pod restarts and reconciles the updated DocumentProcesssor CR.
  • During reconciliation, the operator re-creates the Deployment associated with the Domain and DocumentProcessor CRs by using the updated configuration. The corresponding processing engine pods are then started with the new settings.