GenaiMultiDocumentQuery class

The GenaiMultiDocumentQuery class performs vector queries across multiple documents and operates in four modes: RAG, entire document, adaptive entire document, and adaptive RAG.

Processing modes

The GenaiMultiDocumentQuery API supports multiple processing modes that determine how document content is supplied to the LLM for processing. The API caller controls the mode by specifying a value in the GenaiMaxDocumentChunks parameter.

Important: The GenaiMultiDocumentQuery API processing modes (RAG mode, entire document mode, adaptive entire document mode, and adaptive RAG mode) are available in Content Platform Engine 5.7 IF04 and later.
RAG mode

By default, GenaiMultiDocumentQuery uses a retrieval‑augmented generation (RAG) approach to process document content. In this mode, the API performs a vector search to identify the most relevant portions of the selected documents and sends only those chunks to the generative model. Full document text is not sent.

The caller controls chunk selection through the GenaiMaxDocumentChunks parameter:

  • If the caller passes null, the API uses a default value of 8 document chunks.
  • If the caller specifies a positive integer, the API uses that value to determine the number of chunks returned.

Only the selected chunks are sent to the generative model for response generation. The GenaiVectorChunks output parameter contains the document chunks that were used. At least one of the selected documents must be vector indexed.

Use RAG mode when you need selective retrieval to limit the amount of content that is sent to the model. This mode is appropriate when document citations are sufficient without full document context.

Entire document mode

Entire document mode bypasses retrieval-augmented generation (RAG) and sends full document content to the generative model. To enable this mode, the caller passes -1 in the GenaiMaxDocumentChunks parameter. In entire document mode, the API does not perform a vector search. Instead, it sends the entire extracted text of the selected documents to the generative model for processing.

The API enforces a maximum combined document size of 400,000 characters. If the total extracted text of all selected documents exceeds this limit, the API truncates the text by using the following behavior:

  • All text from smaller documents is preserved.
  • Remaining documents are truncated to a common maximum length.

The GenaiVectorChunks output parameter is returned as an empty string if no truncation occurs, or set to TRUNCATED if truncation occurs. Documents do not need to be vector indexed for this mode.

Use entire document mode when full document context is required and the overall document size is predictable. This mode never returns document citations.

Adaptive entire document mode

Adaptive entire document mode automatically selects between entire document processing and RAG based on the total document size. To enable this mode, the caller passes 0 in the GenaiMaxDocumentChunks parameter.

The API measures the total extracted text of all selected documents and applies the following logic:

  • If the total text is less than 400,000 characters, the API sends the entire document text to the model without performing a RAG query, and GenaiVectorChunks is not populated.
  • If the total text exceeds 400,000 characters, the API reverts to RAG processing, selects relevant document chunks using vector search, and GenaiVectorChunks contains the chunks that were used.

At least one of the selected documents must be vector indexed. Use this mode when document size varies or is unknown and you want the API to choose the optimal strategy. This mode returns document citations only when documents exceed the 400,000 character threshold.

Adaptive RAG mode

Adaptive RAG mode combines the benefits of RAG mode (document citations) with entire document mode (full context). Unlike standard RAG mode that sends only document chunks, this mode sends both chunks and full documents to the generative model. To enable this mode, the caller passes -2 in the GenaiMaxDocumentChunks parameter.

In this mode, the API combines vector search with full document processing:

  • The API performs a vector search based on the list of document IDs to find the top N document chunks that match the user query.
  • Both the selected chunks and the full extracted text of the documents are sent to the generative model. The chunks are sent first, followed by the complete document text. If necessary, the full document text is truncated to ensure that the combined content does not exceed the LLM context window.

The GenaiVectorChunks output parameter always contains the document chunks that were used, ensuring document citations are always available. All selected documents must be vector indexed.

Use this mode when you need document citations while also providing full document context to the model for comprehensive analysis. This mode addresses the limitation of standard RAG mode (chunks only) and entire document mode (no citations).

Table 1. GenaiMultiDocumentQuery class properties
Property Data type Description
GenaiLLMPrompt String The input prompt from the user. The maximum length for the value is 4000 characters.
GenaiLLMModelName String An optional watsonx LLM model name. The maximum length for the value is 256 characters.
GenaiLLMResponse String The response from the watsonx LLM.
GenaiLLMMaxOutputTokens Integer If you set this parameter, it overrides the LLM maximum output tokens parameter which has a default value of 4096. You cannot set a value less than 10 or greater than 8192.
GenaiPromptTemplate String An optional parameter that specifies the prompt template for your query operation. If you set this parameter, the template that you pass overrides all the default prompt templates that are already configured in Content Platform Engine.
GenaiVectorChunks String A JSON value that contains the document chunks that the vector search returns. If GenaiPerformLLMQuery is false, then this property holds all the vector query results. If GenaiPerformLLMQuery is true, then this property holds the chunks that were submitted to the LLM as context for the LLM query.
GenaiPerformLLMQuery Boolean If the value is true, the prompt is submitted to an LLM with the vector chunks as context.
GenaiMaxDocumentChunks Integer Controls the processing mode and the maximum number of document chunks. Valid values:
  • null or positive integer: RAG mode (default 8 chunks if null)
  • -1: Entire document mode
  • 0: Adaptive entire document mode
  • -2: Adaptive RAG mode
GenaiRelevancyFilterLevel Float When you set this parameter, it overrides the LLM relevancy score filter level which has a default value of 0.5. You can set a value between 0.0 and 1.0.
GenaiContextDocuments List of Id The list of document id's that are used as context for the query.