Uploading files as a knowledge source

You can enhance your agent’s knowledge by uploading documents directly to its internal knowledge base. This method is ideal for quick setup and static content.

When to use file uploads

Use file uploads in the following scenarios:

  • Static, reviewed content that rarely changes
  • Domain‑specific information that you want the agent to reference
  • A need for quick knowledge integration without external systems

For frequently updated or large datasets, consider connecting to an external knowledge repository instead.

File requirements

To ensure successful uploads, verify that your files meet the following criteria.

Supported file types and size limits

File extension Maximum size
.docx, .pdf, .pptx, .xlsx 25 MB
.csv, .html, .txt 5 MB
  • .csv files must be UTF‑8 encoded
  • Each file must have a unique file name
  • You can upload up to 20 files in a single batch.
  • The total size of all files in a batch must not exceed 30 MB.
  • Each file can contain a maximum of 600 pages.

Steps to upload files

Follow these steps to upload documents to the knowledge base:

  1. Navigate to Knowledge and click Choose knowledge +.
  2. In Select source, click Upload files.
  3. Click the upload box or drag your files in, then click Next.
  4. In the Knowledge details section,
  5. Click Save.
  6. In the files list, check all uploaded files.
    • To add more files, click Upload files.
    • To remove a file, click Delete file Delete icon.

Support for document URLs

You can associate external URLs with uploaded files so that users can view the original source.

To add a URL

  1. In the uploaded files list, hover over the file name.
  2. Click Edit URL source Pencil icon.
  3. Enter a new URL, or update the existing one.

Using URLs in citations

  • When the agent cites content from that file in the Orchestrate Chat, the citation includes a View source button.
  • View source opens the URL that you assigned.
  • Users can verify or explore the original document by using that link.

View source
Figure 1. View source in citation.

Enhanced document processing capabilities

When you upload files either to an existing knowledge source or while creating a new one, your documents are now processed using advanced extraction technologies for higher accuracy and better content quality.

Key enhancements:

  • Improved table handling: Preserves the structure of tables in documents containing tabular data for more accurate representation.
  • OCR integration: Extracts text from documents with embedded images, ensuring no critical information is missed.
  • Markdown output: Stores extracted text in Markdown format to maintain structure, readability, and context.

Known limitations:

  • Page limit: Each file can contain a maximum of 600 pages.
  • Processing time: Large files may require more time to process.

Storing your data

When you upload files or documents directly to the agent, your data is stored securely in an IBM Cloud data center that is located in a specific region. A few exceptions apply depending on where and how you're using the service:

  • If you're using watsonx Orchestrate in IBM Cloud, your data stays in the same IBM Cloud data center where your environment is hosted.
  • If you're using watsonx Orchestrate on AWS, your data is stored in an IBM Cloud region that is geographically closest to your AWS region.
Note: If your environment is hosted in AWS Mumbai, your data remains in AWS Mumbai and is not transferred to an IBM Cloud region.

To help clarify where your data goes, use the following reference table that maps watsonx Orchestrate AWS regions to the IBM Cloud regions:

watsonx Orchestrate on AWS Your uploaded data on IBM Cloud
us-east-1 (North Virginia) us-east (Washington D.C)
eu-central-1 (Frankfurt) eu-de (Frankfurt)
ap-southeast-1 (Singapore) jp-tok (Tokyo)