Q&A with RAG Accelerator

Try the Q&A with RAG Accelerator sample project to set up retrieval-augmented generation (RAG) to generate factually accurate output that is grounded in information from provided documents.

Obtaining and running the accelerator

Log into the Resource hub and then create the Q&A with RAG Accelerator sample project. This sample project might not be available in all regions or cloud platforms.

Required services

You need the following services to run the Q&A with RAG Accelerator:

  • watsonx.ai
  • watsonx.data

Running the accelerator

To run the accelerator, open the Readme on the project Overview page and follow the instructions.

What's new in the Q&A with RAG Accelerator

The Q&A with RAG Accelerator is updated seperately from watsonx product releases. Check for updates periodically.

Version 2.2.2 February 2026

The following updates are included in this release:

  • Knowledge coverage insight report to help knowledge owners quickly spot topics that need better or expanded content.

  • Taxonomy generation and visualization to view the hierarchical relationships in your content.

  • Synthetic log data generator notebook to create synthetic log records and query feedback data to test analytics, dashboards, and retrieval evaluation. The notebook exports a CSV test file with generated records that you can input to the user feedback analytics notebook.

Version 2.2, December 2025

The following updates are included in this release:

  • New vector database option: OpenSearch vector database (private preview)
  • The Evaluation Metrics Notebook is now integrated with watsonx.governance
  • Provide your own vector database search as the input context for generating answers

Version 2.1, November 2025

The following updates are included in this release:

  • New vector database option: DataStax Astra
  • More document file format options: HTML, MD, DOCX, and PPTX, in addition to PDFs
  • Bulk ingestion of document files
  • Share insights by exporting user feedback charts as HTML
  • Performance improvements
  • Integration with watsonx.data as a Service: The Answer generation function can perform vector searches for watsonx.data Milvus by using the watsonx.data as a Service Retrieval API. Available in the Toronto region only.

Overview

The Q&A with RAG Accelerator provides an advanced RAG pattern and implementation that includes the following processes:

  • Document processing: The conversion, processing, and indexing of documents to generate a vector index.
  • User interaction: User conversations in a UI-based chatbot application.
  • Answer generation: Question answering by RAG based on vector search results or by using watsonx.data as a Service document retrieval APIs.
  • Input/output logging: The logging of questions, retrieved chunks and metadata, and answers in a secondary log index.
  • User feedback collection: User feedback is appended to the matching input/output log entries.
  • Content analysis: A report of the specific content that needs to be enhanced to improve answers that received negative user feedback.
  • Knowledge coverage insights: A report of the topics that need better or expanded content.
  • Taxonomy generation and visualization: A visualization of the hierarchical relationships in your content.
  • Human intervention: The identification of the best experts to respond to unsatisfactory answers by using a vector index with expert profiles.

The following graphic summarizes the processes of the Q&A with RAG Accelerator.

alt=

Document processing

A notebook for document processing and indexing automates document conversion, splitting, and indexing in a vector index in one of these vector databases:

  • DataStax Astra
  • watsonx Discovery with Elastic Search Enterprise or IBM Cloud Databases for Elasticsearch Platinum.
  • watsonx.data Milvus
  • watsonx.data OpenSearch

The example document collection to vectorize is a version of the watsonx as a Service set in a ZIP file.

You can customize the notebook to run existing vector indexes that you created outside of watsonx with other tools, for example, Elastic connectors and pipelines, Spark pipelines, or your own processes.

User interaction

Sample chatbot applications provide a UI for conversations that can include the following types of interactions:

  • Users ask questions
  • Generated answers and reference links are returned
  • Users provide feedback about the answers
  • Experts provide better answers to address negative user feedback

You can try one of the following sample chatbot applications:

  • watsonx Orchestrate AI assistant
  • Streamlit application

Answer generation

The notebook for the Q&A Python function defines the Python function code and automates its deployment with a well-defined URL suffix in a deployment space. The Q&A with RAG Python function code is configured by parameter sets. The function takes a question as input and queries the vector index to retrieve the most relevant chunks for answering the question and the chunk metadata, including links to source documents. The function adds the most relevant chunks to the configured prompt template and returns the generated answer, the calculated faithfulness score, and the retrieved chunks with metadata. If you store your documents in watsonx.data as a Service, the Answer generation function can perform vector searches for watsonx.data Milvus by using the watsonx.data as a Service Retrieval API.

Input/output logging

For each call of the Q&A with RAG Python function, you can enable logging of the prompt input and output text. Any PII is stripped from the strings before logging. The log index is separate from the vector index for the documents in the vector database.

If input/output logging is active, you can enable type-ahead question completion suggestions when users type their questions. If the user accepts a completion based on a recently answered question, the answer is retrieved from the log index, which saves time and GPU inference and retrieval costs.

User feedback collection

User feedback helps stakeholders understand how well the solution works for their users, which document topics users are interested in, and how well the solution answers questions based on the content. You can configure the application to call the Q&A with RAG Python function again to collect any user feedback on the answer. The user feedback of a satisfaction score and an optional comment is then appended to the Q&A log record for subsequent analysis.

To test your solution, you can generate synthetic log records and user feedback data by running the Synthetic Log Data Generator notebook. The notebook exports a CSV test file with generated records. You can then use the generated records to test analytics, dashboards, and retrieval evaluation with the user feedback analytics notebook.

Content analysis

You can configure the user feedback analytics notebook and run it directly or as a job. The notebook queries the log data for the specified time interval or start and end dates. The notebook loads the log data into a dataframe and uses unsupervised topic detection from BERTopic, Watson Natural Language Processing, or the Top2Vec method to determine which document topics were retrieved most frequently to generate answers. The notebook analyzes and visualizes user satisfaction by topic, and includes the questions, answers, and user feedback comments for low-rated answers. Based on these insights, stakeholders and knowledge content owners can drive content improvements that result in better answers. You can export the content insights charts from the notebook as HTML.

Knowledge coverage reports (Tech preview)

You can run the knowledge coverage insight report to identify topics that need better or expanded content.

The report performs the following actions:

  1. Compiles reference questions from test data, RAG log index, and the corpus index (OpenSearch or Watson Discovery). User feedback data is not required.
  2. Evaluates how well your corpus answers user questions and highlights where content needs improvement.
  3. Returns information on auto‑detected topics, topic popularity, and knowledge gap analysis.

Taxonomy generation and visualization (Tech preview)

You can run taxonomy generation and visualization to view the hierarchical relationships in your content:

  • Taxonomy generation automatically identifies and extracts key themes from a document corpus, then organizes them into a clear hierarchical taxonomy of parent–child relationships. The taxonomy shows how structured knowledge can be derived from unstructured content.

  • Taxonomy visualization produces interactive visualizations that display the taxonomy structure for easy exploration, delivering a well‑organized, intuitive view of the underlying document corpus.

Human intervention

When a user is not satisfied with an answer, you can configure the application to retrieve an expert contact who can provide a better answer. You can configure the expert profiling notebook to process expert profile documents and build an index based on that information. For example, the application can route the question to the expert, send the expert's answer to the user, and alert the knowledge base owner to a possible content enhancement.

Learn more