Setting up retrieval augmented generation (RAG)
RAG (retrieval augmented generation) is the process of optimizing the large language model (LLM) output through prompt augmentation with additional context. When you submit a query, watsonx Code Assistant uses RAG tools to retrieve information from your code bases or documentation.
Before you begin
Before you set up RAG, ensure that you meet the following requirements:
- You have cluster administrator access to enable RAG on your IBM® Software Hub instance.
- You have access to the code repositories or documentation that you want to index.
- You have a GitHub personal access token for accessing private repositories.
About this task
watsonx Code Assistant uses RAG to enhance response quality by retrieving relevant, up-to-date context from your code bases and documentation. This relevant context is appended to your query before it is sent to the large language model (LLM), which reduces model hallucinations and improves the accuracy of generated responses.
You can configure watsonx Code Assistant to use specific code repositories and project documentation that are stored in Git repositories. Supported documentation formats include API documents, readme files, technical and design documents, Markdown files, PDFs, Word documents, and PowerPoint presentations.
The RAG system determines which sources to include or exclude to generate responses with the most useful information. The following figure illustrates the RAG configuration workflow:

Procedure
Results
watsonx Code Assistant uses the indexed repositories based on the following conditions:
- If one repository is opened in Visual Studio Code, watsonx Code Assistant searches for context in the opened repository by default.
- If multiple repositories are opened in Visual Studio Code, watsonx Code Assistant searches for context from the repository that is associated with the most recently accessed file.
- When you use the
@repocommand, watsonx Code Assistant checks for a repo.yaml file in the indexed repository. If one or more YAML configuration files are configured, watsonx Code Assistant uses all the configured repositories to generate a response. If no YAML configuration is found, watsonx Code Assistant uses the currently selected repository.
watsonx Code Assistant uses the indexed document collections based on the following conditions:
- If a document is opened in Visual Studio Code, watsonx Code Assistant searches for context in the opened document collection by default.
- If multiple document collections are opened in Visual Studio Code, watsonx Code Assistant searches for context from the most recently accessed document collection.
- When you use the
@docscommand, watsonx Code Assistant checks for a docs.yaml file in the indexed repository. If one or more YAML configuration files are configured, watsonx Code Assistant uses all the configured document collections to generate a response. If no YAML configuration is found, watsonx Code Assistant uses all documents with thedocs_nameprefix in your deployment space.
What to do next
- Review the use case scenarios to understand how to implement RAG for different team structures and access requirements. For more information, see Use case scenarios for RAG.
- Optionally, set up a YAML configuration to allow watsonx Code Assistant to search multiple repositories simultaneously or use specific indexed code repositories or documents in the vector store. For more information, see Setting up YAML configuration for RAG.