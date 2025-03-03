Retrieval augmented generation is an artificial intelligence (AI) application that connects a generative AI model with an external knowledge base. The data in the knowledge base augments user queries with more context so the LLM can generate more accurate responses. RAG enables LLMs to be more accurate in domain-specific contexts without needing fine-tuning .

Rather than rely solely on training data, RAG-enabled AI models can access current data in real time through APIs and other connections to data sources. A standard RAG pipeline comprises two AI models:

The information retrieval component, typically an embedding model paired with a vector database containing the data to be retrieved.

The generative AI component, usually an LLM.

In response to natural language user queries, the embedding model converts the query to a vector embedding, then retrieves similar data from the knowledge base. The AI system combines the retrieved data with the user query for context-aware response generation.