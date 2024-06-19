According to research from IBM®, about 42% of enterprises surveyed have AI in use in their businesses. Of all the use cases, many of us are now extremely familiar with natural language processing AI chatbots that can answer our questions and assist with tasks such as composing emails or essays. Yet even with widespread adoption of these chatbots, enterprises are still occasionally experiencing some challenges. For example, these chatbots can produce inconsistent results as they’re pulling from large data stores that might not be relevant to the query at hand.

Thankfully, retrieval-augmented generation (RAG) has emerged as a promising solution to ground large language models (LLMs) on the most accurate, up-to-date information. As an AI framework, RAG works to improve the quality of LLM-generated (this link resides outside of ibm.com) responses by grounding the model on sources of knowledge to supplement the LLM’s internal representation of information. IBM unveiled its new AI and data platform, watsonx™, which offers RAG, back in May 2023.

In simple terms, leveraging RAG is like making the model take an open book exam as you are asking the chatbot to respond to a question with all the information readily available. But how does RAG operate at an infrastructure level? With a mixture of platform-as-a-service (PaaS) services, RAG can run successfully and with ease, enabling generative AI outcomes for organizations across industries using LLMs.