In this recipe, you’ll learn how to harness the power of advanced tools to build an AI-powered multimodal RAG pipeline. This tutorial will guide you through the following processes:Documentation Index
Fetch the complete documentation index at: https://wwwpoc.ibm.com/llms.txt
Use this file to discover all available pages before exploring further.
- Document preprocessing: Learn how to handle documents from various sources, parse and transform them into usable formats and store them in vector databases by using Docling. You will use a Granite LLM to generate image descriptions of images in the documents.
- RAG: Understand how to connect LLMs such as Granite with external knowledge bases to enhance query responses and generate valuable insights.
- LangChain for workflow integration: Discover how to use LangChain to streamline and orchestrate document processing and retrieval workflows, enabling seamless interaction between different components of the system.
- Docling: An open-source toolkit used to parse and convert documents.
- Granite: A state-of-the-art MLLM that provides robust natural language capabilities and a vision language model that provides image to text generation.
- LangChain: A powerful framework used to build applications powered by language models, designed to simplify complex workflows and integrate external tools seamlessly.
Get started
Explore sample code in a GitHub repo
Try it out
Execute sample code in Colab