Quick start: Prompt a foundation model with the retrieval-augmented generation pattern

Take this tutorial to learn how to use foundation models in IBM watsonx.ai to generate factually accurate output grounded in information in a knowledge base by applying the retrieval-augmented generation pattern. Foundation models can generate output that is factually inaccurate for a variety of reasons. One way to improve the accuracy of generated output is to provide the needed facts as context in your prompt text. This tutorial uses a sample notebook using the retrieval-augmented generation pattern method to improve the accuracy of the generated output.

Required services: Watson Studio; Watson Machine Learning; watsonx.ai

Your basic workflow includes these tasks:

Open a project. Projects are where you can collaborate with others to work with data.
Add a notebook to your project. You can create your own notebook, or add a sample notebook to your project.
Add and edit code, then run the notebook.
Review the notebook output.

Read about retrieval-augmented generation pattern

You can scale out the technique of including context in your prompts by leveraging information in a knowledge base. The retrieval-augmented generation pattern involves three basic steps:

Search for relevant content in your knowledge base
Pull the most relevant content into your prompt as context
Send the combined prompt text to the model to generate output

Watch a video about using the retrieval-augmented generation pattern

Watch Video Watch this video to preview the steps in this tutorial. There might be slight differences in the user interface shown in the video. The video is intended to be a companion to the written tutorial.

This video provides a visual method to learn the concepts and tasks in this documentation.

Try a tutorial to prompt a foundation model with the retrieval-augmented generation pattern

In this tutorial, you will complete these tasks:

Task 1: Open a project
Task 2: Add a sample notebook to your project
Task 3: Edit the notebook
Task 4: Run the notebook and review the output

Tips for completing this tutorial

Here are some tips for successfully completing this tutorial.

Get help in the community

If you need help with this tutorial, you can ask a question or find an answer in the Cloud Pak for Data Community discussion forum.

Set up your browser windows

For the optimal experience completing this tutorial, open Cloud Pak for Data in one browser window, and keep this tutorial page open in another browser window to switch easily between the two applications. Consider arranging the two browser windows side-by-side to make it easier to follow along.

Side-by-side tutorial and UI

Tip: If you encounter a guided tour while completing this tutorial in the user interface, click Maybe later.

Task 1: Open a project

You need a project to store the sample notebook.

Follow these steps to open an existing project or create a new project.

From the Quick navigation, click All projects.
Open an existing project, or create a new project:
1. Click New project on the Projects page.
2. Select Create an empty project.
3. On the Create a project screen, type a name and optional description for the project.
4. Click Create.

For more information or to watch a video, see Creating a project.

Check your progress

The following image shows the empty project. You are now ready to open the Prompt Lab.

Task 2: Add the sample notebook to your project

The sample notebook uses a small knowledge base and a simple search component to demonstrate the basic pattern. The scenario used in this notebook is for a company that sells seeds for planting in a garden. The website for an online seed catalog has many articles to help customers plan their garden and ultimately select which seeds to purchase. The new widget is being added to the website to answer customer questions on the contents of the articles.

From your project, click the Assets tab.
Click New asset > Jupyter notebook editor.
Select the URL page.
For the Name, type Simple introduction to retrieval-augmented generation with watsonx.ai.

For the Notebook URL, copy and paste the following URL:


https://raw.githubusercontent.com/IBMDataScience/sample-notebooks/master/Cloud/IPYNB/Simple-Introduction-to-retrieval-augmented-generation.ipynb

Click Create. Wait for the notebook editor to load.
From the menu, click Kernel > Restart & Clear Output, then confirm by clicking Restart and Clear All Outputs to clear the output from the last saved run.

Check your progress

The following image shows the notebook open in Edit mode. Now you are ready to set up the prerequisites for running the notebook.

Task 3: Edit the notebook

Before you can run the notebook, you need to set up the environment. Follow these steps to verify the notebook prerequisites:

Scroll to the For IBM watsonx.ai on Cloud Pak for Data section in the notebook to see the prerequisite to run the notebook.
Create a platform API key. You need to pass your credentials using a Cloud Pak for Data platform API key. If you don't already have a saved API key, then follow these steps to create an API key.
1. Click your profile icon.
2. Click Profile and settings.
3. Click API key > Generate new key.
4. Click Generate.
5. Click Copy.
6. Save the API key for future use.
7. Close the dialog box.
Scroll to the Run the cell to provide the platform API key, url (hostname), and username section:
1. Click the Run icon to run the cell.
2. Paste the API key, and press Enter.
Under Run the cell to set the credentials for IBM watsonx.ai on Cloud Pak for Data, click the Run icon to run the cell and set the credentials.

Check your progress

The following images shows the notebook with the prerequisites completed. Now you are ready to run the notebook and review the output.

Task 4: Run the notebook and review the output

The sample notebook includes information about the retrieval-augmented generation and how you can adapt the notebook for your specific use case. Follow these steps to run the notebook and review the output:

Scroll to the Step 2: Create a Knowledge Base section in the notebook:
1. Click the Run icon for each of the three cells in that section.
2. Review the output for the three cells in the section. The code in these cells sets up the knowledge base as a collection of two articles. These articles were written as samples for watsonx.ai, they are not real articles published anywhere else. The authors and publication dates are fictional.
Scroll to the Step 3: Build a simple search component section in the notebook:
1. Click the Run icon for each of the two cells in that section.
2. Review the output for the two cells in the section. The code in these cells builds a simple search component. Many articles that discuss retrieval-augmented generation assume the retrieval component uses a vector database. However, to perform the general retrieval-augmented generation pattern, any search-and-retrieve method that can reliably return relevant content from the knowledge base will do. In this notebook, the search component is a trivial search function that returns the index of one or the other of the two articles in the knowledge base, based on a simple regular expression match.
Scroll to the Step 4: Craft prompt text section in the notebook:
1. Click the Run icon for each of the two cells in that section.
2. Review the output for the two cells in the section. The code in these cells crafts the prompt text. There is no one, best prompt for any given task. However, models that have been instruction-tuned, such as bigscience/mt0-xxl-13b, google/flan-t5-xxl-11b, or google/flan-ul2-20b, can generally perform this task with a sample prompt. Conservative decoding methods tend towards succinct answers. In the prompt, notice two string placeholders (marked with %s) that will be replaced at generation time:
  - The first placeholder will be replaced with the text of the relevant article from the knowledge base
  - The second placeholder will be replaced with the question to be answered
Scroll to the Step 5: Generate output using the foundation models Python library section in the notebook:
1. Click the Run icon for each of the three cells in that section.
2. Review the output for the three cells in the section. The code in these cells generates output by using the Python library. You can prompt foundation models in watsonx.ai programmatically using the Python library. For more information about the library, see the following topics:
- Introduction to the foundation models Python library
- Foundation models Python library reference
Scroll to the Step 6: Pull everything together to perform retrieval-augmented generation section in the notebook:
1. Click the Run icon for each of the two cells in that section. This code pulls everything together to perform retrieval-augmented generation.
2. Review the output for the first cell in the section. The code in this cell sets up the user input elements.
3. For the second cell in the section, type a question related to tomatoes or cucumbers to see the answer and the source. For example, Do I use mulch with tomatoes?.
4. Review the answer to your question.

Check your progress

The following image shows the completed notebook.

Next steps

Watch the video to learn about considerations for applying the retrieval-augmented generation pattern to a production solution.
Try the Prompt a foundation model using Prompt Lab tutorial.
Try the other watsonx.ai use case tutorials.

Additional resources

Parent topic: Quick start tutorials