Scenario: Governing an AI RAG solution

This scenario describes strategies for governing an AI asset. Review the guidance for governing a Retrieval-Augmented Generation (RAG) solution.

Use this scenario approach to learn how you can use the components of watsonx.governance to achieve an integrated solution that ties together asset development with performance monitoring and compliance goals.

In this scenario, your team plans to host a question and answer application grounded with a curated set of documents. In this case, the app supports Q&A related to an article on the history of elevators. Review the steps for how to use watsonx.governance features, including the Governance console, AI factsheets, and evaluations to control and track this application from request to production. Links to related documentation connect you with instructions for how to perform the tasks described in the scenario.

Scenario overview

This scenario describes these high-level steps:

  1. Setting up a use case from the Governance Console.
  2. Moving the use case through an automated approval workflow.
  3. Completing a Risk Assessment for the proposed app.
  4. Completing an assessment to determine whether the app requires compliance with the EU AI Act.
  5. Synchronizing facts for the app with AI Factsheets.
  6. Evaluating the quality of the generated responses for the question and answer app.
  7. Analyzing and reviewing transactions for explainability of answers.

This scenario assumes the following:

  • Watsonx.governance is installed and the Governance Console is configured.
  • The question and answer app is available as an external model. In this scenario, it is created with Azure OpenAI.
  • A notebook registers the model with watsonx.ai and stores the detached prompt template in a Watson Studio project.
Note: Tracking AI assets, especially using the Governance console, is highly customizable. Your organization can plan for and implement a governance approach that meets your needs. This scenario describes one governance approach.

Governing a use case from the Governance Console

  • View governance activity from the Governance Console. The console is highly customizable, but this view provides a typical view of a dashboard for tracking use cases and related governance activity. See Managing risk and compliance with Governance console in IBM watsonx.

    Governance console for tracking AI use cases and governance activities

  • The use case for the Elevator Q&A Assistant provides details including a description of the business problem, the party requesting a solution, and the goal of the solution. The use case captures all of the details for activities related to tracking an AI asset to address the business problem. See Creating use cases with Governance console.

    Use case for a Q&A app

  • One of the key features of creating use cases in the Governance console is that you can configure automated workflows to regulate AI lifecycles activities. The most common workflow is an approval flow that prompts stakeholders to review and approve the proposed solution. After defining the basic information for the use case, including details about the goals for the requested AI solution, the use case is sent to the first approver for review. See Reviewing use cases.

    Automated approval workflow for an AI use case

  • The approval workflow can require additional approvers. When the approval workflow is satisfied, the process advances to the next stage.

    Completed approval workflow for an AI use case

  • In this case, the governance policy requires completion of a Risk Assessment questionnaire to determine a risk level for the Q&A app. The Risk Assessment connects to the risks defined in the AI risk atlas, a directory of potential risks and suggestions for mitigating the risks associated with AI. See AI risk atlas.

    AI risk atlas

  • Upon completion of the Risk Assessment questionnaire, a risk level is assigned for the use case. The risk level can drive a risk mitigation plan that can be added to the use case. See Completing a risk assessment.

    Risk assessment questionnaire based on the IBM Risk Atlas

  • In this scenario, the next step is to complete another questionnaire to determine whether the use case requires any additional steps to comply with the EU AI Act. If the app requires any additional steps for compliance, the plan can be added to the use case. See Completing an applicability assessment.

    Questionnaire for applicability for EU AI Act compliance

  • Completing the assessments triggers the next approval workflow, so the Legal and Compliance teams can review the use case. Once approved, the use case is ready for development. To switch to the development environment, click the Third-party link. This opens the AI factsheet for the use case in Watson Studio. Facts for tracking AI assets are synched between the AI use case in Watson Studio and the Governance console use case.

    Opening the associated AI use case for the Q&A app

  • On the Lifecycle tab, you can review an approach for tracking a detached prompt template for the remote model. An approach is one part of a solution for the business problem framed in the use case. A prompt template is a reusable prompt for a generative AI model with at least one variable. In this scenario, the prompt template is detached, meaning the inferencing is done on the remote model, but the results can be evaluated with watsonx.governance tools. See Setting up an AI use case.

    Reviewing the associated AI use case for the Q&A app

  • Review the detached deployment for an Azure OpenAI model that will generate responses for questions about the elevator article. Metrics provide detailed information about the asset being evaluated. For a prompt template for a Q&A model, metrics can answer quality, and test for the presence of personal or abusive content. See Tracking prompt templates.

    Reviewing evaluation data for the Q&A app

  • You can open the prompt template in a project to run new evaluations. In this case, the inferencing is done directly with the remote model but the evaluation details for the input and output for the model are captured in the factsheet. When you configure an evaluation, you set acceptable thresholds for the metrics to assess the performance of the prompt template. See Generative AI quality evaluations.

    Running new evaluations for the Q&A app

  • Analyze transactions for deeper insights into how the response was generated. For example, review answer relevance to get insights into how the answer was generated. Analyzing transactions contributes to deeper understanding and greater transparency for an AI solution. See Reviewing model transactions.

    Analyzing answers for the Q&A app

Parent topic: Watsonx.governance