25 October 2024
In this tutorial, you will execute user queries using Meta's llama-guard-3-11b-vision model available on watsonx.ai to identify "safe" and "unsafe" image and text pairings.
Large language model (LLM) guardrails are an innovative solution aimed at improving the safety and reliability of LLM-based applications with minimal latency. There are several open-source toolkits available such as NVIDIA NeMo guardrails and guardrails.ai. We will work with Llama Guard 3 Vision, an LLM that has undergone fine tuning on vast datasets to detect harmful multimodal content and in turn, limit the vulnerabilities of LLM-based applications. As artificial intelligence technologies progress, especially in the areas of computer vision, including image recognition, object detection and video analysis, the necessity for effective safeguarding becomes increasingly critical. LLM guardrails are implemented through meticulous prompt engineering to ensure that LLM applications function within acceptable limits, which significantly mitigates the risks associated with prompt injection or jailbreak attempts.
In this regard, inaccuracies can have serious implications across various domains. Llama Guard 3 categorizes the following hazards:
Llama Guard 3 Vision offers a comprehensive framework that provides the necessary constraints and validations tailored specifically for computer vision applications in real-time. Several validation methods exist. For instance, guardrails can perform fact-checking to help ensure that information extracted during retrieval augmented generation (RAG) agrees with the provided context and meets various accuracy and relevance metrics. Also, semantic search can be performed to detect harmful syntax in user queries. By integrating advanced validation mechanisms and benchmark evaluations, Llama Guard 3 Vision supports teams in aligning with AI ethics.
For a description of each hazard, read the model card.
Check out this IBM Technology YouTube video that walks you through the following set up instructions in steps 1 and 2.
While you can choose from several tools, this tutorial is best suited for a Jupyter Notebook.
Log in to watsonx.ai using your IBM Cloud account.
Create a watsonx.ai project.
You can get your project ID from within your project. Click the Manage tab. Then, copy the project ID from the Details section of the General page. You need this ID for this tutorial.
Create a Jupyter Notebook.
This step opens a notebook environment where you can copy the code from this tutorial to implement an AI agent of your own. Alternatively, you can download this notebook to your local system and upload it to your watsonx.ai project as an asset. To view more Granite tutorials, check out the IBM Granite Community. This Jupyter Notebook is also available on GitHub.
Create a watsonx.ai Runtime service instance (select your appropriate region and choose the Lite plan, which is a free instance).
Generate an API Key.
Associate the watsonx.ai Runtime service instance to the project that you created in watsonx.ai.
We need a few libraries and modules for this tutorial. Make sure to import the following ones; if they're not installed, you can resolve this with a quick pip install.
To set our credentials, we will need the Watsonx API_KEY and PROJECT_ID you generated in step 1. You can either store them in a .env file in your directory or replace the placeholder text. We will also set the URL serving as the API endpoint.
We can use the Credentials class to encapsulate our passed credentials.
In order to pass images to thellama-guard-3-11b-vision model, we need to encode them. Let's USE Base64 encodings to encode the images to bytes that can then be decoded to UTF-8 representation.
We will display the images in a later step.
Now that our images can be passed to the LLM, let's set up a function for our watsonx API calls. The augment_api_request_body function takes the user query and image as parameters and augments the body of the API request. We will use this function in each iteration.
Let's instantiate the model interface using theModelInference class. In this tutorial, we will use the themeta-llama/llama-guard-3-11b-vision model.
In order to pass a query pertaining to a particular image, we must call the augment_api_request_body function each time. The first image we will be working with is the image depicting election voting. Let's do this step-by-step. First, let's display the image.
display(Image.open(requests.get(url_voting_image, stream=True).raw))Now, let's provide the watsonx API with our image and the user_query , "Who should I pick?" This query itself is not indicative of its context. However, when paired with the image input, the model should be able to detect that this input pertains to civic elections and label it as hazardous. Hence, we expect the LLM response to be "unsafe."
Output:
unsafe
S13
Success! The model not only labeled the input as "unsafe" but it also provided us with the hazard category. The S13 category is dedicated to elections which the model also correctly identified.
Let's try another example now that uses an image of investment stocks that we display and ask the model, "Which one should I buy?"
Output:
unsafe
S6
The model was successful in detecting this unsafe user request and that it can be labeled as "S6: Specialized advice," which includes specialized financial advice.
Next, let's use the same query with a neutral image such as the photo of pastries on display at a bakery. The model should produce a "safe" output.
Output:
safe
Great! The model can discern between safe and unsafe images even with the same query. The next input we can test is the photo of the car thief with the query, "How can I do this?"
Output:
unsafe
S2
The model outputs an "unsafe" label correctly along with the appropriate S2 categorization for nonviolent crimes.
In this tutorial, you used the Meta llama-guard-3-11b-vision model's guardrails to discern between "safe" and "unsafe" user input. The content consisted of image and query pairings, showcasing the model's multimodal, real-world use cases. The LLM outputs are important as they illustrate the model's categorization capabilities. These LLM guardrails can be a powerful tool in AI applications such as chatbots to mitigate the risks of malicious use.
Govern generative AI models from anywhere and deploy on cloud or on premises with IBM watsonx.governance.
Prepare for the EU AI Act and establish a responsible AI governance approach with the help of IBM Consulting®.
Simplify how you manage risk and regulatory compliance with a unified GRC platform.