Configuring AI guardrails in watsonx.ai
You can set AI guardrails in watsonx.ai to moderate the input text provided to a foundation model and the output generated by the model in multiple ways.
You can remove harmful content when you're working with foundation models in watsonx.ai with the following methods:
- From the Prompt Lab. For details, see Configuring AI guardrails in the Prompt Lab
- Programmatically with the following methods:
Configuring AI guardrails in the Prompt Lab
To remove harmful content when you're working with foundation models in the Prompt Lab, set the AI guardrails switcher to On.
The AI guardrails feature is enabled automatically for all natural language foundation models in English.
To configure AI guardrails in the Prompt Lab, complete the following steps:
-
With AI guardrails enabled, Click the AI guardrails settings icon
.
-
You can configure different filters to apply to the user input and model output and adjust the filter sensitivity, if applicable.
-
HAP filter
To disable AI guardrails, set the HAP slider to
1. To change the sensitivity of the guardrails, move the HAP sliders. -
PII filter
To enable the PII filter, set the PII switcher to On.
-
Granite Guardian model as a filter
Granite Guardian moderation is disabled by default. To change the sensitivity of the guardrails, move the Granite Guardian sliders.
Experiment with adjusting the sliders to find the best settings for your needs.
-
-
Click Save.
Configuring AI guardrails programmatically
You can set AI guardrails programmatically to moderate the input text provided to a foundation model and the output generated by the model in multiple ways.
REST API
You can use the following watsonx.ai API endpoints to configure and apply AI guardrails to natural language input and output text:
- When you inference a foundation model by using the text generation API, you can use the
moderationsfield to apply filters to the foundation model input and output. For more information, see Text generation in the watsonx.ai API reference documentation.
Python
You can use the watsonx.ai Python SDK to configure and apply AI guardrails to natural language input and output text in the following ways:
- Adjust the AI guardrails filters with the Python library when you inference the foundation model by using the text generation API. For details, see Inferencing a foundation model programmatically (Python).
For more information, see watsonx.ai Python SDK.