Validating and monitoring AI models with Watson OpenScale

IBM Watson OpenScale tracks and measures outcomes from your AI models, and helps ensure they remain fair, explainable, and compliant no matter where your models were built or are running. Watson OpenScale also detects and helps correct the drift in accuracy when an AI model is in production.

Service This service is not available by default. An administrator must install this service on the IBM Cloud Pak for Data platform, and you must be given access to the service. To determine whether the service is installed, open the Services catalog and check whether the service is enabled.

Enterprises use model evaluation to automate and put into service AI lifecycle in business applications. This approach ensures that AI models are free from bias, can be easily explained and understood by business users, and are auditable in business transactions. Model evaluation supports AI models built and run with the tools and model serve frameworks of your choice.

Watch this short video to learn more about Watson OpenScale:

Trustworthy AI in action

To learn more about model evaluation in action, see How AI picks the highlights from Wimbledon fairly and fast.

Components of Watson OpenScale

Watson OpenScale has four main areas:

Insights: The Insights dashboard displays the models that you are monitoring and provides status on the results of model evaluations.
Explain a transaction: Explanations describe how the model determined a prediction. It lists some of the most important factors that led to the predictions so you can be confident in the process.
Configuration: Use the Configuration tab to select a database, set up a machine learning provider, and optionally add integrated services.
Support: The Support tab provides you with resources to get the help you need with Watson OpenScale. Access product documentation or connect with IBM Community on Stack Overflow. To create a service ticket with the IBM Support team, click Manage tickets.

Monitors

Monitors evaluate your deployments against specified metrics. Configure alerts that indicate when a threshold is crossed for a metric. Watson OpenScale evaluates your deployments based on three default monitors:

Quality describes the model’s ability to provide correct outcomes based on labeled test data called Feedback data.
Fairness describes how evenly the model delivers favorable outcomes between groups. The Fairness monitor looks for biased outcomes in your model.
Drift warns you of a drop in accuracy or data consistency.

Note: You can also create Custom monitors for your deployment.

Get started with Watson OpenScale

Choose a method for setting up Watson OpenScale.

Parent topic: Deploying assets