Configuring fairness evaluations in Watson OpenScale
Watson OpenScale evaluate models for bias to ensure fair outcomes among different groups.
Evaluating the model for fairness
You can use fairness evaluations to determine whether your model produces biased outcomes. The fairness evaluation checks when the model shows a tendency to provide a favorable (preferable) outcome more often for one group over another.
The fairness evaluation generates a set of metrics every hour by default. You can generate these metrics on demand by clicking Evaluate fairness now or by using the Python client.
When you test an evaluation in a pre-production environment, you evaluate fairness based on test data. Test data must have the same format and schema as the training data you used to train the model.
In a production environment, you monitor feedback data, which is the actual data logged with payload logging. For proper monitoring, you must regularly log feedback data to Watson OpenScale. You can provide feedback data by clicking Upload feedback data on the Evaluations page of the Watson OpenScale Insights dashboard. You can also provide feedback data by using the Python client or REST API.
Before you begin
Before configuring the fairness evaluation, you must provide model details and upload payload data to enable fairness evaluations.
To configure fairness evaluations unstructured text and image models, you must provide payload data that contains meta fields, such as Gender, to calculate disparate impact. To calculate performance metrics for unstructured text and image models, you must also provide feedback data that contains meta fields with the correctly predicted outcomes.
You must complete similar requirements when you configure fairness evaluations for indirect bias. When you configure fairness evaluations for unstructured text and image models, you don't have to provide training data.
Configuring the evaluation
You can configure fairness evaluations manually or you can run a custom notebook to generate a configuration file. You can upload the configuration file to specify the settings for your evaluation.
When you configure fairness evaluations manually, you can specify the reference group (value) that you expect to represent favorable outcomes. You can also select the corresponding model attributes (features) to monitor for bias (for example, Age or Sex), that will be compared against the reference group. Depending on your training data, you can also specify the minimum and maximum sample size that will be evaluated.
Favorable and unfavorable outcomes
The output of the model is categorized as either favorable or unfavorable. For example, if the model is recommending whether a person gets a loan or not, then the Favorable outcome might be Loan Granted
or Loan Partially Granted
.
The unfavorable outcome might be Loan Denied
.
The values that represent a favorable outcome are derived from the label
column in the training data. By default the predictedLabel
column is set as the prediction
column. Favorable and unfavorable values must be specified by using the value of the prediction
column as a string data type, such as 0
or 1
when you are uploading training data.
Metrics
This section allows users to select all the metrics they want to configure. By default only Disparate impact metric is computed.
Minimum sample size
The minimum sample size is set to delay the fairness evaluation until a minimum number of records are available in the evaluation data set. This function ensures that the sample size is not too small and skews results. By setting a minimum sample size, you prevent measuring fairness until a minimum number of records are available in the evaluation data set. This ensures that the sample size is not too small to skew results. When the fairness monitor runs, it uses the minimum sample size to decide the number of records that are evaluated.
Features: Reference and monitored groups
The values of the features are specified as either a reference or monitored group. The monitored group represents the values that are most at risk for biased outcomes. For example, for the Sex
feature, you can
set Female
and Non-binary
as the monitored groups. For a numeric feature, such as Age
, you can set [18-25]
as the monitored group. All other values for the feature are
then considered as the reference group, for example, Sex=Male
or Age=[26,100]
.
Configuring thresholds for the monitor
To configure the model for fairness, follow these steps:
- Select a model deployment tile and click Configure monitors.
- Select Fairness in the Evaluations section of the Configure tab.
- For each configuration item, click Edit to specify the Favorable outcomes, Sample Size, and the features to evaluate.
Features
The features are the model attributes that are evaluated to detect bias. For example, you can configure the fairness monitor to evaluate features such as Sex
or Age
for bias. Only features that are of categorical,
numeric (integer), float, or double fairness data type are supported.
Fairness alert threshold
The fairness alert threshold specifies an acceptable difference between the percentage of favorable outcomes for the monitored group and the percentage of favorable outcomes for the reference group. For example, if the percentage of favorable outcomes for a group in your model is 70% and the fairness threshold is set to 80%, then the fairness monitor detects bias in your model.
Maximum sample size
By default, the maximum number of records that you can run is 1,000,000 records. You can change this maximum number by adding the BIAS_CHECKING_RECORDS_MAX_LIMIT
environment variable to the deployment configuration. Before increasing
the maximum number of records, make sure that you have the processing power and resources to successfully run them. In the following example, the BIAS_CHECKING_RECORDS_MAX_LIMIT
environment variable is configured to set the
maximum number of records to 1,200,000 records:
"BIAS_CHECKING_RECORDS_MAX_LIMIT": 1200000
Testing for indirect bias
If you select a field that is not a training feature, called an added field, Watson OpenScale will look for indirect bias by finding associated values in the training features. For example, the profession “student” may imply a younger individual even though the Age field was excluded from model training.For details on configuring the Fairness monitor to consider indirect bias, see Configuring the Fairness monitor for indirect bias.
Mitigating bias
Watson OpenScale uses two types of debiasing: passive and active. Passive debiasing reveals bias, while active debiasing prevents you from carrying that bias forward by changing the model in real time for the current application. For details on interpreting results and mitigating bias in a model, see Reviewing results from a Fairness evaluation.
Learn more
Parent topic: Configuring model evaluations