Fairness evaluations

You can configure fairness evaluations to determine whether your model produces biased outcomes. Use fairness evaluations to identify when your model shows a tendency to provide favorable outcomes more often for one group over another.

Configuring fairness evaluations for machine learning models

If you log payload data when you prepare for model evaluations, you can configure fairness evaluations.

You can configure fairness evaluations manually or you can run a custom notebook to generate a configuration file. You can upload the configuration file to specify the settings for your evaluation.

When you configure fairness evaluations manually, you can specify the reference group (value) that you expect to represent favorable outcomes. You can also select the corresponding model attributes (features) to monitor for bias (for example, Age or Sex), that will be compared against the reference group. Depending on your training data, you can also specify the minimum and maximum sample size for evaluations.

Select favorable and unfavorable outcomes

You must specify favorable and unfavorable outcomes when configure fairness evaluations. The values that represent a favorable outcome are derived from the label column in the training data. By default the predictedLabel column is set as the prediction column. Favorable and unfavorable values must be specified by using the value of the prediction column as a string data type, such as 0 or 1 when you are uploading training data.

Select features

You must select the features that are the model attributes that you want to evaluate to detect bias. For example, you can evaluate features such as Sex or Age for bias. Only features that are of categorical, numeric (integer), float, or double fairness data type are supported.

The values of the features are specified as either a reference or monitored group. The monitored group represents the values that are most at risk for biased outcomes. For example, for the Sex feature, you can set Female and Non-binary as the monitored groups. For a numeric feature, such as Age, you can set [18-25] as the monitored group. All other values for the feature are then considered as the reference group, for example, Sex=Male or Age=[26,100].

Set fairness threshold

You can set the fairness threshold to specify an acceptable difference between the percentage of favorable outcomes for the monitored group and the percentage of favorable outcomes for the reference group. For example, if the percentage of favorable outcomes for a group in your model is 70% and the fairness threshold is set to 80%, then the fairness monitor detects bias in your model.

Set sample size

Sample sizes are used to spedicy how to process the number of transactions that are evaluated. You must set a minimum sample size to indicate the lowest number of transactions that you want to evaluate. You can also set a maximum sample size to indicate the maximum number of transactions that you want to evaluate.

Testing for indirect bias

If you select a field that is not a training feature, called an added field, indirect bias is identified by finding associated values in the training features. For example, the profession “student” may imply a younger individual even though the Age field was excluded from model training. For details on configuring fairness evaluations to consider indirect bias, see Configuring the Fairness monitor for indirect bias.

Mitigating bias

Passive and active debiasing are used for machine learning model evaluations. Passive debiasing reveals bias, while active debiasing prevents you from carrying that bias forward by changing the model in real time for the current application. For details on interpreting results and mitigating bias in a model, see Reviewing results from a Fairness evaluation.

Measure Performance with Confusion Matrix

The confusion matrix measures performance categorizes positive and negative predictions into four quadrants that represent the measurement of actual and predicted values as shown in the following example:

Actual/Predicted Negative Positive
Negative TN FP
Positive FN TP

The true negative (TN) quadrant represents values that are actually negative and predicted as negative and the true positive (TP) quadrant represents values that are actually positive and predicted as positive. The false positive (FP) quadrant represents values that are actually negative but are predicted as positive and the the false negative (FN) quadrant represents values that are actually positive but predicted as negative.

Note: Performance measures are not supported for regression models.

Parent topic: Configuring fairness evaluations