Accuracy is a measure of the proportion of correct predictions contained within your model.

Accuracy at a glance

  • Description: The proportion of correct predictions
  • Default thresholds: Lower limit = 80%
  • Default recommendation:
    • Upward trend: An upward trend indicates that the metric is improving. This means that model retraining is effective.
    • Downward trend: A downward trend indicates that the metric is deteriorating. Feedback data is becoming significantly different than the training data.
    • Erratic or irregular variation: An erratic or irregular variation indicates that the feedback data is not consistent between evaluations. Increase the minimum sample size for the Quality monitor.
  • Problem types: Binary classification and multiclass classification
  • Chart values: Last value in the time frame
  • Metrics details available: Confusion matrix

Understanding Accuracy

Accuracy can mean different things depending on the type of the algorithm:

  • Multi-class classification: Accuracy measures the number of times any class was predicted correctly, normalized by the number of data points. For more details, see Multi-class classification in the Apache Spark documentation.

  • Binary classification: For a binary classification algorithm, accuracy is measured as the area under an ROC curve. See Binary classification in the Apache Spark documentation for more details.

  • Regression: Regression algorithms are measured using the Coefficient of Determination, or R2. For more details, see Regression model evaluation in the Apache Spark documentation.

Understanding the display

Area under PR is shown with metric trending downward

Do the math

Accuracy is defined as the number of true positives and negatives divided by the sum of the true positives and negatives and the sum of false positives and negatives.

                                     number of true positives + number of true negatives
Accuracy =   ________________________________________________________________________________________________________________

              (number of true positives + number of true negatives + number of false positives + number of false negatives)

How it works

You need to add manually-labelled feedback data through the Watson OpenScale UI as shown in the following examples, using a Python client or Rest API.

De-biased accuracy

When there is data to support it, the accuracy is computed on both the original and de-biased model. IBM Watson OpenScale computes the accuracy for the de-biased output and stores it in the payload logging table as an additional column.

a model visualization appears with accuracy calculated for both the original and debiased models