Understanding metrics

PowerAI Vision provides several metrics to help you measure how effectively your model has been trained.

To understand these metrics, you must understand these terms:
True positive
A true positive result is when PowerAI Vision correctly labels or categorizes an image. For example, categorizing an image of a cat as a cat.
False positive
A false positive result is when PowerAI Vision labels or categorizes an image when it should not have. For example, categorizing an image of a cat as a dog.
True negative
A true negative result is when PowerAI Vision correctly does not label or categorize an image. For example, not categorizing an image of a cat as a dog.
False negative
A false negative result is when PowerAI Vision does not label or categorize an image, but should have. For example, not categorizing an image of a cat as a cat.
Of course, for a model in production, the values for true negative / positive and false negative / positive can't accurately be known. These values are the expected values for these measurements.

Metrics for image classification (Trained for accuracy)

Accuracy
Measures the percentage of correctly classified images. It is calculated by (true positives + true negatives) / (true positives + true negatives + false positives+ false negatives).
PR curve (Advanced)
The precision-recall (PR) curve plots precision vs. recall (sensitivity). Because precision and recall are typically inversely related, it can help you decide whether the model is appropriate for your needs. That is, do you need a system with high precision (fewer results, but the results are more likely to be accurate), or high recall (more results, but the results are more likely to contain false positives)?
Precision
Precision tells describes how "clean" our population of hits is. It measures the percentage of images that are correctly classified. That is, when the model classifies an image into a category, how often is it correct? It is calculated by true positives / (true positives + false positives).
Recall
The percentage of the images that were classified into a category, compared to all images that should have been classified into that category. That is, when an image belongs in a category, how often is it identified? It is calculated as true positives/(true positives + false negatives).
Confusion matrix (Advanced)
The confusion matrix is used to calculate the other metrics, such as precision and recall. Each column of the matrix represents the instances in a predicted class (those that PowerAI Vision marked as belonging to a category). Each row represents the instances in an actual category. Therefore, each cell measures how many times an image was correctly and incorrectly classified.

You can view the confusion matrix as a table of values or a heat map. A heat map is a way of visualizing the data, so that the higher values appear more "hot" (closer to red) and lower values appear more "cool" (closer to blue). Higher values show more confidence in the model.

This matrix makes it easy to see if the model is confusing categories, or not identifying certain categories.

Metrics for object detection (Trained for accuracy)

Accuracy
Measures the percentage of correct image classifications. It is calculated by (true positives + true negatives) / all cases.
Mean Average precision (mAP)
The average over all classes of the maximum precision for each object at each recall value. Precision measures how accurate the model is. That is, the percent of the classified objects that are correct. Recall measures how well the model returns the correct objects. For example, out of 100 images of dogs, how many of them were classified as dogs?

To calculate this, first, the PR curve is found. Then, the maximum precision for each recall value is determined. This is the maximum precision for any recall value greater than or equal to the current recall value. For example, if the precision values range from .35 to .55 (and then never reach .55 again) for recall values in the interval .3 - .6, then the maximum precision for every recall value in the interval .3 - .6 is set to .55.

The mAP is then calculated as the average of the maximum precision values.

IoU (Intersection over union)
The accuracy of the location and size of the image label boxes.

It is calculated by the intersection between a ground truth bounding box and a predicted bounding box, divided by the union of both bounding boxes; where the intersection is the area of overlap, a ground truth bounding box is the hand drawn box, and the predicted bounding box is the one drawn by PowerAI Vision.

Confusion matrix (Advanced)
The confusion matrix is used to calculate the other metrics, such as precision and recall. Each column of the matrix represents the instances in a predicted class (those that PowerAI Vision marked as belonging to a category). Each row represents the instances in an actual category. Therefore, each cell measures how many times an image was correctly and incorrectly classified.

You can view the confusion matrix as a table of values or a heat map. A heat map is a way of visualizing the data, so that the higher values appear more "hot" (closer to red) and lower values appear more "cool" (closer to blue). Higher values show more confidence in the model.

This matrix makes it easy to see if the model is confusing categories, or not identifying certain categories.

PR curve (Advanced)
The precision-recall (PR) curve plots precision vs. recall (sensitivity). Because precision and recall are typically inversely related, it can help you decide whether the model is appropriate for your needs. That is, do you need a system with high precision (fewer results, but the results are more likely to be accurate), or high recall (more results, but the results are more likely to contain false positives)?
Precision
Precision tells describes how "clean" our population of hits is. It measures the percentage of objects that are correctly identified. That is, when the model identifies an object, how often is it correct? It is calculated by true positives / (true positives + false positives).
Recall
The percentage of the images that were labeled as an object, compared to all images that contain that object. That is, how often is an object correctly identified? It is calculated as true positives/(true positives + false negatives).

Metrics for object detection using the Tiny Yolo model (Trained for speed)

Accuracy
Measures the percentage of correctly classified objects. It is calculated by (true positives + true negatives) / (true positives + true negatives + false positives+ false negatives).

Metrics for custom models

When a custom model is imported and deployed, the following metric is shown:

Accuracy
Measures the percentage of correct categorizations. It is calculated by (true positives + true negatives) / (true positives + true negatives + false positives + false negatives).