Data sets for object detection models
When you are preparing a data set for training an object detection model, ensure that the following requirements are met.
Requirements for accurate training
- The data set has at least five images.
- Every defined object has an object label. Images that do not have object labels are not used to train the model.
Example
You are training an object detection model to recognize cars, and the data set contains the following parameters:
- Five images: Ensure that you define and label a car as an object in at least five images.
- Three images and one video: Ensure that you define and label a car as an object in three images and in at least two frames of the video. Labeling five cars in one image is not adequate.
If your data set does not have many images or a sufficient variety for training, use the Augmentation feature to increase the data set.
Validation
For example, consider a data set to be used for training of an object detection model that has 200 images. With the default configuration for model training, 20% of the images, which is 40 images, are selected for testing the model. If a label LabelA is used to identify an object in the data set, the following scenarios are possible if the number of images that are labeled with the object are smaller than the test data set, for example, if only 20 images with objects are labeled as LabelA:
- It is possible that all of the images with LabelA are in the "training" data set, and none of the images are used for testing of the model. This situation results in unknown accuracy for LabelA, since there are no tests of the accuracy.
- Similarly, it is possible that all 20 images with LabelA objects are in the test data set, but no images are used for 88 training. This situation results in low or 0% accuracy for the object because the model was not trained with any images containing the LabelA objects.
If your data set does not have many images or sufficient variety for training, consider using the Augmentation feature to increase the data set.
Special considerations for object detection models
Accuracy for object detection models can be more challenging because it includes intersection over union (IoU), especially for models that use segmentation instead of bounding boxes.
IoU is calculated by the intersection between a ground truth bounding box and a predicted bounding box, divided by the union of both bounding boxes, where the intersection is the area of overlap, a ground truth bounding box is the hand-drawn box, and the predicted bounding box is drawn by IBM® Maximo® Visual Inspection.
In the case of object detection, the object might be correctly identified but the overlap of the boundary generated by the model is not accurate, which results in a poor IoU metric. This metric might be improved by more precise object labeling to reduce background noise, by training the model for longer, or both.
Ensure that the data sets that you use to train Anomaly optimized models contain only images of non-anomalous objects. These images are used to train a model that recognizes an object and identifies similar objects that have different characteristics. When you train and test the model, for best results, use images that present the object consistently. That is, make sure that the object's angle, centering, and scale are similar and that the image backgrounds are similar.