Scenario: Detecting objects in images

The goal of this scenario is to create a deep learning model to determine the make and model of a car that is captured on a traffic camera.

The image file that is used in this scenario is available for download here: Download car image.

This scenario follows these steps to create a deep learning model:

  1. Import images and create a data set.
  2. Label objects in an image.
  3. Train a model.
  4. Deploy a trained model.

Step 1: Import images and create a data set

First, create a data set and add images to it.

  1. Log in to Maximo® Visual Inspection.
  2. Click Data Sets in the side bar to open the Data Sets page. You can choose from several ways to create a new data set. For this example, create a new, empty data set.
  3. From the Data set page, click the icon and name the data set Traffic camera.
  4. To add an image to the data set, click the Traffic image data set and click Import file or drag the image to the + area.
Important: Do not leave the Maximo Visual Inspection page, close the tab or window, or refresh until the upload completes. You can go to different pages within Maximo Visual Inspection during the upload.

Step 2: Label objects in an image

The next step is to label objects in the images. For object detection, you must have at minimum five labels for each object. Create "Black car" and "White car" objects and label at least five images as black cars, and at least five as white cars.

  1. Select the images from your data set and click Label Objects.
  2. Create new object labels for the data set by clicking Add new by the Objects list. Enter Black car and click Add. Then, enter Black car and click OK.
  3. Label the objects in the images:
    1. The first image is open in the data area, with thumbnails of all the selected image on the side. Select the correct object label, for example, "Black car".
    2. Choose Box or Polygon, depending on the shape you want to draw around each object. Boxes are faster to label and train, but less accurate. Only Detectron and High resolution models support polygons. However, if you use polygons to label your objects, then use this data set to train a model that does not support polygons, bounding boxes are defined and used. Draw a polygon or a bounding box around the object.
    3. Select the thumbnail of the next image to open it. Add the appropriate labels, and continue through the rest of the images. For more information about identifying and drawing objects in video frames, see Guidelines for identifying and drawing objects.
  4. After all objects are labeled in all of the image, click Done editing.

Step 3: Train a model

With all the object labels that are identified in your data set, you can now train your deep learning model. To train a model, complete the following steps:

  1. From the Data set page, click Train.
  2. Complete the fields on the Train Data set page, ensuring that you select Object Detection. Choose Accuracy (faster R-CNN) for Model selection
  3. Click Train.
  4. (Optional - Supported only when training for object detection.) Stop the training process by clicking Stop training > Keep Model > Continue.You can wait for the entire training model process to complete. However, you can opt to stop the training process when the lines in the training graph start to flatten out, as shown in Figure 1. You might opt to stop the training process because improvements in quality of training might plateau over time. Therefore, the fastest way to deploy a model and refine the data set is to stop the process before quality stops improving. Use the early stop functionality carefully when training segmented object detection models (such as models that use the Detectron model type). Larger iteration counts and training times can improve accuracy even when the graph indicates that the accuracy is plateauing. The precision of the label can continue to improve even when the accuracy of identifying the object location stops improving.
    Figure 1. Model training graph
    The image a loss on the vertical axis and iterations on the horizontal axis. The more iterations that occur the line for loss converge to a flat line.
    Important: If the training graph converges quickly and has 100% accuracy, the data set does not have enough information. The same is true if the accuracy of the training graph fails to rise or the errors in the graph do not decrease at the end of the training process. For example, a model with high accuracy might be able to discover all instances of different race cars. However, the same model might be unable to differentiate between specific race cars or cars that have different colors. In this situation, add more images, video frames, or videos to the data set. Then, label those objects and try the training again.

Step 4: Deploy a trained model

GPUs are used as follows.

  • Each High resolution, Structured segment network (SSN), Anomaly optimized, or custom deployed models takes one GPU. The GPU group is listed as '-', which indicates that this model uses a full GPU and does not share the resource with any other deployed models.
    Note: Starting in Maximo Visual Inspection 8.7, custom models are not supported. Custom models are still supported in Maximo Visual Inspection 8.6 and earlier versions.
  • Multiple Faster R-CNN, GoogLeNet, SSD, YOLO v3, Tiny YOLO v3, and Detectron2 models are deployed to a single GPU. That is, the model is deployed to the GPU that has the most models deployed on it, if sufficient memory is available on the GPU. The GPU group can be used to determine which deployed models share a GPU resource. To free up a GPU, all deployed models in a GPU group must be deleted or undeployed.
Note: IBM® Maximo Visual Inspection leaves a variable buffer on the GPU. This depends on the combination of models that are currently deployed.

To deploy the trained model, complete the following steps:

  1. Click Models from the menu.
  2. Select the model that you created in the previous section and click Deploy.
  3. Specify a name for the model, and click Deploy. The Deployed Models page is displayed, and the model is deployed when the status column displays Ready.
  4. Double-click the deployed model to get the API endpoint and test other videos or images against the model.
    Note: For more information about APIs, see REST APIs.

Next steps

You can continue to refine the data set as much as you want. When you are satisfied with the data set, you can train the model again. This time when you train the model, you might want to train the model for a longer time to improve the overall accuracy of the model. The goal is for the loss lines in the training model graph to converge to a stable flat line. The lower the values for the loss lines, the better. After the training is completed, you can deploy the model again. You can double-click the deployed model to get the API endpoint and test other images or images against the model.