Scenario: Detecting objects in high-resolution images

The goal of this scenario is to create a deep learning model to detect small and possibly numerous objects in high-resolution images.

You can create this type of model by labeling images in a data set and training a High resolution model.

In this scenario, you create a model that improves worker safety by identifying helmets in high-resolution images of workers.

To create the model, take the following steps:

  1. Create a data set and add images.
  2. Label objects in images.
  3. Train a model.
  4. Deploy the trained model.
  5. Test the deployed model.

Step 1: Create a data set and add images

Take the following steps to create the data set that you use to train the model:

  1. Log in to Maximo® Visual Inspection.
  2. In the side bar, click Data sets.
  3. In the Data sets page, click Create new data set.
  4. In the Data set page, click the icon and name the data set.
  5. Import high-resolution images of workers and helmets by clicking Import file or dragging the images to the Drop files here area.

Step 2: Label objects in images

Take the following steps to label objects in the images in your data set:

  1. In the data set, select the images and click Label Objects. The Label Objects screen is displayed. The first selected image is open in the primary view and thumbnails of the other images are displayed.
  2. In the Objects list, click Add new.
  3. Add an object that is called Helmet and click OK.
  4. In the Objects list, click the Helmet label.
  5. From the label toolbar, choose Enable box drawing or Enable polygon drawing, depending on the shape that you want to draw around each object. Boxes are faster to label and train but are less accurate.
  6. Draw a polygon or a bounding box around any visible helmets in the image.
  7. Optional: Use the pan and zoom controls to adjust the zoom level of the image. You can pan, zoom in, zoom out, zoom to a selected area, or use the mini-map to change the portion of the image that is displayed in the primary view.
  8. For each remaining image, click its thumbnail to open it and repeat steps 5 to 7. Label at least five images for the Helmet object.

You can stop labeling at any time by leaving the page. For more information, see Guidelines for identifying and drawing objects.

Step 3: Train a model

Take the following steps to train the model:

  1. In the Data set page, click Train model.
  2. In the Train model page, select the Object detection training type.
  3. Select the High resolution optimization option and perform the following tasks:
    a. If you used bounding boxes to label the images, clear the Enable segmentation checkbox.
    b. If you want to manually decide when to stop the training, clear the Enable auto early stop checkbox.
  4. Optional: Click the Advanced settings toggle to specify advanced training options, such as increasing the number of training iterations to improve labeling accuracy.
  5. Click Train model.
  6. Optional: Stop the training process by clicking Stop training > Keep Model > Continue.

Training considerations

  • Training a model uses one GPU.
  • If you checked the Enable auto early stop checkbox, training stops automatically when either the most accurate model is found or the training process reaches the maximum number of iterations.
  • If you cleared the Enable auto early stop checkbox, you can wait for the training process to complete. Alternatively, you can stop the process after the loss lines plateau and the validation loss line is equal to or climbs higher than the training loss line. This scenario is shown in Figure 1.
    Figure 1. High resolution model training stopping point
    The image shows two loss lines on the vertical axis and iterations on the horizontal axis. As more iterations occur the lines for loss converge to flat lines.
  • When training is completed, the performance metrics for the model are calculated. This process might take several minutes.
  • The following factors can cause the calculated accuracy metric to be lower for a segmented model than for a model that uses bounding boxes:
    • More accurate labeling of objects makes detection within the intersection over union (IoU) threshold more challenging.
    • Accuracy is calculated on the set of objects that are defined in the data set. Some objects might be more difficult to correctly identify.

Step 4: Deploy the trained model

Take the following steps to deploy the model:

  1. In the side bar, click Models.
  2. Select the trained model and click Deploy model.
  3. In the dialog, specify a deployed model name and click Deploy. The Deployed models page is displayed, and the model is deployed when the Status column displays Ready.

Each deployed High resolution model uses one GPU.

Step 5: Test the deployed model

Take the following steps to test the model:

  1. In the Deployed models page, click the model that you deployed.
  2. In the Test model tile, set the confidence threshold.
  3. Upload an image or video that contains helmets and review the inference data in the Results section.
  4. Optional: Retrieve the Deployed model API endpoint value to test the model by using the API.
    Note: For more information about APIs, see REST APIs.

Testing considerations

  1. Converts the image into multiple tiles and performs inference on each tile.
  2. Combines the inferred tiles to construct the overall inference for the image.

Compared to other model types, this approach brings the following benefits:

  • Improved detection of relatively small objects.
  • Detection of a greater number of objects.

However, compared to other model types, this approach has the following drawbacks:

  • Longer inference times. Inference can take several seconds. Video inference takes several times longer than it takes with other model types.
  • Longer network transfer times because the images that are sent to the model for inference are larger.

The objects that the model detects might be small compared to the overall image size. When you test the model in the user interface, use the pan and zoom features to view the labeled objects.