On average, humans dump around 8 million metric tons (~17.6 billion pounds) of trash into the ocean each year.
At this rate, the total amount of plastic in the ocean will outweigh all of the ocean’s fish by 2050. Ocean and beach pollution is a real issue, with consequences extending far beyond the mere superficial effects of making our beaches and oceans look less “pretty.” Some of the most dangerous consequences include the following:
Depletion of oxygen content in the water
Effect of toxic wastes on marine animals
Failure in the reproductive system of marine animals
Contamination of the food chains
Effect on human health
Disruption to the cycle of coral reefs
Plastics are the most common element found in the ocean today and are especially harmful to the environment since they do not break down easily and are often mistaken as food by marine animals. In an effort to join the fight against global ocean pollution, the IBM Space Tech team has begun work on an open-source, machine-learning, neural-network object-detection project called PlasticNet.
This project is based on YOLOv4 Darknet detection and Tensorflow Object Detection API models, and it provides an environment where developers can easily prepare, train and test detection models that will identify different types of plastic (and non-plastic) trash in the ocean and on the beach.
Develop real-time detection of different types of trash (plastic, in particular) in the ocean by utilizing transfer learning on different machine-learning object-detection architectures.
Build a fully functional PlasticNet machine-learning pipeline that can be easily used to train and test object-detection models based on architectures like YOLOv4, Faster-RCNN, SSD-Resnet, Efficient-DET, Tensorflow, etc. (all accessible inside a command line client).
Provide a set of pretrained PlasticNet models (a PlasticNet Model Zoo) that can be utilized for future development and improvement via transfer learning.
Implement our models to work on real-time satellite and camera footage.
In the long term, we would also like to be able to improve our model to be able to recognize logos/brands on trash, in order to detect and identify from which companies different types of ocean/beach trash originate.
The main goal, and the starting point for this project, was to be able to detect some of the most common types of ocean trash from a camera feed.
To be able to perform transfer learning and adapt a well-trained model to recognize trash, we first had to gather a large dataset of images with which we could use to train. At first, we scoured Google Images for viable images of trash that we could label and use as a part of the dataset.
At the start, the team labeled every piece of detectable trash within the images, and we ended up with an over-defined model of classes. We then duplicated our annotations project and started building models from our most populated classes. We started small with a two-class model, then four, then six and so on. Starting this way allowed us to see what types of trash would be seen more often and are more important to prioritize and detect.
Using these datasets, we were able to apply transfer learning on various models from the TensorFlow Detection Model Zoo and models based on YOLOv4. From this point, we began a cycle of changing model parameters, testing new models, growing the dataset and improving model performance.
Improving model performance included recognizing what our model was failing to recognize or, in some cases, detecting incorrectly. For example, one of our first models would often detect a potato-chip bag as a metal can since both of these objects often have colorful pictures on the outside. And sometimes, metal cans would be detected as plastic bottles, due to the ridges on the top of the can looking similar to those on a plastic bottle.
Another example of false classification would be that our early models detected almost anything that was white as Styrofoam, because it only had the knowledge to associate color with the Styrofoam class.
Once we recognized the issues, many of them could be easily addressed by applying different image augmentation techniques. One of the issues we fixed with this was the over-detection of Styrofoam. By applying hue and saturation changes to our images, the model was able to recognize that Styrofoam wasn’t classified by just color — it began to recognize its shape and texture instead.
The first additions of image augmentation improved model accuracy immensely, and then we were able to continue improving our model by adding more and more images. The dataset we graciously received from the Pacific Whale Foundation helped us add more classes to our model and improve its accuracy a lot, too.
While developing the final models, we also found some statistics showing that face masks were expected to be a large portion of new ocean pollutants due to the pandemic. Because of this, we added face masks to the dataset and trained our models on them, so that these models could be prepared for encountering face masks as an ocean pollutant as well.
Project architecture, PlasticNet command line client and Model Zoo
In order to make our project easily usable/replicable for the public, we developed a PlasticNet command line client. The PlasticNet command line program combines YOLOv4 and Tensorflow Object Detection API technologies into a single, easily usable machine-learning pipeline CLI. Collaborators can use the PlasticNet CLI to prepare models for training (via transfer learning from the provided pre-trained PlasticNet models), train custom detection models built upon pre-trained PlasticNet models, export the trained models and, finally, test the trained models. The CLI was created so these steps can all be done with a few simple commands, seen here. Initially trained via transfer learning from pre-trained YOLO weights (found here) and pre-trained Tensorflow models (from the Tensorflow Detection Model Zoo), our official PlasticNet Model Zoo (found here) can be used by collaborators for the further improvement/development of new PlasticNet object detection models. For labeling images, we utilized IBM’s Cloud Annotations (instructions found here).