Home Case Studies Taking action to preserve our oceans Taking action to preserve our oceans
Overcoming data challenges to track and reduce marine litter

Plastic waste is destroying marine ecosystems at a rapid pace—from ruining beaches to killing baby turtles to destroying coral on the sea floor.

With half the world’s plastic produced in the last 13 years and 8.8 million tons washing into the oceans annually, few places on earth have escaped its reach. Plastic litter fouls the remote, icy coves of Antarctica, the beautiful shores of Réunion and Mauritius, and even the unfathomable reaches of the 10,000-meter-deep Mariana Trench.

A problem so pervasive and pernicious requires immediate, global attention.

The United Nations Environment Programme (UNEP) rallies marine experts, environmentalists, nonprofits, academics and citizen scientists from countries around the world to confront the issue of environmental sustainability. In 2015, it established 17 Sustainable Development Goals (SDGs) for the planet, with goal 14 calling for conservation and sustainable use of the oceans. Its United Nations Development Programme (UNDP) set a goal of significantly reducing marine pollution by 2025.

While no one would argue against the importance of ridding beaches of single-use plastic and other forms of debris, there’s a big problem: you can’t improve what you can’t measure. There’s no process in place to deliver data on the amount of plastic polluting beaches today—and no one really knows if siloed beach cleanup efforts are even making a dent.

Tons of Plastic

8.8 million tons of plastic washes into oceans annually 

Enhanced Predictability

The model can predict litter volumes 5 years into the future


AI is a powerful ally for citizen science that can help local to global communities. We are just taking the first steps toward realizing this potential. Dr. Anne Bowser Director of Innovation the Wilson Center
Conserving life below water

Challenge 1: Uniting the world’s ocean litter data

Estimating the volume of marine litter scattered across all five oceans is harder than it seems. No standard marine litter data collection method exists to guide countries and organizations. The Wilson Center, one of the US’s premier nonpartisan policy organizations, together with UNEP and IBM’s Data Science and AI Elite (DSE) team, needed to harmonize tons of schemas and metadata so data reported from all corners of the world could be used.

To foster more effective collaboration between all stakeholders, UNEP set a key objective to establish a global platform for marine litter. With IBM Knowledge Catalog in IBM Cloud Pak® for Data, they were able to quickly and automatically clean, crosswalk, classify, conform and make available the right data for data scientists. The solution also allowed citizen scientists to trace origins of the data, collaborate with other scientists, request datasets and share their insights on the datasets using rating and tagging mechanisms.

Challenge 2: Conquering conditional datasets to preserve the health of beaches

The second challenge was to calculate the volume of marine beach litter. Statistically randomized surveys help create accurate scientific estimates, but data collection about litter is, by its very nature, random. Heavily reliant on volunteer cleanup crews, data about cleanup efforts can be shaped by temporal and spatial biases. For example, one volunteer collects beach litter daily. But what they collect each day will differ from what someone who collects weekly or monthly may find, leading to samples highly dependent on myriad variables, thus difficult to compare and analyze.

And clean-up efforts are inconsistent across locations, with some places cleaned too frequently and others rarely or never touched, indicating that samples are neither independent nor identically distributed (IID). Such conditional datasets prevent problem resolution using typical machine learning methods.

To address these challenges, the DSE team utilized the Bayesian Inference method with Makov Chain Monte Carlo (MCMC) sampling techniques. The Bayesian approach allowed them to account for uncertainties in the problem; MCMC allowed them to create a chain of dependent events to estimate the parameters of marine litter. This proof of concept revealed this unique hybrid methodology could be adjusted and modified to enhance the model’s strength.

The DSE team created a machine learning pipeline in IBM Cloud Pak for Data to establish a streamlined end-to-end AI lifecycle. Once they established a baseline for measuring marine litter, the team could predict the number of volunteers needed for a cleanup effort at a particular beach. Given current trends and policies, the model will help project the amount of expected litter five years into the future.

Challenge 3: Looking ahead to shore up prevention and support

The best way to solve the marine litter problem is to prevent it. Looking forward, how can coastal communities forestall permanent damage to pristine coastlines? The DSE team created a time-series forecast to help track marine plastic and develop more accurate and effective policies to eradicate it. To make the dataset easily consumable, the team created an executive dashboard allowing various stakeholders to:

  • Monitor the progression of marine litter density year over year
  • Slice and dice the data by national location to evaluate litter trends over time
  • Narrow focus to specific beaches for more granular data collection
  • Refine methodology to recommend the best mobile apps to volunteer groups

With an end-to-end AI lifecycle in place, scientists and policymakers could extract even more value from the Wilson Center’s datasets, whether to choreograph cleanups or predict a timeline for getting to zero pollution. IBM’s custom digital dashboard makes the work easily accessible and sharable even for those without technical expertise.

These tools empower a UNEP stakeholder like Costa Rica to track its progress toward the nation’s aim of ridding itself of plastics entirely.


Challenge 4: Making more people care about marine litter

UNEP leadership wanted to go even deeper into the data, to create a bond between the public and the issue of marine litter. To achieve this connection, the organization envisioned a digital avatar as the information go-to. And so, a digital human called Sam was born.

“Sam can emotionally connect with users because he’s actually responsive,” explains Richard Darden, Distinguished Engineer and Digital Human Advocate at IBM.

Sam’s emotive responses derive from IBM Watson® Assistant using IBM Watson Speech to Text technology. These programs can interpret the intent of a user and then elaborate Sam’s reply by diving into UNEP’s vast repository and other sources.

That information is then filtered through a lifelike avatar built by Soul Machines, a San Francisco-based company that makes what it calls “digital people.”

Sam can emotionally connect with users because he’s actually responsive. Richard Darden Distinguished Engineer and Digital Human Advocate IBM
Proof of concept to production

By harnessing the power of technology to battle plastic pollution, IBM demonstrated to the United Nations Environmental Assembly its commitment to the preservation of the environment—emphasizing that AI can be a vital tool for measuring future progress and can influence direct policy on marine plastic interventions toward building a sustainable marine ecosystem. UNEP is now turning its attention to making data collection easier and more impactful.

The Wilson Center is exploring ways to use citizen science in UNEP’s reporting beyond beach cleanups, including with more sophisticated mobile apps featuring object detection and classification, says Dr. Anne Bowser, Director of Innovation at the Wilson Center and project lead.

Based on the early success of her collaboration with IBM, Bowser thinks more UNEP goals (link resides outside of ibm.com) could benefit from empowering citizen scientists with AI. “AI is a powerful ally for citizen science that can help local to global communities,” Bowser says. “We are just taking the first steps toward realizing this potential.”

Wilson Center and UNEP logo
About the Wilson Center

The Wilson Center (link resides outside of ibm.com) chartered by the US Congress in 1968 as the official memorial to President Woodrow Wilson, is the nation’s key nonpartisan policy forum for tackling global issues through independent research and open dialogue to inform actionable ideas for the policy community. The organization helped launch Earth Challenge 2020, a platform for increasing the amount of open and interoperable citizen science data, with a mobile app to engage the public.

The United Nations Environment Programme (UNEP)

UNEP (link resides outside of ibm.com) is the leading global environmental authority that sets the global environmental agenda, promotes the coherent implementation of the environmental dimension of sustainable development within the United Nations system. Formed in 1972, it serves as an authoritative advocate for the global environment. For more information, visit: https://www.unep.org (link resides outside of ibm.com).

Take the next step
Unify tools, processes and people needed for enterprise data and AI Get started Kick start your next data science project Work with us Unlock data faster with a data fabric with IBM Cloud Pak for Data Learn more

© Copyright IBM Corporation 2021. IBM Corporation, Global Business Services, New Orchard Road, Armonk, NY 10504

Produced in the United States of America, July 2021.

Global Business Services, IBM Cloud Pak, and IBM Watson are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at ibm.com/trademark.

This document is current as of the initial date of publication and may be changed by IBM at any time. IBM Business Partners set their own prices, which may vary. Not all offerings are available in every country in which IBM operates.

The performance data and client examples cited are presented for illustrative purposes only. Actual performance results may vary depending on specific configurations and operating conditions. THE INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS” WITHOUT ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT. IBM products are warranted according to the terms and conditions of the agreements under which they are provided.