Building trustworthy AI for the planet’s habitat recovery
How the U.S. State Department’s Earth Challenge Mobile App uses IBM Cloud Pak® for Data to combat natural extinction
Connecting people with the planet
The stakes for our planet couldn’t be higher. A UN report shows while the human population doubled in just the past 50 years, we lost about three-quarters of the land-based environment and about 66% of the marine environment. In other words, our generation is living with merely 25% of the earth’s natural habitat compared to that of our predecessors. And it’s getting worse every day with plastics destroying rivers and oceans and chemicals decimating bee populations and emissions suffocating the air. We are going through the 5th natural extinction in the history of our beloved pale blue dot.
However, expanding human population need not always be a liability. We can seize the opportunity to connect people with the planet. Earth Challenge 2020 is a citizen science initiative created by the US State Department, UNEP and the Wilson Center with a common goal to harness the hidden power of a versatile scientific instrument that everyone carries in their pocket – the mobile phone.
One of the most serious ecological problems the United States faces today is the decimation of the bee population, with a 40% decline between 2018-2019. The more significant issue is that we can’t pinpoint why, where, or how this is happening, and thus, what action policymakers should take. What we do know is data is the key to understanding the cause and igniting change. Images taken with mobile phones can provide vital research data for environmental issues like this. With the Earth Challenge 2020 Mobile App, anyone can submit pictures of not only bees, but also plastic pollution and air quality, providing the critical data researchers, environmental agencies, nonprofits and government bodies need to formulate a meaningful strategy.
Garbage In – Garbage Out
Recent image recognition model success relies on well-curated images and labels. However, crowdsourced datasets, with their mixed quality and lack of ground truth, fail to deliver good results due to poorly and inconsistently labeled sets and an unknown number of bad images.
How can one find reliable images for building AI models when both images and labels themselves are not reliable? Most organizations rely on human labelers to provide ground truth. But as thousands of images continue to be collected each day, it becomes nearly impossible to manually sort them. Ingesting bad data results in poor AI models. Since both image and label data are independently crowdsourced from the public and unvetted by the SMEs due to sheer volume, we had to figure out the truth from misinformation.
The IBM Data Science and AI Elite (DSE) teamed with US State Department and the Wilson Center to teach machines to find reliable images from vast collections for AI purposes, without any dependable labels or supervision in an unbiased manner.
The goal: increase the adoption of AI based on Earth Challenge for meaningful policy impact.
Data Provenance for AI using AI
A first of its kind AI Governance solution was vital to process unstructured data like images. Using IBM Cloud Pak® for Data, we built a Machine Learning (ML) pipeline with five steps, which successively distill “good” data for AI by finding authentic, usable and relevant images.
1. Cataloging and governing data sources
To ensure data governance, we utilized IBM Watson® Knowledge Catalog to:
- Catalog image metadata, glossary and further trace data lineage for transparency
- Provide capabilities to manage licenses and source attributions for data aggregators
- Automatically handle inter-operability issues for metadata harmonization across multiple agency standards
- Combine images and metadata automatically using Watson APIs to find “Fit for Research” datasets
2. Discovering authentic and high-quality images using Machine Learning
We addressed one image quality issue around receiving images not captured first-hand, or submitting a picture of a pre-existing image to the app. This approach results in fake image data and skews model accuracy. For determining the authenticity of images, we used wavelet decomposition method with SVM (Supper Vector Machines) model to predict if an image is real or fake. To determine image quality, we built an objective assessment model that performs image forensic analysis to find which images are “optimal.” Considering factors like brightness and blurriness, it outputs a “Data Usability Index” to rank images – the higher the index, the more optimal the image.
3. Finding Trustworthy Labels
However, finding “real and good” images isn’t enough. For example, it’s possible that someone submits a perfectly good quality image of their dog, labels it as “honeybee” and, therefore, introduces biases in the AI model. We need to find images that are relevant in the given context and simultaneously ensure that it has a reliable label to train an AI model.
Likewise, determining which images are relevant in a given setting has been a very complex and previously unsolved machine learning problem. For example, a plastic coffee cup on a desk can be an irrelevant image, but if the same coffee cup is on the beach, then it becomes relevant. Naturally, irrelevant images are not tagged by users as such; so, no reliable label exists. To solve this, we separated labels from their images, and treated both the problems separately.
To first address the challenge of mislabeling objects, we created a user trust graph. Using Graph Theory, we built a social network of all labelers based on their activity and credibility history and calculated Label Confidence by identifying reliable labelers in defined circle of trust; to help reduce inconsistently labeled datasets.
4. Finding Relevant Images using Self-Supervised Learning
Once images and labels were separated from crowdsourced data, only 20% of our dataset remained with validated labels. To solve the challenge of unlabeled images, we used Self-Supervised Contrastive Learning to learn true representations of images and find similar or dissimilar images. Contrastive Learning methods consider each image class as an unmarked label and works without any user provided labels.
This method helped isolate the irrelevant images automatically. When combining the output of a small label set (just 1%) with most the confident labels, we generated new ML-produced labels for rest of unlabeled images.
5. Data and AI Factsheets for AI Governance
Data scientists and analysts expect a holistic view of the entire dataset to get an idea of its underlying fitness. Thus, we built a first-of-its-kind “Data Reliability Index” that combines authenticity, usability, label confidence and image relevance on a scale of 0-100, to help remove invalid image data automatically and track improvements along the way.
To assess overall dataset fitness, we expanded upon IBM’s research work on AI Factsheets to build a Data and AI Factsheet – a bit like a “nutrition label” for AI. We captured the facts from data engineers, scientists and analysts using Jupyter notebooks and made them available on a dashboard.
Tackling noisy, crowdsourced images and labels is still a challenge for the Machine Learning community. With our innovative approach, we can automatically detect reliable images among thousands – a nearly impossible task to manually complete. The IBM Data Science and AI Elite team’s innovative, 6-week proof-of-concept combined various dimensions of AI fitness with the latest research in machine learning to deliver a simple framework for managing AI Governance. This pilot further provides the U.S. State Department and UNEP with a vital toolkit to help define new relevant facts across various data and AI projects and drive maximum impact for the Earth Challenge initiative.