Tapping Machine Learning to Foster Greater Use of Biomimicry for Innovation

Share this post:

machine learning

In an example of biomimicry, cicada wings could inspire bacteria-resistent materials.

Knowledge transfer across domains leads to significant breakthroughs in science and technology. For example, through biomimicry, innovators get inspiration from nature/biology to solve complex engineering problems. An exciting example of biomimicry is the recent creation of artificial materials that imitate the surface of cicada’s wings and gecko’s skin, which have antibacterial properties due to their physical structure. These type of materials could be used in hospitals for surfaces that get easily contaminated with bacteria and help drastically reduce the number of hospital infections, a leading cause of health complications during hospitalization.

Biomimicry inventions and discoveries are usually highly creative and efficient. However, they happen due to serendipity: knowledge transfer between biology and engineering is not straightforward since both are studied in isolation of each other. There are no systematic ways to incorporate ideas from nature/biology into the design process of engineering solutions.

A knowledge base of biology goals and mechanisms and an “intelligent” tool to navigate and map them to engineering problems would take serendipity out of the loop and provide a systematic way of connecting engineering challenges to biology inspiration. Last summer we partnered with Biomimicry Institute as part of the IBM Science for Social Good program to address the data curation and organization needs of such a bioinspired design toolkit using machine learning. Our team consisted of a graduate student from Georgia Tech, Yuanshuo David Zhao, along with several researchers from IBM Research.

We developed a system that uses machine learning techniques to assess whether a scientific article could potentially serve as inspiration for a biomimicry invention. One of the most challenging aspects of our work is collecting the appropriate data to train the machine learning algorithms for this task. As such, we developed a crowdsourcing application based on serverless technologies, which allowed us to collect data for the scientific articles that serve as the source of biomimicry-relevant text documents. The serverless application was implemented using IBM Cloud Functions. Due to the generous free tiers for this service, our application ran for free. Instructions on how to build such an application were presented in a tutorial at IEEE DSAA’17. Tutorial materials can be found here.

By consulting biology experts and biomimicry enthusiasts, we learned that for a biology paper to be a potential source of inspiration for a biomimicry solution, it needs to talk about a living organism, and, more specifically, it needs to describe a function that the organism performs and the mechanism through which the function is realized.

Using the collected data, we devised several classifiers that show promising accuracy. We first classify the articles into relevant vs irrelevant, then we classify the relevant articles into function classes defined by the Biomimicry Taxonomy, a classification scheme developed by the Biomimicry Institute to organize biological content.  The classification algorithms we used are traditional, off-the-shelf (e.g., Random Forest and Naïve Bayes) and Convolutional Neural Network based architectures that achieve a best accuracy of 77 percent for relevance classification and 47 percent for 9-class classification using the Biomimicry Taxonomy, respectively.

We also extract important words and sentences using several post-hoc analysis to develop a biomimicry relevant index for the articles in the curated database. These keywords and phrases influence the rank of the retrieved articles in response to user queries.

Our machine learning-based system is currently most beneficial in scaling up the rate of biomimicry-relevant content generation and high-level organization. With inclusion of more supervised data, the next steps would be to create a summarized snippet of the relevant articles highlighting the key information that includes organism, function, and mechanism. This would enable more sophisticated user queries and allow filtering of results based on functions and mechanisms. While these are the first few steps towards the automatic discovery of relevant biomimicry resources, it is a foundational step towards a scalable system that bridges the domains of biology and engineering to foster more innovations inspired by nature.

Our paper on this work, Data Driven Techniques for Organizing Scientific Articles Relevant to Biomimicry, will be presented at the ACM/AAAI Artificial Intelligence, Ethics and Society (AIES) conference to be held on February 1-3 2018 in New Orleans.

machine learning

The user interface for the crowdsourcing service encourages users to select phrases from the source text that are relevant to the chosen biomimicry category.

More AI stories

AI Could Help Enable Accurate Remote Monitoring of Parkinson’s Patients

In a paper recently published in Nature Scientific Reports, IBM Research and scientists from several other medical institutions developed a new way to estimate the severity of a person’s Parkinson’s disease (PD) symptoms by remotely measuring and analyzing physical activity as motor impairment increased. Using data captured by wrist-worn accelerometers, we created statistical representations of […]

Continue reading

Image Captioning as an Assistive Technology

IBM Research's Science for Social Good team recently participated in the 2020 VizWiz Grand Challenge to design and improve systems that make the world more accessible for the blind.

Continue reading

Reducing Speech-to-Text Model Training Time on Switchboard-2000 from a Week to Under Two Hours

Published in our recent ICASSP 2020 paper in which we successfully shorten the training time on the 2000-hour Switchboard dataset, which is one of the largest public ASR benchmarks, from over a week to less than two hours on a 128-GPU IBM high-performance computing cluster. To the best of our knowledge, this is the fastest training time recorded on this dataset.

Continue reading