Clarifying the Complex with a ‘Simpler Voice’

Share this post:

As smart as the phones in our pockets are, they’re not built to interpret the written word. Low-literate adults, of whom there are 757 million worldwide, including 32 million in the United States, instead rely on visual cues for recognition and comprehension. So, as part of IBM Research’s Science for Social Good initiative, our team in Austin, Texas, reached out to the Literacy Coalition of Central Texas with an idea for a different approach: use artificial intelligence to turn text into pictures and simple verbal messages.

The mobile app project, called Simpler Voice, can parse complex text of everything from product descriptions, instruction manuals, to even public signage. It then extracts and presents simplified images and short spoken messages. Today’s smartphone AI is good at straight text-to-speech and speech-to-text, but not so good at rendering a new way of explaining the world around us.

Simpler Voice weaves together IBM Watson natural language understanding and text-to-speech services with novel image generation code – or in AI research parlance, generative adversarial networks (GAN). These GANs provide alternative visualizations of what the smart phone is looking at.

What better place to pilot Simpler Voice than the grocery store.

Our team’s first test run with the Literacy Coalition’s students will examine a handful of common grocery store items, from shampoo bottles to canned goods. We’ll then have their students pilot the app at a local grocery store – with the ability to scan any product.

Low-literate adults often match images from television ads and newspaper coupons with what they see – and then buy – at the grocery store. This limitation excludes potentially healthier, more economical choices, or quite simply, something they might like better. Simpler Voice opens up the entire store to our students, and we hope soon, to the 19 percent of Texans who are low-literate, and beyond.

Take dishwasher detergent, for example. An LCCT student sees a box of dishwashing tablets next to the liquid he recognizes and usually buys. Curious, but unsure of what the tablets do, he scans the barcode. Simpler Voice reads the company’s product description, and begins to interpret key words and phrases to verbalize “dishwasher detergent” and provide a visualization of a person using a dishwasher. Simpler Voice can also illustrate instructions on where to put the tablet in a dishwasher. And because these tablets look like candy, it may also communicate a safety warning: “Not candy. Do not let children eat this.”

Simpler Voice may offer the biggest benefit in visualizing and explaining the fine print on over-the-counter medications. Can anyone read that two-point type? With a quick scan, the app can explain how many pain reliever pills an adult or child should to take, how often, as well as warn of potential allergic reactions.

After piloting Simpler Voice in grocery stores around Austin, we hope to work with LCCT on expanding the app’s capability to legal documents, such as apartment rental agreements, and medical documents, like the pages of paperwork required at the doctor’s office.

Simpler Voice is one of IBM’s 15 global Science for Social Good projects launched this summer. Our team, including intern Minh Nguyen, from the University of Southern California, will work alongside LCCT to further develop and deploy the app for their students. Watch this space for news about Simpler Voice’s availability in your favorite app store.

Manager, Optimized Cloud Environments, IBM Research

Anne Gattiker

Principal Research Staff Member, IBM Research

More Accessibility stories

Expecting the Unexpected at the 2019 Masters

The most compelling thing about sports is the uncertainty. Who will step up? Who will choke? How will it end? And by this criteria the 2019 Masters will go down in history as one of the most compelling sports events of all time. This year’s tournament had all the things you have come to expect […]

Continue reading

How AI and Blockchain are Energizing the Media & Entertainment Industry

When people think of tech in media, their first thoughts are likely around digital platforms that bring more content to our fingertips. Now, leading innovations -– namely, AI and blockchain — are enabling media companies to take this a step further by delivering even more compelling content to broader audiences. AI is being used to […]

Continue reading

Carrefour and Nestlé Partner with IBM to Extend Use of Blockchain to New Food Categories

Blockchain technology is bringing much-needed transparency and traceability to more parts of the global food supply, including lettuce, spinach, berries and, now, mashed potatoes. International food companies Carrefour and Nestlé announced this week that they will add Mousline purée, a popular instant mashed potato mix available in France, into the Food Trust blockchain network. Mashed […]

Continue reading