December 14, 2017 | Written by: Tim Hwang
Categorized: New Thinking
Share this post:
I’m awesome at commuting. Take your pick: subway, bus, or train—I can navigate a complex fare system, bob-and-weave through a difficult transfer, and make it to my destination on time even after leaving way later than I should.
But, inevitably, when any of those vehicles break down, I’m stuck. It’d be crazy to put me in charge of a public transportation system, and even crazier to make me a repair mechanic. I don’t know the first thing about how a bus or a bus system actually works, and wouldn’t even know where to start to fix them.
It turns out that there’s a parallel in the world of artificial intelligence. On one hand, we can talk about how good or bad an AI system is at accomplishing a particular task, and we have gotten pretty good at using AI to solve problems. But, another thing entirely is understanding how and why a system makes the decisions that it does. This level of understanding is often referred to by machine learning researchers as interpretability.
Interpretability is an active field of research. However, unlike a subway car or a bus, what is tricky about the most powerful and cutting-edge AI systems we have is that there currently are limits to how interpretable they can be. We can’t just pop the hood of a machine learning model and easily identify precisely where something went wrong. We don’t even have a standardized version of one of those vague, ominous “check engine” lights on a dashboard that’d signal when things are going wrong.
AI technology progresses independently of our ability to understand why AI makes the decisions that it does. Neural networks massively improved the ability for computers to recognize objects and understand images, but the techniques to pick apart specifically how they do so represent a more recent development. When people and the press tout the potential benefits from AI, they often forget to discuss this more subtle point (though that’s changing).
A failure to focus on and invest in expanding our ability to interpret AI will limit AI’s potential to live up to the hype. A limited understanding of how an AI system makes a decision not only means that it can be difficult to fix a system when it breaks, but it can also be difficult to know what a system will do when it is deployed “in the wild.” That’s critical, given that AI systems can often learn to solve problems in ways that are quite counterintuitive to how we might approach the same situation.
Even worse, in a hostile environment where people might attempt to break or manipulate these systems, it can be difficult to find and assess vulnerabilities. Ensuring that a self-driving car isn’t fooled by inputs designed to make it think that an innocuous printout of a dog is actually a stop sign, for instance, depends on a fuller understanding of the inner-workings of these systems.
In short, lack of interpretability raises the level of risk. Would a hospital integrate a system that suggests medical diagnoses without being able to assess how it arrives at a particular diagnosis? Would a city buy a machine learning solution that helps to route traffic without knowing how it’d perform when targeted by malicious actors? The level of interpretability that we can actually achieve will determine how fast the latest and greatest machine learning breakthroughs will be adopted, and how autonomous we can expect them to be. From a big picture perspective, it will also influence the economic impact we can expect from AI.
It’s true that we don’t demand high levels of interpretability in all the technologies that we rely on. From microwaves and computers to cars and smartphones, we happily use devices daily where we only have a dim idea of their internal processes. That has been the case to date with AI, too. If you are using a machine learning system to quickly find the cutest cat image, not understanding all the specifics of how it does that might not be a big priority, so long as it works. But what about using the same machine learning system to help a self-driving car recognize and avoid a stray cat on a road?
As we deploy AI to solve higher-stakes problems, the necessity of good interpretability will grow, but that’s not to say it isn’t crucial now. Whether or not the public trusts AI in general hinges on the early-stage interpretability of our systems.
One wonky point: it’s important to recognize that not all AI is made equal. There are flavors of AI that are more or less interpretable. For example, decision tree learning can produce a series of binary decisions chained together in a sequence, allowing an easier examination of why a given output emerged from a given input. In contrast, methods such as neural networks don’t follow such an orderly pattern, making it difficult to assess their decisionmaking process through an orderly look at each “step” in a sequence.
So, if it’s needed, we can currently architect AI systems that provide the interpretability we want. But, traditionally, the painful trade-off here has been that performance often comes at the expense of interpretability. Simpler machine learning systems have fewer factors determining their decisions and are easier to understand, but they are also generally weaker at solving problems. Conversely, more powerful machine learning systems are better at prediction, but can take into account a massive number of factors that make it more difficult to come up with a clean understanding of why they do what we do. So, ironically, while an AI system can be more interpretable, the catch is that it will generally make more mistakes. That’s a dilemma, particularly when the costs of an AI system going wrong are high.
Interpretability isn’t just about solving a technical problem. It is also fundamentally a human challenge. Just as the readings on the control room dashboard of a nuclear power plant would be deeply opaque to the casual bystander but totally understandable to a trained expert, getting interpretability right depends on the audience. That ultimately requires a deep understanding of the social expectations and needs surrounding AI systems, and bridging what is technically feasible with what is richly communicative.
In the past, some technologists have worried we may never be able to break through this tension—that some aspect of the latest generation of neural networks is fundamentally opaque. But, there have been some recent developments that suggest that we might be breaking through that trade-off. Visualization has been a powerful tool for understanding what is going on inside neural networks, and there’s a range of other approaches that are showing progress. There’s still a great deal of work to be done though, in determining what exactly “interpretability” requires and making sure these approaches are robust.
But, that’s good news, since it suggests that there isn’t anything inherently uninterpretable about neural networks—it’s just early days. But as machine learning is increasingly applied to a whole range of problems, interpretability will be a linchpin feature that could make or break a product, and defines the risks that society bears as these systems go live.
Learn more about IBM iX