Trustworthy AI is a tool we can all use

Many people find the whole idea of artificial intelligence unnerving. Primed by science fiction movies, they fear power-grabbing robots and computers that act with malicious intent.

Dr. Pin-Yu Chen, a researcher with IBM’s Trusted AI Group, believes that view of AI mischaracterizes a technology that holds tremendous potential to benefit society.

In reality, AI is probably a part of your daily life already, from the online chatbot that helps you with customer service issues to the Roomba that vacuums your floors.

“Some people are afraid that AI will just do everything and people will lay down and do nothing,” said Chen, “but my view is that the best way to use AI is in a collaboration mode, where we make the most of our two different roles, human vs. machine.

“In the movie ‘Ironman,’ Jarvis is the AI system, and when Ironman wants to go someplace, he will say, ‘Jarvis, show me the shortest path from place A to place B,’ or ‘Jarvis, please analyze the environment to find the most suitable material that satisfies the constraints I specified.’

“Machines are very good in terms of computation and search in the large space, in doing complicated computations. But humans are better in terms of instinct and creativity, and in knowing what things to solve. I think ‘Ironman’ shows a good way to picture how humans and machines can work together. Basically, we take the best of those two parts — analysis and search in the machine, and creativity and design ability in humans — and by combining those strengths, we can achieve something greater.”

Chen said the key is recognizing that AI itself isn’t good or bad — it’s simply a new kind of technology, one with enormous potential. And as with any new technology, the outcome of that potential depends on how it is designed and used.

Pin-yu portrait with cherry blossoms

Chen credits IBM’s approach to research with the pace of progress he and peers have made

Capabilities and limitations

“I think of AI like a car,” said Chen. “Everybody wants to drive a car, but it’s our job as a society to make sure people have the training to understand how a car works and how to drive responsibly. A car is made to drive on the road, not on the ocean. As a driver, I have to understand that. It’s the same with AI.

“Because AI is so new and everybody has such high expectations of it, they think AI can do everything and solve every problem, but that’s not true. We have to understand not only the capabilities, but the limitations of AI.”

Chen, a Taiwanese native, is one of the leading global authorities on the subject. In addition to his position with IBM Research, he is also chief scientist of the Rensselaer Polytechnic Institute (RPI)-IBM AI Research Collaboration. He’s published more than 40 papers on trustworthy machine learning at major AI and machine learning conferences, and given tutorials at many of those same conferences.

When Chen came to IBM from the University of Michigan in 2016, his plan was to continue his work in graph data analytics. He credits a “random” meeting early in his time at IBM with changing the trajectory of his career.

“Shortly after I joined this team, we started to have discussions about trustworthy AI,” he said. “At one of those meetings, my colleague Amit Dhurandhar, who was working on explainable AI, invited Nicholas Carlini, at the time a student at Berkeley and now a researcher at Google Brain, to give a talk to our group about something called ‘adversarial robustness,’ or how well a machine learning model can hold up in less-than-ideal circumstances.”

Constructing the kind of machine learning model needed to operate something as complex as a self-driving car requires “deep learning,” an algorithm-driven technique where a computer learns in the same way people learn — by example. Using an electronic neural network constructed to mimic the human brain, the computer has the ability to analyze data input, draw inferences and form conclusions.

Because AI is essentially making its own decisions, it has to be explainable in order to be trustworthy. In other words, you have to be able to track the process by which it reaches any decision. It also has to be robust — able to make those trustworthy decisions in less-than-ideal circumstances.

For instance, in training a model for a self-driving car, the model can learn to recognize and respond appropriately to a stop sign. That’s great, in the lab. But the real test comes on the street, when the self-driving car comes across an imperfect stop sign; maybe one defaced with a sticker or some other damage that changes the sign’s appearance, even slightly.

“Nicholas was working on the problem of challenging AI by generating similar-looking examples that could make a machine learning model make wrong predictions,” said Chen. “Like a slightly modified dog image might be identified as a cat.”

“The original goal was to understand how and why cutting-edge models sometimes fail to give the right prediction, in order for us to improve their explainability. I got lots of inspiration from his talk and decided to work on this new area of adversarial robustness. Without that magical meeting I wouldn’t have been able to identify this interest so early.”

Making sure that deep learning models are reliable in real-world situations is absolutely critical, and a huge focus for scientists and organizations hoping to incorporate more smart machines into our daily lives.

Attacking those models while they’re still in the lab, by presenting them with unexpected obstacles, lets scientists see how the models respond, and fine-tune them so they can withstand the challenges of the real world.

Chen says one of his proudest accomplishments is the development of black box optimization. “A black box model means the model’s details are not transparent to the developer,” he explained. “So we can generate these wrong, or adversarial, examples to test the model without knowing the precise details of the model. I believe our team was one of the first in the world to use this technique. I’m very excited about that. It started as a paper, then a patent, and it has evolved into something that’s now being adopted as a core technology in our Watson services. It’s like watching your child grow.”

Two stop signs, one is discolored and damaged

A self-driving car needs a high degree of adversarial robustness to recognize an imperfect stop sign

It’s not just the machines that need training

The stronger the adversarial robustness demonstrated by a deep learning model, the more reliable the AI function. But Chen notes that machines aren’t the only part of the equation that needs training.

“Every technology has a positive and negative side. As developers, we should always be responsible and take proactive action to prevent bad outcomes,” he said. “And as users, we need to understand what constitutes a good use of the technology versus a bad use of the technology.

“Look at cell phones. After 20 years or so, people kind of have a notion of how cell phones work. If you are deep under water, or far out in a rural area, you don’t panic if your phone doesn’t work. You understand that you are too far away from a cell phone tower to receive a signal.

“But with AI, because it’s so new, when people hear that this AI technology can do something with 99% accuracy, they assume the model knows everything and can solve every task, but that’s not true. From a robustness perspective, when we refer to 99% accuracy, we’re actually referring to performance using a test set that we design to test the model — it does not mean that the AI will be as great under any circumstances as it is in the test environment.”

Teaching users to have realistic expectations is critical. “For example, it’s common knowledge that you shouldn’t have your hands wet while you’re using any electronics,” he said. “In the same way, with an autonomous car, you still have to have your hands on the wheel, you still have to watch the road and make sure you can make a manual stop at any time. Once everybody has that basic common knowledge, we can create a healthy ecosystem, making sure the technology is being deployed and used in a safe and reliable manner.”

Coming soon? A tune-up for your AI

“The vision I have for my research is to develop something I call the AI model inspector,” said Chen. “So, for example, when you take your car for maintenance service, they will check a lot of things to make sure your car is in good condition, and return you a nice safe car to drive away in. I would like to have something like that for our AI model as well. You continuously update your model, and we have monitoring tools in place to make sure there are no errors or risks in the current state of your system. And if there are any issues, we could quickly find and fix them and make sure the model is used in a safe way.”

Chen credits IBM’s approach to research with the pace of progress he and fellow researchers have made. “In a lot of ways, the research environment here is more like academia. Many tech companies have an 80/20 rule, where even as a researcher, you are obligated to spend 80% of your time on products and just 20% doing research. For me here, the opposite is true.

“The strength of IBM is that it’s 110 years old. IBM has seen the ups and downs of technology, so when something new is evolving, like machine learning, I think IBM is well-positioned to quickly identify the problems that need to be solved, and the milestones necessary to make the technology thrive. Because of our history, at IBM we have this deeply rooted scientific spirit. We have a lot of Nobel prize winners, and we have the patience to understand that technology takes time to mature.”

Though just 35, Chen finds his focus expanding to thinking about the next generation of AI researchers. In his role as chief scientist with the RPI-IBM collaboration, he supervises the students in the program. “I think I’m moving to a stage where I’m still going to be doing research myself, but also supervising other projects to make sure our work is having a bigger impact. Part of our AI horizon is to make sure not only that one project is successful, but that our AI research overall is successful. I like to cultivate the talent IBM needs by working with the students.”

Dr. Pin-Yu Chen, Dr. Chia-Yu Mu, Chia-Yi Hsu, and Yu-Lin Tsai

Left to right: Dr. Pin-Yu Chen, Dr. Chia-Yu Mu, Chia-Yi Hsu, and Yu-Lin Tsai, on a visit to New York. The team published a paper at NeurIPS 2021.