Posted in: IBM Research-Almaden

Expert-in-the-Loop Helps AI Transition from Classroom to Internship

Figure 1 Training curves for our expert-in-the-loop AI model. With just ~1 hour of training, we achieve near 100 percent precision (green line).

At IBM Research we understand that humans and AI can accomplish more together, and our team at IBM Research-Almaden is focused on creating ways for IBM’s consultants to work smarter with new AI technology assets. This approach is known as human-in-the-loop or expert-in-the-loop. Working closely with AI technologies on time-intensive tasks can save experts up to 80 percent of their time. Think of having five times as many people on your team—a complete game-changer for your project or business. We recently used a human-in-the-loop approach to train a neural network to identify adverse drug reactions in social media and reported the results in a paper titled, “Recognizing Mentions of Adverse Drug Reaction in Social Media Using Knowledge-Infused Recurrent Models,” published by the Association for Computational Linguistics. The results showed that active interaction between the AI and the human expert strikingly reduced the time it took the AI to learn what an adverse reaction was (Figure 1).

Putting an expert in the loop

Experts work closely with AITo derive the greatest benefit from working with AI systems, we realized it’s best to think of them like very junior colleagues rather than students. AI systems have evolved to the point that they are ready to step out of the classroom and start collaborating in the real world. They will make a fair number of mistakes but will also learn much more rapidly. When training and interacting with them, if we let them ask questions about where they are most confused and identify things they are very certain about, we can be much more productive.

What this means technically is that we want to select AI neural structures that offer a reasonable estimate of their confusion on an example. This can involve flatter neural structures (e.g., recurrent or convolutional neural networks) or the GPU-driven solutions that can very rapidly perform a number of training splits to identify confusion. Once this confusion is identified, the particular example can be presented to a human for guidance. That guidance will become training data for the AI to learn.

Making the most of an expert’s time

When presenting examples of maximal uncertainty, it’s important that the AI system phrase them in a way a human expert can rapidly provide feedback. This is often informed by learnings from human-computer interactions. For example, scoring the presence or absence of a single concept is much faster for a human than scoring multiple aspects of a given sample due to the “task switch” cost for the human of trying to consider two things at once. In other words, it takes much longer for an expert to identify a drug and a disease than to identify just one.

Figure 2. Screen shot of the tool’s user interface. Here, the doctor is being asked if “surgery” is a side effect.

Next, we pay a lot of attention to the interface that the expert will use to score the information (Figure 2). What may seem like simple choices and optimizations can result in savings of 50 percent or more in the time it takes an expert to give feedback to the system. Just think about how much longer it takes to buy something on a poorly-thought-out online sales site versus a well-designed one. This includes design choices such as asking an expert “Did I highlight the drug in this sentence correctly?” (a binary classification that the expert can do with two keystrokes in a second or two) rather than “Please highlight the drug in this sentence.” (a task that requires using a mouse and will take at least 10-20 seconds). Using the former question provides slightly less information to the AI but enables the human to provide ten times as many scores in the same amount of time.

Looping human input back to the AI system

In our paper, “Exploring the efficiency of batch active learning for human-in-the-loop relation extraction,” at the first international workshop on Augmenting Intelligence with Humans­in­the­Loop co-located with TheWebConf (HumL@WWW2018), we showed that the frequency at which the neural net should be retrained based on human input was much higher than we first thought. Basically, as soon as four or five new “facts” are known, it is worth retraining the network and figuring out where the new areas of confusion lie. This makes sense, as in previously undefined areas of the parameter space, even a few examples may be sufficient to reduce error below acceptable thresholds.

Bringing expert-in-the-loop to industry

We created a tool that can identify sections of financial and legal contracts that are likely to cause problems and that human experts should pay more attention to. In this case, AI plays the role of a clever intern, flagging pages where a lawyer may need to spend a little bit more of her time making absolutely sure it’s right. Using the tool can reduce the time it takes to review a complex contract from months down to days.

Expert working with AIWhile AI can be trained to spot potential problems much more quickly than a human, it cannot necessarily understand nuances such as whether a contract clause is valid under GDPR requirements. So the new frontier in AI research is about creating a partnership where humans do what they’re good at and AI does what it’s good at. The more AI can assist them, the faster the process becomes and the better the results. Hopefully this approach also lets humans work on the parts of the problem that humans find fun and not the parts that humans find boring, repetitive, and repugnant. The more we can do that, the more humans can enjoy things that are uniquely human. That’s where our research will continue to focus.

Daniel Gruhl

Distinguished Research Staff Member, IBM Research