Answering Complex Questions using Neural Program Induction

Share this post:

Question answering (QA) is a core problem in AI. Many interactive tasks involving natural language can be modeled as QA problems. For example, a typical task in the finance domain is to answer question like “How many investors purchased restricted stock within a quarter of opening their account?”

In recent years, deep learning has produced significant advances in QA performance. The use of learning-based methods are advantageous because they reduce the reliance on manual rule-based methods that do not scale. Instead deep learning produces neural representations of both the query and its transformation to reach the answer. While deep learning provides a new path for QA, such neural models are still prone to making mistakes, especially as queries become more complex. Furthermore, the reasoning by which deep learning arrives at answers it is not always explainable to users.

A more intuitive way of approaching a complex task like QA is to use deep learning to deconstruct a task into a sequence of simpler steps, much the way that people do. Motivated by this direction, recent research has created a new learning paradigm called Neural Program Induction. Using this technique, an AI model can be taught to procedurally decompose a complex task into a program – i.e. a sequence of atomic actions, which upon execution, lead to the answer. To-date, research on Neural Program Induction has been limited to simpler tasks like sorting, addition of numbers, or simple multi-hop QA over small database tables or knowledge bases. More importantly, it often includes an impractical assumption that the ground-truth program can be provided as supervision during training by an oracle.

Considering these limitations, we made the first attempt at training a model, Complex Imperative Program Induction from Terminal Rewards (CIPITR), to answer complex questions, without either of these constraints. First, the questions can be complex, sometimes requiring up to 7 – 12 steps of different types of reasoning. The reasoning can be logical, comparative or based on quantitative analysis. Second, the questions need to be answered based on a large-scale knowledge base of over 50 million facts. Third, and most importantly, the model is trained with only the question-answer pairs as supervision and can learn to induce a program without the need for the oracle program during training.

For a complex question like “How many investors purchased restricted stock within a quarter of opening their account?” the model has to first understand that the query is talking about concepts like “investors,” “restricted stock,” and “quarter,” as well as relations between concepts like “purchased” and “opening.” The concepts are then map them to facts in the financial knowledge base. Even when this is done perfectly, the number of possible programs induced can be huge (~10^19), out of which only a few can lead to the correct answer. For example, CIPITR has to first find out which investors opened an account, and then which of them purchased restricted stock. Next, it has to find the time of purchasing and opening an account, and then apply the time constraint, and finally count the number of such investors. Only if this decomposition is correct does the model reach the correct answer and get a positive reward. A similar example from the geography domain is illustrated in the figure by a question “How many rivers originate in China but flow through India?” Here the model needs to find an intersection between the set of rivers that originate in China and the ones that flow through India, and then find the size of this intersection set.

Neural Program Induction

Figure 1 as shown in this paper: The steps of assembling a block switching. (a): Sub-models are trained individually. (b): The lower parts of sub-models are used to initialize parallel channels of block switching.

In order to pragmatically search in this combinatorial space of possible programs, the neural model has to incorporate generic programming rules, just as human programmers do, as symbolic constraints in its search. Along with that, it should also be able to incorporate task- or schema-level constraints to ensure that the programs generated are indeed consistent with the background knowledge base and can generate a meaningful answer. Marrying symbolic AI with neural reasoning in this way can exploit the best of both worlds: the power of representation of the neural models, while overcoming their inherent problem of making simple mistakes, or coming up with meaningless answers with the symbolic rules.

Overcoming these challenges, CIPITR scored at least three times higher than the competing systems for moderately complex queries requiring two to five step programs. And on one of the hardest class of queries (comparative reasoning) with five to ten steps, CIPITR outperformed state-of-the-art neural program induction models by a factor of 89 times.

These results make CIPITR one of the first successes in widening the applicability of complex program induction paradigms to various real-world settings by eliminating the need for supervision during training. And it also shows significant potential in pushing the frontiers of complex reasoning tasks in various settings, like answering questions or conversing strategically over multiple modalities (like text, images, or videos) in a more human-like fashion.

This research on Complex Program Induction for Querying Knowledge Bases in the Absence of Gold Programs has been published in MIT Press Journals, and was presented at the ACL 2019 Conference in Florence, Italy.


Software Engineer, IBM Research-India

Karthik Sankaranarayanan

Research Scientist, IBM Research-India

More AI stories

Deriving Complex Insights from Event-driven Continuous Time Bayesian Networks

Real-world decision making often involves situations and systems whose uncertain and inter-dependent variables interact in a complex and dynamic way. Additionally, many scenarios are influenced by external events that affect how system variables evolve. To address these complex scenarios for decision making, together with colleagues at the IBM T. J. Watson Research Center, we have developed a new dynamic, probabilistic graphical model called - Event-driven Continuous Time Bayesian Networks.

Continue reading

Progressing IBM Project Debater at AAAI-20 — and Beyond

At the thirty-fourth AAAI conference on Artificial Intelligence (AAAI-20), we will present two papers on recent advancements in Project Debater on two core tasks, both utilizing BERT.

Continue reading

Mastering Language Is Key to More Natural Human–AI Interaction

IBM Research AI is leading the push to develop new tools that enable AI to process and understand natural language. Our goal: empower enterprises to deploy and scale sophisticated AI systems that leverage natural language processing (NLP) with greater accuracy and efficiency, while requiring less data and human supervision.

Continue reading