Share this post:
Question answering (QA) is a core problem in AI. Many interactive tasks involving natural language can be modeled as QA problems. For example, a typical task in the finance domain is to answer question like “How many investors purchased restricted stock within a quarter of opening their account?”
In recent years, deep learning has produced significant advances in QA performance. The use of learning-based methods are advantageous because they reduce the reliance on manual rule-based methods that do not scale. Instead deep learning produces neural representations of both the query and its transformation to reach the answer. While deep learning provides a new path for QA, such neural models are still prone to making mistakes, especially as queries become more complex. Furthermore, the reasoning by which deep learning arrives at answers it is not always explainable to users.
A more intuitive way of approaching a complex task like QA is to use deep learning to deconstruct a task into a sequence of simpler steps, much the way that people do. Motivated by this direction, recent research has created a new learning paradigm called Neural Program Induction. Using this technique, an AI model can be taught to procedurally decompose a complex task into a program – i.e. a sequence of atomic actions, which upon execution, lead to the answer. To-date, research on Neural Program Induction has been limited to simpler tasks like sorting, addition of numbers, or simple multi-hop QA over small database tables or knowledge bases. More importantly, it often includes an impractical assumption that the ground-truth program can be provided as supervision during training by an oracle.
Considering these limitations, we made the first attempt at training a model, Complex Imperative Program Induction from Terminal Rewards (CIPITR), to answer complex questions, without either of these constraints. First, the questions can be complex, sometimes requiring up to 7 – 12 steps of different types of reasoning. The reasoning can be logical, comparative or based on quantitative analysis. Second, the questions need to be answered based on a large-scale knowledge base of over 50 million facts. Third, and most importantly, the model is trained with only the question-answer pairs as supervision and can learn to induce a program without the need for the oracle program during training.
For a complex question like “How many investors purchased restricted stock within a quarter of opening their account?” the model has to first understand that the query is talking about concepts like “investors,” “restricted stock,” and “quarter,” as well as relations between concepts like “purchased” and “opening.” The concepts are then map them to facts in the financial knowledge base. Even when this is done perfectly, the number of possible programs induced can be huge (~10^19), out of which only a few can lead to the correct answer. For example, CIPITR has to first find out which investors opened an account, and then which of them purchased restricted stock. Next, it has to find the time of purchasing and opening an account, and then apply the time constraint, and finally count the number of such investors. Only if this decomposition is correct does the model reach the correct answer and get a positive reward. A similar example from the geography domain is illustrated in the figure by a question “How many rivers originate in China but flow through India?” Here the model needs to find an intersection between the set of rivers that originate in China and the ones that flow through India, and then find the size of this intersection set.
Figure 1 as shown in this paper: The steps of assembling a block switching. (a): Sub-models are trained individually. (b): The lower parts of sub-models are used to initialize parallel channels of block switching.
In order to pragmatically search in this combinatorial space of possible programs, the neural model has to incorporate generic programming rules, just as human programmers do, as symbolic constraints in its search. Along with that, it should also be able to incorporate task- or schema-level constraints to ensure that the programs generated are indeed consistent with the background knowledge base and can generate a meaningful answer. Marrying symbolic AI with neural reasoning in this way can exploit the best of both worlds: the power of representation of the neural models, while overcoming their inherent problem of making simple mistakes, or coming up with meaningless answers with the symbolic rules.
Overcoming these challenges, CIPITR scored at least three times higher than the competing systems for moderately complex queries requiring two to five step programs. And on one of the hardest class of queries (comparative reasoning) with five to ten steps, CIPITR outperformed state-of-the-art neural program induction models by a factor of 89 times.
These results make CIPITR one of the first successes in widening the applicability of complex program induction paradigms to various real-world settings by eliminating the need for supervision during training. And it also shows significant potential in pushing the frontiers of complex reasoning tasks in various settings, like answering questions or conversing strategically over multiple modalities (like text, images, or videos) in a more human-like fashion.
This research on Complex Program Induction for Querying Knowledge Bases in the Absence of Gold Programs has been published in MIT Press Journals, and was presented at the ACL 2019 Conference in Florence, Italy.