Getting AI to Reason: Using Neuro-Symbolic AI for Knowledge-Based Question Answering

Share this post:

Language is what makes us human. Asking questions is how we learn.

Building on the foundations of deep learning and symbolic AI, we have developed technology that can answer complex questions with minimal domain-specific training. Initial results are very encouraging – the system outperforms current state-of-the-art techniques on two prominent datasets with no need for specialized end-to-end training.

Read more on IBM Research’s efforts in neuro-symbolic ‘common sense’ AI here

As this technology matures, it will be possible to use it for better customer support, business intelligence, medical informatics, advanced discovery, and much more.

There are two main innovations behind our results. First, we’ve developed a fundamentally new neuro-symbolic technique called Logical Neural Networks (LNN) where artificial neurons model a notion of weighted real-valued logic. By design, LNNs inherit key properties of both neural nets and symbolic logic and can be used with domain knowledge for reasoning.

Next, we’ve used LNNs to create a new system for knowledge-based question answering (KBQA), a task that requires reasoning to answer complex questions. Our system, called Neuro-Symbolic QA (NSQA), translates a given natural language question into a logical form and then uses our neuro-symbolic reasoner LNN to reason over a knowledge base to produce the answer.

Our NSQA achieves state-of-the-art accuracy on two prominent KBQA datasets without the need for end-to-end dataset-specific training. Due to the explicit formal use of reasoning, NSQA can also explain how the system arrived at an answer by precisely laying out the steps of reasoning.

What makes LNNs unique?

LNNs are a modification of today’s neural networks so that they become equivalent to a set of logic statements — yet they also retain the original learning capability of a neural network. Standard neurons are modified so that they precisely model operations in real-valued logic (where variables can take on values in a continuous range between 0 and 1, rather than just binary values of True or False). LNNs are able to model formal logical reasoning by applying a recursive neural computation of truth values that moves both forward and backward (whereas a standard neural network only moves forward). As a result, LNNs are capable of greater understandability, tolerance to incomplete knowledge, and full logical expressivity. Figure 1 illustrates the difference between typical neurons and logical neurons.

Figure 1. Illustrates the difference between typical neurons and logical neurons

Figure 1. The difference between typical neurons and logical neurons


Full logical expressivity means that LNNs support an expressive form of logic called first-order logic. This type of logic allows more kinds of knowledge to be represented understandably, with real values allowing representation of uncertainty. Many other approaches only support simpler forms of logic like propositional logic, or Horn clauses, or only approximate the behavior of first-order logic.

LNNs’ form of real-valued logic also enables representation of the strengths of relationships between logical clauses via neural weights, further improving its predictive accuracy. Another advantage of LNNs is that they are tolerant to incomplete knowledge. Most AI approaches make a closed-world assumption that if a statement doesn’t appear in the knowledge base, it is false. LNNs, on the other hand, maintain upper and lower bounds for each variable, allowing the more realistic open-world assumption and a robust way to accommodate incomplete knowledge.

Finally, other well-known neuro-symbolic strategies, including techniques based on Markov random fields (such as Markov logic networks), and many others based on embeddings (such as logic tensor networks) are less understandable — due to the use of hard-to-interpret weights, and the fact that they do not have the same kind of language-like structure. They also assume complete world knowledge and do not perform as well on initial experiments testing learning and reasoning.

Neuro-Symbolic Question Answering

There are several flavors of question answering (QA) tasks – text-based QA, context-based QA (in the context of interaction or dialog) or knowledge-based QA (KBQA). We chose to focus on KBQA because such tasks truly demand advanced reasoning such as multi-hop, quantitative, geographic, and temporal reasoning.

With our NSQA approach, it is possible to design a KBQA system with very little or no end-to-end training data. Currently popular end-to-end trained systems, on the other hand, require thousands of question-answer or question-query pairs – which is unrealistic in most enterprise scenarios.

In this work, we approach KBQA with the basic premise that if we can correctly translate the natural language questions into an abstract form that captures the question’s conceptual meaning, we can reason over existing knowledge to answer complex questions.  Table 1 illustrates the kinds of questions NSQA can handle and the form of reasoning required to answer different questions. This approach provides interpretability, generalizability, and robustness— all critical requirements in enterprise NLP settings. Learn more about how IBM Research AI is advancing enterprise NLP here.

Table 1: Reasoning types for KBQA use cases. Multi-relational reasoning, geographic reasoning and temporal reasoning are challenging for current deep learning end-to-end systems

Table 1: Reasoning types for KBQA use cases. Multi-relational reasoning, geographic reasoning and temporal reasoning are challenging for current deep learning end-to-end systems


Figure 2: NSQA system architecture with an example

Figure 2: NSQA system architecture with an example


For instance, say NSQA starts with the question: “Was Albert Einstein born in Switzerland?” (see Figure 2). The system first puts the question into a generic logic form by transforming it to an Abstract Meaning Representation (AMR). Each AMR captures the semantics of the question using a vocabulary independent of the knowledge graph – an important quality that allows us to apply the technology independent of the end-task and the underlying knowledge base. Learn more about how AMR works here.

The AMR is aligned to the terms used in the knowledge graph using entity linking and relation linking modules and is then transformed to a logic representation. This logic representation is submitted to the LNN. LNN performs necessary reasoning such as type-based and geographic reasoning to eventually return the answers for the given question. For example, Figure 3 shows the steps of geographic reasoning performed by LNN using manually encoded axioms and DBpedia Knowledge Graph to return an answer.

Figure 3. Use of LNN for Geographic Reasoning. The question being answered is “Was Albert Einstein born in Switzerland?”

Figure 3. Use of LNN for Geographic Reasoning. The question being answered is “Was Albert Einstein born in Switzerland?”

Experimental – very promising – results

We’ve tested the effectiveness of our system on two commonly used KBQA datasets: QALD-9, which has 408 training questions and 150 test questions; and LC-Quad 1.0, which has 5,000 questions. We’ve applied NSQA without any end-to-end training for these tasks. As NSQA consists of cutting-edge generic components independent of the underlying knowledge graph (data) — as opposed to other approaches heavily tuned to the data and task at hand — it can generalize to different datasets and tasks without massive training.

We measured the accuracy of our versatile approach using the official metrics of the two benchmark test sets. NSQA achieved a Macro F1 QALD¹ of 45.3 on QALD-9 and an F1 of 38.3 on LC-QuAD — ahead of the current state-of-the-art systems tuned on the respective datasets (43.0 and 33.0).

Question-answering is the first major use case for the LNN technology we’ve developed. While achieving state-of-the-art performance on the two KBQA datasets is an advance over other AI approaches, these datasets do not display the full range of complexities that our neuro-symbolic approach can address. In particular, the level of reasoning required by these questions is relatively simple.

The next step for us is to tackle successively more difficult question-answering tasks, for example those that test complex temporal reasoning and handling of incompleteness and inconsistencies in knowledge bases.

IBM broke ground with Jeopardy!, showing that question-answering by a machine was possible well beyond the level expected by both AI researchers and the general public – inspiring a tremendous amount of work in question-answering in the field. We hope this work also inspires a next generation of thinking and capabilities in AI.



  1. Riegel, Ryan, et al. “Logical Neural Networks.” arXiv preprint arXiv:2006.13155(2020).
  2. Kapanipathi, Pavan et al. “Question Answering over Knowledge Bases by Leveraging Semantic Parsing and Neuro-Symbolic Reasoning” arXiv preprint arXiv: 2012.01707
  3. Fagin, Ronald, Ryan Riegel, and Alexander Gray. “Foundations of Reasoning with Uncertainty via Real-valued Logics.” arXiv preprint arXiv:2008.02429(2020).
  4. Mihindukulasooriya, Nandana, et al. “Leveraging Semantic Parsing for Relation Linking over Knowledge Bases.” International Semantic Web Conference. Springer, Cham, 2020.
  5. Lee, Young-Suk, et al. “Pushing the limits of AMR parsing with self-learning.” In Findings of EMNLP, 2020.
  6. Usbeck, Ricardo , et al. “9th challenge onquestion answering over linked data (QALD-9)” Semdeep/NLIWoD@ISWC, CEUR Workshop Proceedings, 2018.
  7. Trivedi, Priyansh, et al. “LC-QuAD: A corpus for complex question answering over knowledge graphs.” International Semantic Web Conference. Springer, Cham, 2017.
  8. Logical Neural Networks – Blog
  9. Neuro-Symbolic Question Answering – Blog
  10. AMR Paring – Blog
  11. Relation Linking – Blog


1. QALD uses a modified F measure called Macro F QALD

Inventing What’s Next.

Stay up to date with the latest announcements, research, and events from IBM Research through our newsletter.


IBM Fellow & CTO Translation Technologies, IBM Research

Alex Gray

VP of Foundations of AI, IBM Research

Pavan Kapanipathi

Research Staff Member, AI, IBM Research

More AI stories

New research helps make AI fairer in decision-making

To tackle bias in AI, our IBM Research team in collaboration with the University of Michigan has developed practical procedures and tools to help machine learning and AI achieve Individual Fairness. The key idea of Individual Fairness is to treat similar individuals well, similarly, to achieve fairness for everyone.

Continue reading

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading

Deep Document Understanding: IBM’s AI extracts data from complex documents

Using novel deep learning architectures, we have developed an AI that could help organizations, enterprises, and data scientists to easily extract data from vast collections of documents. Our technology allows users to quickly customize high-quality extraction models. It transforms the documents, making it possible to use the text they contain for other downstream processes such as building a knowledge graph out of the extracted content.

Continue reading