A New State-of-the-Art Method for Relation Extraction

Share this post:

In Natural Language Processing (NLP), relation extraction (RE) in an important task that aims to find semantic relationships between pairs of mentions of entities.  RE is essential for many downstream tasks such as knowledge base completion and question answering.

Figure 1: Model Architecture. Different pairs of entities, e.g., (Iraqi and artillery), (southern suburbs, Baghdad) are predicted simultaneously.

In many enterprise applications, an input paragraph of a RE system usually contains multiple pairs of entities. For example, the paragraph in Figure 1 contains a PART-WHOLE relation between south suburbs and Baghdad, and an ART relation between Iraqi and artillery.  However, nearly all the existing RE approaches treat pairs of entity mentions as independent instances. When deep learning models are used for RE, these methods require the same paragraph be encoded multiple times for multiple pairs of entities, which is computationally expensive, especially when the input paragraph is large and the deep model is huge.

Recently, IBM Research AI and IBM Watson have worked together to develop apromising approach that provides both high efficiency (encoding the input in one-pass) and effectiveness (achieving state-of-the-art performance). This method, published at the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), achieves a new state-of-the-art result on the Automatic Content Extraction (ACE) 2005 benchmark, and shows that the proposed highly efficient one-time encoding approach could achieve results comparable to more time-consuming multi-pass counterparts.

Our proposed solution builds on the existing Transformer-based, pre-trained, general-purpose language encoder known as Bidirectional Encoder Representations from Transformers (BERT). We make two novel modifications to the Transformer architecture to enable the encoding of multiple relations in one-pass. First, borrowing the idea from recent advances in dependency parsing, we introduce a structured prediction layer to BERT for predicting multiple relations for different entity pairs, as shown on the top of Figure 1. Second, we make the self-attention layers of Transformers aware of the positions of all entities in the input paragraph. The key idea is to use the relative distance between words and entities to encode the positional information for each entity. This information is propagated through different layers via attention computations to achieve embedding vectors that are aware of all the entities in the paragraph on the top layers.

This proposed approach is the first-of-its-kind solution that can simultaneously extract multiple relations with one-pass encoding of an input paragraph. Besides achieving state-of-the-art performance on relation extraction, this idea also points to a more accurate and efficient way to achieve entity-centric passage encoding. In the future, we will explore the usage of this method in question answering applications.

For more details, check out our ACL 2019 paper, “Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers,” authored by Haoyu Wang, Ming Tan, Mo Yu, Shiyu Chang, Dakuo Wang, Kun Xu, Xiaoxiao Guo, Saloni Potdar.

IBM Research Staff Member

Saloni Potdar

Senior Software Engineer - Cognitive Analytics and Deep Learning

More AI stories

MIT-IBM Watson AI Lab Welcomes Inaugural Members

Two years in, and the MIT-IBM Watson AI Lab is now engaging with leading companies to advance AI research. Today, the Lab announced its new Membership Program with Boston Scientific, Nexplore, Refinitiv and Samsung as the first companies to join.

Continue reading

Adversarial Robustness 360 Toolbox v1.0: A Milestone in AI Security

IBM researchers published the first major release of the Adversarial Robustness 360 Toolbox (ART). Initially released in April 2018, ART is an open-source library for adversarial machine learning that provides researchers and developers with state-of-the-art tools to defend and verify AI models against adversarial attacks. ART addresses growing concerns about people’s trust in AI, specifically the security of AI in mission-critical applications.

Continue reading

Making Sense of Neural Architecture Search

It is no surprise that following the massive success of deep learning technology in solving complicated tasks, there is a growing demand for automated deep learning. Even though deep learning is a highly effective technology, there is a tremendous amount of human effort that goes into designing a deep learning algorithm.

Continue reading