A New State-of-the-Art Method for Relation Extraction

Share this post:

In Natural Language Processing (NLP), relation extraction (RE) in an important task that aims to find semantic relationships between pairs of mentions of entities.  RE is essential for many downstream tasks such as knowledge base completion and question answering.

Figure 1: Model Architecture. Different pairs of entities, e.g., (Iraqi and artillery), (southern suburbs, Baghdad) are predicted simultaneously.

In many enterprise applications, an input paragraph of a RE system usually contains multiple pairs of entities. For example, the paragraph in Figure 1 contains a PART-WHOLE relation between south suburbs and Baghdad, and an ART relation between Iraqi and artillery.  However, nearly all the existing RE approaches treat pairs of entity mentions as independent instances. When deep learning models are used for RE, these methods require the same paragraph be encoded multiple times for multiple pairs of entities, which is computationally expensive, especially when the input paragraph is large and the deep model is huge.

Recently, IBM Research AI and IBM Watson have worked together to develop apromising approach that provides both high efficiency (encoding the input in one-pass) and effectiveness (achieving state-of-the-art performance). This method, published at the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), achieves a new state-of-the-art result on the Automatic Content Extraction (ACE) 2005 benchmark, and shows that the proposed highly efficient one-time encoding approach could achieve results comparable to more time-consuming multi-pass counterparts.

Our proposed solution builds on the existing Transformer-based, pre-trained, general-purpose language encoder known as Bidirectional Encoder Representations from Transformers (BERT). We make two novel modifications to the Transformer architecture to enable the encoding of multiple relations in one-pass. First, borrowing the idea from recent advances in dependency parsing, we introduce a structured prediction layer to BERT for predicting multiple relations for different entity pairs, as shown on the top of Figure 1. Second, we make the self-attention layers of Transformers aware of the positions of all entities in the input paragraph. The key idea is to use the relative distance between words and entities to encode the positional information for each entity. This information is propagated through different layers via attention computations to achieve embedding vectors that are aware of all the entities in the paragraph on the top layers.

This proposed approach is the first-of-its-kind solution that can simultaneously extract multiple relations with one-pass encoding of an input paragraph. Besides achieving state-of-the-art performance on relation extraction, this idea also points to a more accurate and efficient way to achieve entity-centric passage encoding. In the future, we will explore the usage of this method in question answering applications.

For more details, check out our ACL 2019 paper, “Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers,” authored by Haoyu Wang, Ming Tan, Mo Yu, Shiyu Chang, Dakuo Wang, Kun Xu, Xiaoxiao Guo, Saloni Potdar.

IBM Research Staff Member

Saloni Potdar

Senior Software Engineer - Cognitive Analytics and Deep Learning

More AI stories

We’ve moved! The IBM Research blog has a new home

In an effort better integrate the IBM Research blog with the IBM Research web experience, we have migrated to a new landing page:

Continue reading

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading

From HPC Consortium’s success to National Strategic Computing Reserve

Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.” The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.

Continue reading