A New State-of-the-Art Method for Relation Extraction

Share this post:

In Natural Language Processing (NLP), relation extraction (RE) in an important task that aims to find semantic relationships between pairs of mentions of entities.  RE is essential for many downstream tasks such as knowledge base completion and question answering.

Figure 1: Model Architecture. Different pairs of entities, e.g., (Iraqi and artillery), (southern suburbs, Baghdad) are predicted simultaneously.

In many enterprise applications, an input paragraph of a RE system usually contains multiple pairs of entities. For example, the paragraph in Figure 1 contains a PART-WHOLE relation between south suburbs and Baghdad, and an ART relation between Iraqi and artillery.  However, nearly all the existing RE approaches treat pairs of entity mentions as independent instances. When deep learning models are used for RE, these methods require the same paragraph be encoded multiple times for multiple pairs of entities, which is computationally expensive, especially when the input paragraph is large and the deep model is huge.

Recently, IBM Research AI and IBM Watson have worked together to develop apromising approach that provides both high efficiency (encoding the input in one-pass) and effectiveness (achieving state-of-the-art performance). This method, published at the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), achieves a new state-of-the-art result on the Automatic Content Extraction (ACE) 2005 benchmark, and shows that the proposed highly efficient one-time encoding approach could achieve results comparable to more time-consuming multi-pass counterparts.

Our proposed solution builds on the existing Transformer-based, pre-trained, general-purpose language encoder known as Bidirectional Encoder Representations from Transformers (BERT). We make two novel modifications to the Transformer architecture to enable the encoding of multiple relations in one-pass. First, borrowing the idea from recent advances in dependency parsing, we introduce a structured prediction layer to BERT for predicting multiple relations for different entity pairs, as shown on the top of Figure 1. Second, we make the self-attention layers of Transformers aware of the positions of all entities in the input paragraph. The key idea is to use the relative distance between words and entities to encode the positional information for each entity. This information is propagated through different layers via attention computations to achieve embedding vectors that are aware of all the entities in the paragraph on the top layers.

This proposed approach is the first-of-its-kind solution that can simultaneously extract multiple relations with one-pass encoding of an input paragraph. Besides achieving state-of-the-art performance on relation extraction, this idea also points to a more accurate and efficient way to achieve entity-centric passage encoding. In the future, we will explore the usage of this method in question answering applications.

For more details, check out our ACL 2019 paper, “Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers,” authored by Haoyu Wang, Ming Tan, Mo Yu, Shiyu Chang, Dakuo Wang, Kun Xu, Xiaoxiao Guo, Saloni Potdar.

IBM Research Staff Member

Saloni Potdar

Senior Software Engineer - Cognitive Analytics and Deep Learning

More AI stories

IBM Research and The Michael J. Fox Foundation Develop Modeling Methodology to Help Understand Parkinson’s Disease Using Machine Learning

In collaboration with The Michael J. Fox Foundation for Parkinson’s Research, our team of researchers at IBM is aiming to develop improved disease progression models that can help clinicians understand how the disease progresses in relation to the emergence of symptoms, even when those patients are taking symptom-modifying medications.

Continue reading

AI Could Help Enable Accurate Remote Monitoring of Parkinson’s Patients

In a paper recently published in Nature Scientific Reports, IBM Research and scientists from several other medical institutions developed a new way to estimate the severity of a person’s Parkinson’s disease (PD) symptoms by remotely measuring and analyzing physical activity as motor impairment increased. Using data captured by wrist-worn accelerometers, we created statistical representations of […]

Continue reading

Image Captioning as an Assistive Technology

IBM Research's Science for Social Good team recently participated in the 2020 VizWiz Grand Challenge to design and improve systems that make the world more accessible for the blind.

Continue reading