IBM’s AI goes multilingual — with single language training

Share this post:

This article was written by Mihaela Bornea, Lin Pan, Sara Rosenthal, Radu Florian, Avirup Sil, members of IBM Research AI.


Can you understand Japanese? If not, then the line of text, above, wouldn’t make any sense. An AI trained only on English data typically wouldn’t be able to understand the text either.

In a new research paper, Multilingual Transfer Learning for QA Using Translation as Data Augmentation, accepted to AAAI 2021, our team presents two novel techniques that enable an AI to understand different languages while only trained on one. These techniques, built on top of Multilingual BERT (a pre-trained large multilingual language model and can provide text representations), use machine language (ML) translation to make the representations for different languages look the same to a question answering (QA) system.

Machine reading comprehension is an important AI challenge seeking to demonstrate a computer’s understanding of language. Applying it to QA-related tasks to enable a system to automatically answer questions from natural language text is particularly difficult.

One central challenge involved in building high-performing QA systems is the data-intensive training process. The training data is created manually by presenting annotators with questions and passages that must be labeled with the correct answer. This is a time-consuming and expensive process. And other languages make it even harder. While we want to support all spoken languages, most labeled data for QA currently is in English, and it’s nearly impossible to collect labeled data at the same scale for the thousands of other languages in use.

To deal with the issue, researchers have been developing techniques for cross-lingual transfer. This is the process of training the system with labeled data in one language but enabling it to answer questions in another language.

A recent popular solution for cross-lingual transfer is using large pre-trained multilingual language models such as Multilingual BERT, which can provide text representations, or embeddings aligned across hundreds of languages. Still, while Multilingual BERT is effective at cross-lingual transfer, the performance for low resource languages is often subpar.

IBM’s latest cross-lingual research

Our team investigated two methods to help tackle the cross-lingual transfer challenge.

One, Language Adversarial Training, aims to make the multilingual embeddings indistinguishable to a Discriminator — a classifier that distinguishes real data from data created by a generator. Before training the system, we enrich the original English training set by translating each example into several target languages, including Arabic, Spanish, German, Hindi, Vietnamese, Chinese and Japanese. We use the augmented dataset to train the QA system and the Discriminator that use the same underlying BERT network for their predictions.

The two models have adversarial objectives. The role of the Discriminator is to identify the language of the question based on the question representation, and the role of the QA model is to find the correct answer but also to update the embeddings while making the Discriminator uncertain about the language of the question.

Specifically, for every input question the Discriminator produces a probability distribution over all training languages. In turn, the QA system is trained with an adversarial loss function and minimizes the divergence between the language probability distribution produced by the Discriminator and the uniform distribution. As training proceeds with questions in different languages, the multilingual representations become close to each other in the semantic space.

Our second method for QA cross-lingual transfer is called the Language Arbitration Framework. With language arbitration, we use properties of the translations to bring the multilingual embeddings closer to the training language (English, in our case). During training, the QA model examines the original labeled example together with its translation to one of the target languages.

Just like a human arbitrator, our language arbitration framework ensures agreement between English and the translation using two additional multilingual tasks. The first one is the so-called Produce Same Answer (PSA) task that ensures the Japanese translation produces the same answer as the original English question. And the other one is the Question Similarity (QS) task that ensures the translation representation is close to the original English question representation by minimizing their cosine similarity.

Both adversarial training and language arbitration are effective cross-lingual QA transfer techniques improving the zero-shot (no prior training) performance by a large margin, especially on low resource languages such as Hindi and Arabic. With these efforts, our model can now understand that the Japanese question AAAI会議はいつ設立されましたか? translates to “When was the AAAI Conference founded?” in English. It can then find the correct answer: “The organization was founded in 1979.”


IBM Research AI is proudly sponsoring AAAI2021 as a Platinum Sponsor. We will present 40 main track papers, in addition to at least 7 workshop papers, 10 demos, 4 IAAI papers and one tutorial. IBM Research AI is also co-organizing 3 workshops. We hope you can join us from February 2-9 to learn more about our research. To view our full presence at AAAI 2021, visit here.


Inventing What’s Next.

Stay up to date with the latest announcements, research, and events from IBM Research through our newsletter.


More AI stories

New research helps make AI fairer in decision-making

To tackle bias in AI, our IBM Research team in collaboration with the University of Michigan has developed practical procedures and tools to help machine learning and AI achieve Individual Fairness. The key idea of Individual Fairness is to treat similar individuals well, similarly, to achieve fairness for everyone.

Continue reading

MIT and IBM announce ThreeDWorld Transport Challenge for physically realistic Embodied AI

MIT Brain and Cognitive Sciences, in collaboration with the MIT-IBM Watson AI Lab, has developed a new Embodied AI benchmark, the ThreeDWorld Transport Challenge, which aims to measure an Embodied AI agent’s ability to change the states of multiple objects to accomplish a complex task, performed within a photo- and physically realistic virtual environment.

Continue reading

Mimicking the brain: Deep learning meets vector-symbolic AI

To better simulate how the human brain makes decisions, we’ve combined the strengths of symbolic AI and neural networks. Specifically, we combined the learning representations that neural networks create with the symbol-like entities represented by high-dimensional and distributed vectors. The idea is to guide a neural network to represent unrelated objects with dissimilar high-dimensional vectors.

Continue reading