12 new Project Debater AI technologies available as cloud APIs

Share this post:

Argumentation and debating are fundamental capabilities of our human intelligence. Until recently, they have been totally out of reach of AI.

In February 2019 and after six years of work by natural language processing and machine learning researchers and engineers, an IBM AI dubbed Project Debater became the first AI system able to debate humans over complex topics.

And while it may not have ‘won’ the sparring against debate champion Harish Natarajan in San Francisco that year, Project Debater demonstrated how AI could help people build persuasive arguments and make well-informed decisions. The AI became the third in the series of IBM Research AI’s grand challenges, following Deep Blue and Watson.

In our recent paper “An autonomous debating system” published in Nature, we describe Project Debater’s architecture and evaluate its performance. We also offer free access for academic use to 12 of Project Debater’s underlying technologies as cloud APIs, as well as trial and licensing options for developers.

To debate humans, an AI must be equipped certain skills. It has to be able to pinpoint relevant arguments for a given debate topic in a massive corpus, detect the stance of arguments and assess their quality. It also has to identify general, recurring arguments that are relevant for the specific topic, organize the different types of arguments into a compelling narrative, recognize the arguments made by the human opponent, and make a rebuttal. And it has to be able to use competitive debate techniques, such as asking the opponent questions to frame the discussion in a way that favors its position.

This is exactly what we’ve done with Project Debater. It’s been developed as a collection of components, each designed to perform a specific subtask. Over the years, we published more than 50 papers describing these components and released many related datasets for academic use.

Building debating skills

To engage in a debate successfully, a machine requires high level of accuracy from each component. For example, failing to detect the argument’s stance may result in arguing in favor of your opponent – a dire situation in a debate.

This is why it was crucial for us to collect uniquely large-scale, high-quality labeled training datasets for Project Debater. The evidence detection classifier, for instance, was trained on 200,000 labeled examples, and achieved a remarkable precision of 95 percent for top 40 candidates.

Another major challenge was scalability. For example, we had to apply “wikification” (identifying mentions of Wikipedia concepts) to our 10 billion-sentence corpus – an impossible task for any existing Wikification tools. So, we developed a new, fast wikification algorithm that could be applied to massive corpora and achieve competitive accuracy.

Project Debater’s APIs give access to different capabilities originally developed for the live debating system, as well as related technologies we have developed more recently. The APIs include natural language understanding capabilities that deal with wikification, semantic relatedness between Wikipedia concepts, short text clustering, and common theme extraction for texts.

The core set of APIs relates to services for argument mining and analysis. These services include detection of sentences containing claims and evidence, detecting claim boundaries in a sentence, argument quality assessment and stance classification (Pro/Con).

Then there are APIs for two high-level services that create different kinds of summaries, Narrative Generation and Key Point Analysis. When given a set of arguments, Narrative Generation constructs a well-structured speech that supports or contests a given topic, according to the specified polarity.

And Key Point Analysis is a new and promising approach for summarization, with an important quantitative angle. This service summarizes a collection of comments on a given topic as a small set of key points, and the prominence of each key point is given by the number of its matching sentences in the given comments.

Developers are welcome

Key Points Analysis and Narrative Generation have been recently demonstrated in the That’s Debatable television series and in the Grammy Debates with Watson backstage experience, where they summarized pro and con arguments contributed online by thousands of people, discussing debate topics ranging from social questions to pop culture.

Developers can access the Project Debater API documentation as guests on the main documentation site. They can login as guests, view the documentation and run online interactive demos of most of the services. They can also see the code of complete end-to-end examples using these services.


One example is Mining to Narrative. Given a controversial topic, it demonstrates the creation of a narrative by mining content from a Wikipedia corpus. Another one uses Debater Services to analyze free text surveys for themes, where it identifies themes based on Wikipedia concepts.

Before developers can run code examples or use the Project Debater APIs in their own project, they need to obtain an API key and download the SDK. To request an API key, please visit Project Debater for Academic Use or send an an e-mail request to You will receive a username and password to login to the Early Access website and can then obtain your personal API key from the API-key tab.


Slonim, N., Bilu, Y., Alzate, C., et al. An autonomous debating system. Nature (2021).


Inventing What’s Next.

Stay up to date with the latest announcements, research, and events from IBM Research through our newsletter.


More AI stories

New research helps make AI fairer in decision-making

To tackle bias in AI, our IBM Research team in collaboration with the University of Michigan has developed practical procedures and tools to help machine learning and AI achieve Individual Fairness. The key idea of Individual Fairness is to treat similar individuals well, similarly, to achieve fairness for everyone.

Continue reading

MIT and IBM announce ThreeDWorld Transport Challenge for physically realistic Embodied AI

MIT Brain and Cognitive Sciences, in collaboration with the MIT-IBM Watson AI Lab, has developed a new Embodied AI benchmark, the ThreeDWorld Transport Challenge, which aims to measure an Embodied AI agent’s ability to change the states of multiple objects to accomplish a complex task, performed within a photo- and physically realistic virtual environment.

Continue reading

Mimicking the brain: Deep learning meets vector-symbolic AI

To better simulate how the human brain makes decisions, we’ve combined the strengths of symbolic AI and neural networks. Specifically, we combined the learning representations that neural networks create with the symbol-like entities represented by high-dimensional and distributed vectors. The idea is to guide a neural network to represent unrelated objects with dissimilar high-dimensional vectors.

Continue reading