Help build the next generation of AI-driven dialog systems

Share this post:

Introducing the Sentence Selection Track of DSTC8

Artificial Intelligence (AI)-driven dialog assistants are used in our day-to-day routines such as checking weather, finding restaurants, making calls and many more. In an enterprise setting, these assistants are highly useful in customer support scenarios, where they help reduce organizational costs and provide fast and accurate help to users.

However, many AI-driven dialog assistants today rely heavily approaches that are explicitly programmed by dialog designers. A key need for scientific advancement is the development of algorithms that can learn goal-oriented dialog interactions effectively from human-to-human chatlogs, while grounding them on relevant external information sources. IBM Research AI and the University of Michigan are spearheading this research direction by organizing a public competition to inspire and evaluate novel approaches that will lead to the next generation of AI-driven dialog systems.

The competition, called NOESIS II: Predicting Responses, Identifying Success, and Managing Complexity in Task-Oriented Dialog, is offered in Track 2 of Dialog Systems Technology Challenge 8 (DSTC 8). The competition is now open, with submissions due by October 6. The winners will be announced on October 20, and the participants will be invited to present their work at a workshop that will be held at a top conference to be named later. Top participants will also be invited to submit their work to a special issue of the Computer Speech and Language (CS&L) journal.

Read more and register now!

The competition is building on the success of DSTC 7 Track 1 which IBM Research AI and University of Michigan organized last year. More than 20 academic and industrial teams participated, worldwide. This year we provide extensions to the task by incorporating new elements that are vital for the creation of deployable task-oriented dialog systems.


This challenge is offered with two goal-oriented dialog datasets:

A. Ubuntu dataset

The dataset consists of multi-party conversations extracted from the Ubuntu Internet Relay Chat (IRC) channel. A typical dialog starts with a question that was asked by participant_1. Then other participants respond with either an answer or follow-up questions that then lead to a back-and-forth conversation. Relevant external information in the form of Linux manual pages and Ubuntu discussion forums is also provided.

An example conversation from the dataset is shown below.

ubuntu dataset

B. Advising dataset

This dataset contains two party dialogs that simulate a discussion between a student and an academic advisor. The purpose of the dialogs is to guide the student to pick courses that fit not only their curriculum, but also personal preferences about time, difficulty, areas of interest, etc. These conversations were collected by having students at the University of Michigan act in the two roles using provided personas. Structured information in the form of a database of course information will be provided, as well as the personas. The data also includes paraphrases of the sentences and of the target responses.

An example conversation from the dataset is shown below.

advising dataset


This challenge is offered with four subtasks. A participant may participate in one, more than one, or all the subtasks. The primary task, given a dialog context, is to select the next turn in the conversation from a given set of candidate utterances.


The following table shows the full set of subtasks; [x] indicates that the subtask is evaluated on the marked dataset.

Subtask number Description Ubuntu Advising
1 Given the disentangled conversation, select the next utterance from a candidate pool of 100 which might not contain the correct next utterance [x] [x]
2 Given a section of the IRC channel, select the next utterance from a candidate pool of 100 which might not contain the correct next utterance [x]
3 Given a conversation, predict where in the conversation the problem is solved (if at all). [x]
4 Given a section of the IRC channel, identify a set of conversations contained within that section [x]


Task Dates
Development Phase Jun 17 – Sep 22, 2019 (14 weeks)
Test data released Sep 23, 2019
Entry submission deadline Oct 6, 2019
Objective evaluation completed Oct 20, 2019
Paper submission deadline Nov 2019
DSTC8 workshop Spring 2020


Chulaka Gunasekara, Luis Lastras – IBM Research AI

Jonathan K. Kummerfeld, Walter Lasecki – University of Michigan

We encourage all academic and industrial teams working on dialog or related research to register and participate in DSTC 8 – Track 2.


Research Staff Member - Implicit Learning for Dialog, IBM Research

Luis Lastras

Distinguished Research Staff Member and Senior Manager, IBM Research

More AI stories

We’ve moved! The IBM Research blog has a new home

In an effort better integrate the IBM Research blog with the IBM Research web experience, we have migrated to a new landing page:

Continue reading

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading

From HPC Consortium’s success to National Strategic Computing Reserve

Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.” The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.

Continue reading