How Watson “sees,” “hears,” and “speaks” to play Jeopardy!

Share this post:

Editor’s note: This guest post from IBM Researcher Dr. David Gondek is the first article in a three-part series about how Watson plays America’s favorite quiz show®

The buzzer sounds.

Jeopardy! host Alex Trebek: “Watson?”

IBM Watson: “What is …”

This scenario will play out on February’s airing of the Jeopardy! quiz show when IBM’s Question Answering system, Watson, will challenge two of the game’s greatest champions, Ken Jennings and Brad Rutter.

Watson, however, cannot “see” or “hear” anything – so how can he play a Jeopardy! game?

Chips, not retinas

When host Alex Trebek finishes stating a clue, a human operator (who works for Jeopardy!) turns on a “Buzzer Enable” light on stage to indicate that contestants can “buzz in” and answer. At exactly the moment the “Buzzer Enable” light is activated, Watson’s system receives a signal that the buzzer is open.

Watson’s avatar, which viewers will see behind a standard Jeopardy! podium, is designer Joshua Davis’ artistic representation of the machine. It does not provide eyes or ears for Watson. Instead, Watson depends on text messaging, sent over TCP/IP, in order to receive the clue. At exactly the moment that the clue is revealed on the game board, a text is sent electronically to Watson’s POWER7 chips. So, Watson receives the clue text at the same time it hits Brad Rutter’s and Ken Jennings’ retinas.

Watson uses IBM’s DeepQA technology (over optimized IBM POWER7 servers) to analyze and produce a Jeopardy! clue response. Those responses come with an associated confidence, or estimated probability that the answer is correct. If his confidence is high enough, Watson may decide to buzz in. To do this, Watson sends a signal to a mechanical thumb, which is mounted on exactly the same type of Jeopardy! buzzer used by human contestants. Just like Ken and Brad, Watson must physically depress a button to buzz in.

Watson’s buzzing is not instantaneous. For some clues he may not complete the question answering computation in time to make the decision to buzz in. For all clues, even if he does have an answer and confidence ready in time, he still has to respond to the signal and physically depress the button.

The best human contestants don’t wait for, but instead anticipate when Trebek will finish reading a clue. They time their “buzz” for the instant when the last word leaves Trebek’s mouth and the “Buzzer Enable” light turns on. Watson cannot anticipate. He can only react to the enable signal. While Watson reacts at an impressive speed, humans can and do buzz in faster than his best possible reaction time.

Speaking when signaled

When answering a clue, Watson must convert his answer from text into speech to verbally respond like any other contestant. An operator prompts Watson to speak his answer. The operator has no control over what Watson might say. The operator just ensures that Watson will speak at the right moment and not interrupt the host or others.

The sound of Watson’s voice is synthesized, based on a human’s voice. Since it’s not possible to record someone speaking every possible word and phrase imaginable – all the more so given the vast range of topics and knowledge that even a single game of Jeopardy! demands – an IBM text-to-speech engine (TTS) “speaks” Watson’s answer. And Watson’s speech must be highly accurate, as mispronunciations of an ambiguous response may be judged incorrect.

Categories and clues

Watson autonomously selects categories and clues, based on algorithms that – just as his human opponents will do – take into consideration available clues; score and game position; knowledge of clues previously revealed, as well as other factors. In the next article of the series, we will take a closer look at how Watson chooses a Jeopardy! category and clue.

Note: As Watson cannot see or hear, he cannot respond to video or audio clues. Jeopardy! has agreed to omit them, just as they have with contestants who are visually or hearing impaired. Watson did take and pass the same Jeopardy! contestant test that humans take to qualify for the show. Find out more about Watson at

More stories

A new supercomputing-powered weather model may ready us for Exascale

In the U.S. alone, extreme weather caused some 297 deaths and $53.5 billion in economic damage in 2016. Globally, natural disasters caused $175 billion in damage. It’s essential for governments, business and people to receive advance warning of wild weather in order to minimize its impact, yet today the information we get is limited. Current […]

Continue reading

DREAM Challenge results: Can machine learning help improve accuracy in breast cancer screening?

        Breast Cancer is the most common cancer in women. It is estimated that one out of eight women will be diagnosed with breast cancer in their lifetime. The good news is that 99 percent of women whose breast cancer was detected early (stage 1 or 0) survive beyond five years after […]

Continue reading

Computational Neuroscience

New Issue of the IBM Journal of Research and Development   Understanding the brain’s dynamics is of central importance to neuroscience. Our ability to observe, model, and infer from neuroscientific data the principles and mechanisms of brain dynamics determines our ability to understand the brain’s unusual cognitive and behavioral capabilities. Our guest editors, James Kozloski, […]

Continue reading