Featured Carousel

Writing a Chess Game with IBM Watson, and Unity

Share this post:

With the release of the Watson Unity SDK in 2018, myself and Amara Graham (Keller) set out to build a chess game that could be completely voice controlled.

In order to tackle this, we had three major tasks ahead of us, each of which had a clear solution.

First of all, we would need our application to be capable of understanding speech, as well as being able to process it. Enter Watson Speech to Text.

Next, we would need to create a conversational AI that could follow and understand a game of chess. Enter Watson Assistant (formally conversation).

Lastly, we would need to be able to respond to players with speech. You guessed it: enter Watson Text to Speech.

Understanding Speech

Using Watson Speech to Text, we were able to capture most of the audio and translate it accurately, even with the fact that we were mixing UK and US English accents. Pieces, colors and other elements were captured really well.

The board positions were a bit trickier, as they are not typical elements of language. We don’t say “D4” or “F8” very often in everyday speech. To deal with this, we used a custom voice model, Which is something that is covered in-depth in Amara Graham (Keller)’s article How to Build a Custom Language Model. We expanded this to load an entire JSON model, which we could add to over time as we gathered more detailed training data.

This caused a massive jump in the reliability of the Speech to Text system, showing how valuable it is to train Watson in your preferred domain language.

Creating a conversational AI

Using Watson Assistant, we were able to construct a dialog tree that gave our AI the knowledge of how to move through a chess game. Below, you can see an example of an “intent”, i.e. how the AI will act when it believes you want to make a move.

You can see that — as well as recognising the intent to move — it knows that it requires certain information, and will attempt to pull it out. You require a full PieceName (i.e. “Queen’s Rook”) or at least a PieceType (such as “Rook”) and we can infer which one from the board state. It also requires a place to move to. It will continue to prompt you for this information.

Replying back with a voice

Using Watson Text-To-Speech was a breeze, and the only real issue we hit was with our own programming.

We had slight issues with Watson tripping over its own speech. For example, Speech o Text would hear Text to Speech and the whole thing would end up talking to itself for a while.

The solution to this was relatively easy: we just kept the mic off while the Text to Speech was playing.

Giving Watson some smarts

The focus of this was to try to make chess more accessible via a voice interface, not to build DeepBlue all over again. While we could have looked at Watson Machine Learning, ee didn’t want to reinvent the wheel.

The simple solution was to convert outboard to Forsyth-Edwards Notation, and ask a chess engine for the answer over the Universal Chess Interface (UCI).

A list of this can be found on Wikipedia:

ttps://en.wikipedia.org/wiki/Chess_engine

Results

The results so far are really pleasing. We were able to build a powerful conversation interface, with understanding of domain specific language really fast.

Good future features would include:

  • Having Watson comment on your move in more detail
  • Being able to ask Watson for hints
  • Being able to ask more general chess questions (maybe using Watson Discovery)

We hope to open source this code very soon. In the meantime, to find out more please check out Watson Unity SDK and Emerging Tech Team.

More Featured Carousel stories

Finding Hidden Cell Phones in Prisons

Myself and some colleagues from IBM’s Emerging Technology attempted a two-day project just before Christmas this year, because we thought we might have the tech to tackle a big problem: cell phones in prisons, we know prisoners have them, but they are rarely ever tracked down. This is something the UK government is currently planning to spend […]

Continue reading

Building the IBM Emotive Droid

Animating emotion. This is a project to show of the use of affective computing (Emotional AI) in the Watson suite, which is IBM’s collection of cloud-based AI API’s. We hooked this into The Waston SDK for Unity, which allowed us to use this as a 3D Environment. The goal here was simple: to create an […]

Continue reading

A visit to Munich for the ACM Computer Science in Cars Symposim

Supported by the Hartree Centre’s Innovation Return on Research programme – Emerging Technology and colleagues from the STFC Hartree Centre have been working to contextualise the driving environment, fusing multiple data sources together to allow vehicles to make intelligent decisions. Following on from a recent publication at the IEEE Vehicular Technology Conference, the team recently presented an Extended […]

Continue reading