Rainbow Octopus is an open-source project created to show how a developer could use IBM’s Watson services in ARKit and Unity to control and manipulate a 3d animated character. Specifically, via the Watson Unity SDK, we capture and send speech to the IBM Speech To Text, and interpret the results using Tone Analyser and Watson Assistant. The responses from these services and Watson’s interpretation of the data would then impact the animated figure – an Octopus.
Blender – Open-source 3d modelling and animation software
Watson Assistant – An AI assistant designed for business, enabling conversational interfaces into any application, device, or channel.
Watson Speech-to-Text – Automatically transcribe audio from 7 languages in real-time with customisation available to improve accuracy for language and content you care most about
Watson Tone Analyser – Analyse emotions and tone of text in what people write online, or from audio transcripts. Predict whether they are happy, sad, confident, and more.
The result is an animated octopus, who could sit with you (via Augmented Reality) and respond to you – both by undertaking the actions you ask of it, and by reacting and showing emotions by changing it’s own colour based on the tone of your voice.
Give it a go!
The code is all available on our public GitHub here.
You’ll need a copy of Unity and an IBM Cloud account (available for free) to get it up and running. Unity will enable you to deploy the AR environment to an iPhone/iPad using ARKit. We can talk to the Octopus using IBM Speech To Text, Tone Analyser, and Watson Assistant, and it will respond in various ways.
As soon as the app is started, ARKit identifies horizontal surfaces within the camera view. When found, the ground is established and the octopus is rendered. As a user, you can then tap on your device’s screen to control the movement of the animated character.
Cognitive understanding in an animated character
The app is constantly listening too. You can try particular commands and “intents” like:
“Hello! “ – Octopus will wave
“Go for a walk” – Octopus will start walking
“Stop!” – Octopus will get scared – changing colour to yellow – and stop in his tracks
“Jump for Joy” – Octopus will jump up and down, whilst glowing green
“Turn right / left” – Octopus will turn in the indicated direction
“Get bigger / smaller” – Octopus will grow / shrink
A variety of other emotive sentences that cause the Octopus to change colour. Reinforcing that emotion will make that colour become stronger.
“You look happy” will make the octopus go light green. Then, “Jump for Joy” for example, will make the Octopus go a stronger green colour.
The app sends the speech to Watson Speech-To-Text (via the Watson Unity SDK), and receives raw text in response. The app passes the raw text to Watson Assistant and simultaneously to Watson Tone Analyzer – again utilizing the Watson Unity SDK. The results from the Tone Analyzer determine what tone was identified from the stated speech – this may be “joy”, “sadness”, “anger” etc. – and colours the octopus accordingly. The Tone Analyzer only analyzes the content of the words said, rather than the way it was said. Watson Assistant returns an “intent” from the text sent to it. For example if you say “Go forward”, or “Walk”, or “move” – the intent will be “forward” and the Octopus will move forwards.
The project itself was coded by Kevin Brown, modelled on a similar project by Gwilym Newton (https://medium.com/@gwilymnewton/building-the-ibm-emotive-droid-55b2dec26cd8), Joe Pavitt created the 3d octopus and associated animations in Blender, and these were imported into Unity so they could be activated at the right time (via the Animation Controller). Kris Schultz then applied a range of techniques to make the experience slightly smoother.
The Emerging Technology team within IBM Research UK, have been working with FOX Sports to enhance their viewer’s experience of watching ‘The Beautiful Game’. The IBM Player Spotlight Built with IBM Watson is an AI-powered tool built for the FOX Sports’ studio team to access complex statistical analyses using an easy-to-use conversational interface. With the […]