Using a device controller

Typically, you use the local speaker and microphone of the audio device to send and receive audio. However, you might have a smart speaker in your environment that you would like to use to perform external processing or you might have additional device controls, for example, for volume or display. Watson Assistant Solutions provides you with the option to use your own smart speaker and microphone with the audio client. The smart speaker acts as a device controller for the audio client.

Socket interfaces

The audio client provides socket interfaces to allow control of the audio client from a device controller and sending audio to and from the controller. There are two socket interfaces:

For example, the device controller wants to send audio to the audio client on the audio socket interface, but the client is not ready to process the audio. The device controller sends a RAS (a trigger to the client to read from the audio socket) command on the command socket interface. The audio client responds with a micWakeUpNotAllowed status message. The device controller waits for a micWakeUpAllowed status message and sends data to the audio client on the audio socket interface.

Command socket interface

You can test the command socket interface using telnet. Complete these steps:

  1. Establish a telnet connection to the audio client on the command port.
  2. Wait for the client to respond with OK.
  3. Send a command and terminate the command with a carriage return.

Table 1 displays the commands from the device controller to the client.

Command Description
OS Send output to the speaker.
OAS Send output to the audio socket.
RM Read the microphone (trigger).
RAS Read the audio socket (trigger).
EXIT Disconnect.

Table 2 displays the responses from the audio client to the device controller.

Command Description
OK Command was received and is acknowledged.
? An unknown command was received.
DONE The client was told to disconnect.
micWakeUpNotAllowed The client will not respond to the wake up command trigger.
micWakeUpAllowed The client will respond to the wake up command trigger.
micOn The client is expecting audio.
micOff The client is not expecting audio.

Table 3 displays the status messages from the audio client to the device controller to show the status of the connection to the audio gateway.

Command Description
serverConnected The client is connected to the server but is not yet ready to start.
serverConnecting The client is attempting to connect to the server.
serverConnectionReady The client is connected and is ready to start.
serverNotConnected The client is not connected to the server.
Audio socket interface

The audio socket interface sends and receives audio data streams in binary format. Currently, the format is fixed as follows:

If the controller sends an OAS command for diverting audio output to the audio socket, the controller must be ready to receive audio data and process it (that is, play it). No command is sent from the client to indicate that audio data will be sent.

When the controller sends a RAS command to trigger the reading of audio data from the audio socket, the client responds with a micOn response and starts to read data from the audio socket. The audio data is sent to the Watson server for transcription. Once the transcription has responded with an acceptable confidence level, the client sends a micClose response. Any further data that is received on the audio socket is discarded.

Next topic

How audio is processed with a controller