Contribute in GitHub:

Reporting transcription events from Voice Gateway

For both self-service agents and agent assistants, IBM® Voice Gateway provides the ability to access the transcription text from the call in real time. Transcribing calls can be valuable in many scenarios, including:

Monitoring conversations as they occur
Logging conversations for later analysis, such as to fine-tune your Watson services for a better customer experience

You can configure Voice Gateway to generate transcription messages as REST events and publish them to a configured REST server. This method is available in Version 1.0.0.2 and later. In Versions 1.0.0.6 and later, you can also configure Voice Gateway to publish transcription events directly to IBM Cloudant. Each transcription event contains the utterance text, confidence score, and session information. For more information about transcription event contents and format, see Reporting events.

Configuring Voice Gateway to publish transcription events
Configuring Voice Gateway to publish transcription events in an IBM Cloudant database
Configuring Voice Gateway to publish transcription events in a CouchDB
Suppressing transcription text in reporting events
Transcription event format
SMS Gateway transcription events generated by Voice Gateway

Configuring Voice Gateway to publish transcription events

Set up a Splunk HTTP Event Collector (HEC) or a REST server that can store the events, for example in a noSQL database.
Configure your REST server URL and authorization credentials and enable transcription events in Voice Gateway.
- Single-tenant environment: Set the following environment variables in your configuration.
```
REPORTING_URL=http://myresteventserver.ibm.com/
REPORTING_USERNAME=myRestAdmin
REPORTING_PASSWORD=myRestTokenOrPassword
REPORTING_TRANSCRIPTION_EVENT_INDEX=transcription
REPORTING_TRANSCRIPTION_EVENT_SOURCE_TYPE=vgwSessionID
```
- Multi-tenant JSON configuration: In the multi-tenant JSON configuration file, for each tenant where you want to enable transcription reporting, configure a reporting object that contains the following properties. You can configure indexes for other types of reporting events within the same object.
```
...
"reporting": {
  "url": "http://myresteventserver.ibm.com/",
  "username": "myRestAdmin",
  "password": "myRestTokenOrPassword",
  "transcriptionEventInd": "transcription",
  "transcriptionEventSourceType" : "vgwSessionID"
}
...
```
The transcription event index value is included as the index value in all transcription events so that the REST server that consumes them can differentiate the type of event. The index field is required for Splunk HEC, but it's also useful for anyone who is building their own REST server to handle these events.

After you redeploy Voice Gateway, it will publish event records to the configured REST server every time an utterance is detected.

Configuring Voice Gateway to publish transcription events in an IBM Cloudant database

Before configuring Voice Gateway to publish to IBM Cloudant, set up a IBM® Cloudant® for IBM Cloud. When you generate your service credentials from the IBM Cloudant dashboard, you can choose either IAM or a combination of API key and username and password.

Set the following environment variables in your configuration to publish transcription events in an IBM Cloudant database. You can use both IBM Cloudant and another reporting service simultaneously by adding the IBM Cloudant configuration variables to your configuration.

The following examples show the configuration for publishing turn events to IBM Cloudant by using a password and username authentication. See Using an API key for authentication. You can configure authentication for IBM Cloudant either with an API key by adding the REPORTING_TRANSCRIPTION_CLOUDANT_APIKEY variable or by username and password by adding REPORTING_TRANSCRIPTION_CLOUDANT_USERNAME and REPORTING_TRANSCRIPTION_CLOUDANT_PASSWORD.

Single-tenant environment: Set the following environment variables in your configuration.
```
REPORTING_TRANSCRIPTION_CLOUDANT_USERNAME=myCloudantUsername
REPORTING_TRANSCRIPTION_CLOUDANT_PASSWORD=myCloudantPassword
REPORTING_TRANSCRIPTION_CLOUDANT_DB_NAME=myCloudantDB
REPORTING_TRANSCRIPTION_CLOUDANT_EVENT_INDEX=transcription
```
If you use non-default account configurations to IBM Cloudant, you might need to include additional environment variables. See REPORTING_TRANSCRIPTION_CLOUDANT_URL and REPORTING_TRANSCRIPTION_CLOUDANT_ACCOUNT in Reporting event configuration.
Multi-tenant JSON configuration: In the multi-tenant JSON configuration file, for each tenant where you want to enable transcription events, configure a reporting object that contains the following properties. If you use non-default account configurations to IBM Cloudant, you might need to include additional environment variables. See transcriptionCloudant in Properties for the reporting object.
```
...
"reporting": {
"transcriptionCloudant": {
  "username": "myCloudantUsername",
  "password": "myCloudantPassword",
  "dbName": "myCloudantDB",
  "eventInd": "transcription"
}
}
...
```

Using an API key for authentication

As an alternative to using a user name and password combination for your IBM Cloudant credentials, you can use an API key for authentication. You can find or generate your service credentials to IBM Cloudant, then copy the URL and API key values into your Voice Gateway reporting event configuration. See IBM Cloudant: API keys for more information about API keys.

* **Single-tenant environment:** Set the following environment variables in your [configuration](https://www.ibm.com/docs/en/voice-gateway?topic=reference-configuration-environment-variables).
```yaml

REPORTING_TRANSCRIPTION_CLOUDANT_URL=http://mytranscriptioncloudant.ibm.com/ REPORTING_TRANSCRIPTION_CLOUDANT_APIKEY=a1b2c3d3fgh1jk1mn0 REPORTING_TRANSCRIPTION_CLOUDANT_DB_NAME=myCloudantDB REPORTING_TRANSCRIPTION_CLOUDANT_EVENT_INDEX=transcription

  {:codeblock}

  * **Multi-tenant JSON configuration**: You must configure the `url` and `apikey` configuration variables instead of `username` and `password`.

  ```json
...
"reporting": {
  "transcriptionCloudant": {
    "url": "http://mytranscriptioncloudant.ibm.com/",
    "apikey": "a1b2c3d3fgh1jk1mn0",
    "dbName": "myCloudantDB",
    "eventInd": "transcription"
  }
}
...

{:codeblock}

Configuring Voice Gateway to publish transcription events in a CouchDB

To set up CouchDB, clone the repository and follow the instructions on GitHub. If you use CouchDB, you can't use an API key to authenticate.

Set the following environment variables in your configuration to publish transcription events in an CouchDB database. You can use both CouchDB and another reporting service simultaneously by adding the CouchDB configuration variables to your configuration.

The following examples show the configuration for publishing turn events to CouchDB by using a password and username authentication.

Single-tenant environment: Set the following environment variables in your configuration.

REPORTING_TRANSCRIPTION_CLOUDANT_URL="http://<svc-couchdb>:5984/"
REPORTING_TRANSCRIPTION_CLOUDANT_USERNAME=myCouchDBUsername
REPORTING_TRANSCRIPTION_CLOUDANT_PASSWORD=myCouchDBPassword
REPORTING_TRANSCRIPTION_CLOUDANT_DB_NAME=myCouchDB
REPORTING_TRANSCRIPTION_CLOUDANT_EVENT_INDEX=transcription

Multi-tenant JSON configuration: In the multi-tenant JSON configuration file, for each tenant where you want to enable transcription events, configure a reporting object that contains the following properties. If you use non-default account configurations to IBM Cloudant, you might need to include additional environment variables. See transcriptionCloudant in Properties for the reporting object.

...
"reporting": {
  "transcriptionCloudant": {
    "url": "http://<svc-couchdb>:5984/",
    "username": "myCouchDBUsername",
    "password": "myCouchDBPassword",
    "dbName": "myCouchDB",
    "eventInd": "transcription"
  }
}
...

Suppressing transcription text in reporting events

You can programmatically enable or suppress transcription text in reporting events by using the action tags, vgwActEnableTranscriptionReport and vgwActDisableTranscriptionReport. With these action tags, you can configure Watson Assistant to temporarily enable or suppress text inclusion in reporting events to prevent collection of any personal or sensitive information from callers.

In the following example, the action sequence first disables the receiver, the callee, then disables both the caller and callee. Finally, only the callee is enabled.

"vgwActionSequence": [
    {
      "command": "vgwActDisableTranscriptionReport",
      "parameters": {
        "targets": [
          "callee"   // disable just the callee
        ]
      }
    },
    {
      "command": "vgwActDisableTranscriptionReport",
      "parameters": {
        "targets": []  // disable both callee and caller
      }
    },
    {
      "command": "vgwActEnableTranscriptionReport",
      "parameters": {
        "targets": [
          "callee"    // enable just the callee
        ]
      }
    }]

Transcription event format

All Voice Gateway reporting events are based on the Splunk HTTP Event Controller JSON format.

Important: Transcription events include text transcriptions and other information that could potentially contain Protected Health Information (PHI), personally identifiable information (PII), or PCI Data Security Standard (PCI DSS) data. Therefore, it's critical that the consuming server properly stores these events to prevent exposure of personal information.

In the following example, the transcription event JSON object shows that the utterance was sent to Watson Assistant in a SIPREC session.

{
  "time": 1558628246.008,
  "host": "9.42.89.143",
  "source": "sip:18883141589@114.589.797.1",
  "sourcetype": "sipURI",
  "index": "transcription",
  "event": {
    "transcription": "Good morning",
    "globalSessionID": "0dzMpLgyJ5",
    "sipCallID": "0dzMpLgyJ5",
    "sipRecCallID": "1232673994_41860201@115.110.121.1",
    "sipFromURI": "sip:18883141589@114.589.797.1",
    "sipToURI": "sip:18882718281@182.845.045.2",
    "conversationID": "a23de67h-d2da-4fc7-8a04-a868d760671e",
    "customSIPInviteHeaders": {
      "Custom-Header1": "123",
      "Custom-Header2": "456"
    },
    "destination": "a23de67h-d2da-4fc7-8a04-a868d760671e",
    "destinationType": "conversationID",
    "sttConfig": {
      "confidenceScoreThreshold": 0.7,
      "credentials": {
        "password": "***",
        "username": "3b36c01c-6dfb-4cb6-9a7f-ea4fafd1b2f1"
      },
      "config": {
        "smart_formatting": true,
        "model": "en-US_NarrowbandModel",
        "profanity_filter": true
      }
    },
    "confidence": 0.94,
    "dtmf": false,
    "sttResponse": {
      "result_index": 0,
      "results": [
        {
          "final": true,
          "alternatives": [
            {
              "transcript": "Good morning",
              "confidence": 0.94
            }
          ]
        }
      ]
    },
    "speechRecognitionLatency": 800,
    "disabled": "false"
  }
}

In the following example, the transcription event JSON object shows that the audio URL was played to the caller in a self-service session.

{
  "time": 1558626870.906,
  "host": "192.168.0.3",
  "source": "e165e846-a2f6-493a-94e5-544647af81a8",
  "sourcetype": "conversationID",
  "index": "trx",
  "event": {
    "audioURL": "https://www.example.com/acm/8k16bitpcm.wav",
    "destination": "sip:alice@192.168.0.4",
    "destinationType": "sipURI",
    "sipCallID": "ujtu6PMqLs",
    "globalSessionID": "ujtu6PMqLs",
    "sipToURI": "sip:watson-conversation@192.168.0.3",
    "sipFromURI": "sip:alice@192.168.0.4",
    "assistantID": "bf5a9423-b0d6-437b-bcc4-01f2bc647ca4",
    "conversationLatency": 538,
    "intents": [
      {
        "intent": "One",
        "confidence": 1
      }
    ],
    "ttsConfig": {
      "credentials": {
        "password": "***",
        "username": "e48b9a83-98e2-4499-8db0-e09b2d8f2629"
      }
    },
    "disabled": false
  }
}

Event metadata

All reporting events begin with the following metadata:

Table 1. Keys for event metadata
Key	Description
`time`	The time of the event. The default format is epoch time, in the format ..
`host`	Host name of the Voice Gateway instance that generated the event.
`source`	Represents the Voice Gateway tenant that generated the event. Typically, this value is set to the phone number that was called.
`sourcetype`	The source type. By default, defined as `e164` for a telephone number, `conversationID` for a Watson Assistant workspace, `sipURI` for a SIP URI, or `sms` for SMS messages. Use the `REPORTING_TRANSCRIPTION_EVENT_SOURCE_TYPE` environment variable for a single-tenant deployment or `transcriptionEventSourceType` for a multitenant deployment to configure the source type. For more information, see Properties for the reporting object.
`index`	The index of the event as defined on the reporting event index configuration environment variables.
`event`	A CDR event, a Watson Assistant turn event, or a transcription event.

Transcription event object

The JSON object for each transcription event contains the following keys:

Table 2. Keys for defining transcriptions
Key	Description
`transcription`	The text from the transcribed utterance.
`sipCallID`	The SIP Call ID, pulled from the SIP INVITE request that is related to the call.
`sipRecCallID`	The SIPREC Call ID, pulled from the SIPREC metadata. This key has a value only when the session is a SIPREC session.
`globalSessionID`	The value of this key depends on how Voice Gateway is configured. By default, the value is identical to the `sipCallID` value. If the `CUSTOM_SIP_SESSION_HEADER` environment variable or `customSIPSessionHeader` JSON property is configured, it maps to that ID. If a session is a SIPREC session, the field is identical to the `gcid` field in the SIPREC metadata. If the session is a SIPREC session and the `CUSTOM_SIPREC_SESSION_FIELD` environment variable or `customSIPRECSessionField` JSON property is configured, it maps to that field.
`conversationID`	The Watson Assistant ID. This key has a value only if the utterance is sent to Watson Assistant.
`customSIPInviteHeader`	The value of the custom SIP INVITE header. For this value to be set, the header field must be defined on the `CUSTOM_SIP_INVITE_HEADER` environment variable, and the INVITE request must contain the specified header. Version 1.0.0.5 and later.
`customSIPInviteHeaders`	The value of the custom SIP INVITE headers. For this value to be set, the header fields must be defined on the `CUSTOM_SIP_INVITE_HEADERS` environment variable, and the INVITE request must contain the specified headers. Version 1.0.1 and later.
`destination`	The destination where the transcription is sent to.
`destinationType`	The destination type defined as `e164` for a telephone number, `conversationID` for a Watson Assistant workspace, or `sipURI` for a SIP URI.
`confidence`	The confidence score for the utterance from the Speech to Text service. Version 1.0.0.3 and later.
`sipFromURI`	SIP URI from the initial SIP INVITE `From` field. In a SIPREC session, a caller identifier is extracted from the SIPREC metadata.. Version 1.0.0.6a and later.
`sipToURI`	SIP URI from the initial SIP INVITE `To` field. In a SIPREC session, a caller identifier is extracted from the SIPREC metadata. Version 1.0.0.6a and later.
`speechRecognitionLatency`	The amount of time in milliseconds between when silence is detected in the caller speech to when a final result from Speech to Text is received. For this value to be set, `STT_LATENCY_TRACKING` must be set to `true`. If the Media Relay fails to detect latency, this value is omitted from the transcription event. Version 1.0.0.8 and later.
`disabled`	Indicates whether the text from the transcription event is included in the reporting event. When set to `true`, text is excluded from the reporting event. Set to `false` by default. Version 1.0.0.8 and later.
`audioURL`	The audio URL that was played to the user. Version 1.0.2 and later.
`mediaURLs`	A list of the media URLs that were sent or received over the SMS channel. Version 1.0.2 and later.
`sttConfig`	The Speech to Text service configuration that was used for this transaction. Version 1.0.2 and later.
`bargeinOccurred`	Indicates whether barge-in occurred during the play back transaction. Version 1.0.2 and later.
`dtmf`	Indicates whether the input from the user is DTMF. Version 1.0.2 and later.
`sttResponse`	The final response from the Speech to Text service in JSON format, including the transcript and confidence score for the top hypothesis and any alternatives. Version 1.0.2 and later.
`workspaceID`	Watson assistant workspace ID. Version 1.0.2 and later.
`assistantID`	Watson assistant ID. This key has a value only when using Watson Assistant v2 API. Version 1.0.2 and later.
`conversationLatency`	The amount of time in milliseconds between when a turn was initiated to the conversation service to when a response received from the conversation service. Version 1.0.2 and later.
`intents`	An array of intents that were recognized in the user input, sorted in descending order of confidence. Version 1.0.2 and later.
`ttsConfig`	Text to Speech service configuration that was used for the play back transaction. Version 1.0.2 and later.
`ttsLatency`	the amount of time in milliseconds between when the text was sent to the Text to Speech service to when the first audio chunk was received. Version 1.0.2 and later.

SMS Gateway transcription events generated by Voice Gateway

When an SMS message arrives and transcription events are enabled on Voice Gateway, Voice Gateway generates a transcription event indicating the source where the transcription came from.

In the following example, the sourcetype value, sms indicates that the transcription arrived via the SMS channel. The source value provides the telephone number that sent the SMS message. The destinationType indicates that the message is sent to Watson Assistant.

{
  "time": 1531231769,
  "host": "9.16.25.144",
  "source": "+14445556666",
  "sourcetype": "sms",
  "index": "transcription",
  "event": {
    "transcription": "Hello.",
    "sipCallID": "AbcdEF2G~",
    "globalSessionID": "AbcdEF2G~",
    "sipToURI": "sip:watson-conversation@9.95.87.495",
    "sipFromURI": "sip:14448675309@9.40.540.440",
    "destination": "6abcde93-5393-53f9-g430-907hi2j7k936",
    "destinationType": "conversationID"
  }
}

When Voice Gateway sends a message to SMS Gateway, SMS Gateway generates a transcription event indicating that an SMS message was sent.

In the following example, the destinationType value, sms indicates that a transcription is sent to the caller in an SMS message. The destination value provides the telephone number that Voice Gateway sends the SMS message to.

{
  "time": 1531231769,
  "host": "9.16.25.144",
  "source": "6abcde93-5393-53f9-g430-907hi2j7k936",
  "sourcetype": "conversation",
  "index": "transcription",
  "event": {
    "transcription": "Hello.",
    "sipCallID": "AbcdEF2G~",
    "globalSessionID": "AbcdEF2G~",
    "sipToURI": "sip:watson-conversation@9.95.87.495",
    "sipFromURI": "sip:14445556666@9.40.540.440",
    "destination": "+14445556666",
    "destinationType": "sms"  }
}