copyright: years: 2017, 2023 lastupdated: "2023-01-07"
Reporting transcription events from Voice Gateway
For both self-service agents and agent assistants, IBM® Voice Gateway provides the ability to access the transcription text from the call in real time. Transcribing calls can be valuable in many scenarios, including:
- Monitoring conversations as they occur
- Logging conversations for later analysis, such as to fine-tune your Watson services for a better customer experience
You can configure Voice Gateway to generate transcription messages as REST events and publish them to a configured REST server. This method is available in Version 1.0.0.2 and later. In Versions 1.0.0.6 and later, you can also configure Voice Gateway to publish transcription events directly to IBM Cloudant. Each transcription event contains the utterance text, confidence score, and session information. For more information about transcription event contents and format, see Reporting events.
- Configuring Voice Gateway to publish transcription events
- Configuring Voice Gateway to publish transcription events in an IBM Cloudant database
- Configuring Voice Gateway to publish transcription events in a CouchDB
- Suppressing transcription text in reporting events
- Transcription event format
- SMS Gateway transcription events generated by Voice Gateway
Configuring Voice Gateway to publish transcription events
-
Set up a Splunk HTTP Event Collector (HEC) or a REST server that can store the events, for example in a noSQL database.
-
Configure your REST server URL and authorization credentials and enable transcription events in Voice Gateway.
- Single-tenant environment: Set the following environment variables in your configuration.
REPORTING_URL=http://myresteventserver.ibm.com/ REPORTING_USERNAME=myRestAdmin REPORTING_PASSWORD=myRestTokenOrPassword REPORTING_TRANSCRIPTION_EVENT_INDEX=transcription REPORTING_TRANSCRIPTION_EVENT_SOURCE_TYPE=vgwSessionID
- Multi-tenant JSON configuration: In the multi-tenant JSON configuration file, for each tenant where you
want to enable transcription reporting, configure a
reporting
object that contains the following properties. You can configure indexes for other types of reporting events within the same object.
... "reporting": { "url": "http://myresteventserver.ibm.com/", "username": "myRestAdmin", "password": "myRestTokenOrPassword", "transcriptionEventInd": "transcription", "transcriptionEventSourceType" : "vgwSessionID" } ...
The transcription event index value is included as the
index
value in all transcription events so that the REST server that consumes them can differentiate the type of event. The index field is required for Splunk HEC, but it's also useful for anyone who is building their own REST server to handle these events.
After you redeploy Voice Gateway, it will publish event records to the configured REST server every time an utterance is detected.
Configuring Voice Gateway to publish transcription events in an IBM Cloudant database
Before configuring Voice Gateway to publish to IBM Cloudant, set up a IBM® Cloudant® for IBM Cloud. When you generate your service credentials from the IBM Cloudant dashboard, you can choose either IAM or a combination of API key and username and password.
Set the following environment variables in your configuration to publish transcription events in an IBM Cloudant database. You can use both IBM Cloudant and another reporting service simultaneously by adding the IBM Cloudant configuration variables to your configuration.
The following examples show the configuration for publishing turn events to IBM Cloudant by using a password and username authentication. See Using an API key for authentication. You can configure authentication
for IBM Cloudant either with an API key by adding the REPORTING_TRANSCRIPTION_CLOUDANT_APIKEY
variable or by username and password by adding REPORTING_TRANSCRIPTION_CLOUDANT_USERNAME
and REPORTING_TRANSCRIPTION_CLOUDANT_PASSWORD
.
-
Single-tenant environment: Set the following environment variables in your configuration.
REPORTING_TRANSCRIPTION_CLOUDANT_USERNAME=myCloudantUsername REPORTING_TRANSCRIPTION_CLOUDANT_PASSWORD=myCloudantPassword REPORTING_TRANSCRIPTION_CLOUDANT_DB_NAME=myCloudantDB REPORTING_TRANSCRIPTION_CLOUDANT_EVENT_INDEX=transcription
If you use non-default account configurations to IBM Cloudant, you might need to include additional environment variables. See
REPORTING_TRANSCRIPTION_CLOUDANT_URL
andREPORTING_TRANSCRIPTION_CLOUDANT_ACCOUNT
in Reporting event configuration. -
Multi-tenant JSON configuration: In the multi-tenant JSON configuration file, for each tenant where you want to enable transcription events, configure a
reporting
object that contains the following properties. If you use non-default account configurations to IBM Cloudant, you might need to include additional environment variables. SeetranscriptionCloudant
in Properties for thereporting
object.... "reporting": { "transcriptionCloudant": { "username": "myCloudantUsername", "password": "myCloudantPassword", "dbName": "myCloudantDB", "eventInd": "transcription" } } ...
Using an API key for authentication
As an alternative to using a user name and password combination for your IBM Cloudant credentials, you can use an API key for authentication. You can find or generate your service credentials to IBM Cloudant, then copy the URL and API key values into your Voice Gateway reporting event configuration. See IBM Cloudant: API keys for more information about API keys.
* **Single-tenant environment:** Set the following environment variables in your [configuration](https://www.ibm.com/docs/en/voice-gateway?topic=reference-configuration-environment-variables).
```yaml
REPORTING_TRANSCRIPTION_CLOUDANT_URL=http://mytranscriptioncloudant.ibm.com/ REPORTING_TRANSCRIPTION_CLOUDANT_APIKEY=a1b2c3d3fgh1jk1mn0 REPORTING_TRANSCRIPTION_CLOUDANT_DB_NAME=myCloudantDB REPORTING_TRANSCRIPTION_CLOUDANT_EVENT_INDEX=transcription
{:codeblock}
* **Multi-tenant JSON configuration**: You must configure the `url` and `apikey` configuration variables instead of `username` and `password`.
```json
...
"reporting": {
"transcriptionCloudant": {
"url": "http://mytranscriptioncloudant.ibm.com/",
"apikey": "a1b2c3d3fgh1jk1mn0",
"dbName": "myCloudantDB",
"eventInd": "transcription"
}
}
...
{:codeblock}
Configuring Voice Gateway to publish transcription events in a CouchDB
To set up CouchDB, clone the repository and follow the instructions on GitHub. If you use CouchDB, you can't use an API key to authenticate.
Set the following environment variables in your configuration to publish transcription events in an CouchDB database. You can use both CouchDB and another reporting service simultaneously by adding the CouchDB configuration variables to your configuration.
The following examples show the configuration for publishing turn events to CouchDB by using a password and username authentication.
- Single-tenant environment: Set the following environment variables in your configuration.
REPORTING_TRANSCRIPTION_CLOUDANT_URL="http://<svc-couchdb>:5984/"
REPORTING_TRANSCRIPTION_CLOUDANT_USERNAME=myCouchDBUsername
REPORTING_TRANSCRIPTION_CLOUDANT_PASSWORD=myCouchDBPassword
REPORTING_TRANSCRIPTION_CLOUDANT_DB_NAME=myCouchDB
REPORTING_TRANSCRIPTION_CLOUDANT_EVENT_INDEX=transcription
- Multi-tenant JSON configuration: In the multi-tenant JSON configuration file, for each tenant where you want
to enable transcription events, configure a
reporting
object that contains the following properties. If you use non-default account configurations to IBM Cloudant, you might need to include additional environment variables. SeetranscriptionCloudant
in Properties for thereporting
object.
...
"reporting": {
"transcriptionCloudant": {
"url": "http://<svc-couchdb>:5984/",
"username": "myCouchDBUsername",
"password": "myCouchDBPassword",
"dbName": "myCouchDB",
"eventInd": "transcription"
}
}
...
Suppressing transcription text in reporting events
You can programmatically enable or suppress transcription text in reporting events by using the action tags, vgwActEnableTranscriptionReport
and vgwActDisableTranscriptionReport
. With these action tags, you can configure
Watson Assistant to temporarily enable or suppress text inclusion in reporting events to prevent collection of any personal or sensitive information from callers.
In the following example, the action sequence first disables the receiver, the callee
, then disables both the caller
and callee
. Finally, only the callee
is enabled.
"vgwActionSequence": [
{
"command": "vgwActDisableTranscriptionReport",
"parameters": {
"targets": [
"callee" // disable just the callee
]
}
},
{
"command": "vgwActDisableTranscriptionReport",
"parameters": {
"targets": [] // disable both callee and caller
}
},
{
"command": "vgwActEnableTranscriptionReport",
"parameters": {
"targets": [
"callee" // enable just the callee
]
}
}]
Transcription event format
All Voice Gateway reporting events are based on the Splunk HTTP Event Controller JSON format.
Important: Transcription events include text transcriptions and other information that could potentially contain Protected Health Information (PHI), personally identifiable information (PII), or PCI Data Security Standard (PCI DSS) data. Therefore, it's critical that the consuming server properly stores these events to prevent exposure of personal information.
In the following example, the transcription event JSON object shows that the utterance was sent to Watson Assistant in a SIPREC session.
{
"time": 1558628246.008,
"host": "9.42.89.143",
"source": "sip:18883141589@114.589.797.1",
"sourcetype": "sipURI",
"index": "transcription",
"event": {
"transcription": "Good morning",
"globalSessionID": "0dzMpLgyJ5",
"sipCallID": "0dzMpLgyJ5",
"sipRecCallID": "1232673994_41860201@115.110.121.1",
"sipFromURI": "sip:18883141589@114.589.797.1",
"sipToURI": "sip:18882718281@182.845.045.2",
"conversationID": "a23de67h-d2da-4fc7-8a04-a868d760671e",
"customSIPInviteHeaders": {
"Custom-Header1": "123",
"Custom-Header2": "456"
},
"destination": "a23de67h-d2da-4fc7-8a04-a868d760671e",
"destinationType": "conversationID",
"sttConfig": {
"confidenceScoreThreshold": 0.7,
"credentials": {
"password": "***",
"username": "3b36c01c-6dfb-4cb6-9a7f-ea4fafd1b2f1"
},
"config": {
"smart_formatting": true,
"model": "en-US_NarrowbandModel",
"profanity_filter": true
}
},
"confidence": 0.94,
"dtmf": false,
"sttResponse": {
"result_index": 0,
"results": [
{
"final": true,
"alternatives": [
{
"transcript": "Good morning",
"confidence": 0.94
}
]
}
]
},
"speechRecognitionLatency": 800,
"disabled": "false"
}
}
In the following example, the transcription event JSON object shows that the audio URL was played to the caller in a self-service session.
{
"time": 1558626870.906,
"host": "192.168.0.3",
"source": "e165e846-a2f6-493a-94e5-544647af81a8",
"sourcetype": "conversationID",
"index": "trx",
"event": {
"audioURL": "https://www.example.com/acm/8k16bitpcm.wav",
"destination": "sip:alice@192.168.0.4",
"destinationType": "sipURI",
"sipCallID": "ujtu6PMqLs",
"globalSessionID": "ujtu6PMqLs",
"sipToURI": "sip:watson-conversation@192.168.0.3",
"sipFromURI": "sip:alice@192.168.0.4",
"assistantID": "bf5a9423-b0d6-437b-bcc4-01f2bc647ca4",
"conversationLatency": 538,
"intents": [
{
"intent": "One",
"confidence": 1
}
],
"ttsConfig": {
"credentials": {
"password": "***",
"username": "e48b9a83-98e2-4499-8db0-e09b2d8f2629"
}
},
"disabled": false
}
}
Event metadata
All reporting events begin with the following metadata:
Key | Description |
---|---|
time |
The time of the event. The default format is epoch time, in the format |
host |
Host name of the Voice Gateway instance that generated the event. |
source |
Represents the Voice Gateway tenant that generated the event. Typically, this value is set to the phone number that was called. |
sourcetype |
The source type. By default, defined as e164 for a telephone number, conversationID for a Watson Assistant workspace, sipURI for a SIP URI, or sms for SMS messages. Use the REPORTING_TRANSCRIPTION_EVENT_SOURCE_TYPE environment variable for a single-tenant deployment or transcriptionEventSourceType for a multitenant deployment to configure the source type. For more information, see Properties for the reporting object. |
index |
The index of the event as defined on the reporting event index configuration environment variables. |
event |
A CDR event, a Watson Assistant turn event, or a transcription event. |
Transcription event object
The JSON object for each transcription event contains the following keys:
Key | Description |
---|---|
transcription |
The text from the transcribed utterance. |
sipCallID |
The SIP Call ID, pulled from the SIP INVITE request that is related to the call. |
sipRecCallID |
The SIPREC Call ID, pulled from the SIPREC metadata. This key has a value only when the session is a SIPREC session. |
globalSessionID |
The value of this key depends on how Voice Gateway is configured. By default, the value is identical to the sipCallID value. If the CUSTOM_SIP_SESSION_HEADER environment variable or customSIPSessionHeader JSON property is configured, it maps to that ID. If a session is a SIPREC session, the field is identical to the gcid field in the SIPREC metadata. If the session is a SIPREC session and the CUSTOM_SIPREC_SESSION_FIELD environment variable or customSIPRECSessionField JSON property is configured, it maps to that field. |
conversationID |
The Watson Assistant ID. This key has a value only if the utterance is sent to Watson Assistant. |
customSIPInviteHeader |
The value of the custom SIP INVITE header. For this value to be set, the header field must be defined on the CUSTOM_SIP_INVITE_HEADER environment variable, and the INVITE request must contain the specified header. Version
1.0.0.5 and later. |
customSIPInviteHeaders |
The value of the custom SIP INVITE headers. For this value to be set, the header fields must be defined on the CUSTOM_SIP_INVITE_HEADERS environment variable, and the INVITE request must contain the specified headers. Version
1.0.1 and later. |
destination |
The destination where the transcription is sent to. |
destinationType |
The destination type defined as e164 for a telephone number, conversationID for a Watson Assistant workspace, or sipURI for a SIP URI. |
confidence |
The confidence score for the utterance from the Speech to Text service. Version 1.0.0.3 and later. |
sipFromURI |
SIP URI from the initial SIP INVITE From field. In a SIPREC session, a caller identifier is extracted from the SIPREC metadata.. Version 1.0.0.6a and later. |
sipToURI |
SIP URI from the initial SIP INVITE To field. In a SIPREC session, a caller identifier is extracted from the SIPREC metadata. Version 1.0.0.6a and later. |
speechRecognitionLatency |
The amount of time in milliseconds between when silence is detected in the caller speech to when a final result from Speech to Text is received. For this value to be set, STT_LATENCY_TRACKING must be set to true .
If the Media Relay fails to detect latency, this value is omitted from the transcription event. Version 1.0.0.8 and later. |
disabled |
Indicates whether the text from the transcription event is included in the reporting event. When set to true , text is excluded from the reporting event. Set to false by default. Version 1.0.0.8 and later. |
audioURL |
The audio URL that was played to the user. Version 1.0.2 and later. |
mediaURLs |
A list of the media URLs that were sent or received over the SMS channel. Version 1.0.2 and later. |
sttConfig |
The Speech to Text service configuration that was used for this transaction. Version 1.0.2 and later. |
bargeinOccurred |
Indicates whether barge-in occurred during the play back transaction. Version 1.0.2 and later. |
dtmf |
Indicates whether the input from the user is DTMF. Version 1.0.2 and later. |
sttResponse |
The final response from the Speech to Text service in JSON format, including the transcript and confidence score for the top hypothesis and any alternatives. Version 1.0.2 and later. |
workspaceID |
Watson assistant workspace ID. Version 1.0.2 and later. |
assistantID |
Watson assistant ID. This key has a value only when using Watson Assistant v2 API. Version 1.0.2 and later. |
conversationLatency |
The amount of time in milliseconds between when a turn was initiated to the conversation service to when a response received from the conversation service. Version 1.0.2 and later. |
intents |
An array of intents that were recognized in the user input, sorted in descending order of confidence. Version 1.0.2 and later. |
ttsConfig |
Text to Speech service configuration that was used for the play back transaction. Version 1.0.2 and later. |
ttsLatency |
the amount of time in milliseconds between when the text was sent to the Text to Speech service to when the first audio chunk was received. Version 1.0.2 and later. |
SMS Gateway transcription events generated by Voice Gateway
When an SMS message arrives and transcription events are enabled on Voice Gateway, Voice Gateway generates a transcription event indicating the source where the transcription came from.
In the following example, the sourcetype
value, sms
indicates that the transcription arrived via the SMS channel. The source
value provides the telephone number that sent the SMS message. The destinationType
indicates that the message is sent to Watson Assistant.
{
"time": 1531231769,
"host": "9.16.25.144",
"source": "+14445556666",
"sourcetype": "sms",
"index": "transcription",
"event": {
"transcription": "Hello.",
"sipCallID": "AbcdEF2G~",
"globalSessionID": "AbcdEF2G~",
"sipToURI": "sip:watson-conversation@9.95.87.495",
"sipFromURI": "sip:14448675309@9.40.540.440",
"destination": "6abcde93-5393-53f9-g430-907hi2j7k936",
"destinationType": "conversationID"
}
}
When Voice Gateway sends a message to SMS Gateway, SMS Gateway generates a transcription event indicating that an SMS message was sent.
In the following example, the destinationType
value, sms
indicates that a transcription is sent to the caller in an SMS message. The destination
value provides the telephone number that Voice Gateway sends
the SMS message to.
{
"time": 1531231769,
"host": "9.16.25.144",
"source": "6abcde93-5393-53f9-g430-907hi2j7k936",
"sourcetype": "conversation",
"index": "transcription",
"event": {
"transcription": "Hello.",
"sipCallID": "AbcdEF2G~",
"globalSessionID": "AbcdEF2G~",
"sipToURI": "sip:watson-conversation@9.95.87.495",
"sipFromURI": "sip:14445556666@9.40.540.440",
"destination": "+14445556666",
"destinationType": "sms" }
}