GitHubContribute in GitHub: Edit online

copyright: years: 2017, 2023 lastupdated: "2023-01-05"


Integrating third-party speech services

IBM® Voice Gateway supports using speech adapters to integrate third-party speech recognition (speech-to-text) and speech synthesis (text-to-speech) services in place of the IBM® Speech to Text and IBM® Text to Speech services. The adapters are separate Docker containers that you deploy together with Voice Gateway and act as proxies that sit between Voice Gateway and the third-party speech service.

Voice Gateway provides the following options for integrating third-party speech services:

  • Voice Gateway Speech to Text Adapter: The adapter currently enables the Google Cloud Speech API for speech recognition. Using the Google Cloud Speech API enables French, German, and Italian as additional languages for self-service agents. Version 1.0.0.5 and later.
  • Custom speech adapters: To use a different speech recognition or speech synthesis service, you can create your own speech adapter. To get started, use the speech adapter samples. Version 1.0.0.5 and later.

Speech to Text Adapter

Deploying the Speech to Text Adapter

The Voice Gateway Speech to Text Adapter is packaged as a separate Docker image that you configure and deploy along with the core SIP Orchestrator and Media Relay images. Before you deploy the Speech to Text Adapter, deploy a basic Voice Gateway instance as described in Getting started with Voice Gateway. Then, learn more about how add the Speech to Text Adapter to your deployment in the following pages:

Configuring the Speech to Text Adapter

To set up the Speech to Text Adapter, you can define the following configurations.

Learn more about topics related to configuring the Speech to Text Adapter: