Watson Speech services
Version: 5.2.2
Experience: API
Description
The Watson Speech services offer speech recognition and speech synthesis capabilities for your applications:
- Watson Speech to Text transcribes written text
from spoken audio. The service leverages machine learning to combine knowledge of grammar, language
structure, and the composition of audio and voice signals to accurately transcribe the human voice.
It continuously updates and refines its transcription as it receives more speech audio. The service
is ideal for applications that need to extract high-quality speech transcripts for use cases such as
call centers, custom care, agent assistance, and similar solutions.
For more information about the service, see About Watson Speech to Text.
- Watson Text to Speech synthesizes
natural-sounding speech from written text. The service streams the results back to the client with
minimal delay. The service is appropriate for voice-driven and screenless applications, where audio
is the preferred method of output.
For more information about the service, see About Watson Text to Speech.
You can customize the Watson Speech services to suit your language and application needs. Both services offer HTTP and WebSocket programming interfaces that make them suitable for any application that produces or accepts audio.
Licensing information
This service is included in the IBM Watson® Speech Services Cartridge license. For more information, see Licenses and entitlements.
Quick links
Integrated services
| Service | Capability |
|---|---|
| Watson Assistant for Voice Interaction | Enable direct voice interactions over a telephone with a cognitive self-service agent or transcribe phone calls between a caller and agent. |
| Service | Capability |
|---|---|
| watsonx Assistant | Build your own branded assistant into any device, application, or channel. Users interact with your application through the user interface that you implement. |