What is Watson Speech to Text?

The Speech to Text service provides an API to add speech transcription capabilities to applications. It combines information about language structure with the composition of the audio signal.

Watson Speech to Text features

Powerful real-time speech recognition

Automatically transcribe audio from 7 languages in real-time. Rapidly identify and transcribe what is being discussed, even from lower quality audio, across a variety of audio formats and programming interfaces (HTTP REST, Websocket, Asynchronous HTTP).

Highly accurate speech engine

Customize your model to improve accuracy for language and content you care most about, such as product names, sensitive subjects or names of individuals. Recognize different speakers in your audio and spot specified keywords in real-time with high accuracy and confidence.

Built to support various use cases

Transcribe audio for various use cases ranging from real-time transcription for audio from a microphone, to analyzing 1000s of audio recording from your call center to provide meaningful analytics.

Get started on Watson Speech to Text in minutes

Get Started with Watson Speech to Text