The Next Generation: Diverse Perspectives on Data and AI

Modernize at your own pace with IBM Cloud Pak for Data while keeping existing entitlements

-->Catch the on-demand version

What is Watson Speech to Text?

The Speech to Text service provides an API to add speech transcription capabilities to applications. It combines information about language structure with the composition of the audio signal.

Watson Assistant

IBM’s AI product that lets you build, train, and deploy conversational interactions into any application, device or channel.

Watson Speech to Text features

Powerful real-time speech recognition

Automatically transcribe audio from 7 languages in real-time. Rapidly identify and transcribe what is being discussed, even from lower quality audio, across a variety of audio formats and programming interfaces (HTTP REST, Websocket, Asynchronous HTTP).

Highly accurate speech engine

Customize your model to improve accuracy for language and content you care most about, such as product names, sensitive subjects or names of individuals. Recognize different speakers in your audio and spot specified keywords in real-time with high accuracy and confidence.

Built to support various use cases

Transcribe audio for various use cases ranging from real-time transcription for audio from a microphone, to analyzing 1000s of audio recording from your call center to provide meaningful analytics.

Get started on Watson Speech to Text in minutes

Get Started with Watson Speech to Text