Interactive demo

Advanced speech recognition and formatting

Everything you need to get started, and the capabilities to continuously improve.

Pre-trained speech models

Activate your voice application with speech models tuned for the customer care domain.

Model training options

Improve speech recognition accuracy for your use case with language and acoustic training options.

Fine-tuning features

Improve speech recognition accuracy for extracting phrases, words, letters, numbers or lists.

Low latency transcription

Use our models optimized for low latency in real-time speech applications.

Audio diagnostics before transcription

Analyze and correct weak audio signals before transcription begins.

Interim transcription before final results

Improve application response times by using speech transcription as it is generated and throughout the finalization process.

Smart formatting

Transcribe dates, times, numbers, currency values, email and website addresses in your final transcripts by converting them into conventional forms.

Speaker diarization

Recognize who said what in a multi-participant voice exchange. Currently optimized for two-way call center conversations but can detect up to 6 different speakers.

Word spotting and filtering

Filter for specific words or inappropriate content by using our keyword spotting and profanity filtering features. (US English only)

Get started now with Watson Speech to Text