Home AI and ML Watson Speech to Text IBM Watson Speech to Text
Convert speech into text using AI-powered speech recognition and transcription
Start your free trial Explore the demo
Man at desk connected to sound bars and documents
What is IBM Watson Speech to Text?

IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics. Get started fast with our advanced machine learning models out-of-the-box or customize them for your use case.

IBM Watson Speech to Text is now available as a containerized library for IBM partners to embed AI technology in their commercial applications.

IBM TechXchange Virtual event

Register for the IBM TechXchange Day: AI and Automation

Benefits More accurate AI

Our best-in-class AI, embedded within Watson Speech to Text, truly understands your customers.

Customizable for your business

Train Watson Speech to Text on your unique domain language and specific audio characteristics.

Protects your data

Enjoy the security of IBM’s world-class data governance practices.

Truly runs anywhere

Built to support global languages and deployable on any cloud — public, private, hybrid, multicloud, or on-premises.

Feature highlights What sets Watson Speech to Text apart? Automatic speech recognition

Enable your voice applications using neural technologies for speech recognition powered by IBM Watson.


Model training options

Improve speech recognition accuracy for your use case with language and acoustic training options.

Optimized for customer care

Activate your voice application with speech models tuned for the customer care domain.

Pre-trained speech models

Activate your voice application with speech models tuned for the customer care domain.

Fine-tuning features

Improve speech recognition accuracy for extracting phrases, words, letters, numbers or lists.

Low latency transcription

Use our models optimized for low latency in real-time speech applications.

Audio diagnostics before transcription

Analyze and correct weak audio signals before transcription begins.

Interim transcription before final results

Improve application response times by using speech transcription as it is generated and throughout the finalization process.

Smart formatting

Transcribe dates, times, numbers, currency values, email and website addresses in your final transcripts by converting them into conventional forms.

Speaker diarization

Recognize who said what in a multi-participant voice exchange. Currently optimized for two-way call center conversations but can detect up to 6 different speakers.

Word spotting and filtering

Filter for specific words or inappropriate content by using our keyword spotting and profanity filtering features. (US English only)

Use cases

Customer self-service Answer common call center queries using a Watson-powered virtual assistant on the phone.

Call analytics Improve call center performance by mining conversation logs to quickly and accurately identify emerging call patterns, customer complaints, sentiment, non-compliant behavior and more.

Agent assist Boost agent productivity and success with real time assistance during calls using AI-powered document and intranet search. As the agent is speaking with a customer, Watson listens in on the conversation, transcribes the audio, searches for relevant content within documentation and feeds the answer back to the agent within seconds.

Interactive demo
Experience the difference Explore the powerful capabilities of advanced AI, neural voices and voice customization in our interactive demo. Go to the live demo
Partner with IBM

Accelerate your business growth as an Independent Software Vendor (ISV) by innovating with IBM. Partner with us to deliver enhanced commercial solutions embedded with AI to better address clients’ needs.

Explore ways to accelerate your growth with IBM
Find out more

Build AI-based solutions faster with IBM embeddable AI

Ways to buy

Get started for free or view a demo



500 minutes of free speech recognition a month and 38 pre-trained speech models.

Start for free


As low as USD 0.01 per minute

Tune your speech models to improve accuracy in recognition as well as transcription. Plus version includes unlimited minutes per month and 100 concurrent transcriptions.

View details


Contact us for pricing

Provides large and security-sensitive firms with more capacity and data protection. Premium includes unlimited minutes per month and unlimited concurrent transcriptions.

Deploy Anywhere

Contact us for pricing

Deploy behind your firewall or on any cloud with the flexibility of IBM Cloud Pak for Data. The Deploy Anywhere version includes unlimited minutes per month and unlimited concurrent transcriptions, along with noise detection, speech customization and data isolation. 

Resources API reference

Technical API specifications for all of your development needs.

Read more
Download SDKs

The Watson SDK repository in GitHub.

Go to GitHub
Data privacy and security

See documentation about our enhanced security features that ensure your data is isolated and encrypted end-to-end, while in transit and at rest.

Learn more
Build custom speech recognition models within minutes

Learn how to create custom speech models using IBM Watson quickly — without knowing how to code.

Read more
How to train your own speech “dragon”

Read about Watson Speech to Text requirements, the methodology and some best practices inspired by actual clients.

Read more
Replacing my old IVR system with IBM Watson

Guidelines on how to add a new or existing virtual assistant to your brand-new Watson IVR.

Read more
Related products Watson Text to Speech

Improve customer engagement by interacting with users in their own language using any written text.

watsonx Assistant

Solve customer issues the first time using an AI virtual assistant across any application, device, or channel.

Watson Speech Libraries for Embed

Infuse powerful natural language AI into commercial applications with a containerized library designed to empower IBM partners with greater flexibility.

Take the next step

See Watson Speech to Text capabilities in action.

Start your free trial Explore the demo
More ways to explore Documentation Community Partner with IBM