Speech to Text

Convert human voice into written word

Start free on Bluemix View demo

Watson Speech to Text converts audio voice into written text. Use Speech to Text to transcribe calls in a contact center to identify what is being discussed, when to escalate calls, and to understand content from multiple speakers. Use speech to text to create voice-controlled applications – even customize the model to improve accuracy for the language and content you care about most such as product names, sensitive subjects, or names of individuals.

How it works

Developer tools


Standard Service


First thousand minutes per month are FREE. Additional minutes are $0.02 per minute.

Includes the ability to use wideband models for all supported languages. Also includes confidence scores per word, time offsets per word, and alternate hypotheses per phrase.

Telephony Add-on


First thousand minutes per month are FREE. Additional minutes are $0.02 per minute, in addition to the cost of using the Standard Service.

Adds the ability to use narrowband models for all supported languages. Narrowband models are required to process any audio that passed through a telephone line, since telephone lines down-sample audio to 8 kHz.


Let's talk

Watson Premium plans offer a higher level of security and isolation to help customers with sensitive data requirements.

Click here to find out more

More questions about pricing? Talk to sales

Ready to use?


Bluemix is IBM’s cloud platform where you can access the Watson services.

Follow the getting started guide to use Watson in five steps.

Use In Bluemix


Ready to get down to the details? Full documentation detailing how to get started using this service in Bluemix is available for each Watson service.

View full docs


Localized versions of Watson services (Natural Language Classifier, Retrieve and Rank, Speech to Text, Text to Speech) are available in the following places.

Japanese - SoftBank