Watson Speech to Text converts audio voice into written text. Use Speech to Text to transcribe calls in a contact center to identify what is being discussed, when to escalate calls, and to understand content from multiple speakers. Use speech to text to create voice-controlled applications – even customize the model to improve accuracy for the language and content you care about most such as product names, sensitive subjects, or names of individuals.
The Speech to Text service can be used anywhere voice-interactivity is needed. In addition to transcribing audio in multiple languages, the service provides the ability to detect the presence of specific keywords or key phrases in the input stream. Common uses for the Speech to Text service include:
Interactions in mobile experiences
Transcribing media files
Call center transcriptions
Voice control of embedded systems
Converting sound to text to make data searchable
Streamed audio with Intelligible Speech
Recorded audio with Intelligible Speech
Text transcriptions of the audio with recognized words
Check out the Speech to Text demo and choose from pre-recorded audio, upload a WAV file, or record on the fly in US English, UK English, Japanese, Spanish, Brazilian Portuguese, Modern Standard Arabic, or Mandarin and watch the service in action. The API returns metadata providing timestamps, confidence, and alternative hypothesis. The demo also includes options to help Watson learn and improve.
First thousand minutes per month are FREE. Additional minutes are $0.02 per minute.
Includes the ability to use wideband models for all supported languages. Also includes confidence scores per word, time offsets per word, and alternate hypotheses per phrase.
First thousand minutes per month are FREE. Additional minutes are $0.02 per minute, in addition to the cost of using the Standard Service.
Adds the ability to use narrowband models for all supported languages. Narrowband models are required to process any audio that passed through a telephone line, since telephone lines down-sample audio to 8 kHz.