Watson Speech to Text Library for Embed home

The Watson Speech to Text Library for Embed transcribes written text from spoken audio. The service leverages machine learning to combine knowledge of grammar, language structure, and the composition of audio and voice signals to accurately transcribe the human voice. It continuously updates and refines its transcription as it receives more speech audio. The service is ideal for applications that need to extract high-quality speech transcripts for use cases such as call centers, custom care, agent assistance, and similar solutions.

You can customize the Watson Speech to Text service to suit your language and application needs. Both services offer HTTP and WebSocket programming interfaces that make them suitable for any application that produces or accepts audio.

The services add a tool or other type of interface that provides APIs that you can run in notebooks.

Quick links

Use: Work with the service
Model Catalog: Pretrained model images
API Docs: Write code and build applications
Release notes: See what's changed in each release