Watson APIs

Watson Speech-to-Text is paying attention to what people are saying (even when you are not)

Share this post:

After a conference call, have you ever had someone say, “The explanation on that call was great.  Did anyone write it down?”

Many conference calls, call center conversations and webinars are recorded for replay, but transcription can help listeners get more from calls. Phone conversations are an often-underutilized source of insights, mostly because the unstructured nature of the data is difficult to analyze.

Transcribing calls can be challenging because they often include acronyms and technical terms. Sometimes transcripts also can be difficult to decipher without an indication of which speaker is talking.

As the value of phone interactions is being recognized, the demand for apps that can transcribe speech is increasing. You can use the IBM Watson® Speech to Text service to add speech transcription capabilities to your applications. In a new episode of the Building with Watson webinar series, Bhavik Shah, Senior Offering Manager for IBM Watson, talks with Zach Walchuk about some of the newest features of the Speech to Text service, including language model customization and diarization.

Although speech-to-text technology has existed for many years, when you’re writing applications that involve speech recognition, you still encounter two problems. First, the accuracy depends on the quality of the input audio. Second, the service can only transcribe words that it knows. The Speech to Text service uses the technology behind Watson to determine the most likely results for words and phrases. With the Speech to Text Language Model Customization capability, you can train the service to learn from your input.

The process is iterative and you follow these high-level steps:

  1. Create a Bluemix account and provision the Speech to Text service.
  2. Run your test audio files through the standard Speech to Text service and store the output.
  3. Gather text data to create a custom language model.
  4. Create the custom language model.
  5. Use the custom language model on your test audio files.
  6. Compare your results. You should see higher accuracy using your custom language model.

By creating your own custom models, you can make them align more closely with your application’s requirements and accommodate specific accents, topics and words. Speech to Text also supports real-time speaker diarization, which means it can identify and segment speech by speaker identity, so Watson can process a conversation as it happens between two people. This feature can make your transcripts easier to read because the output includes labels for the speakers.

To learn more about the IBM Speech to Text Service, be sure to check out the Building with Watson webcast.

Learn more with the “Building with Watson” series

 

Senior Writer, IBM Watson

Add Comment
No Comments

Leave a Reply

Your email address will not be published.Required fields are marked *

More Watson APIs stories
April 18, 2018

Manage procurement contracts with less effort, more accuracy with Watson Compliance Assist

Contracts are often thought of as large files which are created, signed and then filed away until there is cause for renewal, termination, legal action or to justify payment. AI platforms like Watson transform contracts from stand-alone, static forms into integrated, "living" entities that contribute to a knowledge base.

Continue reading

April 9, 2018

Driving faster, more accurate and more beneficial tax decisions

With Watson, KPMG tax professionals have access to a powerful solution that frees them to focus on the qualitative documentation which defines a project, helping to transform and enhance the accuracy, speed and ROI in generating qualified tax credits for their clients.

Continue reading

April 5, 2018

Prudential Singapore puts Watson AI to work to increase sales agent performance

Facing steady growth in customer and business volumes, Prudential recognized it needed an easily scalable and cost-effective solution to support a growing number of queries from financial consultants. Prudential created a scalable, intelligent chatbot using Watson AI to provide financial consultants with real-time customer-specific policy information.

Continue reading