Watson APIs

Watson Speech-to-Text is paying attention to what people are saying (even when you are not)

Share this post:

After a conference call, have you ever had someone say, “The explanation on that call was great.  Did anyone write it down?”

Many conference calls, call center conversations and webinars are recorded for replay, but transcription can help listeners get more from calls. Phone conversations are an often-underutilized source of insights, mostly because the unstructured nature of the data is difficult to analyze.

Transcribing calls can be challenging because they often include acronyms and technical terms. Sometimes transcripts also can be difficult to decipher without an indication of which speaker is talking.

As the value of phone interactions is being recognized, the demand for apps that can transcribe speech is increasing. You can use the IBM Watson® Speech to Text service to add speech transcription capabilities to your applications. In a new episode of the Building with Watson webinar series, Bhavik Shah, Senior Offering Manager for IBM Watson, talks with Zach Walchuk about some of the newest features of the Speech to Text service, including language model customization and diarization.

Although speech-to-text technology has existed for many years, when you’re writing applications that involve speech recognition, you still encounter two problems. First, the accuracy depends on the quality of the input audio. Second, the service can only transcribe words that it knows. The Speech to Text service uses the technology behind Watson to determine the most likely results for words and phrases. With the Speech to Text Language Model Customization capability, you can train the service to learn from your input.

The process is iterative and you follow these high-level steps:

  1. Create a Bluemix account and provision the Speech to Text service.
  2. Run your test audio files through the standard Speech to Text service and store the output.
  3. Gather text data to create a custom language model.
  4. Create the custom language model.
  5. Use the custom language model on your test audio files.
  6. Compare your results. You should see higher accuracy using your custom language model.

By creating your own custom models, you can make them align more closely with your application’s requirements and accommodate specific accents, topics and words. Speech to Text also supports real-time speaker diarization, which means it can identify and segment speech by speaker identity, so Watson can process a conversation as it happens between two people. This feature can make your transcripts easier to read because the output includes labels for the speakers.

To learn more about the IBM Speech to Text Service, be sure to check out the Building with Watson webcast.

Learn more with the “Building with Watson” series

 

Add Comment
No Comments

Leave a Reply

Your email address will not be published.Required fields are marked *

More Watson APIs Stories
November 16, 2017

Get the Forrester Report on how to make customer service smarter, faster and more cost effective

Call center executives face the constant challenge of simultaneously meeting customer expectations and business cost goals. Customers are growing ever-more comfortable interacting with virtual agents, and most of them prefer messaging to phone calls for issue resolution. This Forrester report outlines the customer service trends that will enable call centers to become smarter and more strategic

Continue reading

November 14, 2017

Top 10 ways that AI will impact business in the next decade

AI already impacts many aspects of our daily lives at work and at home. Over the next decade, experts predict that AI enterprise software revenue will grow from $644 million to nearly $39 billion. Here are the top 10 ways that we think AI will impact business over the next 10 years.

Continue reading

November 3, 2017

AI is redefining customer service. Does your call center stack up?

More than 62% of customers will consider switching to a competitor after only 1-2 bad experiences with a brand. New technologies like AI and chatbots are allowing brands to offer always-on self-service, at scale, cheaper than ever before. Spend 5 minutes or less to answer 8 questions and find out how your customer service stacks up.

Continue reading