Developers

How to use Watson Speech to Text utilities to increase accuracy

Share this post:

Key Points:
– Learn how to use Watson Speech to Text utilities to increase your accuracy
– We’ve included links so you can download S2T utilities
– Sample .wav files and Python code are also included

Try Watson Speech to Text for free


(Note: This content was previously published on the author’s blog and is is reposted here with the author’s permission.)

Speech to Text and Conversation

I thought I would take a moment to play with Watson Speech to Text and a utility that was released a few months ago.

The Speech to Text Utils allows you to train S2T using your existing conversational system. To give a quick demo, I got my son to ask about buying a puppy.

I set up some quick Python code to print out results:

—————————————–

import json

from watson_developer_cloud import SpeechToTextV1

# ctx is Service credentials copied from S2T Service.

s2t = SpeechToTextV1(

username=ctx.get(‘username’),

password=ctx.get(‘password’)

)

def wav(filename, **kwargs):

with open(filename,’rb’) as wav:

response = s2t.recognize(wav, content_type=’audio/wav’, **kwargs)

if len(response[‘results’]) > 0:

return response[‘results’][0][‘alternatives’][0][‘transcript’]

else:

return ‘???’;

 

So testing the audio with the following code:        

wav_file = 'p4u-example1.wav'

print(‘Broadband: {}’.format(wav(wav_file)))

print(‘NarrowBand: {}’.format(wav(wav_file,model=’en-US_NarrowbandModel’)))

Gets these results:

Broadband: can I get a puppy

NarrowBand: can I get a puppy

Of course the recording is crystal clear, which is why such a good result. So I added some ambient noises from SoundJay to the background. So now it sounds like it is in a subway.

Running the code above again gets these results.

Broadband: Greg it appropriate Narrowband: can I get a phone

Ouch!

Utils to the rescue!

So the purpose of asking about a puppy is that I have a sample conversation system that is about buying a dog. Using that conversation file I did the following.

1. Installed Speech to Text Utils.

2. Before you begin you need to set up the connection to your S2T service (using service credentials).

watson-speech-to-text-utils set-credentials

It will walk you through the username and password.

3. Once that was set up, I then tell it to create a customization.

watson-speech-to-text-utils corpus-from-workspace puppies4you.json

You need to map to a particular model. For testing, I attached it to en-US_NarrowbandModel and en-US_BroadbandModel.

4. Once it was run, I get the ID numbers for the customizations.

watson-speech-to-text-utils customization-list

Once I have the IDs I try the audio again:

wav_file='p4u-example2.wav'print('Broadband: {}'.format(wav(wav_file,customization_id='beeebd80-2420-11e7-8f1c-176db802f8de',timestamps=True)))print('Narrowband: {}'.format(wav(wav_file,model='en-US_NarrowbandModel',customization_id='a9f80490-241b-11e7-8f1c-176db802f8de')))

This outputs:

Broadband: can I get a puppy Narrowband: can I get a phone

So the broadband now works. Narrowband is likely the quality is too poor to work with. There is also more specialised language models for children done by others to cope with this.

One swallow does not make a summer

So this is one example, of one phrase. Really for testing, you should test the whole model. From a demonstration from development, it was able to increase a S2T model accuracy from around 50% to over 80%.

Interesting in trying this out for yourself? Try Watson Speech to Text for free with our 30-day trial.

 

(Note: This content was previously published on the author’s blog and is is reposted here with the author’s permission.)

Try Watson Speech to Text for free and start converting audio into written text in minutes

Advisory Software Engineer and Master Inventor, Delivery, IBM Watson and Cloud Platform

More stories
June 26, 2018

Updates to Watson Visual Recognition – 
Price reduction for Custom Classification, and 
Food and Explicit Models are now GA

Announcing updates to the IBM Watson Visual Recognition: A price reduction for Custom Classification events, and two models becoming generally available. Our blog highlights all these exciting changes.

Continue reading

June 8, 2018

How to prepare and clean data through quick operations, data profiles and visualization

IBM Watson Studio and Watson Knowledge Catalog include Data Refinery for self-service data preparation. Data Refinery puts data pre-processing and feature engineering in the hands of data scientists, enabling faster data insights.

Continue reading

May 31, 2018

IBM Watson and Topcoder – Building cost effective chatbots for your business

Topcoder is home to the world’s largest network of designers, developers, and data scientists. Working with IBM and Topcoder, you can get a business ready low cost chatbot built with Watson Assistant for just $10,000 in just 2-6 weeks

Continue reading