Watson APIs

How Watson text-to-speech AI helped an author bring his book’s main character to life

Share this post:

It’s been said that a picture is worth a thousand words. Great tonality, clarity, diction and enunciation of spoken words can go a long way in creating the best and most memorable pictures. Artificial Intelligence has progressed to the point where it can now effectively articulate the above.

I wanted to find out if it were possible to have a female artificial intelligence voice portray the main character in my book, “Miraculous,” in such a convincing way that the listening audience would believe that she is the actual character in the book.

How I used Watson APIs to bring my main character to life

After auditioning many different AI characters from a variety of company’s, I discovered and settled upon IBM Watson’s Text to Speech API that synthesizes text to audio in various languages, voices and dialects. I chose the “Allison”voice, as she possesses a very sweet, attractive tone that also fits the age range of Hailee Tupper, the main protagonist in my book.

In order to assist her in acting out scenes in my book, I utilized the Text to Speech API’s “Expressiveness” feature which extends SSML with an expressive element that you can use to indicate a speaking style of GoodNews, Apology, or Uncertainty (available only for the U.S. English Allison voice). Learn more about the Expressive SSML, IBM Watson’s expressive speech service.

There are thousands, upon thousands of different word combinations in literature and Watson’s Allison voice responds uniquely to each. When one or more of the available three expressive speech emotions are applied solely or in combination, spaced at different intervals, an expanded range of emotions are possible.

Fictional characters speak differently in terms of being short, medium or long winded. This influences the number and frequency of breaks and pauses that must be calculated in and applied to sentences. The overall mood of a particular scene in a book can also influence pause rate application like: suspense, tranquility, jubilance etc.

Below is an example of what can be accomplished using the above technique.

Do you have to be a computer Tec or coder to do a project like this? I don’t think so. I, by any stretch of the imagination don’t even come close to falling into any of those two categories. What I will say, however, is that it takes patience, practice and creative a drive. It’s like taking on the role of a story director. The process involves a lot of copying and pasting. The key is in learning how and where to paste the code into the text to get the desired effects.

For those who might be interested in doing a similar project, I’m willing to share my knowledge and expertise to help you achieve the highest quality results, perhaps though a free video.

I would just like to conclude by saying that it is a wonderful and fascinating experience working with IBM’s Watson. If it were possible I would like to shake his hand.

To get started with using Watson’s Text to Speech API visit our developer guide page. For more information on how to access a copy of my audiobook, please click here.

Try Watson Text to Speech with a free trial.

Author, Miraculous –A Whale of a Tale

More Watson APIs stories
February 21, 2019

What’s happening in conversational AI

Conversational AI is a type of artificial intelligence that enables software to understand and interact with people naturally, using spoken or written language. Chances are you’ve encountered conversational AI or chatbots on your smartphone or other smart device.

Continue reading

February 15, 2019

IBM’s Think 2019 highlighted best of AI customer service applications

IBM's Think 2019 was an industry-leading conference that took place February 12-15 in San Francisco and included 2,000+ technical and business sessions with execs from ExxonMobil, Sprint, Honda, KPM, and others. In addition, more than 800 leaders, 400 developers and 200 distinguished engineers from IBM appeared on stage and in private sessions. Speakers ranged from football great Joe Montana, skateboard pioneer Tony Hawk and astronaut Taylor Richardson.

Continue reading

February 12, 2019

Welcome to Think 2019!

Welcome to Think 2019! We decided to try something new this year and update this blog in real-time, so that you may have a glimpse into the conference experience. In case you have not been following along, Think 2019 is IBM's only conference of the year, taking place February 12-15 in San Francisco.

Continue reading