Nothing is more frustrating than calling a customer support line to be greeted by a monotone, robotic, automated voice. The voice on the other end of the phone is taking painfully long to read you the menu options. You’re two seconds away from either hanging up, screaming “representative” into the phone, or pounding on the zero button until you reach a human agent. That’s the problem with many IVR solutions today. Conversational AI is too artificial. Customers feel they’re not being heard or listened to, so they just want to speak with a human agent.
IBM Watson Expressive Voices
Luckily, there is a way to fix that problem and make the customer experience more pleasant. With IBM Watson’s newest technology of expressive voices, you will no longer feel like you’re talking to a typical robot; you’ll feel like you’re talking to a live human agent without any of the wait time. These highly natural voices have conversational capabilities like expressive styles, emotions, word emphasis and interjections. Not only do these voices relieve the customer frustration of feeling like they’re talking to a bot, but they also contribute to the goal of call deflection from human agents. It’s a win-win for customers and businesses.
Best suited for the customer care domain, the voices will have a conversational style enabled by default; however, the voices also support a neutral style which may be optimal for other use cases (newscasting, e-learning, audio books, etc.). Have a listen to the expressive voice samples below:
As humans, we convey emotion in the words we speak, whether we realize it or not. We tend to sound empathetic when apologizing to one another. We sound uncertain when we don’t know the answer to something, and perhaps cheerful when we finally discover the answer. The ability to convey emotion is what makes us human. IBM Watson’s expressive voices can express emotion in order to better convey the meaning behind the words, ultimately reducing customer frustration when dealing with today’s phone experiences. Your voice bot will sound empathetic when telling the customer their package is delayed or cheerful when they’ve successfully helped the customer book an airline ticket.
Emphasis is another important aspect of human speech. Did you say Austin or August? Did you say you lost the card ending in 4876? IBM expressive voices support word emphasis so that your bot can better convey the desired meaning of the text. Users can indicate the location of the stress with four levels – none, moderate, strong, and reduced.
Interjecting with words like hmm, um, oh, aha, orhuh is another feature of human speech that IBM expressive voices now support to enable an interaction that feels more natural and human-like. The new expressive voices will automatically detect these interjections in text and treat them as such without any SSML (Speech Synthesis Markup Language) indication. There’s an also an option to disable the interjections when it’s not appropriate (e.g., ‘oh’ can be used to spell out the number 0 or as an interjection).
How to Get Started with Expressive Voices
Expressive voices and features will be available in US-English first in September 2022, followed by other languages in early 2023. The US-English expressive voices are Michael, Allison, Lisa, and Emma. For customers using the V3 version of Michael, Allison, or Lisa, switching to the expressive voices shouldn’t cause disruption as it will still sound like the same speaker, but with a more natural and conversational style. It’s easy to start using the new voices – simply indicate the voice name in the API reference, just like any other voice.
In summary, IBM’s new technology of expressive voices is the next level of conversational AI. It checks the box when it comes to an engaging and natural experience that mirrors that of a human agent. The new voices relieve the customer frustration of feeling unheard and drive call deflection from human agents. To learn more about the expressive voices, see the resources below.