When a brand thinks about implementing chatbots, one of the first conversations is usually about voice. What will this AI sound like? Will it be like a professional concierge, a chummy pal, or a quirky robot friend that admits up front that it’s a robot?
Voice in AI is a moving target. The technology is moving so fast, and the capability of chatbots to mimic human speech and emotion is advancing rapidly. This means that conversational AI best practices will change over time too, as bots expand their skillset into the nuances and subtext of human language. There are no hard and fast rules that determine how “human” a chatbot should be. This means that developers need to understand the ways customers interact with their bots and maintain vigilance about continually monitoring the ways in which conversations with AI break down.
With any chatbot designed for a customer service experience, speech can be divided into two categories. The first is informational or transactional. The customer asks a question: “What’s the weather like tonight?” The bot provides an answer: “High of 40, low of 32, 15% chance of precipitation” etc. Just the facts, no room for editorializing.
The second category can be described as “personality” or anything else that goes beyond pure utility. Humor, sympathy, gratitude—these are emotional qualities that bots can simulate. Natural Language Processing and Natural Language Understanding are twin attributes that allow a bot to understand complexities in human language, and provide similarly complex responses that aren’t scripted by a human developer. However, as of this writing, a lot of the more “out of the box” or context-specific responses we can expect chatbots to provide will be scripted by devs or writers, rather than arrived at independently by the AI. Either way, bot creators will need to make decisions about how their bot behaves.
It could be helpful to think of a bot’s “humanity” as its ability to provide interactions that go beyond the informational and transactional. But can a bot be too human? In some cases, yes.
One decision that bot creators will want to make at the outset is whether their bot presents itself as such. It’s a frustrating experience to think you’re speaking to a human, only to find later that you’re dealing with a bot that, no matter how advanced, cannot match the ability of a person to handle highly complex queries. People tend to be more forgiving of a bot that is having difficulty grasping the nuance of their query than they are of human agents. They may naturally simplify their queries in response so as to help the bot understand what they’re trying to ask. In most cases, bots should immediately announce themselves for what they are.
Dennis Mortensen of x.ai (link resides outside ibm.com) agrees. He says he used to think mimicking humans and their faults was good design, but having reviewed countless dialogues over the years, he’s changed his mind. “You win very little when you fool people,” he says. “But you lose a lot when your charade is exposed. Design the agent as a piece of software and have it act like a piece of software, knowing that most of the other actors in your universe are humans.”
And as AI moves forward, Dennis argues, human decision-making will diminish (comparatively) and a lot of decisions will be requested and resolved by computers. There’s no room for human flaws here, though when it’s time for a human to make a decision, the machines will need to “slow down” to the speed of human cognition.
reply.ai’s (link resides outside ibm.com) Clara deSoto tells thinkLeaders that these decisions should be based on a bot’s personality, and suggests showing that off in the bot’s error messages.
Users inevitably will probe the bot’s capabilities, trying to find the seams and “trick” the bot, and the reality is no bot’s NLP (Natural Language Processing) is sufficiently advanced to withstand this. So, bot creators should ensure that their bot’s error message is an expression of that persona, and more importantly, should have an arsenal of different error messages so as not to constantly repeat itself.
Context is everything when it comes to voice. If a bot is helping a customer choose a life insurance policy, it’s probably not the time to bust out the puns. Conversely, if a bot is helping a customer pick out a prom dress, it might make sense for the bot to like, talk like a teenager. When a bot’s voice doesn’t match the context of a customer interaction, it shows, and customers can tend to get confused or irritated, even prematurely terminating an interaction due to frustration. Mortensen argues that we should design our bots to behave as we would expect a human in the same situation. As an example, he cites a hypothetical support bot at a bank, who is designed to be as formal and as efficient as a human agent would be during problem-solving mode. Once a transaction has taken place, or some other problem has been solved, there is more room for potential chit-chat or jokes.
Humans are very good at adapting to subtle changes in context, and making adjustments to our vocabulary, style of speaking, body language and overall presentation. We call this “code switching,” and we’re often so good at it that we don’t even realize we’re doing it. We don’t speak to our grandmothers the same way we speak to our peers, and vice versa. Doing so would be awkward.
The most successful bots will also learn to adjust their presentation based on whom they’re interacting, and make small adjustments on the fly based on the mood of the conversation. In other words, bots will get better at emotional intelligence. For example, if a customer that’s extremely upset with her experience tried to communicate this to a chatbot who responds in a cheery or flippant tone, that could make the customer wish she was speaking to a human who could comprehend her frustration.
Mortensen claims this is where the industry should move, with bots aligning toward individual users, as well as the context of the conversation.
As an example, I tend to be reasonably formal with my external counsel, but upon closing some financing, there is a casual celebratory tone to our emails before we fall back into our more formal communication. I have not seen any convincing [bot] agents in production who deliver on this promise, but I do think we have enough positive research around e.g. sentiment analysis to begin designs that take thoughts like this into consideration.
Fortunately, consumers expect bots to behave differently from humans. For example, we know that personal assistants like Siri and Alexa can provide instant answers to questions like “What was the score of today’s baseball game?” in milliseconds, whereas a human response to such a question would take longer. But does this mean that developers should build time delays into their conversational bots so they appear to be closer to human? Or would it be better to allow customers to enjoy the unique attributes of bots, in this case speedy responses? De Soto says no. “I strongly believe that the objective of a bot is to provide a service, not hoodwink the user into thinking it’s something it’s not.” However, she clarifies that these tonal decisions should be based on the bot’s personality.
Perhaps the most human thing we can imbue into our bots is the knowledge of their own limitations. In other words, bots should know when to ask for help from a human agent when they’ve received a request that extends beyond their capability to respond satisfactorily. Such bots will rescue customer experiences from confusion and frustration, and point toward a future where bots can handle an ever-expanding array of tasks.