December 24, 2018 | Written by: Louis Huang
Share this post:
Chatbots Orchestration and Multilingual Challenges
This post is the part 2 of “Chatbots Orchestration and Multilingual Challenges” post. Please check Chatbots Orchestration, if you haven’t.
Part 2. Multilingual Chatbots Challenges
There are some challenges when creating multilingual chatbots. A common practice is to use a baseline language chatbot then translate it to other languages chatbots. Usually, the baseline language is English. Then whenever there is a change in the English chatbot, the system admin will need to manually replicate the change and translate it to other languages chatbots. Also, the translation takes turnaround time. The change won’t reflect to each language chatbot in time, not to mention the translation cost.
Let’s take Watson Assistant as the example, the chatbot workspace is in JSON file format, it is easily to make mistakes when translating it by a translator. So, a development team may need to merge the translation into JSON file manually. Below chart shows a typical JSON file exported from Watson Assistant.
So, development team sends out the JSON file, then the translators will need to work directly in JSON format to verify in-context, it’s easy human errors may be introduced because translators aren’t trained to be programmers. This process is actually a common translation process but if we apply it to Chatbot translation, it really takes time and does not scale.
Leverage Machine Translation
We propose to use a normalized English chatbot, which uses machine translation on the input and output and is moderated by domain dictionaries. This can save the translation turnaround time and easier to expand to other languages. Let’s look at below architecture.
The user’s input is in Chinese. Watson Language Translator translates it to English. No matter the translation quality is good or bad. As long as the the translated utterance can match the intent and entities in dialog, then the chatbot returns a key. The key will be sent to a IBM translation management system called Globalization pipeline. Globalization pipeline then takes the key to lookup the value, which is the actual output of the chatbot. The output could be machine translated and reviewed by human to ensure better quality response. With this architecture, we can easily expand a chatbot from one language to multiple languages.
You may notice that there is a domain dictionary in this architecture. The domain dictionary handles the translation of the special terms (such as synonyms and homonyms ) even the terms that should not be translated. By using domain dictionary, we can normalize user’s utterance to more accurate translation, in order to match the dialogs in English workspace.
Advantages of Normalized Chatbots
We completed some POCs and found advantages as below:
- Simple, cost effective, and quick to add language support to existing single language chatbot
- No need for separate per language workspace management
- Better return and outcome than just straight translation of original language model
Multilingual Chatbots and Chatbots Orchestration integration
We have learned how to orchestrate chatbots in part 1 post and how to expand a single language chatbot to be a multilingual chatbot. Let’s try to combine these two architecture. It’s a common scenario that a worldwide company needs lots of chatbots to serve multiple functions with multiple languages. So, to integrate chatbots orchestration with multilingual support. We have high level architecture as below. We adopt language detection service and then normalize utterance to English. Classifier calculates the percentage of confidence to each bot. Orchestrator then directs the utterance to corresponding chatbot. The chatbot return the answer or key to Globalization Pipeline. It then takes the key to lookup the value, which is the actual output of the chatbot. The output could be machine translated and reviewed by human to ensure better quality response.
Ready to globalize your chatbots to reach global customers in their native language? Below are resources to help you: