Anyone who has ever called a company's 800 number can relate to the frustration of dealing with menus. You navigate through layers of choices, some not quite fitting what you're looking for, until you reach the right menu. Or did you? Often the first step that a live agent takes is transferring you to another department, which might have its own menu for you to navigate!
The solution to menu woes is a call router. Using natural language technologies, a call router simply asks "What can I do for you?" and allows the caller to state what they want in plain language as they would to a real person. Based on what the caller said they are routed to the appropriate destination within the infrastructure. No menus, no waiting, just accurate routing to the right place.
To accomplish call routing, two statistical models need to be created. The first performs speech recognition and is called a Statistical Language Model (SLM). An SLM tunes the speech recognition engines to the particular things your callers will say. By creating your own SLM based on what your callers are likely to say, your accuracy can be high without the lengthy per-user enrollment process that is typical of desktop dictation systems.
The second model that needs to be created is called the Action Classifier (AC) model. The AC model takes the spoken request obtained by the speech engines and predicts the correct action to take. Actions are the categories into which each request a caller makes can be sorted. Your job as a developer is to define the actions and create a model that predicts which action best matches each caller's request.
When these two models are ready, they are deployed inside a call routing application, a Web project preconfigured to use the call routing models. The two models do all the magic of figuring out what the caller wants to do, and your job as an application developer is simply to define where to route the caller for each action. Typically, each action routes to an extension in your organization's call center or an Interactive Voice Response (IVR) software program.
The first task in building a call router is to collect the initial data that is to be used to train the two statistical models. The right data is critical to achieving the best accuracy possible. Your job is to feed the SLM and AC models with the data that most accurately reflects what actual callers might say.
To get started, you must define the question you're going to ask your callers when they call your system. This is typically "How may I help you?" It's important to get the question you are going to ask defined early because it can dramatically influence how people respond. Deciding now means that the initial data you collect will more accurately reflect what live callers say.
Next, you'll need to categorize your organization in the best way to route callers queries. Make a listing of all of the places in your organization that a call can be routed to; these are your routing destinations. Use this list to construct a data collection questionnaire that you can give out to internal users or, preferably, real customers. For every destination you identified, ask your audience to think of a few things they would say to get there. See Listing 1 for an example of what a questionnaire for a car company might look like.
Listing 1. A sample data collection questionnaire
Thank you for helping us build a call routing system. Please respond with the real things you would say out loud to a telephone system according to the following scenario: You call the system and are greeted with the following statement: Thank you for calling ACME Motors. Please say what you'd like and I will direct your call. How may I help you? Think of a few ways to say that you want to be transferred to the warranty department: 1. __________________________________________________________ 2. __________________________________________________________ 3. __________________________________________________________ Now imagine you want to talk to the complaint department. Type some of the things you might imagine saying: 1. __________________________________________________________ 2. __________________________________________________________ 3. __________________________________________________________ Your car is broken down and you need to speak with a roadside assistance specialist. What kinds of things might you say? 1. __________________________________________________________ 2. __________________________________________________________ 3. __________________________________________________________ |
It's important to encourage realistic responses when doing data collection. Your data should reflect the kinds of things real people in your target audience are going to say when they call. If you collected data inside your organization, you might want to go through the results and make sure that acronyms or terminology that are only used internally are replaced with what actual customers say.
To get started, try to get approximately one hundred requests for each destination. For the survey above, that would mean getting about 30 people to fill it out. When you've collected approximately that much data, place the requests for each destination in its own text file, one request per line. To keep things consistent in the database, you should convert all of the requests to lowercase letters, change any numbers over to their word versions ("42"to "forty two"), and remove any punctuation except the apostrophe ('). Your data is ready to be imported into a natural language development environment. Now you can set it up!
To set up a natural language development environment you'll need:
- Rational® Web Developer (RWD) 6.0 or Rational Application Developer (RAD) 6.0
- DB2® 8.0 or greater
- IBM® WebSphere® Voice Toolkit Preview with the Natural Language Understanding (NLU) Feature installed
Your first task after you've got everything installed is to set up a natural language database. Start the Rational IDE and open the Natural Language perspective by:
- Selecting Window > Open Perspective > Other....
- Choosing Natural Language.
- Selecting File > New > Natural Language Database Wizard. You can either choose to create a local database, which performs better, or a remote database, which lets you and other developers work with your data set simultaneously. For a remote server you'll need to install DB2 in a location that all of your developers will have access to.
After your database has been created by the wizard, you'll need to create a call router to link to it. Select File > New > Other... and choose Natural Language Understanding > Call Router. In the wizard you are prompted to name your call router and optionally specify the destinations that it ultimately routes to. Finally, you are prompted to specify which natural language database to connect the project to.
Now you can go ahead and create the actions in the database that correspond to your routing destinations. Right-click on the project name in the Filter Navigator view and select Database Properties (see Figure 1). Select Actions from the list of database properties to manipulate and enter a short, descriptive name for each destination you identified in the text input field on the right side of the window. Make sure that the names you choose for actions here line up with the names you entered in the second page of the New Call Router Wizard above. After entering each name, press Add to add it to the database.
Figure 1. The Database Properties window

After you've entered all your actions into the window, press Done. You're ready to begin importing your data!
To import your data into the database you'll need to import each text file individually so you can set its Action properly. To do this:
- Select File > Import....
- Choose NLU Sentence Data from the list, and press Next to start the NLU Import Wizard.
- Choose your call router project on the first page of the wizard.
- Select Text File as the import mode.
- Choose the first text file you want to import.
- On the next page, select :MAINMENU :HOWMAYIHELPYOU from the drop-down labeled Context.
- Select the appropriate action you want to import these sentences under from the Actions drop-down list.
Note: There should be an action on the drop-down list for each text file you plan to import; otherwise, quit the wizard and create it in the Database Properties window as described previously.
- On the final page of the wizard press Start to begin the import.
Repeat the procedure for each text file you created from your questionnaire. You now have a database full of requests and are well on your way to having a real natural language call router.
When people call your system they are likely to have lots of different requests. Often, they give certain pieces of information that don't really affect the meaning of the sentence. For example, consider the following three requests:
- I'd like to buy ten cars.
- I'd like to buy four cars.
- I'd like to buy six hundred cars.
In each case the meaning of the request is the same regardless of the amount specified. IBM WebSphere Voice Toolkit allows you to generalize your data by removing these specific words or phrases and replacing them with a Named Entity. A Named Entity (sometimes referred to as a Class or a Tag) is simply a generic placeholder that you supply a static grammar into to fill in at run time. When substituted, the sentences above would all collapse down to:
- I'd like to buy NUMBER cars.
Which could then be assigned to an Action corresponding to your car purchasing division. After you supply a grammar file that lists what can be filled into NUMBER, you ave taken care of every possible way people would say that sentence.
An important side benefit of Named Entities is that their values can be passed to the IVR applications that the call router sends the caller to. To illustrate the power of this, imagine you had trained your call router to have PARTNUMBER and NUMBER Named Entities. That means that if someone called and asked:
- Hello, can I have 32 of part number XJ428?
The variables NUMBER=32 and PARTNUMBER=XJ428 would be passed to your IVR application. Not only are you routing the user automatically to the right place in your telephony environment, but you're saving them time by not asking them information they have already given you.
To get started assigning named entities, open the Natural Language perspective and expand the tree under your call router project in the Filter Navigator. Double-click on the Named Entity Extraction item to bring up a listing of all of the sentences in your database in a Sentence List view. Note that the column on the Sentence List says "Text." Text is what the database refers to as the incoming requests before they have had Named Entities assigned to them. Classed Text is how the database refers to text that has been generalized by replacing specific words or phrases with Named Entities.
Figure 2. Using the Filter Navigator view to launch a new Sentence List
| XML error: The image is not displayed because the width is greater than the maximum of 580 pixels. Please decrease the image width. |
Take a serious look at your entire list of requests and come up with a reasonable set of Named Entities that you think can be applied. It's important to do this ahead of time so that you assign Named Entities consistently across your entire database. Remember that you'll need to supply a grammar to represent each named entity, so make sure to set aside time to write those grammars when you're finished.
Your first step in assigning Named Entities is to double click on the first request you want to edit. This brings up the Sentence Properties Editor. As the name implies, the Sentence Properties Editor lets you edit all the various properties associated with the row. There are many things that can be set, but for now you're focused on setting up the Named Entities, so go ahead and click on the Named Entities tab of the editor.
Figure 3. A request in the Sentence Properties Editor
| XML error: The image is not displayed because the width is greater than the maximum of 580 pixels. Please decrease the image width. |
When you begin, you'll see just the text of the sentence. Drag a rectangle with your mouse around a group of contiguous words or phrases that you want to group into a Named Entity. Because this is your first time, you'll get the Create Named Entity window. Go ahead and press New Named Entity in the window to create the first one. Be careful to name it something appropriate because Named Entities cannot be edited or removed!
Figure 4. New Named Entity window

After you've created a Named Entity, select it from the list in the Create Named Entity window and select OK. Now the editor contents should show the new nl gamed Entity applied to the request.
Figure 5. A request in the Sentence Properties Editor with a Named Entity applied
| XML error: The image is not displayed because the width is greater than the maximum of 580 pixels. Please decrease the image width. |
When you're finished editing a request, you can press Page Down on the keyboard to move to the next one. The current contents are saved automatically. A good practice is to start at the first entry in the sentence list and use the Page Down key to move all the way down the list until you've completed looking at each one.
Your next task is to create a grammar file for each Named Entity. Create a grammar of the exact same name anywhere in the project, and inside each, place a few examples of what can be said. For example, in the previous request, place "sun roof" in the PART grammar file.
Congratulations -- you've completed the primary steps of building a natural language data set. Your Statistical Language Model and Action Classification models are now ready to be trained.
- Download the IBM WebSphere Voice Toolkit from AlphaWorks to get started.
- Check out the WebSphere Voice Zone for additional information about IBM WebSphere Voice products.
- Learn more about using the Eclipse platform at the Eclipse Web site.

Brent D. Metz is a Staff Software Engineer at IBM Pervasive Computing in Boca Raton, Florida. He has worked on the Natural Language Understanding (NLU) tooling integrated into the IBM WebSphere Voice Toolkit since its inception. Brent received a B.S. in Computer Science from Virginia Tech where he was the grand prize winner of IBM's Cool Blue VoiceXML Challenge.