Internet searching is taken for granted these days, with numerous services available for searching. Web searching has also expanded. With so many companies that have a Web presence, many companies now merge their Web and traditional offline data, like business directories, and map and location information to make it possible to search for a variety of businesses and information.
This information is perfect to use VoiceXML (VXML) to submit your searches and listen to the returned results. In this article, you will create an application that does this, and you will also:
- Review a Web searching workflow
- Create a generic class for outputting VXML form elements
- Create VXML grammar that supports a wide range of input
- Use the Yahoo search interface
- Run Web searches using VXML and Yahoo search
- Run local searches using VXML and Yahoo search
Voice, and audio in general, is more and more popular on the Web. Examples include the plethora of music and webcasts currently available online. This series shows several ways to combine voice and XML to develop the following useful applications:
- Part 1 —a voice-enabled RSS reader.
- Part 2 —a voice-enabled calendar.
- Part 3 —a voice-enabled blogging and Twitter application.
- Part 4—an voice-enabled Yahoo search application.
The Web search workflow provides for a simple menu that enables you to select one of two types of search, the local search and the traditional Web search. The former requires two input values, the search term and the location, the latter just a search term. You can see the basic sequence here in Figure 1.
Figure 1. Web searching workflow
In previous parts of this series, you saw examples of how to support a basic menu selection, so you'll ignore that for this article.
For the search terms though, you will support a much wider range of choice in the input terms that are recognized by the VoiceXML specification, and you'll build a simple VoiceXML class to output this information as part of the application.
Creating a freeform grammar input
As you saw in Part 3 of this series, when you look at blogging solutions for VoiceXML, no convenient transcription method is available at the moment that will read in any voice text and convert it into a text format. Instead, you must specify the words and phrases that you expect to receive so that the voice recognition system is more accurate.
Normally, if you specify or expect a specific phrase, such as "I am sad" you define the phrase explicitly in the grammar rules, for example: (eye am sad).
For a Web search, you want text input that is more freeform. Although you cannot blindly accept any text because of the lack of "transcript" support, you can provide a range of terms and then let the user repeat different terms as much as they like.
To achieve this, you specify a group for a given list of terms, and then create a rule that supports at least one or more of these terms. For example, imagine that you want to support "java", "apple", or "windows" or any combination of these terms, you specify the term group: Terms [ (java) (apple) (windows) ].
And then you add a rule to the same grammar that expects a repetition of this: .Phrase +Terms.
Now the user can repeat any of those terms, or any combination, any period of times, such that "java apple" or "java apple window" are valid.
Of course, on its own this is comparatively useless, as you could achieve this through
other means. Where it shines is if you combine it with other English words that can join or provide tighter control. For example, if you expand the definition with: Terms [ (java) (apple) (windows) (and) (or) ].
The user is then able to speak "java and apple" or "java and windows" and even "java and apple or windows".
Now you'll use this to build a system that accepts almost freeform input using a standardized output class.
Creating a generic VXML output class
You can create a standardized VXML output that will generate specific elements of fragments according to a pre-defined format. This simplifies the production of VXML and, especially with a freeform input format, makes the production of the information much easier.
You can adapt the WebSearchVXML class to work with many VXML
applications, as it provides some simple methods to output the right VXML you will need
for your application. This version outputs to standard output (using the System.out.println() method), but you can easily adapt it to work within a Web-based application for the dynamic elements of a VoiceXML application.
The class provides the following information and methods:
- A list of supported Web search terms
- A list of supported local search terms
- A list of supported cities or locations
-
VXMLPromptoutputs a '<prompt>message</prompt>' block -
VXMLTermsoutputs a field block with an embedded prompt block and a freeform terms grammar output -
VXMLOptionsoutputs a field block with an embedded prompt block and a list of worded options (suitable when you want the user to select one of a list of possibilities, for example, with the list of cities) -
VXMLHeaderoutputs a standardized header -
VXMLFooteroutputs a simple footer
Listing 1 shows the resulting class.
Listing 1. Standardized VXML class
public class WebSearchVXML {
String webterms[] = { "java",
"apple",
"windows",
"britney spears",
"lindsey lohan"};
String localterms[] = { "plumber",
"restaurant",
"electrician",
"supermarket"};
String cities[] = { "London",
"Nottingham",
"Manchester",
"York",};
PrintStream out = System.out;
public void WebSearchVXML() {
}
private void OutputGSLTerms(String[] terms) {
this.out.println("<grammar type=\"text/gsl\">" +
"<![CDATA[\n" +
".Phrase +Terms\n" +
"Terms [ ");
for(int i=0;i<terms.length;i++) {
this.out.println("(" + terms[i] + ")");
}
this.out.println("]\n" +
"]]>" +
"</grammar>");
}
private void OutputOptionTerms(String[] terms) {
for(int i=0;i<terms.length;i++) {
this.out.println("<option>" + terms[i] + "</option>");
}
}
public void VXMLPrompt(String prompt) {
this.out.println("<prompt>" + prompt + "</prompt>");
}
public void VXMLTerms(String fieldname,
String prompt,
String [] terms) {
this.out.println("<field name=\"" + fieldname + "\">");
VXMLPrompt(prompt);
OutputGSLTerms(terms);
this.out.println("</field>");
}
public void VXMLOptions(String fieldname,
String prompt,
String [] terms) {
this.out.println("<field name=\"" + fieldname + "\">");
VXMLPrompt(prompt);
OutputOptionTerms(terms);
this.out.println("</field>");
}
public void VXMLHeader() {
this.out.println("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");
this.out.println("<vxml version=\"2.1\">");
}
public void VXMLFooter() {
this.out.println("</vxml>");
}
}
|
The result is that you can very quickly output a VXML form to accept input from a user.
For example, you might use the code in Listing 2 to generate the VXML for a "local" search.
Listing 2. Outputting a simple search
WebSearchVXML search = new WebSearchVXML();
search.VXMLHeader();
System.out.println("<form>");
System.out.println("<block>");
search.VXMLPrompt("Welcome to the Internet search service");
System.out.println("</block>");
search.VXMLTerms("phrase",
"Enter the search phrase to use when searching the web",
search.webterms);
search.VXMLOptions("location",
"Enter the location to limit your search to",
search.cities);
System.out.println("<filled><submit name=\"/VXMLSearch/search\"
namelist=\"phrase location\">");
System.out.println("</form>");
search.VXMLFooter();
|
Listing 3 shows the resulting VXML generated by this.
Listing 3: VXML output
<?xml version="1.0" encoding="UTF-8"?>
<vxml version="2.1">
<form>
<block>
<prompt>Welcome to the Internet search service</prompt>
</block>
<field name="phrase">
<prompt>Enter the search phrase to use when searching the web</prompt>
<grammar type="text/gsl"><![CDATA[
.Phrase +Terms
Terms [
(java)
(apple)
(windows)
(britney spears)
(lindsey lohan)
]
]]></grammar>
</field>
<field name="location">
<prompt>Enter the location to limit your search to</prompt>
<option>London</option>
<option>Nottingham</option>
<option>Manchester</option>
<option>York</option>
</field>
<filled><submit name="/VXMLSearch/search" namelist="phrase location">
</form>
</vxml>
|
The real benefit here is that to change or expand the list of supported terms or cities you can just update the string array.
The resulting options and supported input values are not quite freeform text, but with a suitably large list of terms, it can be quite extensive.
Using the Yahoo search interface
In the Yahoo search interface, you can search nearly all of the Yahoo search databases. This convenient interface provides methods to create the different search parameters and obtain the details from the results.
Before you start to use the Yahoo search system, you must register for an API key. Once you have the API key, download the Yahoo search SDK (see Resources). The SDK is in the form of a Java™ JAR file that contains everything you need to communicate with the Yahoo search interface.
To compile a Java program that uses the API, you must include the yahoo JAR in the class path: $ javac -cp yahoo_search-2.0.2.jar LocalSearch.java.
Now, you can put together a simple search function that will take a search term and output the VXML required to list the results found.
To run a simple Web search, create a SearchClient() object instance using the API key that you received from Yahoo when you registered. This creates a client object that will perform the search, and in fact supports any search type.
For the request, you create a new request object using the WebSearchRequest() class, specifying the string to use as the search term or terms.
When you submit the request, using the client, Yahoo returns a WebSearchResults object that contains generalized information about the request, and an array of individual results. Each result contains information like the page title, URL and other information, much of which, in a VoiceXML application is comparatively useless. You can see the entire sequence in Listing 4.
Listing 4. Performing a Web search with Yahoo
public static void DoSearch(String term) {
SearchClient client = new SearchClient("INSERTKEY");
WebSearchRequest request = new WebSearchRequest(term);
try {
WebSearchResults results = client.webSearch(request);
System.out.println("<block><prompt>Found " +
results.getTotalResultsAvailable() +
" hits for " +
term +
". Detailing the top " +
results.getTotalResultsReturned() +
" results.</prompt></block>");
for (int i = 0; i < results.listResults().length; i++) {
WebSearchResult result = results.listResults()[i];
System.out.println("<block><prompt>" +
result.getTitle() +
".<break/>" +
"</prompt></block>");
}
}
catch (Exception e) {
System.err.println("An error occurred");
}
}
|
The code in Listing 4 is designed to output prompt blocks of text, one for the overall search information (that is, the number of hits) and then a block for each of the results found in the Yahoo Web search database.
Listing 5 shows the sample VXML generated in the process.
Listing 5. Generated VXML
<?xml version="1.0" encoding="UTF-8"?>
<vxml version="2.1">
<form>
<block><prompt>Found 73600000 hits for britney spears. Detailing the top 10
results.</prompt></block>
<block><prompt>Britney Spears - The Official
Site.<break/></prompt></block>
<block><prompt>WORLDOFBRITNEY.COM.<break/>
</prompt></block>
<block><prompt>Britney Spears - britney.com - Jive
Records.<break/></prompt></block>
<block><prompt>Britney Spears Zone | Britney Divorce, Pregnant,
Baby Pictures, Pics &
News.<break/></prompt></block>
<block><prompt>britneyspears.org.<break/></prompt></block>
<block><prompt>WoBPictures.com [100030013].<break/>
</prompt></block>
<block><prompt>Britney Spears - Wikipedia, the free
encyclopedia.<break/></prompt></block>
<block><prompt>Britney Spears | Music Artist | Videos, News, Photos &
Ringtones | MTV.<break/></prompt></block>
<block><prompt>Britney Spears.<break/></prompt></block>
<block><prompt>Britney Spears - Yahoo! Music.<break/>
</prompt></block>
<disconnect/>
</form>
</vxml>
|
When the voice browser speaks, it describes some of the information in a funny way, because it tries to identify the type of output. For example, 'WORLDOFBRITNEY.COM' will be output as letters ('W','O', 'R',...), since the browser will assume it is an acronym.
Other values will be output according to what the voice browser thinks they are. For example, the hit count will be correctly spoken as "seventy three million, six hundred thousand". Unfortunately, the number in the "WoBPictures.com" title will also be described as if it was a bare number.
With the Yahoo search system, the local search relies on both a search term and a location. Because there are a variety of ways in which you can specify a location, the method for specifying the search information is slightly more complex than the Web search interface.
For the local search therefore you accept the search term and the location from the VoiceXML select page and then execute a search based on that information. The sequence is:
- Create the Yahoo SearchClient request
- Set the location for the search (using
setLocation()) - Set the query for the search (using
setQuery())
Once the search object has been assembled, the basic process of extracting the result data is identical to that in the Web search solution.
You can see a sample of this in action in Listing 6.
Listing 6. Outputting VoiceXML from a local search
public static void DoSearch (String term,String location) {
SearchClient client = new SearchClient("INSERTKEY");
LocalSearchRequest request = new LocalSearchRequest();
request.setLocation(location);
request.setQuery(term);
try {
LocalSearchResults results = client.localSearch(request);
System.out.println("<block><prompt>Found " +
results.getTotalResultsAvailable() +
" hits for " +
term + " in " + location +
". Detailing the top " +
results.getTotalResultsReturned() +
" results.</prompt></block>");
for (int i = 0; i < results.listResults().length; i++) {
LocalSearchResult result = results.listResults()[i];
Pattern pattern = Pattern.compile("&");
Matcher matcher = pattern.matcher(result.getTitle());
System.out.println("<block><prompt>" +
matcher.replaceAll("and") +
".<break/> Phone number: " +
result.getPhone() +
"</prompt></block>");
}
}
catch (Exception e) {
System.err.println("Error calling Yahoo! Search Service: " +
e.toString());
}
}
|
One thing to note in this particular output is that you replace any bare ampersand (&) characters with the word "and". Most VoiceXML systems are designed to work with most of the XML/HTML escaped characters (in the case of &, it becomes &), but some voice browsers will refuse to work the information if it is not properly escaped (as this would break the XML validity of the VXML document). Be careful to output the information in the correct format. To achieve this in the above example, you use a regular expression to perform the replacement.
The second point to note with this example is that you output the phone number along with the title of the company that was found. Many voice browsers will identify the phone number from the number format and correctly output the digits as a phone number, rather than say, for example, "nine hundred" or "six hundred and six".
You've seen many examples of the full VXML. Now look at the basic content output (organized into blocks) that will output each of the search results (see Listing 7).
Listing 7. Basic content output
<block><prompt>Found 14 hits for plumber in New York.
Detailing the top 10 results.</prompt></block>
<block><prompt>State Plumbing Inspector.<break/>
Phone number: (606) 862-1297</prompt></block>
<block><prompt>Hibbitts Brothers Wholesale.<break/>
Phone number: (606) 864-2256</prompt></block>
<block><prompt>State Plumbing Inspector Paul Ray.<break/>
Phone number: (606) 862-1297</prompt></block>
<block><prompt>Willard Neeley Plumbing and Htg.<break/>
Phone number: (606) 864-6203</prompt></block>
<block><prompt>Vanhook Plumbing Htg and Cooling.<break/>
Phone number: (606) 862-8228</prompt></block>
<block><prompt>Rooter Man of South Eastern Ky.<break/>
Phone number: (606) 878-1339</prompt></block>
<block><prompt>Kettry Roaden Plumbing.<break/>
Phone number: (606) 528-3396</prompt></block>
<block><prompt>Prestige Marble.<break/>
Phone number: (606) 523-9186</prompt></block>
<block><prompt>Herb King Plumbing.<break/>
Phone number: (606) 364-4534</prompt></block>
<block><prompt>AAA Plumbing.<break/>
Phone number: (606) 528-0705</prompt></block>
|
The spoken version of this output will be less than perfect. You will, for example, have "Htg" in "Vanhook Plumbing and Htg." spelled out because it's unlikely the voice browser will know to expand that value to "heating".
This article covered the use of the Yahoo search API to output VXML from searches made both to the Yahoo Web page database, and to the Yahoo Local business directory database. In the resulting solution, a user can call the service and ask the system to list the top ten plumbers in New York or to list the top ten sites that mention "java" and "apple". The key here is the freeform phrase input that enables you to support phrases, words, and unlimited combinations of those words and phrases together to make the search as precise as possible.
You' also looked at a standardized method to generate the VXML that might be used in a standard VoiceXML browser selection system.
I hope this series gives you a solid foundation for developing your own VoiceXML applications.
| Description | Name | Size | Download method |
|---|---|---|---|
| Part 4 sample code | x-voicexml4-search.zip | 3KB | HTTP |
Information about download methods
Learn
-
Yahoo Developer Network: Find downloads and resources for searching and integrating with the Yahoo services.
-
Create VoiceXML pages within a Java Web developer framework, Part 1: Generate VoiceXML using Java servlets and JSPs (Brett McLaughlin, developerWorks, January 2006): Learn how Java servlets could easily power a VoiceXML application.
-
Create VoiceXML pages
within a Java Web developer framework, Part 2: Expanding Java-driven VoiceXML applications (Brett McLaughlin, developerWorks, January 2006): Learn how to use servlets to go beyond single-page applications.
-
VoiceXML 2.1 specification: Read about the set of features commonly implemented by Voice Extensible Markup Language platforms.
-
IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.
-
XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
-
developerWorks technical events and webcasts: Stay current with technology in these sessions.
- The technology
bookstore: Browse for books on these and other technical topics.
Get products and technologies
-
Rome RSS/Atom syndication: Download these open source Java tools and libraries for parsing, generating and publishing RSS and Atom feeds.
-
Apache XML-RPC Client: Get a simplified interface for connecting to XML-RPC sources.
-
Voxeo: Find a wealth of information and a hosting solution for VoiceXML applications that provides access through traditional, VoIP and Skype.
-
IBM trial software: Build your next development project with trial software available for download directly from developerWorks.
Discuss
- Participate in the discussion forum.
-
Ajax
forum: Join the discussion on Ajax, its frameworks, and related technologies that speed application development.
-
XML zone discussion forums: Participate in any of several XML-related discussion forums.
-
developerWorks XML zone: Share your thoughts: After you read this article, post your comments and thoughts in this forum. The XML zone editors moderate the forum and welcome your input.
-
developerWorks blogs: Check out these blogs and get involved in the developerWorks community.

Martin Brown has been a professional writer for over eight years. He is the author of numerous books and articles across a range of topics. His expertise spans myriad development languages and platforms -- Perl, Python, Java, JavaScript, Basic, Pascal, Modula-2, C, C++, Rebol, Gawk, Shellscript, Windows, Solaris, Linux, BeOS, Mac OS/X and more -- as well as Web programming, systems management and integration. Martin is a regular contributor to ServerWatch.com, LinuxToday.com and IBM developerWorks, and a regular blogger at Computerworld, The Apple Blog and other sites, as well as a Subject Matter Expert (SME) for Microsoft. He can be contacted through his Web site at http://www.mcslp.com.
Comments (Undergoing maintenance)





