Skip to main content

Voice enabling XML, Part 4: Develop a Web search application for VoiceXML

Search the Web and hear the results of a voice-enabled Yahoo search

Martin Brown, Developer and writer, Freelance
Photo of Martin Brown
Martin Brown has been a professional writer for over eight years. He is the author of numerous books and articles across a range of topics. His expertise spans myriad development languages and platforms -- Perl, Python, Java, JavaScript, Basic, Pascal, Modula-2, C, C++, Rebol, Gawk, Shellscript, Windows, Solaris, Linux, BeOS, Mac OS/X and more -- as well as Web programming, systems management and integration. Martin is a regular contributor to ServerWatch.com, LinuxToday.com and IBM developerWorks, and a regular blogger at Computerworld, The Apple Blog and other sites, as well as a Subject Matter Expert (SME) for Microsoft. He can be contacted through his Web site at http://www.mcslp.com.

Summary:  In this final article of a four-part series, develop an application that takes VoiceXML as input and queries the Yahoo Search API for both basic Web searches and Yahoo local searches. The query returns information about businesses within a specific location and region. The application then reads the results to the caller after submission.

View more content in this series

Date:  02 Oct 2007
Level:  Intermediate
Activity:  2278 views

Introduction

Internet searching is taken for granted these days, with numerous services available for searching. Web searching has also expanded. With so many companies that have a Web presence, many companies now merge their Web and traditional offline data, like business directories, and map and location information to make it possible to search for a variety of businesses and information.

This information is perfect to use VoiceXML (VXML) to submit your searches and listen to the returned results. In this article, you will create an application that does this, and you will also:

  • Review a Web searching workflow
  • Create a generic class for outputting VXML form elements
  • Create VXML grammar that supports a wide range of input
  • Use the Yahoo search interface
  • Run Web searches using VXML and Yahoo search
  • Run local searches using VXML and Yahoo search

About this series

Voice, and audio in general, is more and more popular on the Web. Examples include the plethora of music and webcasts currently available online. This series shows several ways to combine voice and XML to develop the following useful applications:

  • Part 1 —a voice-enabled RSS reader.
  • Part 2 —a voice-enabled calendar.
  • Part 3 —a voice-enabled blogging and Twitter application.
  • Part 4—an voice-enabled Yahoo search application.

Web searching workflow

The Web search workflow provides for a simple menu that enables you to select one of two types of search, the local search and the traditional Web search. The former requires two input values, the search term and the location, the latter just a search term. You can see the basic sequence here in Figure 1.


Figure 1. Web searching workflow
web searching workflow

In previous parts of this series, you saw examples of how to support a basic menu selection, so you'll ignore that for this article.

For the search terms though, you will support a much wider range of choice in the input terms that are recognized by the VoiceXML specification, and you'll build a simple VoiceXML class to output this information as part of the application.


Creating a freeform grammar input

As you saw in Part 3 of this series, when you look at blogging solutions for VoiceXML, no convenient transcription method is available at the moment that will read in any voice text and convert it into a text format. Instead, you must specify the words and phrases that you expect to receive so that the voice recognition system is more accurate.

Normally, if you specify or expect a specific phrase, such as "I am sad" you define the phrase explicitly in the grammar rules, for example: (eye am sad).

For a Web search, you want text input that is more freeform. Although you cannot blindly accept any text because of the lack of "transcript" support, you can provide a range of terms and then let the user repeat different terms as much as they like.

To achieve this, you specify a group for a given list of terms, and then create a rule that supports at least one or more of these terms. For example, imagine that you want to support "java", "apple", or "windows" or any combination of these terms, you specify the term group: Terms [ (java) (apple) (windows) ].

And then you add a rule to the same grammar that expects a repetition of this: .Phrase +Terms.

Now the user can repeat any of those terms, or any combination, any period of times, such that "java apple" or "java apple window" are valid.

Of course, on its own this is comparatively useless, as you could achieve this through other means. Where it shines is if you combine it with other English words that can join or provide tighter control. For example, if you expand the definition with: Terms [ (java) (apple) (windows) (and) (or) ].

The user is then able to speak "java and apple" or "java and windows" and even "java and apple or windows".

Now you'll use this to build a system that accepts almost freeform input using a standardized output class.


Creating a generic VXML output class

You can create a standardized VXML output that will generate specific elements of fragments according to a pre-defined format. This simplifies the production of VXML and, especially with a freeform input format, makes the production of the information much easier.

You can adapt the WebSearchVXML class to work with many VXML applications, as it provides some simple methods to output the right VXML you will need for your application. This version outputs to standard output (using the System.out.println() method), but you can easily adapt it to work within a Web-based application for the dynamic elements of a VoiceXML application.

The class provides the following information and methods:

  • A list of supported Web search terms
  • A list of supported local search terms
  • A list of supported cities or locations
  • VXMLPrompt outputs a '<prompt>message</prompt>' block
  • VXMLTerms outputs a field block with an embedded prompt block and a freeform terms grammar output
  • VXMLOptions outputs a field block with an embedded prompt block and a list of worded options (suitable when you want the user to select one of a list of possibilities, for example, with the list of cities)
  • VXMLHeader outputs a standardized header
  • VXMLFooter outputs a simple footer

Listing 1 shows the resulting class.


Listing 1. Standardized VXML class
                

public class WebSearchVXML {

    String webterms[] = { "java",
                          "apple",
                          "windows",
                          "britney spears",
                          "lindsey lohan"};

    String localterms[] = { "plumber",
                            "restaurant",
                            "electrician",
                            "supermarket"};

    String cities[] = { "London",
                        "Nottingham",
                        "Manchester",
                        "York",};

    PrintStream out = System.out;

    public void WebSearchVXML() {
    }

    private void OutputGSLTerms(String[] terms) {

        this.out.println("<grammar type=\"text/gsl\">" +
                           "<![CDATA[\n" +
                           ".Phrase +Terms\n" +
                           "Terms [ ");

        for(int i=0;i<terms.length;i++) {
            this.out.println("(" + terms[i] + ")");
        }

        this.out.println("]\n" +
                           "]]>" +
                           "</grammar>");
    }

    private void OutputOptionTerms(String[] terms) {
        for(int i=0;i<terms.length;i++) {
            this.out.println("<option>" + terms[i] + "</option>");
        }
    }

    public void VXMLPrompt(String prompt) {
        this.out.println("<prompt>" + prompt + "</prompt>");
    }

    public void VXMLTerms(String fieldname,
                          String prompt,
                          String [] terms) {
        this.out.println("<field name=\"" + fieldname + "\">");
        VXMLPrompt(prompt);
        OutputGSLTerms(terms);
        this.out.println("</field>");
    }

    public void VXMLOptions(String fieldname,
                          String prompt,
                          String [] terms) {
        this.out.println("<field name=\"" + fieldname + "\">");
        VXMLPrompt(prompt);
        OutputOptionTerms(terms);
        this.out.println("</field>");
    }

    public void VXMLHeader() {
        this.out.println("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");
        this.out.println("<vxml version=\"2.1\">");
    }

    public void VXMLFooter() {
        this.out.println("</vxml>");
    }
}

The result is that you can very quickly output a VXML form to accept input from a user.


Using the generic interface

For example, you might use the code in Listing 2 to generate the VXML for a "local" search.


Listing 2. Outputting a simple search
                
WebSearchVXML search = new WebSearchVXML();

search.VXMLHeader();
System.out.println("<form>");
System.out.println("<block>");
search.VXMLPrompt("Welcome to the Internet search service");
System.out.println("</block>");
search.VXMLTerms("phrase",
    "Enter the search phrase to use when searching the web",
    search.webterms);
search.VXMLOptions("location",
    "Enter the location to limit your search to",
    search.cities);
System.out.println("<filled><submit name=\"/VXMLSearch/search\" 
namelist=\"phrase location\">");
System.out.println("</form>");
search.VXMLFooter();

Listing 3 shows the resulting VXML generated by this.


Listing 3: VXML output
                
<?xml version="1.0" encoding="UTF-8"?>
<vxml version="2.1">
<form>
<block>
<prompt>Welcome to the Internet search service</prompt>
</block>
<field name="phrase">
<prompt>Enter the search phrase to use when searching the web</prompt>
<grammar type="text/gsl"><![CDATA[
.Phrase +Terms
Terms [ 
(java)
(apple)
(windows)
(britney spears)
(lindsey lohan)
]
]]></grammar>
</field>
<field name="location">
<prompt>Enter the location to limit your search to</prompt>
<option>London</option>
<option>Nottingham</option>
<option>Manchester</option>
<option>York</option>
</field>
<filled><submit name="/VXMLSearch/search" namelist="phrase location">
</form>
</vxml>

The real benefit here is that to change or expand the list of supported terms or cities you can just update the string array.

The resulting options and supported input values are not quite freeform text, but with a suitably large list of terms, it can be quite extensive.


Using the Yahoo search interface

In the Yahoo search interface, you can search nearly all of the Yahoo search databases. This convenient interface provides methods to create the different search parameters and obtain the details from the results.

Before you start to use the Yahoo search system, you must register for an API key. Once you have the API key, download the Yahoo search SDK (see Resources). The SDK is in the form of a Java™ JAR file that contains everything you need to communicate with the Yahoo search interface.

To compile a Java program that uses the API, you must include the yahoo JAR in the class path: $ javac -cp yahoo_search-2.0.2.jar LocalSearch.java.

Now, you can put together a simple search function that will take a search term and output the VXML required to list the results found.


Running a Web search

To run a simple Web search, create a SearchClient() object instance using the API key that you received from Yahoo when you registered. This creates a client object that will perform the search, and in fact supports any search type.

For the request, you create a new request object using the WebSearchRequest() class, specifying the string to use as the search term or terms.

When you submit the request, using the client, Yahoo returns a WebSearchResults object that contains generalized information about the request, and an array of individual results. Each result contains information like the page title, URL and other information, much of which, in a VoiceXML application is comparatively useless. You can see the entire sequence in Listing 4.


Listing 4. Performing a Web search with Yahoo
                
public static void DoSearch(String term) {
        SearchClient client = new SearchClient("INSERTKEY");

    WebSearchRequest request = new WebSearchRequest(term);

    try {
        WebSearchResults results = client.webSearch(request);

        System.out.println("<block><prompt>Found " +
                           results.getTotalResultsAvailable() +
                           " hits for " +
                           term +
                           ". Detailing the top " +
                           results.getTotalResultsReturned() +
                           " results.</prompt></block>");

        for (int i = 0; i < results.listResults().length; i++) {
            WebSearchResult result = results.listResults()[i];

            System.out.println("<block><prompt>" +
                               result.getTitle() +
                               ".<break/>" +
                               "</prompt></block>");
        }
    }
    catch (Exception e) {
        System.err.println("An error occurred");
    }
}

The code in Listing 4 is designed to output prompt blocks of text, one for the overall search information (that is, the number of hits) and then a block for each of the results found in the Yahoo Web search database.

Listing 5 shows the sample VXML generated in the process.


Listing 5. Generated VXML
                
<?xml version="1.0" encoding="UTF-8"?>
<vxml version="2.1">
<form>
<block><prompt>Found 73600000 hits for britney spears. Detailing the top 10
 results.</prompt></block>
<block><prompt>Britney Spears - The Official
 Site.<break/></prompt></block>
<block><prompt>WORLDOFBRITNEY.COM.<break/>
                                            </prompt></block>
<block><prompt>Britney Spears - britney.com - Jive 
Records.<break/></prompt></block>
<block><prompt>Britney Spears Zone | Britney Divorce, Pregnant, 
Baby Pictures, Pics & 
News.<break/></prompt></block>
<block><prompt>britneyspears.org.<break/></prompt></block>
<block><prompt>WoBPictures.com [100030013].<break/>
                                           </prompt></block>
<block><prompt>Britney Spears - Wikipedia, the free
 encyclopedia.<break/></prompt></block>
<block><prompt>Britney Spears | Music Artist | Videos, News, Photos & 
Ringtones | MTV.<break/></prompt></block>
<block><prompt>Britney Spears.<break/></prompt></block>
<block><prompt>Britney Spears - Yahoo! Music.<break/>
                                            </prompt></block>
<disconnect/>
</form>
</vxml>

When the voice browser speaks, it describes some of the information in a funny way, because it tries to identify the type of output. For example, 'WORLDOFBRITNEY.COM' will be output as letters ('W','O', 'R',...), since the browser will assume it is an acronym.

Other values will be output according to what the voice browser thinks they are. For example, the hit count will be correctly spoken as "seventy three million, six hundred thousand". Unfortunately, the number in the "WoBPictures.com" title will also be described as if it was a bare number.


Running a local search

With the Yahoo search system, the local search relies on both a search term and a location. Because there are a variety of ways in which you can specify a location, the method for specifying the search information is slightly more complex than the Web search interface.

For the local search therefore you accept the search term and the location from the VoiceXML select page and then execute a search based on that information. The sequence is:

  • Create the Yahoo SearchClient request
  • Set the location for the search (using setLocation())
  • Set the query for the search (using setQuery())

Once the search object has been assembled, the basic process of extracting the result data is identical to that in the Web search solution.

You can see a sample of this in action in Listing 6.


Listing 6. Outputting VoiceXML from a local search
                
public static void DoSearch (String term,String location) {
    SearchClient client = new SearchClient("INSERTKEY");

    LocalSearchRequest request = new LocalSearchRequest();
    request.setLocation(location);
    request.setQuery(term);

    try {
        LocalSearchResults results = client.localSearch(request);

        System.out.println("<block><prompt>Found " +
                           results.getTotalResultsAvailable() +
                           " hits for " +
                           term + " in " + location +
                           ". Detailing the top " +
                           results.getTotalResultsReturned() +
                           " results.</prompt></block>");

        for (int i = 0; i < results.listResults().length; i++) {
            LocalSearchResult result = results.listResults()[i];

            Pattern pattern = Pattern.compile("&");
            Matcher matcher = pattern.matcher(result.getTitle());

            System.out.println("<block><prompt>" +
                               matcher.replaceAll("and") +
                               ".<break/> Phone number: " +
                               result.getPhone() +
                               "</prompt></block>");
        }
    }
    catch (Exception e) {
        System.err.println("Error calling Yahoo! Search Service: " +
                e.toString());
    }
}

One thing to note in this particular output is that you replace any bare ampersand (&) characters with the word "and". Most VoiceXML systems are designed to work with most of the XML/HTML escaped characters (in the case of &, it becomes &amp;), but some voice browsers will refuse to work the information if it is not properly escaped (as this would break the XML validity of the VXML document). Be careful to output the information in the correct format. To achieve this in the above example, you use a regular expression to perform the replacement.

The second point to note with this example is that you output the phone number along with the title of the company that was found. Many voice browsers will identify the phone number from the number format and correctly output the digits as a phone number, rather than say, for example, "nine hundred" or "six hundred and six".

You've seen many examples of the full VXML. Now look at the basic content output (organized into blocks) that will output each of the search results (see Listing 7).


Listing 7. Basic content output
                
<block><prompt>Found 14 hits for plumber in New York. 
Detailing the top 10 results.</prompt></block>
<block><prompt>State Plumbing Inspector.<break/> 
Phone number: (606) 862-1297</prompt></block>
<block><prompt>Hibbitts Brothers Wholesale.<break/> 
Phone number: (606) 864-2256</prompt></block>
<block><prompt>State Plumbing Inspector Paul Ray.<break/> 
Phone number: (606) 862-1297</prompt></block>
<block><prompt>Willard Neeley Plumbing and Htg.<break/> 
Phone number: (606) 864-6203</prompt></block>
<block><prompt>Vanhook Plumbing Htg and Cooling.<break/> 
Phone number: (606) 862-8228</prompt></block>
<block><prompt>Rooter Man of South Eastern Ky.<break/> 
Phone number: (606) 878-1339</prompt></block>
<block><prompt>Kettry Roaden Plumbing.<break/> 
Phone number: (606) 528-3396</prompt></block>
<block><prompt>Prestige Marble.<break/> 
Phone number: (606) 523-9186</prompt></block>
<block><prompt>Herb King Plumbing.<break/> 
Phone number: (606) 364-4534</prompt></block>
<block><prompt>AAA Plumbing.<break/> 
Phone number: (606) 528-0705</prompt></block>

The spoken version of this output will be less than perfect. You will, for example, have "Htg" in "Vanhook Plumbing and Htg." spelled out because it's unlikely the voice browser will know to expand that value to "heating".


Summary

This article covered the use of the Yahoo search API to output VXML from searches made both to the Yahoo Web page database, and to the Yahoo Local business directory database. In the resulting solution, a user can call the service and ask the system to list the top ten plumbers in New York or to list the top ten sites that mention "java" and "apple". The key here is the freeform phrase input that enables you to support phrases, words, and unlimited combinations of those words and phrases together to make the search as precise as possible.

You' also looked at a standardized method to generate the VXML that might be used in a standard VoiceXML browser selection system.

I hope this series gives you a solid foundation for developing your own VoiceXML applications.



Download

DescriptionNameSizeDownload method
Part 4 sample codex-voicexml4-search.zip3KB HTTP

Information about download methods


Resources

Learn

Get products and technologies

  • Rome RSS/Atom syndication: Download these open source Java tools and libraries for parsing, generating and publishing RSS and Atom feeds.

  • Apache XML-RPC Client: Get a simplified interface for connecting to XML-RPC sources.

  • Voxeo: Find a wealth of information and a hosting solution for VoiceXML applications that provides access through traditional, VoIP and Skype.

  • IBM trial software: Build your next development project with trial software available for download directly from developerWorks.

Discuss

About the author

Photo of Martin Brown

Martin Brown has been a professional writer for over eight years. He is the author of numerous books and articles across a range of topics. His expertise spans myriad development languages and platforms -- Perl, Python, Java, JavaScript, Basic, Pascal, Modula-2, C, C++, Rebol, Gawk, Shellscript, Windows, Solaris, Linux, BeOS, Mac OS/X and more -- as well as Web programming, systems management and integration. Martin is a regular contributor to ServerWatch.com, LinuxToday.com and IBM developerWorks, and a regular blogger at Computerworld, The Apple Blog and other sites, as well as a Subject Matter Expert (SME) for Microsoft. He can be contacted through his Web site at http://www.mcslp.com.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=256183
ArticleTitle=Voice enabling XML, Part 4: Develop a Web search application for VoiceXML
publish-date=10022007
author1-email=mc@mcslp.com
author1-email-cc=dwxed@us.ibm.com

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers