Skip to main content

The W3C Multimodal Architecture, Part 3: A multimodal Web service

Multimodal authoring with WSDL, SOAP, XHTML, and JavaScript

Gerald McCobb (mccobb@us.ibm.com), Advisory Software Engineer, IBM, Software Group
Gerald McCobb has worked for IBM for over 15 years. He currently works in WebSphere Business Activity Monitor development. He was IBM's representative to the W3C Multimodal Interaction Working Group from 2002 to last May, 2007.

Summary:  Gerald McCobb concludes his introduction to the W3C Multimodal Architecture by showing you how to use the architecture as a generic template for developing a multimodal Web service.

View more content in this series

Date:  12 Jun 2007
Level:  Intermediate
Activity:  882 views

In the second article in this series I selected a cross-section of the W3C XML specification stack and described the role (or roles) each XML language could potentially play in a W3C Multimodal Architecture. I then presented a speculative example of how four of these languages -- SCXML, XHTML, REX, and XML Events -- could be combined to form a complete, distributed multimodal application. The application was entirely declarative in that I declared all event notification, event processing, and interaction processing as XML.

In this final article in the series, I'll show you how to implement the W3C Multimodal Architecture as a generic template to define a multimodal Web service. As you will see, in this type of implementation the Web service defines the data types and Web service operations for the Multimodal Architecture's life-cycle API.

Note that this article assumes you are familiar with the principles of Web service development and with the example application from Part 2. You can download the code for the example application at any time.

The online coffee service

For the sake of example I'll build on the multimodal online coffee maker introduced in Part 2 of this series. The multimodal coffee maker is a virtual appliance connected to an actual coffee maker. The actual coffee maker has its own Web server, enabling a user to remotely start or stop the coffee maker. In the previous example I used XML Events and REX remote events to communicate interaction between the client browser and the interaction manager running on the server. For this example, the online coffee service, I'll replace the application's XML Events and REX components with a Web service running on the server and SOAP calls to the Web service interface on the browser. My ability to implement this solution relies on the support for SOAP in the latest FireFox browser.

I'll still need the interaction manager running on the server for this example. I no longer require a client- or server-side adapter, however. Recall that in the original coffee maker application I used adapters to map XML Events and REX to <data> events consumed by the application's SCXML. In this case, I replace the adapters with Web service operations. The operations perform the mapping instead, with the actual mapping implementation hidden from the client.


Data types for multimodal interaction

The core of a Web service is its WSDL (Web Service Description Language) document, which defines the Web service. The WSDL defines an interface by separating the abstract description of the messages and operations that can be performed by the service from a concrete specification of the data exchanged between the client and service. The WSDL then defines how the abstract operations are bound to the actual end points available to clients. The format of the data to be exchanged between client and service is defined in terms of the XML The XML schema for the online coffee service is shown in Listing 1.


Listing 1. Data types for the online coffee service
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema targetNamespace="urn:mycoffeemaker"          
           xmlns="urn:mycoffeemaker"
           xmlns:xs="http://www.w3.org/2001/XMLSchema">
   <xs:annotation>
      <xs:documentation>
         Coffee Maker 1.0 WSDL data types schema
      </xs:documentation>
   </xs:annotation>

   <xs:complexType name="tData">
      <xs:sequence>
         <xs:element ref="context"/>
         <xs:choice>
            <xs:element ref="event"/>
            <xs:element ref="coffeeStatus"/>
	    <xs:element ref="grammar"/>
            <xs:element ref="start"/>
            <xs:element ref="prepare"/>
            <xs:element ref="done"/>
         </xs:choice>
      </xs:sequence>
   </xs:complexType>
   
   <xs:complexType name="tDone">
      <xs:sequence>
         <xs:element ref="context"/>
         <xs:element ref="status"/>
         <xs:element ref="errorInfo"/>
         <xs:element ref="data" minOccurs="0"/>
      </xs:sequence>
   </xs:complexType>

...
   
   <xs:element name="data" type="tData"/>
   <xs:element name="done" type="tDone"/>
   <xs:element name="startResponse" type="tStartResponse"/>
   <xs:element name="start" type="tStart"/>
   <xs:element name="prepareResponse" type="tPrepareResponse"/>
   <xs:element name="prepare" type="tPrepare"/>
   <xs:element name="newContextResponse" type="tNewContextResponse"/>
   <xs:element name="newContextRequest" type="tNewContextRequest"/>
</xs:schema>

Describing the Web service

Having defined the format of the XML exchanged between client and service, the next important part of a WSDL is the section that describes the input and output messages and operations that define the Web service interface. The WSDL is a contract between the client and the Web service, ensuring that the client knows both how to call the service and what to expect will be returned by the service.

Here the operations described are implementations of the W3C Multimodal Architecture's life-cycle API. For example, the "DataOp" operation is an implementation of the Multimodal Architecture's <data> event. However, the Multimodal Architecture specifies <data> as an asynchronous notification, while the WSDL specifies "DataOp" as a call that sends and returns a data message.

Note that the data types defined in Listing 1 are imported at the beginning of the WSDL document shown in Listing 2, below. Note also that I've used WSDL Version 1.1 for this implementation because Mozilla's WSDL API currently supports only WSDL 1.1. The latest W3C WSDL recommendation is 2.0.


Listing 2. Initial WSDL for the online coffee service
<?xml version="1.0" encoding="utf-8"?> 
<definitions name="CoffeeMaker"
    targetNamespace="urn:mycoffeemaker" 
    xmlns:tns="urn:mycoffeemaker"
    xmlns:xsdl="urn:mycoffeemaker"
    xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
    xmlns="http://schemas.xmlsoap.org/wsdl">

  <documentation>
    This document describes the 'My Coffee Maker' web service.
  </documentation>

  <import namespace="urn:mycoffeemaker"
          location="http://example.com/schemas/coffeemaker.xsd"/>

   <message name="Data">
      <part name="body" element="xsdl:data"/>
   </message>
   <message name="NewContextRequest">
      <part name="body" element="xsdl:newContextRequest"/>
   </message>
   <message name="NewContextResponse">
      <part name="body" element="xsdl:newContextResponse"/>
   </message>
   <message name="PrepareResponse">
      <part name="body" element="xsdl:prepareResponse"/>
   </message>
   <message name="StartResponse">
      <part name="body" element="xsdl:startResponse"/>
   </message>

   <portType name="CoffeeMakerPortType">
      <operation name="DataOp">
         <input message="tns:Data"/>
         <output message="tns:Data"/>
      </operation>
      <operation name="NewContextOp">
         <input message="tns:NewContextRequest"/>
         <output message="tns:NewContextResponse"/>
      </operation>
      <operation name="PrepareResponseOp">
         <input message="tns:PrepareResponse"/>
         <output message="tns:Data"/>
      </operation>
      <operation name="StartResponseOp">
         <input message="tns:StartResponse"/>
         <output message="tns:Data"/>
      </operation>
   </portType>
</definitions>

Listing 2 only describes the input and output messages and operations that define the Web service interface; the complete WSDL is shown below in Listing 3.


The complete Coffee Maker Web service

The messages and port types described in Listing 2 are imported at the beginning of the document shown in Listing 3. Below the import statement are the bindings and service definitions for the Web service. The service definition specifies that SOAP calls for this Web service should be made to the Web address http://example.com/mycoffeemaker.


Listing 3. The complete WSDL for the online coffee service
<?xml version="1.0" encoding="utf-8"?> 
<definitions name="CoffeeMaker"
    targetNamespace="urn:mycoffeemaker" 
    xmlns:tns="urn:mycoffeemaker"
    xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
    xmlns:defs="urn:mycoffeemaker/definitions"
    xmlns="http://schemas.xmlsoap.org/wsdl">

   <documentation>
     This document describes the 'My Coffee Maker' web service.
   </documentation>

   <import namespace="urn:mycoffeemaker/definitions"
      location="http://example.com/wsdl/11/coffeemaker.wsdl"/>

   <binding name="CoffeeMakerBinding"
      type="defs:CoffeeMakerPortType">
      <soap:binding style="rpc"
            transport="http://schemas.xmlsoap.org/soap/http"/>
      <operation name="DataOp">
         <soap:operation soapAction="DataOp" style="rpc"/>
         <input>
            <soap:body use="encoded" namespace="urn:mycoffeemaker"
               encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
         </input>
         <output>
            <soap:body use="encoded" namespace="urn:mycoffeemaker"
               encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
         </output>
      </operation>
      <operation name="NewContextOp">
         <soap:operation soapAction="NewContextOp" style="rpc"/>
         <input>
            <soap:body use="encoded" namespace="urn:mycoffeemaker"
               encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
         </input>
         <output>
            <soap:body use="encoded" namespace="urn:mycoffeemaker"
               encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
         </output>
      </operation>

...

   </binding>

   <service name="CoffeeMakerService">
      <port name="CoffeeMakerPortType" binding="tns:CoffeeMakerBinding">
         <soap:address location="http://example.com/mycoffeemaker"/>
      </port>
   </service>
</definitions>


Client-side interaction

With the server-side set up, your next step is to establish the client. While Web browsers do not support the W3C life-cycle events as asynchronous notifications, SOAP calls can be invoked asynchronously. This enables the user to continue to interact with the client while the client waits for a response. A client should not poll for a response (because it can't know how often it should poll), but it can keep track of its own interaction state, and make an asynchronous call to the Web server whenever it receives a return message from the Web service. In this way, even without polling, the client is always in a state waiting for a response from the Web service, which simulates an asynchronous notification.

Ajax

Using SOAP as your messaging framework is only one possibility when developing a Web service based on the Multimodal Architecture. Using XMLHttpRequest is another option that has the advantage of being supported by Web browsers other than Firefox (although SOAP libraries are available for Symbian and other mobile platforms). An application serving XML to a client through XMLHttpRequest could be described as a Web service, though in this case you would not use WSDL to define the Web interface. You should still use a data types schema (like the one shown in Listing 1) to define the structure of the XML returned to the client.

Table 1 shows the interaction states on the client as it interacts with the online coffee service. When the client receives a context response, start response, or data from the Web service, it processes the message and then makes a data request call to request more data. The data may contain "coffee status," or a <start> or <done> life-cycle event. Because the Multimodal Architecture allows <data> to contain anything, the online coffee service includes the <start>, <prepare>, and <done> events as <data> event content.


Table 1. Client interaction states for making SOAP calls
StateTransitionNext state
BeginSend: context requestContext Response
Context ResponseReceive: context response
Send: data request
Data Request
Data RequestReceive: data/start
Send: start response
Data Request
Data RequestReceive: data/event
Send: data request
Data Request
Data RequestReceive: data/coffee status
Send: data request
Data Request
Data RequestReceive: data/doneEnd

SOAP and JavaScript

The client implementation uses JavaScript functions to set up, send, and process the Mozilla SOAP calls to the online coffee service. The callCoffeeMaker function creates a new SOAPCall object and asynchronously invokes the Web service method at transport URI http://example.com/mycoffeemaker. The specified call-back function handles the response from the Web service.

One of the call-back functions is dataResponse. The dataResponse function handles on and off events by checking the radio button associated with the coffee maker's current state. The "coffee status" data event is handled by updating the text area with the "box" ID with the contents of the event. If the event is <start> and its contentURL field is not empty, then a document cookie is set with the context ID and the application moves to a new XHTML page by setting the href property of the location object with the new URI. After the browser has moved to the new page the application can get the contextId property from the cookie and, if it exists, send a start Response message to the interaction manager.


Listing 4. JavaScript functions for calling the Mozilla SOAP API
function data(name, value) {

   // create the parameters array
   var params = new Array();
   params[0] = new SOAPParameter(contextId, "context");
   params[1] = new SOAPParameter(value, name);
   callCoffeeMaker("DataOp", params, dataResponse);
}

function newContext() {

   var date = new Date();
   requestId = "coffeemaker"+date.getTime();

   // create the parameters array
   var params = new Array();
   params[0] = new SOAPParameter(requestId, "requestID");
   params[1] = new SOAPParameter("text/xhtml", "media");
   callCoffeeMaker("NewContextOp", params, contextResponse);
}

function callCoffeeMaker(method,params,callback)
{
   try {
      netscape.security.PrivilegeManager.enablePrivilege("UniversalBrowserRead");
   } 
   catch (e) {
      alert(e);
      return false;
   }
   var soapCall = new SOAPCall();
   soapCall.transportURI = "http://example.com/mycoffeemaker";
   soapCall.encode(0, method, "urn:mycoffeemaker", 0, null, params.length, params);
   soapCall.asyncInvoke(
      function (response, soapcall, error)
      {
         var r = handleSOAPResponse(response,soapcall,error);
         callback(r);
      }
   );
}

...

function dataResponse(result)
{
   if (result)
   {
      var params = result.getParameters(false,{});
      if (params[1].name == "event") {

         if (params[1].value == "on") {
            document.getElementById("on").checked = true;
         }
         else if (params[1].value == "off") {
            document.getElementById("off").checked = true;
         }
         data("event", "request");
      }
      else if (params[1].name == "start") {

         var url = params[1].element.getElementsByTagName("contentURL");
         if (url.length > 0) {
            document.cookie = "contextId="+contextId;
            location.href = url.nodeValue;
         }
         else {
            startResponse("Failure");
        }
      }
      else if (params[1].name == "coffeestatus") {
         document.getElementById("box").innerHTML = params[1].value;
         data("event", "request");
      }
      else if (params[1].name != "done") {
         data("event", "request");
      }
   }
}

var contextId; // Global context ID
var requestId; // Global request ID


XHTML

Listing 5 shows the XHTML document for the online coffee service. The document loads the JavaScript in Listing 4 and then calls the newContext function to initiate interaction with the Web service. When the user clicks on either the On or Off radio button, the data function is called to send the on or off data event to the Web service.


Listing 5. XHTML for the online coffee service
<?xml version="1.0" encoding="ISO-8859-1" ?>
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
     <title>My Coffee Maker</title>

  <style>
h2 { background: #ffffff;
color : #0000a0;
font-weight: bold;
font-size : 18pt;
font-family : Arial }
body { color : #000000;
background: #ffffff;
font-size : 14pt;
margin: 30px 0px 0px 30px;
font-family : Comic Sans MS }
textarea { width: 400px;
height: 300px;
border: 1px solid #0000a0;
padding: 5px;
margin: 2px 2px 2px 2px;
}
  </style>
  <script type="text/javascript" src="coffeemaker.js"/>
  <script type="text/javascript">
    newContext();
  </script>
  </head>
  <body>
    <h2>My Coffee Maker</h2>
    <p></p>
    <form action=".">
      <input name="OnOff" type="radio" id="on"
             value="On" onfocus="data('event','on')"/> On<br/>
      <input name="OnOff" type="radio" if="off"
             value="Off" onfocus="data('event','off')"/> Off<br/>
      <p></p><p></p>
      <p>Status Messages<br/>
      <textarea id="box" name="box">
	</textarea>
      </p>
    </form>
  </body>
</html>


It's coffee time!

Figure 1 shows the online coffee service running in Firefox. The screenshot shows the text area updated after Firefox has received and processed a SOAP message from the Web service containing the coffee maker's status.


Figure 1. The coffee service running in Firefox
The Web service application running in Firefox

Because the online coffee service implements the Multimodal Architecture's life-cycle API, its operations are available to other modality components besides XHTML. A voice component could send and receive <data> events, which in turn would contain the life-cycle <prepare> event for pre-fetching resources, and a grammar event for loading a new grammar. Because the voice modality would reside on a server it would support asynchronous notifications and would not need to send the <data> request event to the interaction manager.

The voice modality

Modality components in the Multimodal Architecture are loosely coupled, so they cannot know about each other. This leads to the question of how to initiate the SIP (Session Initiation Protocol) call to the voice modality component. One solution would be to use ActiveX components and Java applets (which are available to the Web browser) to manage the SIP call, as well as call the start listening and stop listening functions necessary for Push-to-Talk (PTT). If the SIP component were not accessible from the XHTML, the voice component would have to initiate the call. In this case the voice component would have to assume that the client SIP component was listening at the default SIP port number 5060, which is what is assumed in the sample application.


In conclusion

In this article I conclude my analysis of the W3C Multimodal Architecture by showing how multimodal interaction as specified by the W3C Multimodal Architecture can be implemented today as a Web service. I presented a simple example of an online coffee service, including the WSDL which defined the data types, operations, and service bindings needed by the client. Finally, I showed how a client can call the Web Service operations using the Mozilla SOAP API.

In the first article of this series I pointed out several challenges faced by implementations of the W3C Multimodal Architecture. A Web Service implementation of the architecture overcomes these challenges by providing a well-defined interface and data exchange format for a distributed application. This is important because the W3C Architecture is best used to develop a distributed multimodal application. In addition, a Web Service implementation allows for the definition of additional operations for querying the capabilities of multimodal components. It also provides SOAP and the SOAP client API as the multimodal protocol and JavaScript interface. Implementing a distributed multimodal application as a Web service turns out to be an excellent way to leverage the strengths of the W3C Multimodal Architecture and its generic life-cycle API, while also working around some of its weaknesses.



Download

DescriptionNameSizeDownload method
Sample code from this articlewebservice.zip4KB HTTP

Information about download methods


Resources

Learn

Get products and technologies

  • IBM developerWorks SCXML page: Get the IBM Modeling and Integration Tools for State Chart XML and try the plug-ins for Rational Software Architect and Mozilla.

Discuss

About the author

Gerald McCobb has worked for IBM for over 15 years. He currently works in WebSphere Business Activity Monitor development. He was IBM's representative to the W3C Multimodal Interaction Working Group from 2002 to last May, 2007.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development
ArticleID=230310
ArticleTitle=The W3C Multimodal Architecture, Part 3: A multimodal Web service
publish-date=06122007
author1-email=mccobb@us.ibm.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers