The concept of a Web service is very abstract. The current definition proposed by the W3C's Web Services Architecture working group roughly boils down to a software application accessible via Internet protocols using XML for messaging, description, and discovery. Depending on whom you ask, Web services are simply an incarnation, a generalization, or a reinvention of distributed computing. It is the Web's version of CORBA or DCE but much more than a general means for remote procedure calls. Being so abstract, the concept of Web services extends beyond any single standard or technology. But even though the idea of Web services is very general, software developers have focused on three specific standards as the core of Web services infrastructure.
The first is the Web Services Description Language (WSDL), a format for specifying the operations that a Web service exposes, the concrete transport mechanisms through which the service exposes these operations, and where the service is located. The next is SOAP, a protocol that mandates the structure of a message in XML. The last of these is XML Schema, a format for describing XML datatypes. These give us, respectively, an interface definition language, a wire format, and a type system, all of which are platform neutral. Thus they make up the basic necessities for a platform independent distributed computing framework. However, these standards are more than platform independent. They are also transport independent, language independent and extensible. So we have a distributed computing framework which can potentially be implemented above the transport that is most suited to the task and can be extended with technologies such as new type systems that are suited best to a particular task. This generality does wonders for the shelf-life of the standards but has the unfortunate side effect of adding complexity to the system.
What this all means is that communicating with a Web service does not simply involve understanding WSDL, SOAP, and XML Schema. That is not enough information. Instead it involves, among other things, understanding particular extensions to WSDL, the specific transports over which to send SOAP envelopes, and the data encoding expected by the service. Just as XML is often joked about as being a standard whose sole purpose is to create more standards. The family of Web services standards seems to share that property. No doubt we will soon see higher level specifications making use of WSDL and SOAP in some specific manner. In Web services terminology, these higher level specifications would often be referred to as bindings. Moreover, each of the main specifications contains ambiguities that can stifle interoperability, thus requiring understanding of the specific interpretations of the specs. Each standard has extensibility points, optional sections, and ambiguities which contribute to divergence in functionality of implementations.
The standard most commonly associated with Web services is also the most general, most often misinterpreted, and thus the most problematic for interoperability. That standard is SOAP. The most commonly used version of SOAP is SOAP version 1.1 as defined by the W3C note which the W3C is refining into an official recommendation. SOAP is often regarded as a framework for RPC, a protocol used to access functionality over the web. But SOAP itself is none of those things exactly. SOAP is merely a message format. SOAP simply specifies that messages will consist of an Envelope element followed by an optional Header element and then a Body element. It defines some semantics for how to treat headers such as whether processing of the header is optional. It also defines a format for sending errors within messages.
The main advantage, and arguably the entire purpose of having a standard message format, is that messages can then be transmitted via any means and it would still look the same to the recipient. So theoretically, a SOAP server should simply take an envelope as input and not worry about how that envelope got to it. Thus, a message can be transmitted via the transport protocol that suits the application. This, however, means that a claim to support SOAP as a means of communication can only refer to message format, not to transport protocol or payload format. So claims stating that a client API supports any SOAP service are never true in general if the code does anything other than form the SOAP Envelope given arbitrary XML as the payload. Most such claims involve implicit assumptions as to the transport and the data encoding. Commonly only HTTP will be supported as a transport.
Higher level SOAP APIs will often simplify the creation of SOAP Envelopes by automatically marshalling from native datatypes in a particular language to XML which is then used for the payload of the envelope. An API which does this is again making an implicit assumption about the way a SOAP service behaves. Since the SOAP standard says nothing about the format of the payload, how can these higher level APIs be assured that, when they convert an "int" into an XML element, the server on the other end will properly recognize it as such? In general it cannot. Any API which does this is making an implicit assumption about the encoding of the data. To illustrate this point about the arbitrary nature of SOAP payload data and the necessity for an agreement upon data encoding between the client and the service take the following example.
Imagine a Web service that provides up-to-the-second reports of the temperature in my office. All it does is return a floating point number representing the temperature. So I return a message as shown in Listing 1. According to the SOAP standard, the message in Listing 1 is a proper SOAP message.
Listing 1: A valid SOAP Envelope with an unusually encoded payload.
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<officeTemperature precision="double" signbit="0"
exponentfield="1029" significand="1.0203125" />
</soap:Body>
</soap:Envelope>
|
However, note that the payload looks very strange for a payload that simply returns a floating point number. It turns out that I chose to build my hypothetical Web service with a hypothetical toolkit that encodes floating point values in that manner. It is simply a textual representation of the value according to the IEEE floating point standard. You could imagine such a serialization to be useful for something like a scientific application where rounding errors are intolerable so a very specific format for encoding floating point values is needed. Now imagine I attempt to write a client for my Web service using a common generic Java technology SOAP toolkit. I try to use the nice high level features of the toolkit to allow me to simply call the Web service and get back a plain old Java programming language double primitive type. But it turns out the toolkit won't work in that way with my service. When it sees the officeTemperature element, it does not recognize it as a floating point value. The problem is that the Java client toolkit I chose only supports a particular kind of data encoding, not the one my Web service is using.
When the requirements for the SOAP specification were drawn up, many people saw the need for standard rules for encoding data within SOAP envelopes. The idea is that simple services and clients would simply agree to follow a set of rules for encoding data to XML in order to make writing services and clients easier. So the result was SOAP Encoding, the infamous set of rules often referred to as "Section 5" encoding after its location in the specification. SOAP Encoding is only a suggestion in the specification. Thus, when a product claims SOAP compatibility, they are not explicitly claiming SOAP Encoding compatibility. This is why these higher level APIs cannot be correct in general when they do automatic marshalling of datatypes. Just as the extensibility of SOAP with regard to transports often causes implicit assumptions which create compatibility problems, so does this extensibility with regard to payload data encoding. So to know whether the client API will generate SOAP envelopes that a specific Web service will understand, you must be explicitly aware of the data encoding that the Web service expects and whether that encoding is supported by the client API. Listing 2 shows one of many ways that my service could have encoded the response payload using SOAP Encoding.
Listing 2: A valid SOAP Envelope with a payload encoded with SOAP "Section 5" Encoding.
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soap:Body>
<officeTemperature xsi:type="xsd:double">65.3</officeTemperature>
</soap:Body>
</soap:Envelope>
|
The generality of SOAP means that not all systems that make use of SOAP can communicate easily. Since SOAP is transport independent and does not enforce a data encoding format, one must be aware of more about a service than simply that it supports SOAP. The most important thing to understand is that SOAP is simply a message format. In the next article in this series, I will explain another commonly misunderstood aspect of SOAP which is responsible for many interoperability issues, namely, data encoding. The above example shows that one must be aware of the data encoding to properly communicate with a Web service. But that is not enough in many cases. I will attempt to clarify the purpose of different data encodings and dig deeper into the specific issues surrounding the particularly confusing SOAP Section 5 Encoding. I will also begin to discuss WSDL and its relation to SOAP and other Web services standards.
- Read Advancing SOAP interoperability: A look at community SOAP interoperability efforts for more information on activity around Web services interoperability
- Look at Web services interoperability between the WebSphere and .Net platforms for a practical discussion of interoperability.
- Break through the misconceptions and hype around SOAP by reading Myths and misunderstandings surrounding SOAP.
- Clear up questions about SOAP from the source, the SOAP version 1.1 specification
- For a peek at what's coming in the new specs, look at the SOAP version 1.2 part 1 last call working draft and the SOAP version 1.2 part 2 adjuncts last call working draft
Jordi Albornoz is a Software Engineer in IBM's Advanced Internet Technology group. He has worked on the Sash scripting environment as well as SashXB and is currently developing an advanced Web services API for JavaScript. He is a graduate of Carnegie Mellon University's Computer Science program and an alumnus of IBM's Extreme Blue Program. You can contact Jordi at jordi@us.ibm.com.





