 | Level: Introductory Jordi Albornoz (jordi@us.ibm.com), Software Engineer, Advanced Internet Technology, IBM
01 Oct 2002 Web services are defined by a myriad of standards. Each standard is general enough to be independent from the other standards but specific enough to only address a small piece of the Web services puzzle. The interaction between SOAP, WSDL, XML Schema, HTTP, etc. can become very complicated. This, along with differing interpretation of the standards and how they relate, causes interoperability problems. Often, software packages claim to support "Web services" or a specific standard such as SOAP or WSDL. So developers are likely to assume that two products proudly displaying such an acronym will communicate easily with each other. However, the standards were not written such that a simple acronym on the product box will ensure or even suggest compatibility. This series will guide you through the prevailing Web services standards describing their specific use and explaining what it really means to support each one, how they interact, and most importantly, where compatibility problems are likely to occur. The articles will also discuss the relevant changes to come as many of these standards are being revised. In this first article in the series, Jordi Albornoz will introduce the issue of complex interaction of standards and describe some of the issues around SOAP.
Introduction to the issues
The concept of a Web service is very abstract. The current definition proposed by the W3C's Web Services Architecture working group roughly boils down to a software application accessible via Internet protocols using XML for messaging, description, and discovery. Depending on whom you ask, Web services are simply an incarnation, a generalization, or a reinvention of distributed computing. It is the Web's version of CORBA or DCE but much more than a general means for remote procedure calls. Being so abstract, the concept of Web services extends beyond any single standard or technology.
But even though the idea of Web services is very general, software developers have
focused on three specific standards as the core of Web services infrastructure. The first is
the Web Services Description Language (WSDL), a format for specifying the operations
that a Web service exposes, the concrete transport mechanisms through which the
service exposes these operations, and where the service is located. The next is SOAP, a
protocol that mandates the structure of a message in XML. The last of these is XML
Schema, a format for describing XML datatypes. These give us, respectively, an
interface definition language, a wire format, and a type system, all of which are platform
neutral. Thus they make up the basic necessities for a platform independent distributed
computing framework. However, these standards are more than platform independent.
They are also transport independent, language independent and extensible. So we have a
distributed computing framework which can potentially be implemented above the
transport that is most suited to the task and can be extended with technologies such as
new type systems that are suited best to a particular task. This generality does wonders
for the shelf-life of the standards but has the unfortunate side effect of adding complexity
to the system.
What this all means is that communicating with a Web service does not simply
involve understanding WSDL, SOAP, and XML Schema. That is not enough
information. Instead it involves, among other things, understanding particular extensions
to WSDL, the specific transports over which to send SOAP envelopes, and the data
encoding expected by the service. Just as XML is often joked about as being a standard
whose sole purpose is to create more standards. The family of Web services standards
seems to share that property. No doubt we will soon see higher level specifications
making use of WSDL and SOAP in some specific manner. In Web services terminology,
these higher level specifications would often be referred to as bindings. Moreover, each
of the main specifications contains ambiguities that can stifle interoperability, thus
requiring understanding of the specific interpretations of the specs. Each standard has
extensibility points, optional sections, and ambiguities which contribute to divergence in
functionality of implementations.
 |
The implications of SOAP
The standard most commonly associated with Web services is also the most general,
most often misinterpreted, and thus the most problematic for interoperability. That
standard is SOAP. The most commonly used version of SOAP is SOAP version 1.1 as
defined by the W3C note which the W3C is refining into an official recommendation.
SOAP is often regarded as a framework for RPC, a protocol used to access functionality
over the web. But SOAP itself is none of those things exactly. SOAP is merely a
message format. SOAP simply specifies that messages will consist of an Envelope
element followed by an optional Header element and then a Body element. It defines
some semantics for how to treat headers such as whether processing of the header is
optional. It also defines a format for sending errors within messages. The main
advantage, and arguably the entire purpose of having a standard message format, is that
messages can then be transmitted via any means and it would still look the same to the
recipient. So theoretically, a SOAP server should simply take an envelope as input and
not worry about how that envelope got to it. Thus, a message can be transmitted via the
transport protocol that suits the application. This, however, means that a claim to support
SOAP as a means of communication can only refer to message format, not to transport
protocol or payload format. So claims stating that a client API supports any SOAP
service are never true in general if the code does anything other than form the SOAP
Envelope given arbitrary XML as the payload. Most such claims involve implicit
assumptions as to the transport and the data encoding. Commonly only HTTP will be
supported as a transport.
Marshalling to XML
Higher level SOAP APIs will often simplify the creation of SOAP Envelopes by
automatically marshalling from native datatypes in a particular language to XML which
is then used for the payload of the envelope. An API which does this is again making an
implicit assumption about the way a SOAP service behaves. Since the SOAP standard
says nothing about the format of the payload, how can these higher level APIs be assured
that, when they convert an "int" into an XML element, the server on the other end will
properly recognize it as such? In general it cannot. Any API which does this is making
an implicit assumption about the encoding of the data. To illustrate this point about the
arbitrary nature of SOAP payload data and the necessity for an agreement upon data
encoding between the client and the service take the following example.
Imagine a Web service that provides up-to-the-second reports of the temperature in
my office. All it does is return a floating point number representing the temperature. So
I return a message as shown in Listing 1. According to the SOAP standard, the message in Listing 1 is a proper SOAP message.
Listing 1: A valid SOAP Envelope with an unusually encoded payload.
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<officeTemperature precision="double" signbit="0"
exponentfield="1029" significand="1.0203125" />
</soap:Body>
</soap:Envelope>
|
However, note that the payload looks very strange for a payload that simply returns a floating point number. It turns out that I chose to build my hypothetical Web service with a hypothetical toolkit that encodes floating point values in that manner. It is simply a textual representation of the value according to the IEEE floating point standard. You could imagine such a serialization to be useful for
something like a scientific application where rounding errors are intolerable so a very
specific format for encoding floating point values is needed. Now imagine I attempt to
write a client for my Web service using a common generic Java technology SOAP toolkit. I try to
use the nice high level features of the toolkit to allow me to simply call the Web service
and get back a plain old Java programming language double primitive type. But it turns out the toolkit
won't work in that way with my service. When it sees the officeTemperature
element, it does not recognize it as a floating point value. The problem is that the Java
client toolkit I chose only supports a particular kind of data encoding, not the one my Web
service is using.
When the requirements for the SOAP specification were drawn up, many people saw
the need for standard rules for encoding data within SOAP envelopes. The idea is that
simple services and clients would simply agree to follow a set of rules for encoding data
to XML in order to make writing services and clients easier. So the result was SOAP
Encoding, the infamous set of rules often referred to as "Section 5" encoding after its
location in the specification. SOAP Encoding is only a suggestion in the specification.
Thus, when a product claims SOAP compatibility, they are not explicitly claiming SOAP
Encoding compatibility. This is why these higher level APIs cannot be correct in general
when they do automatic marshalling of datatypes. Just as the extensibility of SOAP with
regard to transports often causes implicit assumptions which create compatibility
problems, so does this extensibility with regard to payload data encoding. So to know
whether the client API will generate SOAP envelopes that a specific Web service will
understand, you must be explicitly aware of the data encoding that the Web service
expects and whether that encoding is supported by the client API. Listing 2 shows one of
many ways that my service could have encoded the response payload using SOAP Encoding.
Listing 2: A valid SOAP Envelope with a payload encoded with SOAP "Section 5" Encoding.
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soap:Body>
<officeTemperature xsi:type="xsd:double">65.3</officeTemperature>
</soap:Body>
</soap:Envelope>
|
This is only the beginning
The generality of SOAP means that not all systems that make use of SOAP
can communicate easily. Since SOAP is transport independent and does not
enforce a data encoding format, one must be aware of more about a service
than simply that it supports SOAP. The most important thing to understand
is that SOAP is simply a message format. In the next article in this series,
I will explain another commonly misunderstood aspect of SOAP which is responsible
for many interoperability issues, namely, data encoding. The above example
shows that one must be aware of the data encoding to properly communicate with a
Web service. But that is not enough in many cases. I will attempt to clarify the
purpose of different data encodings and dig deeper into the specific issues surrounding
the particularly confusing SOAP Section 5 Encoding. I will also begin to discuss WSDL
and its relation to SOAP and other Web services standards.
Resources
About the author  | |  |
Jordi Albornoz is a Software Engineer in IBM's Advanced Internet Technology group. He has worked on the Sash scripting environment as well as SashXB and is currently developing an advanced Web services API for JavaScript. He is a graduate of Carnegie Mellon University's Computer Science program and an alumnus of IBM's Extreme Blue Program. You can contact Jordi at jordi@us.ibm.com.
|
Rate this page
|  |