Skip to main content

Finding your way through Web service standards, Part 2: More intricacies of SOAP and WSDL

Issues in encoding SOAP data

Jordi Albornoz (jordi@us.ibm.com), Software Engineer, Advanced Internet Technology, IBM
Jordi Albornoz is a Software Engineer in IBM's Advanced Internet Technology group. He has worked on the Sash scripting environment as well as SashXB and is currently developing an advanced Web services API for JavaScript. He is a graduate of Carnegie Mellon University's Computer Science program and an alumnus of IBM's Extreme Blue Program. You can contact Jordi at jordi@us.ibm.com.

Summary:  In the previous article, Jordi explained how each of the Web services standards are designed to be extremely general in the interest of extensibility (see Resources). Yet each standard only solves a very specific problem in creating a distributed computing framework. Thus, knowing that a product supports SOAP is not enough to know that another product making a similar claim will interoperate. More details must be known, such as the transport that should be used to exchange SOAP envelopes and the data encoding for the payload of the envelope. SOAP is at its root merely a message format. It is extensible with respect to the transport used to transmit the message and the format of the data contained within each message. The data encoding extensibility is understandably important for the exchange of SOAP messages. So suggested rules for encoding data were created. In this article, Jordi explains how data encodings relate to SOAP and other standards.

Date:  01 Oct 2002
Level:  Introductory
Activity:  751 views

Web services are defined by a myriad of standards. Each standard is general enough to be independent from the other standards, but specific enough to only address a small piece of the Web services puzzle. The interaction between SOAP, WSDL, XML Schema, HTTP, etc. can become very complicated. This, along with differing interpretation of the standards and how they relate, causes interoperability problems. Often, software packages claim to support "Web services" or a specific standard such as SOAP or WSDL. So developers are likely to assume that two products proudly displaying such an acronym will communicate easily with each other. However, the standards were not written in such a way that a simple acronym on the product box will ensure or even suggest compatibility. In this second article in the series, I continue the discussion of SOAP and I use the example of a hypothetical service that returns the temperature in my office. I give an example of how SOAP Section 5 Encoding can be used to encode the response envelope for my service in Listing 1.


Listing 1. Valid SOAP Envelope with a payload encoded with SOAP Section 5 Encoding
		
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
        xmlns:xsd="http://www.w3.org/2001/XMLSchema"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <soap:Body>
        <officeTemperature xsi:type="xsd:double">65.3</officeTemperature>
    </soap:Body>
</soap:Envelope>

SOAP Section 5 Encoding is only one of the encodings that are commonly in use. The other encoding that is most common is not truly an encoding by SOAP terms. It is often known as the literal encoding. The literal encoding simply means that the payload of a SOAP message is defined completely by a specific schema, often an XML Schema. Much confusion exists in this area, since SOAP Section 5 is often interpreted as making use of XML Schema as well. The SOAP Section 5 encoding spec is considered ambiguous in its requirement of XML Schema datatypes. To add to the confusion, SOAP Section 5 Encoding is not itself fully expressible by an XML Schema document. To clear up this confusion, I will extract the essence of the concept of encodings.

What does your data look like?

An encoding in the context of SOAP is simply a set of rules that specify how to translate a specific data model to an XML Infoset. The idea is that SOAP Envelopes are an XML Infoset, since they are in an XML format. But not all data is stored in a hierarchical data structure like an XML Infoset. So for such data, one needs a translation mechanism, an encoding, simply an agreed upon way to serialize and de-serialize the data.

One example of a data structure that does not fit directly into an XML Infoset is a graph, a set of nodes that are arbitrarily connected to each other via edges. The XML Infoset is very similar to a tree structure and it is easy to construct a graph that is not a tree (just connect the nodes in such a way so that there is no obvious root node). SOAP Section 5 Encoding gives rules for serializing a graph into XML. It does this by defining XML constructs such as an id attribute and a ref attribute to serialize a graph structure into XML so that the recipient can create the same graph on the other end. A graph as a data model was chosen for SOAP Section 5 Encoding because a tree structure is often too restrictive for doing RPC and the encoding was designed to facilitate RPC.

Notice I said nothing about XML Schema datatypes when discussing the SOAP Encoding data model. That is because nothing in the encoding forces the use of datatypes. It only deals with the abstract concepts of nodes and edges. This means that even SOAP encoding itself is general enough so that code claiming to marshal data using SOAP Encoding cannot claim to marshal that data into native types of a language without having another implicit assumption being made.

For example, notice that in Listing 1, the encoded payload, which follows SOAP Encoding rules, makes use of XML Schema to annotate the element as a double. What if I had chosen to use another type system other than XML Schema? For instance, I could use the XML-Data Reduced (XDR) specification's set of datatypes to annotate the node in Listing 1. If I had, it would look as shown in Listing 2.


Listing 2. Valid SOAP Envelope with a payload encoded with SOAP Section 5 Encoding annotated with an XDR datatype
		
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
        xmlns:dt="urn:schemas-microsoft-com:datatypes"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <soap:Body>
        <officeTemperature xsi:type="dt:r8">65.3</officeTemperature>
    </soap:Body>
</soap:Envelope>

The toolkit might have been unable to handle this SOAP Envelope even though it was encoded properly according to SOAP Encoding. I could have also chosen to simply omit the xsi:type attribute and it would still be a valid encoded form. So even if your high-level API creates a SOAP Envelope for you using SOAP Section 5 encoding and the Web service you are interested in expects data encoded via SOAP Encoding, it still might not work. This is because the Web service most likely expects more than simply SOAP Encoding for its data. It is also failing to tell you that it expects each node in the encoded graph you send it to be instrumented with a type specification. It needs to know the type of the data in the node. Such a requirement is not addressed by SOAP or SOAP Encoding, so most Web services require the instrumentation of types with XML Schema datatypes. This is simply because XML Schema datatypes is a widely understood set of datatypes that are designed to be used with XML. Often Web services will require that the payload data only use the XML Schema built-in types and not any types derived from these.

This means that if you want to know if a particular client API will generate SOAP Envelopes that are compatible with a particular Web service, you needs to know whether the API and the service support a particular transport, a particular encoding style, and, in the case of SOAP Section 5 Encoding, the particular type system that might be required. This type-system ambiguity in SOAP Section 5 Encoding is one reason that many developers are leaning towards using literal encoding for their Web services.

Skipping SOAP Encoding

Literal encoding simply means that instead of having the Web service and the client agree to follow a set of rules for serializing the data, they agree on the exact format of the data. Often the distinction is referred to as reader-makes-right versus writer-makes-right. Literal encoding could simply involve some written prose documentation by the service author that states exactly what the message payload will look like. However, if the format is described by using an XML Schema, a toolkit would read it and use that to provide automatic marshalling of the data from the native language structures into XML. So, in this case, all the toolkit has to understand is the entire XML Schema specification instead of the combination of the particular encoding rules as well as the chosen type system. The only issue left with using literal encoding is how the toolkit finds the particular service's XML Schema. That issue is solved by WSDL.

No machine-understandable standard exists for describing data models for use with SOAP Section 5 Encoding, thus developers are leaning towards literal encoding with XML Schema because developer tools often give such features as syntax assistance when working with XML Schema as well as validation support of data modeled with XML Schema. This may change soon, however, as the Web Services Description working group is considering creating a language to describe SOAP Section 5 data models as part of the WSDL binding for SOAP in WSDL version 1.2.


SOAP wrap-up

I hope the relationship between SOAP, SOAP Encoding, and XML Schema is much clearer now. I hope to have cut through much of the confusing hype which leads developers to believe that SOAP is the solution for everything they need. SOAP is a very powerful and important technology but it has a very specific purpose and scope. The most important concept to understand is that SOAP is simply a message format. To be able to communicate with a Web service, you need to know more than just that it supports SOAP. Most SOAP APIs for both consuming and creating services will either explicitly or implicitly specify the data encoding they expect and the transports they support. Properly evaluate whether the technologies for your specific task are compatible.


Where WSDL fits

Some say that SOAP is all you need to create and consume Web services. Because SOAP is simply a message format, that is not strictly true. It is true that if you choose a transport protocol, such as HTTP, and a data encoding, such as SOAP Section 5 Encoding, you have all the pieces needed to create and consume Web services. However, after writing a few clients and services you will begin to see repetition in the work being done and room for automation in the development process.

A SOAP service would require some documentation explaining the operations exposed along with their parameters.But if this documentation is not in a machine-understandable standard format, it's like giving someone a shared library implementing some useful function without also including the library's header file. Theoretically, it can be used if you explain exposed methods in some documentation. You could then load the library into your address space and call the methods using function pointers. But why should you have to do that? .

Doing that is basically recreating the header file yourself from the written documentation. Not many people will argue the usefulness of a header file. In the Web services context, the analogous file is a Web Services Description Language (WSDL) document. It is the header file for a Web service. It is a machine understandable standard describing the operations of a Web service. It also specifies the wire format and transport protocol that the Web service uses to expose this functionality. It can also describe the payload data using a type system.

In the next article in this series, I will discuss the issues surrounding WSDL. I will attempt to extract the bare essence of what WSDL provides and I will tie the discussions of SOAP and WSDL together with some practical advice for creating and consuming Web services in a way that promotes interoperability as well as some ideas on the future of the Web services standards.


Resources

About the author

Jordi Albornoz is a Software Engineer in IBM's Advanced Internet Technology group. He has worked on the Sash scripting environment as well as SashXB and is currently developing an advanced Web services API for JavaScript. He is a graduate of Carnegie Mellon University's Computer Science program and an alumnus of IBM's Extreme Blue Program. You can contact Jordi at jordi@us.ibm.com.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=SOA and Web services
ArticleID=11721
ArticleTitle=Finding your way through Web service standards, Part 2: More intricacies of SOAP and WSDL
publish-date=10012002
author1-email=jordi@us.ibm.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers