Skip to main content

The Web services (r)evolution: Part 3

How SOAP works

Graham Glass (graham-glass@mindspring.com), CEO/Chief Architect, The Mind Electric
Graham Glass is founder, CEO, and Chief Architect of The Mind Electric, which designs, builds and licenses forward-thinking distributed computing infrastructure. He believes that the evolution of the Internet will mirror that of a biological mind, and that architectures for helping people and businesses to network effectively will provide insight into those that wire together the human brain. He can be reached at graham-glass@mindspring.com.

Summary:  This article provides an explanation of how SOAP works, including information about its on-the-wire protocol and how messages are processed. It also explains how objects can be passed by value between Web services, and touches on performance and security issues.

Date:  01 Jan 2001
Level:  Introductory
Activity:  4098 views

Welcome to the third installment of my column that focuses on the evolutionary and revolutionary aspects of Web services technology. In the second installment, I showed how to build, deploy, and invoke a simple Web service using the Apache Simple Object Access Protocol (SOAP) implementation. In this installment, I'll explain how SOAP works under the hood, which is technically interesting and will help to demystify the future standards that we'll discuss like Web Services Description Language (WSDL) and Uniform Description, Discovery and Integration (UDDI) standard (see Resources).

Tools and installation

No new software needs to be installed since we are using the same tools as in the previous installment. Additionally, we will use the examples from the previous installment, so be sure you still have the \demo1 directory with the IExchange.java, Exchange.java, and Client.java source files present.

One new software tool I will introduce this time around is the TCP tunneling GUI, which allows you to see the TCP messages move between the client and server. This tool will allow you to examine the SOAP on-the-wire protocol and even check that the various competing SOAP implementations are standards-compliant.


Sniffing SOAP

The TCP tunneling GUI allows you to "sniff" SOAP messages by acting as a TCP router. To see it in action, first start the Apache Tomcat server (see last issue in Resources) in the \demo1 directory by typing:

> cd \demo1
> tomcat run

Tomcat starts up on port 8080 by default. Then start the TCP tunneler in a separate window by typing:

> java org.apache.soap.util.net.TcpTunnelGui 8070 localhost 8080

This starts a TCP server on port 8070 that acts as an intermediary between any client and port 8080 on the local host, which in this case is serviced by Tomcat. The GUI (see Figure 1) will display outgoing messages in the left pane and incoming replies in the right pane.


Figure 1: The TCP Tunneler Interface

Now edit the Client.java code from the previous example and change the Web service URL to use port 8070 instead of 8080. The first line of the client program should now read:

URL url = new URL( "http://localhost:8070/soap/servlet/rpcrouter" );

Then compile and run the client. The GUI should display Figure 2.


Figure 2: Running the Tunneler

As you can see, the left-hand pane shows an outgoing SOAP request comprised of 5 lines of standard HTTP header followed by an XML document that represents the client service invocation. The right-hand pane shows the resultant SOAP response consisting of 6 lines of standard HTTP header followed by an XML document that is the server response. Even without an explanation of the SOAP format, you can probably figure out what most of it means. Contrast this with the CORBA and DCOM protocols, which are binary, not self-describing, and tough to trace. I know this firsthand, having written a CORBA ORB in a previous lifetime.


Anatomy of a SOAP Request

The first thing to mention is that although this particular set up uses HTTP to deliver SOAP messages, SOAP can ride on any other transport protocol. You can use SMTP, the Internet email protocol, for example, to deliver SOAP messages. The header differs between transport layers, but the XML payload remains the same. A complete SOAP/HTTP request, with the XML content indented for clarity, is shown in Listing 1.

A SOAP request is sent as an HTTP POST with the content type set to text/xml and a field called SOAPAction set to either an empty string or the name of the SOAP method. The SOAPAction field allows a receiving Web server to detect that it's an incoming SOAP message and potentially route or filter it.

The XML part of the SOAP request consists of three main portions:

  • The Envelope defines the various namespaces that are used by the rest of the SOAP message, typically including xmlns:SOAP-ENV (SOAP Envelope namespace), xmlns:xsi (XML Schema for Instances) and xmlns:xsd (XML Schema for DataTypes).
  • The Header is an optional element for carrying auxiliary information for authentication, transactions, and payments. Any element in a SOAP processing chain can add or delete items from the Header; elements can also choose to ignore items if they are unknown. If a Header is present, it must be the first child of the Envelope. Because our example is simple and does not involve routers, the Header is absent.
  • The Body is the main payload of the message. When SOAP is used to perform an RPC call, the Body contains a single element that contains the method name, arguments, and Web service target address. The namespace of the element is equal to the target address, and the base name is the method name. In this example, ns1:getRate indicates that the target address is urn:demo1:exchange (the expanded form of ns1) and the method name is getRate. If a Header is present, the Body must be its immediate sibling, otherwise it must be the first child of the Envelope.

When using SOAP as a remote procedure call (RPC) system, the SOAP parameters can be typed or untyped. The current version of Apache only accepts typed arguments, although there is a version in the works that will properly allow untyped arguments as well. The default SOAP encoding scheme uses the xsi:type attribute to indicate an XSD type. XSD defines several basic types, including int, byte, short, boolean, string, float, double, date, time and URL. It also specifies a format for sending arrays and blocks of opaque data.

Because SOAP is intended to be platform and language neutral, XSD does not define formats for encoding objects or structures unique to a single language. Later in this article, I'll show you how to send Java objects between machines that are both running Apache SOAP 2.0.

The SOAP/HTTP message is typically accepted by a servlet running on a Web server. In this example, Apache 2.0 installs a servlet into Tomcat that accepts HTTP requests on soap/servlet/rpcrouter. When the servlet gets a request, it first checks that the request has a SOAPAction field; if it does, it forwards it to the Apache SOAP engine. The engine then parses the XML payload and uses the target Web service address to perform a lookup in its local registry (populated at startup when it reads the DeployedServices.ds file). It then uses reflection to locate and invoke the specified method on the service and get the result.

The next section describes the format of the SOAP/HTTP response that is used to return the result back to the client.


Anatomy of a SOAP Response

A SOAP/HTTP response (see Listing 2) is returned as an XML document within a standard HTTP reply whose content type is set to text/xml. The XML document is structured just like the request except that the Body contains the encoded method result. The namespace of the result is the original target object URI and the base name is the name of the method that was invoked. The XSI/XSD tagging scheme is optionally used to denote the type of the result (see Resources). The SOAP standard does not specify what should be returned from a void method, although most implementations simply omit the <return> part of the Body.


SOAP Exceptions

If an exception occurs at any time during the processing of a message, a SOAP exception is thrown. SOAP exceptions are encoded in a manner similar to regular SOAP responses, with the exception that the Body is used to contain information about the exception. To illustrate this, edit the Exchange.java file that describes our currency exchange Web service and force it to throw an exception as in Listing 3.


Listing 3: A SOAP Exception

public class Exchange implements IExchange
  {
  public float getRate( String country1, String country2 )
    {
    throw new RuntimeException( "cannot calculate rate" );
    /*
    System.out.println( "getRate( " + country1 + ", " + country2 + " )" );
    return 144.52F; // always return the same value for now
    */
    }
  }

Then shutdown and restart Tomcat so that it uses the new version of the Web service. The invoke the service using the Client program. The output from the TCP tunneling GUI is shown in Figure 3.


Figure 3: Tunneler output on SOAP Exception

The standard HTTP reply header indicates an exception by using status code 500. The XML payload contains an Envelope and Body just like a regular response, except that the content of the Body is a Fault structure whose fields are defined as follows:

  • Faultcode is a code which indicates the type of the fault. The valid values are SOAP-ENV:Client (incorrectly formed message), SOAP-ENV:Server (delivery problem), SOAP-ENV:VersionMismatch (invalid namespace for Envelope element) and SOAP-ENV:MustUnderstand (error processing header content).
  • Faultstring is a human readable description of the fault.
  • Faultactor is an optional field that indicates the URI of the source of the fault.
  • Detail is an application-specific XML that contains detailed information about the fault.

Some SOAP implementations use the detail element to encode information about remote exceptions such as their type, data, and stack trace so that they can be rethrown automatically on the client. This allows Remote Method Invocation (RMI)-style remote exceptions to be implemented using SOAP. Of course, if the client and server are not using the same SOAP implementation, the feature is automatically disabled.


Performance

Now that you've seen how SOAP messages are passed back and forth using HTTP and XML, it is interesting to contemplate performance issues.

CORBA, DCOM, and RMI use binary encoding for arguments and return values. In addition, they assume that both the sender and the receiver have full knowledge of the message context and do not encode any meta-information such as the names or types of the arguments. This approach results in good performance, but makes it hard for intermediaries to process messages. And since each system uses a different binary encoding, it's hard to build systems that interoperate.

Because SOAP uses XML to encode messages, it's very easy to process messages at every step of the invocation process. In addition, the ease of debugging SOAP messages is leading to a quick convergence of the various SOAP implementations, which is important because large scale interoperability is what SOAP is all about.

On the surface, it seems that an XML-based scheme would be intrinsically slower than that of a binary-based model, but it's not as straightforward as that. First, when SOAP is used for sending messages across the Internet, the time to encode/decode the messages at each endpoint is tiny compared with the time to transfer the bytes between endpoints, so using XML in this case is not significant.

Second, when SOAP is used to send messages between endpoints in a closed environment, such as between departments within the same company, its likely that the endpoints will be running the same implementation of SOAP. In this case, there are opportunities for optimizations that are unique to that particular implementation. For example, a SOAP client could add an HTTP header tag to a SOAP request that indicates that it supports a particular optimization. If the SOAP server also supports that optimization, it could return a HTTP header tag in the first SOAP response that tells the client that its okay to use that optimization in subsequent communications. At that point, both the client and server could start using the optimization.

In my experiments with Apache, I normally get about 30 round trip messages per second between Java programs on the same machine. I have been evaluating another SOAP implementation that gets about 700 round trip messages per second in the same configuration, so obviously there is a lot of room for improvement.


Passing objects

Once the initial excitement of building and deploying a simple service has worn off, many developers wish to start sending objects between clients and servers. There are at least two approaches for performing this task.

If you can guarantee that both the client and the server are configured with the same proprietary extensions, then you could pass objects between the machines using whatever approach you wanted. For example, if the client and server are both running Java programming language, they could agree to send objects using Java serialization.

On the other hand, if you want your system to work with any combination of client and server, start off by specifying XML schemas for the objects that you wish to send. Then ensure that both client and server are configured to recognize XML documents conforming to these schemas and can convert them to or from objects as necessary. For example, if a C++ client wanted to send an object to a Java server, the client should be able to convert C++ objects to or from the schema, and the server should be able to convert Java objects to/from the same schema.

Apache SOAP 2.0 includes a flexible scheme for passing objects between SOAP engines by including a special mapping registry. You can register custom serializers/deserializers that are used to encode/decode data. The current documentation is pretty sparse on how to create your own serializers, but fortunately a useful default Java BeanSerializer is included that can serialize/deserialize any class conforming to the JavaBean specification -- it must include a public default constructor and declare public set/get methods for all its attributes.

The following program illustrates how you can send and receive objects. A Purchasing Web service is deployed on the Tomcat server and then the Client sends an Invoice object by value. The object is displayed when it arrives on the server and then redisplayed when it is returned back to the client. You can also pass the object by reference in SOAP, as I will explain in a future installment of this column.

The sample source code for this Purchasing Web service is given in Listings 4 through 7.


Listing 4. Code for IPurchasing.java

public interface IPurchasing
  {
  Invoice receive( Invoice invoice );
  } 


Listing 5. Code for Purchasing.java

public class Purchasing implements IPurchasing
  {
  public Invoice receive( Invoice invoice )
    {
    System.out.println( "got invoice " + invoice );
    return invoice;
    }
  }


Listing 6. Code for Invoice.java

public class Invoice
  {
  String name;
  int amount;
  public Invoice()
    {
    }
  public Invoice( String name, int amount )
    {
    this.name = name;
    this.amount = amount;
    }
  public String toString()
    {
    return "Invoice( " + name + ", " + amount + " )";
    }
  public void setName( String name )
    {
    this.name = name;
    }
  public String getName()
    {
    return name;
    }
  public void setAmount( int amount )
    {
    this.amount = amount;
    }
  public int getAmount()
    {
    return amount;
    }
  }

Listing 7. Code for Client2.java

To run the program, perform the following steps:

  1. Copy the following Java files into a new directory called \demo2 and add \demo2 to your CLASSPATH.
  2. Compile the source files.
  3. Run the TCP tunneling GUI as before, listening to port 8070 and routing to/from port 8080.
  4. Start Tomcat in the \demo2 directory.
  5. Open a browser and enter the URL http://localhost:8080/soap
  6. Use the admin interface to deploy a Purchasing service. Enter the following values:
    • ID=urn:demo2:purchasing
    • Scope=Request
    • Methods=receive
    • Provider Type=Java
    • Provider Class=Purchasing
    • Static=No
    • Number of Mappings=1
    • Encoding Style=SOAP
    • Namespace URI=urn:my_encoding
    • Local Part=Invoice
    • Java Type=Invoice
    • Java To XML Serializer=org.apache.soap.encoding.soapenc.BeanSerializer
    • XML To Java Deserializer=org.apache.soap.encoding.soapenc.BeanSerializer
  7. Run the Client.

The Namespace URI and Local Part values are used by the Apache SOAP engine for the xsi:type attribute value. It is not particular important what these values are, as long as they don't clash with any other mappings you use.

Assuming all goes well, you should see the results in the TCP GUI in Figure 4, and the client output in Figure 5.


Figure 4: Running the Purchasing Web service


Figure 5: The Purchasing Web service client output window


Security

Security is a complex topic, and the current approaches to securing SOAP seem to be taking a layered approach.

At the lowest level, SOAP messages can be passed over HTTPS. Since HTTPS uses SSL for its transport, this ensures that the contents of the message are encoded to prevent snooping, and that the client and server can verify each other's identity. SOAP solutions based on SSL are available today, but not particularly easy to install and configure.

Although HTTPS solves the problem of shielding messages from eavesdroppers, it doesn't help much with the kind of finer grain security that is necessary to authenticate particular users of specific Web services. Many Web services will require a user to obtain some kind of user/password combination during initial registration and then use this authentication information when accessing the Web service in future. The UDDI Web services registries that are hosted online by IBM, Microsoft, and Ariba are examples of Web services that require a user to register before being able to use the publish service. There is no standard yet for how a Web service supports registration and authentication, but such a standard will no doubt emerge soon. Microsoft and Verisign recently announced their work on a standard called XML Key Management Specification (XKMS) that they will submit to the standards bodies for possible incorporation into SOAP and other systems (see Resources).

I intend to devote more time to the topic of security in a future installment of this column, when hopefully progress on the security standards front has progressed.


Next installment

In the next installment, I'm going to introduce WSDL (Web Services Description Language) and UDDI (Universal Description, Discovery and Integration), two important standards that will help move Web services technology into the mainstream.


Resources

About the author

Graham Glass is founder, CEO, and Chief Architect of The Mind Electric, which designs, builds and licenses forward-thinking distributed computing infrastructure. He believes that the evolution of the Internet will mirror that of a biological mind, and that architectures for helping people and businesses to network effectively will provide insight into those that wire together the human brain. He can be reached at graham-glass@mindspring.com.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=SOA and Web services
ArticleID=11478
ArticleTitle=The Web services (r)evolution: Part 3
publish-date=01012001
author1-email=graham-glass@mindspring.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers