Welcome to the third installment of my column that focuses on the evolutionary and revolutionary aspects of Web services technology. In the second installment, I showed how to build, deploy, and invoke a simple Web service using the Apache Simple Object Access Protocol (SOAP) implementation. In this installment, I'll explain how SOAP works under the hood, which is technically interesting and will help to demystify the future standards that we'll discuss like Web Services Description Language (WSDL) and Uniform Description, Discovery and Integration (UDDI) standard (see Resources).
No new software needs to be installed since we are using the same tools as in the previous installment. Additionally, we will use the examples from the previous installment, so be sure you still have the
\demo1
directory
with the IExchange.java, Exchange.java, and Client.java
source files present.
One new software tool I will introduce this time around is the TCP tunneling GUI, which allows you to see the TCP messages move between the client and server. This tool will allow you to examine the SOAP on-the-wire protocol and even check that the various competing SOAP implementations are standards-compliant.
The TCP tunneling GUI allows you to "sniff" SOAP messages by acting as a TCP router. To see it in action, first start the Apache Tomcat server (see last issue in Resources) in the \demo1 directory by typing:
> cd \demo1
> tomcat run
Tomcat starts up on port 8080 by default. Then start the TCP tunneler in a separate window by typing:
> java org.apache.soap.util.net.TcpTunnelGui 8070 localhost 8080
This starts a TCP server on port 8070 that acts as an intermediary between any client and port 8080 on the local host, which in this case is serviced by Tomcat. The GUI (see Figure 1) will display outgoing messages in the left pane and incoming replies in the right pane.
Figure 1: The TCP Tunneler Interface

Now edit the Client.java code from the previous example and change the
Web service URL to use port 8070 instead of 8080. The first line of the
client program should now read:
URL url = new URL( "http://localhost:8070/soap/servlet/rpcrouter" );
Then compile and run the client. The GUI should display Figure 2.
Figure 2: Running the Tunneler

As you can see, the left-hand pane shows an outgoing SOAP request comprised of 5 lines of standard HTTP header followed by an XML document that represents the client service invocation. The right-hand pane shows the resultant SOAP response consisting of 6 lines of standard HTTP header followed by an XML document that is the server response. Even without an explanation of the SOAP format, you can probably figure out what most of it means. Contrast this with the CORBA and DCOM protocols, which are binary, not self-describing, and tough to trace. I know this firsthand, having written a CORBA ORB in a previous lifetime.
The first thing to mention is that although this particular set up uses HTTP to deliver SOAP messages, SOAP can ride on any other transport protocol. You can use SMTP, the Internet email protocol, for example, to deliver SOAP messages. The header differs between transport layers, but the XML payload remains the same. A complete SOAP/HTTP request, with the XML content indented for clarity, is shown in Listing 1.
A SOAP request is sent as an HTTP POST with the content type set to
text/xml and a field called SOAPAction set to either
an empty string or the name of the SOAP method. The SOAPAction
field allows a receiving Web server to detect that it's an incoming SOAP
message and potentially route or filter it.
The XML part of the SOAP request consists of three main portions:
-
The
Envelopedefines the variousnamespacesthat are used by the rest of the SOAP message, typically includingxmlns:SOAP-ENV(SOAP Envelope namespace),xmlns:xsi(XML Schema for Instances) andxmlns:xsd(XML Schema for DataTypes). -
The
Headeris an optional element for carrying auxiliary information for authentication, transactions, and payments. Any element in a SOAP processing chain can add or delete items from theHeader; elements can also choose to ignore items if they are unknown. If aHeader is present, it must be the first child of theEnvelope. Because our example is simple and does not involve routers, theHeaderis absent. -
The
Bodyis the main payload of the message. When SOAP is used to perform an RPC call, theBodycontains a single element that contains the method name, arguments, and Web service target address. Thenamespaceof the element is equal to the target address, and the base name is the method name. In this example,ns1:getRateindicates that the target address isurn:demo1:exchange(the expanded form ofns1) and the method name isgetRate. If aHeaderis present, theBodymust be its immediate sibling, otherwise it must be the first child of theEnvelope.
When using SOAP as a remote procedure call (RPC) system, the SOAP parameters can be typed or untyped. The current
version of Apache only accepts typed arguments, although there is a version
in the works that will properly allow untyped arguments as well. The default
SOAP encoding scheme uses the xsi:type attribute to indicate an
XSD type. XSD defines several basic types, including int, byte, short,
boolean, string, float, double, date, time and URL. It also specifies a
format for sending arrays and blocks of opaque data.
Because SOAP is intended to be platform and language neutral, XSD does not define formats for encoding objects or structures unique to a single language. Later in this article, I'll show you how to send Java objects between machines that are both running Apache SOAP 2.0.
The SOAP/HTTP message is typically accepted by a servlet running on
a Web server. In this example, Apache 2.0 installs a servlet into Tomcat
that accepts HTTP requests on soap/servlet/rpcrouter. When the
servlet gets a request, it first checks that the request has a SOAPAction
field; if it does, it forwards it to the Apache SOAP engine. The engine
then parses the XML payload and uses the target Web service address to
perform a lookup in its local registry (populated at startup when it reads
the DeployedServices.ds file). It then uses reflection to locate
and invoke the specified method on the service and get the result.
The next section describes the format of the SOAP/HTTP response that is used to return the result back to the client.
A SOAP/HTTP response (see Listing 2) is returned as
an XML document within a standard HTTP reply whose content type is set
to text/xml. The XML document is structured just like the request
except that the Body contains the encoded method result. The namespace
of the result is the original target object URI and the base name is the
name of the method that was invoked. The XSI/XSD tagging scheme is optionally
used to denote the type of the result (see Resources). The SOAP standard does not specify
what should be returned from a void method, although most implementations
simply omit the <return> part of the Body.
If an exception occurs at any time during the processing of a message,
a SOAP exception is thrown. SOAP exceptions are encoded in a manner similar
to regular SOAP responses, with the exception that the Body is used to contain information
about the exception. To illustrate this, edit the Exchange.java
file that describes our currency exchange Web service and force it to throw
an exception as in Listing 3.
Listing 3: A SOAP Exception
public class Exchange implements IExchange
{
public float getRate( String country1, String country2 )
{
throw new RuntimeException( "cannot calculate rate" );
/*
System.out.println( "getRate( " + country1 + ", " + country2 + " )" );
return 144.52F; // always return the same value for now
*/
}
} |
Then shutdown and restart Tomcat so that it uses the new version of the Web service. The invoke the service using the Client program. The output from the TCP tunneling GUI is shown in Figure 3.
Figure 3: Tunneler output on SOAP Exception

The standard HTTP reply header indicates an exception by using status
code 500. The XML payload contains an Envelope and Body just like a regular
response, except that the content of the Body is a Fault structure whose
fields are defined as follows:
Faultcodeis a code which indicates the type of the fault. The valid values areSOAP-ENV:Client(incorrectly formed message),SOAP-ENV:Server(delivery problem),SOAP-ENV:VersionMismatch(invalid namespace forEnvelopeelement) andSOAP-ENV:MustUnderstand(error processingheadercontent).Faultstringis a human readable description of the fault.Faultactoris an optional field that indicates the URI of the source of the fault.Detailis an application-specific XML that contains detailed information about the fault.
Some SOAP implementations use the detail element to encode information about remote exceptions such as their type, data, and stack trace so that they can be rethrown automatically on the client. This allows Remote Method Invocation (RMI)-style remote exceptions to be implemented using SOAP. Of course, if the client and server are not using the same SOAP implementation, the feature is automatically disabled.
Now that you've seen how SOAP messages are passed back and forth using HTTP and XML, it is interesting to contemplate performance issues.
CORBA, DCOM, and RMI use binary encoding for arguments and return values. In addition, they assume that both the sender and the receiver have full knowledge of the message context and do not encode any meta-information such as the names or types of the arguments. This approach results in good performance, but makes it hard for intermediaries to process messages. And since each system uses a different binary encoding, it's hard to build systems that interoperate.
Because SOAP uses XML to encode messages, it's very easy to process messages at every step of the invocation process. In addition, the ease of debugging SOAP messages is leading to a quick convergence of the various SOAP implementations, which is important because large scale interoperability is what SOAP is all about.
On the surface, it seems that an XML-based scheme would be intrinsically slower than that of a binary-based model, but it's not as straightforward as that. First, when SOAP is used for sending messages across the Internet, the time to encode/decode the messages at each endpoint is tiny compared with the time to transfer the bytes between endpoints, so using XML in this case is not significant.
Second, when SOAP is used to send messages between endpoints in a closed
environment, such as between departments within the same company, its likely
that the endpoints will be running the same implementation of SOAP. In
this case, there are opportunities for optimizations that are unique to
that particular implementation. For example, a SOAP client could add an
HTTP header tag to a SOAP request that indicates that it supports a particular
optimization. If the SOAP server also supports that optimization, it could
return a HTTP header tag in the first SOAP response that tells the client
that its okay to use that optimization in subsequent communications. At that
point, both the client and server could start using the optimization.
In my experiments with Apache, I normally get about 30 round trip messages per second between Java programs on the same machine. I have been evaluating another SOAP implementation that gets about 700 round trip messages per second in the same configuration, so obviously there is a lot of room for improvement.
Once the initial excitement of building and deploying a simple service has worn off, many developers wish to start sending objects between clients and servers. There are at least two approaches for performing this task.
If you can guarantee that both the client and the server are configured with the same proprietary extensions, then you could pass objects between the machines using whatever approach you wanted. For example, if the client and server are both running Java programming language, they could agree to send objects using Java serialization.
On the other hand, if you want your system to work with any combination of client and server, start off by specifying XML schemas for the objects that you wish to send. Then ensure that both client and server are configured to recognize XML documents conforming to these schemas and can convert them to or from objects as necessary. For example, if a C++ client wanted to send an object to a Java server, the client should be able to convert C++ objects to or from the schema, and the server should be able to convert Java objects to/from the same schema.
Apache SOAP 2.0 includes a flexible scheme for passing objects between SOAP engines by including a special mapping registry. You can register custom serializers/deserializers that are used to encode/decode data. The current documentation is pretty sparse on how to create your own serializers, but fortunately a useful default Java BeanSerializer is included that can serialize/deserialize any class conforming to the JavaBean specification -- it must include a public default constructor and declare public set/get methods for all its attributes.
The following program illustrates how you can send and receive objects. A Purchasing Web service is deployed on the Tomcat server and then the Client sends an Invoice object by value. The object is displayed when it arrives on the server and then redisplayed when it is returned back to the client. You can also pass the object by reference in SOAP, as I will explain in a future installment of this column.
The sample source code for this Purchasing Web service is given in Listings 4 through 7.
Listing 4. Code for IPurchasing.java
public interface IPurchasing
{
Invoice receive( Invoice invoice );
} |
Listing 5. Code for Purchasing.java
public class Purchasing implements IPurchasing
{
public Invoice receive( Invoice invoice )
{
System.out.println( "got invoice " + invoice );
return invoice;
}
} |
Listing 6. Code for Invoice.java
public class Invoice
{
String name;
int amount;
public Invoice()
{
}
public Invoice( String name, int amount )
{
this.name = name;
this.amount = amount;
}
public String toString()
{
return "Invoice( " + name + ", " + amount + " )";
}
public void setName( String name )
{
this.name = name;
}
public String getName()
{
return name;
}
public void setAmount( int amount )
{
this.amount = amount;
}
public int getAmount()
{
return amount;
}
} |
Listing 7. Code for Client2.java
To run the program, perform the following steps:
-
Copy the following Java files into a new directory called
\demo2and add\demo2to yourCLASSPATH. - Compile the source files.
- Run the TCP tunneling GUI as before, listening to port 8070 and routing to/from port 8080.
-
Start Tomcat in the
\demo2directory. -
Open a browser and enter the URL
http://localhost:8080/soap - Use the admin interface to deploy a Purchasing service. Enter the following values:
ID=urn:demo2:purchasingScope=RequestMethods=receiveProvider Type=JavaProvider Class=PurchasingStatic=NoNumber of Mappings=1Encoding Style=SOAPNamespace URI=urn:my_encodingLocal Part=InvoiceJava Type=InvoiceJava To XML Serializer=org.apache.soap.encoding.soapenc.BeanSerializerXML To Java Deserializer=org.apache.soap.encoding.soapenc.BeanSerializer- Run the Client.
The Namespace URI and Local Part values are used by the
Apache SOAP engine for the xsi:type attribute value. It is not
particular important what these values are, as long as they don't clash
with any other mappings you use.
Assuming all goes well, you should see the results in the TCP GUI in Figure 4, and the client output in Figure 5.
Figure 4: Running the Purchasing Web service

Figure 5: The Purchasing Web service client output window

Security is a complex topic, and the current approaches to securing SOAP seem to be taking a layered approach.
At the lowest level, SOAP messages can be passed over HTTPS. Since HTTPS uses SSL for its transport, this ensures that the contents of the message are encoded to prevent snooping, and that the client and server can verify each other's identity. SOAP solutions based on SSL are available today, but not particularly easy to install and configure.
Although HTTPS solves the problem of shielding messages from eavesdroppers, it doesn't help much with the kind of finer grain security that is necessary to authenticate particular users of specific Web services. Many Web services will require a user to obtain some kind of user/password combination during initial registration and then use this authentication information when accessing the Web service in future. The UDDI Web services registries that are hosted online by IBM, Microsoft, and Ariba are examples of Web services that require a user to register before being able to use the publish service. There is no standard yet for how a Web service supports registration and authentication, but such a standard will no doubt emerge soon. Microsoft and Verisign recently announced their work on a standard called XML Key Management Specification (XKMS) that they will submit to the standards bodies for possible incorporation into SOAP and other systems (see Resources).
I intend to devote more time to the topic of security in a future installment of this column, when hopefully progress on the security standards front has progressed.
In the next installment, I'm going to introduce WSDL (Web Services Description Language) and UDDI (Universal Description, Discovery and Integration), two important standards that will help move Web services technology into the mainstream.
-
In his first installment, Graham examines the benefits and challenges of building Web service applications to enable peer-to-peer distributed networks.
-
In the second
installment, Graham provided a step by step explanation of how to develop
a Web service, including the tools you need, how to install them, how to
write the code, and how to deploy the service.
-
Visit http://www.w3.org/TR/xmlschema-0/
for a primer on XSI and XSD schemas.
-
Visit http://www.w3.org/TR/SOAP/
for the SOAP 1.1 specification.
-
Read the Microsoft/Verisign announcement.
- Http://www.uddi.org/ is the main UDDI
site.
Graham Glass is founder, CEO, and Chief Architect of The Mind Electric, which designs, builds and licenses forward-thinking distributed computing infrastructure. He believes that the evolution of the Internet will mirror that of a biological mind, and that architectures for helping people and businesses to network effectively will provide insight into those that wire together the human brain. He can be reached at graham-glass@mindspring.com.




