In the first article in this series, you saw how SOAP maps data types to XML, and learned how to use the serializers and deserializers (hereafter referred to collectively as (de)serializers) included in the Apache SOAP toolkit. In this installment, I'll walk you through a cookbook that will show you how to write your own (de)serializers. I would advise you to have the sources of some of the base (de)serializers available for reference. You may also want to reread the "Type Mapping Pattern" section in Part 1 to refresh your memory on how type mappings are resolved internally.
Once I've finished with the cookbook, I will present a simple application that implements schema-constrained SOAP. The application will describe an interaction in which a purchase order document that's purposely noncompliant with Section 4 encoding is sent using SOAP.
Root and normal (de)serializers
If none of the (de)serializers bundled with the Apache SOAP toolkit will work with your Java class, you may need to write a custom (de)serializer yourself. First, you need to make the distinction between what I call root (de)serializers and normal (de)serializers. The initial bootstrapping of the serialization and deserialization of RPC parameters and responses is handled by the root (de)serializers. The three root (de)serializers in Apache SOAP are listed in Table 1.
encodingStyle | (de)serializer |
| Section 5 | ParameterSerializer |
| Literal XML | XMLParameterSerializer |
| XMI | XMIParameterSerializer |
The appropriate root (de)serializer is dispatched based on two things:
encodingStyle and the class Parameter
(for serialization) or the QName SOAP-ENV:Parameter (for deserialization).
To see how the actual dispatching takes place, you need to understand the chain
of events that leads to serialization of a Java type at the client side.
Call call = new Call();
...
resp = call.invoke(url, ""); //1
|
When the invoke method of the Call class is called
at line 1 above, the Call class iterates through its associated
parameters.
org.apache.soap.rpc.RPCMessage: ... Serializer ser = xjmr.querySerializer(Parameter.class, actualParamEncStyle); //2 ser.marshall(...); ... |
For each parameter, the type mapping registry is queried and the marshall() method
subsequently called on the returned serializer. The returned serializer in line 2
is a root serializer.
Let us now turn our attention to the deserialization process at the Web service.
On the server side, the listener for SOAP RPC messages is implemented as a
servlet. The doPost method retrieves the SOAP request
and attempts to reconstruct the Call object from it.
org.apache.soap.rpc.RPCMessage: ... Bean paramBean = smr.unmarshall(actualParamEncStyle, RPCConstants.Q_ELEM_PARAMETER ...); //3 Parameter param = (Parameter)paramBean.value; ... |
In line 3, RPCConstants.Q_ELEM_PARAMETER resolves to SOAP-ENV:Parmeter. It is here at
line 3 that the dispatching to the root deserializer occurs, when the unmarshall() method is
called.
The root (de)serializer in turn will query the type mapping registry for the next normal (de)serializer to be called. This call stack will start to unravel only when the registry returns (de)serializers for primitive Java types (during serialization) or when the XML elements are purely containers for simple types (during deserialization).
The bulk of the supplied normal (de)serializers -- which can be found in the
package org.apache.soap.encoding.soapenc -- work on Section 5 encoding, as do most of the helper classes in Apache SOAP. This is the reason why I
will be focusing the cookbook solely on Section 5-encoding (de)serializers.
Figure 1. APIs for writing (de)serializers

A serializer implements org.apache.soap.util.xml.Serializer
and realizes a single method:
void marshall(
java.lang.String inScopeEncStyle,
java.lang.Class javaType,
java.lang.Object src,
java.lang.Object context,
java.io.Writer sink,
NSStack nsStack,
XMLJavaMappingRegistry xjmr,
SOAPContext ctx)
throws java.lang.IllegalArgumentException,
java.io.IOException
|
Let us investigate each of marshall()'s parameters in turn:
-
inScopeEncStyle: This represents theencodingStyleURIas specified in the enclosingCallorResponseobject. -
javaType: This is the run-time type of the object that is to be serialized. -
src: This is a reference to the Java object to be serialized. -
context: AStringdenoting the accessor name. If this serializer is invoked by theParameterSerializer, then the context value is equivalent to the named property in theParameterclass (declared on SOAP client) or hardcoded toreturnif this is a SOAP server. It must be non-null. -
sink: The destination sink to which the SOAP XML instance will be written. -
nsStack: A data structure that implements a stack of namespace declarations that are currently in scope. -
xjmr: This is thesmrwhich you'll use to query for the serializer that you'll use next based on the Java type. You will also invoke thexjmrmarshall method to delegate to other serializers based onjavaTypeandencodingStyle-- you'd do this for compound structures likeHashtableorVector, for instance. -
ctx: This is used to pass in things likejavax.servlet.http.HttpServletRequestandjavax.servlet.http.HttpSessionfrom the servlet context.
The cookbook for the marshall() method is as follows:
Step 1: Create new namespace scope
nsStack.pushScope(); |
Use the NSStack class to track the scope of XML namespace
declarations. Later on in the method, NSStack can be used to
add a new namespace and search through the stack for the prefix
given a URI.
Step 2: Check constraints on object argument
Two conditions need to be satisfied in order for
serialization to happen: the serializer must be given a
supported type and the object to be serialized must be non-null. The following code snippet demonstrates
how these constraints are enforced in VectorSerializer.
if ( (src != null) &&
!(src instanceof Vector) &&
!(src instanceof Enumeration))
throw new IllegalArgumentException("Tried to pass a '" +
src.getClass().toString() + "' to VectorSerializer");
|
Several built-in serializers actually compare the javaType parameter
with the expected type, like this:
if(!javaType.equals(Foo.class)) ... |
I wouldn't recommended using this technique, however, as it susceptible to the impostor type bug (see Resources).
Step 3: Generate a null accessor
If the object argument is null, you need to generate a null accessor for the type.
SoapEncUtils.generateNullStructure(inScopeEncStyle, |
Serializing the object into a Section 5-compliant SOAP XML document involves three steps: generating the opening element for the accessor, serializing the value of the object, and closing the element.
The first step is easily achieved by calling the following utility method:
SoapEncUtils.generateStructureHeader(inScopeEncStyle,
javaType,
context,
sink,
nsStack,xjmr);
|
This code will call queryElementType
to find the mapped QName for javaType:
<context xsi:type="QName"> |
If the object argument, src, is a simple type, then the second step, serializing the value of the object, is
a simple matter of calling the src.toString() method
and writing that out. Otherwise, you will need to identify the constituent
parts of the object and individually pass them to more primitive serializers.
If you investigate the source for the built-in serializers, you'll notice
that these constituent parts can be identified in many ways:
- Java reflection (for example,
BeanSerializer) - Iterating through a
Listdata structure (for example,ArraySerializer) - Direct access via a priori knowledge of the class (that is, you know in advance that you serializer only works for one specific class)
Having identified the other serializers, you can delegate to them by calling:
xjmr.marshall(inScopeEncStyle,
componentType,
componentValue,
accessorName,
sink, nsStack, ctx);
|
Here, componentType and componentValue are representative of the run-time
type and object reference, respectively, for any constituent parts of the
original src parameter. The marshall() method actually calls querySerializer to retrieve the associated serializer and subsequently calls the marshall() method of the associated serializer. Obviously, this will only work if you've registered the serializers for all components in the type mapping registry.
The last step, closing the element, is completed by simply writing out the closing tag for the accessor.
sink.write("</" + context + '>');
|
Finally, you must clean up after yourself by leaving the current namespace scope.
nsStack.popScope(); |
A deserializer implements org.apache.soap.util.xml.Deserializer
and realizes a single method:
Bean unmarshall(
java.lang.String inScopeEncStyle,
QName elementType,
org.w3c.dom.Node src,
XMLJavaMappingRegistry xjmr,
SOAPContext ctx)
|
The purpose of the unmarshall() method is to reconstruct the parameters as Java objects.
In order to do that, you need to process the XML fragment
contained by the src DOM node. The preferred programming model
to achieve this is to use the DOM wrapper methods in org.apache.soap.utils.xml.DOMUtils in
conjunction with the type mapping registry. In general, DOMUtils is
deserialization's counterpart to serialization's SoapEncUtils.
It is important to note that the XML contained in src is
guaranteed to be free of multireference values. All multireference hrefs
have been resolved back to the actual value by the root deserializer,
ParameterSerializer.
Thus, the deserialization cookbook is as follows:
It is advisable to check for the nullability attribute, like so:
Element root = (Element)src;
if (SoapEncUtils.isNull(root))
{
return new Bean(Your.class, null);
}
|
Step 2: Reconstruct the Java object
The process of reconstructing a Java object varies depending on the category its data type falls into. (For more on these type categories, see Part 1.)
Simple type
If you're deserializing a simple type,
just use DOMUtils.getChildCharacterData(Element)
to retrieve the string value of src and optionally preprocess it
(for example, map the string "NaN" to Float.NaN in
FloatDeserializer) before using it to initialize the object
that's to be returned.
Compound type
Compound types fall into two major categories. The first comprises types
with a homogeneous structure of repeating elements; examples include Java arrays and classes implementing java.util.List and
java.util.Map. The other category is representative of all other Java
classes that exhibit arbitrary structure. The deserialization process, then,
boils down to the navigation of the XML structure to identify relevant
descendant elements and the subsequent delegation of deserialization
responsibilities to more primitive deserializers, as follows:
-
Navigating the DOM. If you're dealing with a compound type from the first category, you may use
DOMUtils.getFirstChildElement()andDOMUtils.getNextSiblingElement()to navigate through all its repeating members. Otherwise, use the DOM API to identify the elements that represent member properties. -
Delegate deserialization to other deserializers. First, you must extract the SOAP type:
QName soapType = SoapEncUtils.getTypeQName(rootElement);
Next, delegate to more primitive deserializers:
xjmr.unmarshall(inScopeEncStyle, soapType, rootElement, ctx);
xjmr.unmarshallinternally callsqueryDeserializerand then invokesunmarshallon the returned deserializer. The two steps above are better collapsed into one by delegating deserialization toParameterSerializer. This is done because, in situations where thexsi:typeattribute is missing, we would like to invokexjmr.unmarshall()with thesoapTypeset to the QName{""}/X, where X is the root element'stagName. Since the code to achieve this is already conveniently packaged insideParameterSerializer.unmarshall(), the shortened version of the process becomes:Bean paramBean = xjmr.unmarshall(inScopeEncStyle, RPCConstants.Q_ELEM_PARAMETER, rootElement, ctx);
-
Initialize the target object. The target object is the object instance you're reconstructing.
As member properties get deserialized, you can restore their values by invoking the
mutator methods on your target object, as follows:
Foo foo = new Foo(); foo.setS( paramBean.value );
Step 3: Return the reconstructed object
The Bean class encapsulates the run-time type
and the actual returned instance. The deserializer knows
what class it should be returning because in most cases it has been tailored for a specific class. For generic deserializers like
BeanSerializer and ArraySerializer, the javaType property in the type mapping
conveys the type to be returned:
return(new Bean(Foo.class, foo)); |
Registering root (de)serializers
I've already mentioned that if you intend to introduce custom encodingStyles,
then you must write root (de)serializers. Root (de)serializers are implemented the same way as normal (de)serializers
except for one small difference: all root (de)serializers
are registered into the type mapping registry with a specially designated QName and
Java type that will tell Apache SOAP to bootstrap the (de)serialization
process based on the encodingStyle property. In the sample code below,
take note of the highlighted values, which you must use when registering root (de)serializers.
[Client]
smr.mapTypes(customEncURI,
RPCConstants.Q_ELEM_PARAMETER,
Parameter.class,
customSerializer,
null);
[Server]
<isd:map encodingStyle="customEncURI"
xmlns:x="http://schemas.xmlsoap.org/soap/envelope" qname="x:Parameter"
javaType="org.apache.soap.rpc.Parameter"
java2XMLClassName="foo.customSerializer" />
|
In this section, I'll walk you through an alternative solution
to BeanSerializer for (de)serializing complex types. This technique,
which I'll call schema-constrained SOAP, uses an XML Schema to
describe the literal XML structure of the RPC parameter(s). Here, we're
agreeing to interoperate strictly on the format of the message, without
caring about the data model on the client and server. To avoid confusion,
it should be noted that the RPC invocation is still encoded using Section 5,
but the parameter(s) are not.
I'll illustrate this technique with an example application; you can download the full code from Resources below. A client sends a purchase order to a Web service, and the service responds with an acknowledgement string. The method signature exported by the Web service is thus:
public String eatPo (PurchaseOrder p); |
In order for this technique to work, we need a XML/object data binding framework. For this example, I chose to use Exolab's Castor toolkit. (See the Resources section below for links to Castor and a list of other serialization frameworks, like JSX, JAXB, and Schema2Java.)
The steps for this technique are as follows:
- Agree on the XML format for
PurchaseOrder. - Generate the Java classes using Castor.
- Write a custom (de)serializer.
- Write type mappings for the client and server pieces.
Step 1: Agree on the XML format for PurchaseOrder
For this use case, I removed the order details section from my PurchaseOrder
schema to keep things simple. Also, note that the PONumber attribute makes this
schema noncompliant with Section 5 encoding.
Figure 2. PurchaseOrder.xsd

Step 2: Generate the Java classes using Castor
Run Castor's SourceGenerator command-line tool to generate Java classes
that implement the schema in PurchaseOrder.xsd:
java org.exolab.castor.builder.SourceGenerator
-i PurchaseOrder.xsd
-package com.raverun.po.castor
|
The SourceGenerator tool only recognizes the latest schema namespace -- http://www.w3.org/2001/XMLSchema.
Next, compile the set of Java classes. Note that you'll need to use the -deprecation option as the
generated files uses SAX 1.0 APIs. To circumvent this manual compilation,
Exolab is working on an Ant taskdef to automate it.
Step 3: Write a custom (de)serializer
You will now implement the (de)serialization methods in PurchaseOrderSerializer by
utilizing its counterpart methods exposed by the PurchaseOrder class. For serialization,
PurchaseOrder can marshall to a java.io.Writer or a
org.xml.sax.DocumentHandler. As shown in Listing 1, you delegate the
serialization to PurchaseOrder's marshal() method. One caveat: the XML
stream generated by the marshal() method contains the XML prolog. PurchaseOrderSerializer strips off this prolog
by wrapping sink with FilterXmlProlog, a java.io.FilterWriter. Listing 2 contains some exceptional cases that might arise during the deserialization process.
Listing 1. Extract from marshal() method in PurchaseOrderSerializer
----o<---------
SoapEncUtils.generateStructureHeader(inScopeEncStyle,
javaType,
context,
sink,
nsStack,
xjmr);
PurchaseOrder po = (PurchaseOrder)src;
try{
po.marshal( new FilterXmlProlog(sink) );
}catch(Exception e){
throw (new java.io.IOException("Castor: Error marshalling"));
}
sink.write( StringUtils.lineSeparator );
sink.write("</" + context + '>');
----o<---------
|
Listing 2. Exceptional cases during deserialization in PurchaseOrderSerializer
(b1) Null PO.
---------------------------
<po
xmlns:ns2="urn:raverun"
xsi:type="ns2:po"
xsi:null="true"/>
---------------------------
(b2) Non null but nothing submitted in the body.
---------------------------
<po
xmlns:ns2="urn:raverun"
xsi:type="ns2:po" />
---------------------------
(b3) PO that violates the schema.
---------------------------
<po
xmlns:ns2="urn:raverun"
xsi:type="ns2:po">
<foo bar="123"/>
</po>
---------------------------
|
Step 4: Write type mappings for the client and server pieces
Lastly, you need to declare the type mappings to reference your custom (de)serializer.
It might surprise you to see Section 5 specified as the encoding for PurchaseOrder.
This is done for convenience's sake, as it grants you the ability to use ParameterSerializer
to bootstrap the deserialization process and also to use SoapEncUtils in the
serialization code.
[Client]
SOAPMappingRegistry smr = new SOAPMappingRegistry();
smr.mapTypes(Constants.NS_URI_SOAP_ENC,
new QName("urn:raverun", "po"),
PurchaseOrder.class, pos, null);
[Server]
<isd:map
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:x="urn:raverun" qname="x:po"
javaType="com.raverun.po.castor.PurchaseOrder"
xml2JavaClassName="com.raverun.po.PurchaseOrderSerializer" />
|
Potential problems with this solution
You should keep the following issues in mind while examining the schema-constrained SOAP example:
-
In order to be standards compliant, you must turn off any claims about Section 5 encoding in the
<po>element. (See Listing 3 for a more compliant SOAP XML instance.) The SOAP 1.1 specification (see Resources) describes this requirement as follows:A value of the zero-length URI ("") explicitly indicates that no claims are made for the encoding style of contained elements.
An alternative to the null
encodingStyleis to introduce a customencodingStyleURI, tailored to your communication needs. -
There are some bugs to watch out for in Castor, but all have workarounds. If you're using a version of Castor older than 0.9.3, schema validation does not work as expected. The solution is to upgrade to the latest release. On the other hand, Castor 0.9.3 (the version I used) generates a spurious message to the standard output stream. The message I encountered:
Warning : preserved is a bad entry for the whiteSpace value.
The latest version of Castor, 0.9.3.9, suppresses this warning.
-
PurchaseOrderSerializerdoes not serialize to multireference values. However, it will deserialize them correctly. This is not a feature ofPurchaseOrderSerializerper se, but ofParameterSerializer.
Listing 3. A more compliant SOAP instance
<ns1:eatPo
xmlns:ns1="urn:poservice"
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<po
xmlns:ns2="urn:raverun"
xsi:type="ns2:po"
SOAP-ENV:encodingStyle="">
<purchaseOrder xmlns="http://www.example.com/PO1">
<header PONumber="9999-1212">
<Date>2001-09-25T14:40:13.453</Date>
</header>
...
</purchaseOrder>
</po>
</ns1:eatPo>
|
The last official release of Apache SOAP (version 2.2) came out in May 2001. Although
development focus has shifted to Axis (currently at beta 1), bug fixes are
continually being added. While we await a 2.3 release (if there is one), users of the
official release should be aware that there have been major updates in the codebase,
especially in SOAPMappingRegistry and its related
classes. Existing code may need some changes to interoperate with the fixes.
Here is the list of notable changes:
- Schema namespaces now reference the 2001 recommendation namespace by default. Version 2.2 referenced the 1999 namespace.
- As a corollary, if you instantiate
SOAPMappingRegistrywith its no-arg constructor, a 2001 namespace-aware instance is returned. - Instance creation for
SOAPMappingRegistryhas been redesigned according to the static factory pattern. Thus, you now should use the factory methodgetBaseRegistry(schemaURI)instead of the overloaded constructorSOAPMappingRegistry(schemaURI):public static SOAPMappingRegistry getBaseRegistry (String schemaURI);
- Version 2.2 offers the ability to chain registries. These methods were recently added:
public SOAPMappingRegistry(SOAPMappingRegistry parent); public SOAPMappingRegistry(SOAPMappingRegistry parent, String schemaURI); public SOAPMappingRegistry getParent() public String getSchemaURI()
The resolution of type mappings will percolate up the chain until a match is found. - The
DeploymentDescriptorclass treats theqnameattribute as optional in type mapping declarations.
I hope that the examples in this article have made clear the theoretical concepts outlined in the first article in this series. If Web services operating across many machines on the network are to become a widespread reality, developers must understand how programmatic objects are transmitted from one machine to another. A better understanding of SOAP's type mapping abilities should help you build better distributed applications and services.
-
Apache SOAP is an open source SOAP toolkit for Java programming language.
-
Apache Axis is the next-generation toolkit that will supercede Apache SOAP.
- Brendan Macmillan's Java Serialization to XML (JSX) toolkit contains some useful tools.
- Download an early access release of Sun's Java Architecture for XML Binding (JAXB).
-
Castor is an open source data binding framework for Java technology.
-
Schema2Java is a commercial XML data binding product from Creative Science Systems.
- Download the sample code that accompanies this article as PurchaseOrder.zip or PurchaseOrder.tar.gz.
- To learn more about XMI (XML Metadata Interchange), check out the IBM XMI Framework.
- The SOAP 1.1 specification is available as a W3C Technical Note.
-
This post by Sanjiva Weerawarana describes the loosening of
xsi:typerequirement in Apache SOAP. - Apache SOAP's interoperability with other toolkits is covered in "The Web services insider, Part 3" by James Snell (developerWorks, May 2001).
- You can attend an IBM workshop on SOAP and Java technology.
- "SOAP Messages with Attachments," a W3C note, describes how SOAP is embedded within MIME.
- Eric E. Allen described the Impostor type bug pattern in his Diagnosing Java Code column for developerWorks.
- The W3C XML Schema, Part 1, describes the core XML Schema concepts and syntax.
- The W3C XML Schema, Part 2, describes the data types supported in the XML Schema.
- Apache SOAP bug report #2865 describes the nonsupport for the
charprimitive. - Apache SOAP bug report #3000 describes the
timeInstantbug. - Apache SOAP bug report #2388 describes the use of old XML Schema namespace URIs.
- Apache SOAP bug report #2470 describes the
HashtableSerializerproblem of mixing SOAP and literal XML encoding styles.
Gavin Bong is a Java developer from Kuala Lumpur, Malaysia. His areas of interest include service-oriented architectures and wireless Java. You can contact Gavin at gavinb@eutama.com.
Comments (Undergoing maintenance)





