Skip to main content

Apache SOAP type mapping, Part 2: A serialization cookbook

Follow these steps to write your own custom serializers and deserializers

Gavin Bong (gavinb@eutama.com), Software Engineer, eUtama Sdn. Bhd.
Gavin Bong is a Java developer from Kuala Lumpur, Malaysia. His areas of interest include service-oriented architectures and wireless Java. You can contact Gavin at gavinb@eutama.com.

Summary:  SOAP specifies an encoding to represent common types found in databases, programming languages (for example, Java programming language), and data repositories. Apache SOAP's toolkit supports encoding by supplying a base set of (de)serializers; classes that do the grunt work of mapping Java types to serialized XML representations. Part 1 of this two-part series explored the use of these (de)serializers. Here in Part 2, Gavin Bong shows you how to write your own (de)serializers when none from the toolkit suit your needs. He also provides an example application that demonstrates many of the concepts explored in this series.

Date:  01 Mar 2002
Level:  Introductory
Activity:  3003 views

In the first article in this series, you saw how SOAP maps data types to XML, and learned how to use the serializers and deserializers (hereafter referred to collectively as (de)serializers) included in the Apache SOAP toolkit. In this installment, I'll walk you through a cookbook that will show you how to write your own (de)serializers. I would advise you to have the sources of some of the base (de)serializers available for reference. You may also want to reread the "Type Mapping Pattern" section in Part 1 to refresh your memory on how type mappings are resolved internally.

Once I've finished with the cookbook, I will present a simple application that implements schema-constrained SOAP. The application will describe an interaction in which a purchase order document that's purposely noncompliant with Section 4 encoding is sent using SOAP.

Root and normal (de)serializers

If none of the (de)serializers bundled with the Apache SOAP toolkit will work with your Java class, you may need to write a custom (de)serializer yourself. First, you need to make the distinction between what I call root (de)serializers and normal (de)serializers. The initial bootstrapping of the serialization and deserialization of RPC parameters and responses is handled by the root (de)serializers. The three root (de)serializers in Apache SOAP are listed in Table 1.

Table 1. Root (de)serializers

encodingStyle(de)serializer
Section 5ParameterSerializer
Literal XMLXMLParameterSerializer
XMIXMIParameterSerializer

The appropriate root (de)serializer is dispatched based on two things: encodingStyle and the class Parameter (for serialization) or the QName SOAP-ENV:Parameter (for deserialization). To see how the actual dispatching takes place, you need to understand the chain of events that leads to serialization of a Java type at the client side.

  Call call = new Call();
  ...
  resp = call.invoke(url, ""); //1 

When the invoke method of the Call class is called at line 1 above, the Call class iterates through its associated parameters.

org.apache.soap.rpc.RPCMessage:
  ...
  Serializer ser = xjmr.querySerializer(Parameter.class, actualParamEncStyle); //2
  ser.marshall(...); 
  ...

For each parameter, the type mapping registry is queried and the marshall() method subsequently called on the returned serializer. The returned serializer in line 2 is a root serializer.

Let us now turn our attention to the deserialization process at the Web service. On the server side, the listener for SOAP RPC messages is implemented as a servlet. The doPost method retrieves the SOAP request and attempts to reconstruct the Call object from it.

org.apache.soap.rpc.RPCMessage:
...
  Bean paramBean = smr.unmarshall(actualParamEncStyle, RPCConstants.Q_ELEM_PARAMETER
 ...); //3
  Parameter param = (Parameter)paramBean.value;
  ...

In line 3, RPCConstants.Q_ELEM_PARAMETER resolves to SOAP-ENV:Parmeter. It is here at line 3 that the dispatching to the root deserializer occurs, when the unmarshall() method is called.

The root (de)serializer in turn will query the type mapping registry for the next normal (de)serializer to be called. This call stack will start to unravel only when the registry returns (de)serializers for primitive Java types (during serialization) or when the XML elements are purely containers for simple types (during deserialization).

The bulk of the supplied normal (de)serializers -- which can be found in the package org.apache.soap.encoding.soapenc -- work on Section 5 encoding, as do most of the helper classes in Apache SOAP. This is the reason why I will be focusing the cookbook solely on Section 5-encoding (de)serializers.


Figure 1. APIs for writing (de)serializers
APIs for writing (de)serializers

Serializer cookbook

A serializer implements org.apache.soap.util.xml.Serializer and realizes a single method:

  void marshall(
    java.lang.String inScopeEncStyle, 
    java.lang.Class javaType, 
    java.lang.Object src, 
    java.lang.Object context, 
    java.io.Writer sink, 
    NSStack nsStack, 
    XMLJavaMappingRegistry xjmr, 
    SOAPContext ctx) 
      throws java.lang.IllegalArgumentException, 
             java.io.IOException 

Let us investigate each of marshall()'s parameters in turn:

  • inScopeEncStyle : This represents the encodingStyleURI as specified in the enclosing Call or Response object.
  • javaType : This is the run-time type of the object that is to be serialized.
  • src : This is a reference to the Java object to be serialized.
  • context : A String denoting the accessor name. If this serializer is invoked by the ParameterSerializer, then the context value is equivalent to the named property in the Parameter class (declared on SOAP client) or hardcoded to return if this is a SOAP server. It must be non-null.
  • sink : The destination sink to which the SOAP XML instance will be written.
  • nsStack : A data structure that implements a stack of namespace declarations that are currently in scope.
  • xjmr : This is the smr which you'll use to query for the serializer that you'll use next based on the Java type. You will also invoke the xjmr marshall method to delegate to other serializers based on javaType and encodingStyle -- you'd do this for compound structures like Hashtable or Vector, for instance.
  • ctx : This is used to pass in things like javax.servlet.http.HttpServletRequest and javax.servlet.http.HttpSession from the servlet context.

The cookbook for the marshall() method is as follows:

Step 1: Create new namespace scope

   nsStack.pushScope();
  

Use the NSStack class to track the scope of XML namespace declarations. Later on in the method, NSStack can be used to add a new namespace and search through the stack for the prefix given a URI.

Step 2: Check constraints on object argument

Two conditions need to be satisfied in order for serialization to happen: the serializer must be given a supported type and the object to be serialized must be non-null. The following code snippet demonstrates how these constraints are enforced in VectorSerializer.

     if ( (src != null) &&
         !(src instanceof Vector) &&
         !(src instanceof Enumeration))
              throw new IllegalArgumentException("Tried to pass a '" +
                      src.getClass().toString() + "' to VectorSerializer");

Several built-in serializers actually compare the javaType parameter with the expected type, like this:

  if(!javaType.equals(Foo.class)) ...
  

I wouldn't recommended using this technique, however, as it susceptible to the impostor type bug (see Resources).

Step 3: Generate a null accessor

If the object argument is null, you need to generate a null accessor for the type.

  SoapEncUtils.generateNullStructure(inScopeEncStyle,
javaType, context, sink, nsStack, xjmr);

Step 4: Serialize the object

Serializing the object into a Section 5-compliant SOAP XML document involves three steps: generating the opening element for the accessor, serializing the value of the object, and closing the element.

The first step is easily achieved by calling the following utility method:

  SoapEncUtils.generateStructureHeader(inScopeEncStyle,
         javaType,
         context,
         sink,
         nsStack,xjmr);
  

This code will call queryElementType to find the mapped QName for javaType:

  <context xsi:type="QName">
  

If the object argument, src, is a simple type, then the second step, serializing the value of the object, is a simple matter of calling the src.toString() method and writing that out. Otherwise, you will need to identify the constituent parts of the object and individually pass them to more primitive serializers. If you investigate the source for the built-in serializers, you'll notice that these constituent parts can be identified in many ways:

  • Java reflection (for example, BeanSerializer)
  • Iterating through a List data structure (for example, ArraySerializer)
  • Direct access via a priori knowledge of the class (that is, you know in advance that you serializer only works for one specific class)

Having identified the other serializers, you can delegate to them by calling:

  xjmr.marshall(inScopeEncStyle,
                componentType,
                componentValue,
                accessorName,
                sink, nsStack, ctx);
  

Here, componentType and componentValue are representative of the run-time type and object reference, respectively, for any constituent parts of the original src parameter. The marshall() method actually calls querySerializer to retrieve the associated serializer and subsequently calls the marshall() method of the associated serializer. Obviously, this will only work if you've registered the serializers for all components in the type mapping registry.

The last step, closing the element, is completed by simply writing out the closing tag for the accessor.

  sink.write("</" + context + '>');
  

Finally, you must clean up after yourself by leaving the current namespace scope.

  nsStack.popScope();
  


Deserializer cookbook

A deserializer implements org.apache.soap.util.xml.Deserializer and realizes a single method:

  Bean unmarshall(
    java.lang.String inScopeEncStyle, 
    QName elementType, 
    org.w3c.dom.Node src, 
    XMLJavaMappingRegistry xjmr, 
    SOAPContext ctx)         

The purpose of the unmarshall() method is to reconstruct the parameters as Java objects. In order to do that, you need to process the XML fragment contained by the src DOM node. The preferred programming model to achieve this is to use the DOM wrapper methods in org.apache.soap.utils.xml.DOMUtils in conjunction with the type mapping registry. In general, DOMUtils is deserialization's counterpart to serialization's SoapEncUtils.

It is important to note that the XML contained in src is guaranteed to be free of multireference values. All multireference hrefs have been resolved back to the actual value by the root deserializer, ParameterSerializer.

Thus, the deserialization cookbook is as follows:

Step 1: Check for null

It is advisable to check for the nullability attribute, like so:

    Element root = (Element)src;
    if (SoapEncUtils.isNull(root))
    {
        return new Bean(Your.class, null);
    }
  

Step 2: Reconstruct the Java object

The process of reconstructing a Java object varies depending on the category its data type falls into. (For more on these type categories, see Part 1.)

Simple type
If you're deserializing a simple type, just use DOMUtils.getChildCharacterData(Element) to retrieve the string value of src and optionally preprocess it (for example, map the string "NaN" to Float.NaN in FloatDeserializer) before using it to initialize the object that's to be returned.

Compound type
Compound types fall into two major categories. The first comprises types with a homogeneous structure of repeating elements; examples include Java arrays and classes implementing java.util.List and java.util.Map. The other category is representative of all other Java classes that exhibit arbitrary structure. The deserialization process, then, boils down to the navigation of the XML structure to identify relevant descendant elements and the subsequent delegation of deserialization responsibilities to more primitive deserializers, as follows:

  • Navigating the DOM. If you're dealing with a compound type from the first category, you may use DOMUtils.getFirstChildElement() and DOMUtils.getNextSiblingElement() to navigate through all its repeating members. Otherwise, use the DOM API to identify the elements that represent member properties.
  • Delegate deserialization to other deserializers. First, you must extract the SOAP type:
    QName soapType = SoapEncUtils.getTypeQName(rootElement);
    

    Next, delegate to more primitive deserializers:

    xjmr.unmarshall(inScopeEncStyle, soapType, rootElement, ctx);
    

    xjmr.unmarshall internally calls queryDeserializer and then invokes unmarshall on the returned deserializer. The two steps above are better collapsed into one by delegating deserialization to ParameterSerializer. This is done because, in situations where the xsi:type attribute is missing, we would like to invoke xjmr.unmarshall() with the soapType set to the QName {""}/X, where X is the root element's tagName. Since the code to achieve this is already conveniently packaged inside ParameterSerializer.unmarshall(), the shortened version of the process becomes:

    Bean paramBean = xjmr.unmarshall(inScopeEncStyle,
                                    RPCConstants.Q_ELEM_PARAMETER,
                                    rootElement, ctx);
    

  • Initialize the target object. The target object is the object instance you're reconstructing. As member properties get deserialized, you can restore their values by invoking the mutator methods on your target object, as follows:
      Foo foo = new Foo();
      foo.setS( paramBean.value );
      

Step 3: Return the reconstructed object

The Bean class encapsulates the run-time type and the actual returned instance. The deserializer knows what class it should be returning because in most cases it has been tailored for a specific class. For generic deserializers like BeanSerializer and ArraySerializer, the javaType property in the type mapping conveys the type to be returned:

   return(new Bean(Foo.class, foo));


Registering root (de)serializers

I've already mentioned that if you intend to introduce custom encodingStyles, then you must write root (de)serializers. Root (de)serializers are implemented the same way as normal (de)serializers except for one small difference: all root (de)serializers are registered into the type mapping registry with a specially designated QName and Java type that will tell Apache SOAP to bootstrap the (de)serialization process based on the encodingStyle property. In the sample code below, take note of the highlighted values, which you must use when registering root (de)serializers.

  [Client]

    smr.mapTypes(customEncURI,
      RPCConstants.Q_ELEM_PARAMETER, 
      Parameter.class,
      customSerializer, 
      null);

  [Server]

    <isd:map encodingStyle="customEncURI"
      xmlns:x="http://schemas.xmlsoap.org/soap/envelope" qname="x:Parameter"
      javaType="org.apache.soap.rpc.Parameter"
      java2XMLClassName="foo.customSerializer" />


Schema-constrained SOAP

In this section, I'll walk you through an alternative solution to BeanSerializer for (de)serializing complex types. This technique, which I'll call schema-constrained SOAP, uses an XML Schema to describe the literal XML structure of the RPC parameter(s). Here, we're agreeing to interoperate strictly on the format of the message, without caring about the data model on the client and server. To avoid confusion, it should be noted that the RPC invocation is still encoded using Section 5, but the parameter(s) are not.

I'll illustrate this technique with an example application; you can download the full code from Resources below. A client sends a purchase order to a Web service, and the service responds with an acknowledgement string. The method signature exported by the Web service is thus:

public String eatPo (PurchaseOrder p);

In order for this technique to work, we need a XML/object data binding framework. For this example, I chose to use Exolab's Castor toolkit. (See the Resources section below for links to Castor and a list of other serialization frameworks, like JSX, JAXB, and Schema2Java.)

The steps for this technique are as follows:

  1. Agree on the XML format for PurchaseOrder.
  2. Generate the Java classes using Castor.
  3. Write a custom (de)serializer.
  4. Write type mappings for the client and server pieces.

Step 1: Agree on the XML format for PurchaseOrder

For this use case, I removed the order details section from my PurchaseOrder schema to keep things simple. Also, note that the PONumber attribute makes this schema noncompliant with Section 5 encoding.


Figure 2. PurchaseOrder.xsd
PurchaseOrder.xsd

Step 2: Generate the Java classes using Castor

Run Castor's SourceGenerator command-line tool to generate Java classes that implement the schema in PurchaseOrder.xsd:

  java org.exolab.castor.builder.SourceGenerator
     -i PurchaseOrder.xsd
     -package com.raverun.po.castor

The SourceGenerator tool only recognizes the latest schema namespace -- http://www.w3.org/2001/XMLSchema.

Next, compile the set of Java classes. Note that you'll need to use the -deprecation option as the generated files uses SAX 1.0 APIs. To circumvent this manual compilation, Exolab is working on an Ant taskdef to automate it.

Step 3: Write a custom (de)serializer

You will now implement the (de)serialization methods in PurchaseOrderSerializer by utilizing its counterpart methods exposed by the PurchaseOrder class. For serialization, PurchaseOrder can marshall to a java.io.Writer or a org.xml.sax.DocumentHandler. As shown in Listing 1, you delegate the serialization to PurchaseOrder's marshal() method. One caveat: the XML stream generated by the marshal() method contains the XML prolog. PurchaseOrderSerializer strips off this prolog by wrapping sink with FilterXmlProlog, a java.io.FilterWriter. Listing 2 contains some exceptional cases that might arise during the deserialization process.


Listing 1. Extract from marshal() method in PurchaseOrderSerializer

      ----o<---------
      SoapEncUtils.generateStructureHeader(inScopeEncStyle,
                                         javaType,
                                         context,
                                         sink,
                                         nsStack,
                                         xjmr);
      PurchaseOrder po = (PurchaseOrder)src; 
      try{
        po.marshal( new FilterXmlProlog(sink) );
      }catch(Exception e){ 
        throw (new java.io.IOException("Castor: Error marshalling"));
      }
      sink.write( StringUtils.lineSeparator );
      sink.write("</" + context + '>');
      ----o<---------


Listing 2. Exceptional cases during deserialization in PurchaseOrderSerializer

    (b1) Null PO.
                 
         ---------------------------
         <po 
           xmlns:ns2="urn:raverun" 
           xsi:type="ns2:po"
           xsi:null="true"/>
         ---------------------------

    (b2) Non null but nothing submitted in the body.

         ---------------------------
         <po 
           xmlns:ns2="urn:raverun" 
           xsi:type="ns2:po" />
         ---------------------------

    (b3) PO that violates the schema.

         ---------------------------
         <po 
           xmlns:ns2="urn:raverun" 
           xsi:type="ns2:po">
           <foo bar="123"/>
         </po>
         ---------------------------

Step 4: Write type mappings for the client and server pieces

Lastly, you need to declare the type mappings to reference your custom (de)serializer. It might surprise you to see Section 5 specified as the encoding for PurchaseOrder. This is done for convenience's sake, as it grants you the ability to use ParameterSerializer to bootstrap the deserialization process and also to use SoapEncUtils in the serialization code.

  [Client]
    SOAPMappingRegistry smr = new SOAPMappingRegistry();
    smr.mapTypes(Constants.NS_URI_SOAP_ENC,
                   new QName("urn:raverun", "po"),
                   PurchaseOrder.class, pos, null);

  [Server]
    <isd:map
       encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
       xmlns:x="urn:raverun" qname="x:po"
       javaType="com.raverun.po.castor.PurchaseOrder"
       xml2JavaClassName="com.raverun.po.PurchaseOrderSerializer" />

Potential problems with this solution

You should keep the following issues in mind while examining the schema-constrained SOAP example:

  • In order to be standards compliant, you must turn off any claims about Section 5 encoding in the <po> element. (See Listing 3 for a more compliant SOAP XML instance.) The SOAP 1.1 specification (see Resources) describes this requirement as follows:

    A value of the zero-length URI ("") explicitly indicates that no claims are made for the encoding style of contained elements.

    An alternative to the null encodingStyle is to introduce a custom encodingStyleURI, tailored to your communication needs.

  • There are some bugs to watch out for in Castor, but all have workarounds. If you're using a version of Castor older than 0.9.3, schema validation does not work as expected. The solution is to upgrade to the latest release. On the other hand, Castor 0.9.3 (the version I used) generates a spurious message to the standard output stream. The message I encountered:

              Warning : preserved is a bad entry for the whiteSpace value.
    

    The latest version of Castor, 0.9.3.9, suppresses this warning.

  • PurchaseOrderSerializer does not serialize to multireference values. However, it will deserialize them correctly. This is not a feature of PurchaseOrderSerializer per se, but of ParameterSerializer.


Listing 3. A more compliant SOAP instance

<ns1:eatPo 
  xmlns:ns1="urn:poservice" 
  SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
  <po 
    xmlns:ns2="urn:raverun" 
    xsi:type="ns2:po"
    SOAP-ENV:encodingStyle="">
    <purchaseOrder xmlns="http://www.example.com/PO1">
      <header PONumber="9999-1212">
        <Date>2001-09-25T14:40:13.453</Date>
      </header>
      ...
    </purchaseOrder>
  </po>
</ns1:eatPo>


Latest interface changes

The last official release of Apache SOAP (version 2.2) came out in May 2001. Although development focus has shifted to Axis (currently at beta 1), bug fixes are continually being added. While we await a 2.3 release (if there is one), users of the official release should be aware that there have been major updates in the codebase, especially in SOAPMappingRegistry and its related classes. Existing code may need some changes to interoperate with the fixes.

Here is the list of notable changes:

  • Schema namespaces now reference the 2001 recommendation namespace by default. Version 2.2 referenced the 1999 namespace.
  • As a corollary, if you instantiate SOAPMappingRegistry with its no-arg constructor, a 2001 namespace-aware instance is returned.
  • Instance creation for SOAPMappingRegistry has been redesigned according to the static factory pattern. Thus, you now should use the factory method getBaseRegistry(schemaURI) instead of the overloaded constructor SOAPMappingRegistry(schemaURI):
      public static SOAPMappingRegistry getBaseRegistry (String schemaURI);
    

  • Version 2.2 offers the ability to chain registries. These methods were recently added:
      public SOAPMappingRegistry(SOAPMappingRegistry parent);
      public SOAPMappingRegistry(SOAPMappingRegistry parent, String schemaURI);
      public SOAPMappingRegistry getParent()
      public String getSchemaURI()
      

    The resolution of type mappings will percolate up the chain until a match is found.
  • The DeploymentDescriptor class treats the qname attribute as optional in type mapping declarations.

Conclusion

I hope that the examples in this article have made clear the theoretical concepts outlined in the first article in this series. If Web services operating across many machines on the network are to become a widespread reality, developers must understand how programmatic objects are transmitted from one machine to another. A better understanding of SOAP's type mapping abilities should help you build better distributed applications and services.


Resources

About the author

Gavin Bong is a Java developer from Kuala Lumpur, Malaysia. His areas of interest include service-oriented architectures and wireless Java. You can contact Gavin at gavinb@eutama.com.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=/developerworks/js/artrating/
SITE_ID=1
Zone=SOA and Web services, Java technology
ArticleID=10649
ArticleTitle=Apache SOAP type mapping, Part 2: A serialization cookbook
publish-date=03012002
author1-email=gavinb@eutama.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers