In 2001, I demonstrated to one of my skeptical clients that a Web service can be easy to deploy in the field. My goal was to replace most of my client's administrative paperwork with a Web service, thereby "fastening" their supply chain. The acid test was to deploy the Web service to my client's suppliers, most of which were small organizations (15 to 50 employees).
In such a setup, it is nearly impossible to use traditional RPC -- hence my client's skepticism. I proposed that we look at the RPC request as an XML document and use XML programming, rather than RPC coding, to construct the call. After two days of testing, we proved that my approach worked.
The lightweight XML client I have been working on for the past few columns embodies the same ideas. In my previous column, I showed you how to analyze a file export to turn it into an RPC request. This month I'll show you the actual programming.
Remember from Part 1 of this article that the ultimate goal is to turn legacy data files, such as the one in Listing 1, into a SOAP request. The request has one method,
processOrder(), which takes an array of Order objects.
Listing 1. A sample legacy file
HDRAZ5251029200309252003 1899.00 HDRAZ5281029200310272003 149.95 LINWEBAZ525THPRE IBM ThinkPad R Series Economy 1099.00 LINDITAZ525THPRV IBM ThinkPad R Series Value 2 400.00 LINWEBAZ528BKXBE XML by Example 5 29.99 |
The Order object is specified in UML in Figure 1:
Figure 1. SOAP data model

Also recall that in Part 1, you analyzed this data and specified the mapping between the two formats, as shown in Table 1:
Table 1. Analysis results
| SOAP element | Export field | Comment |
| Order/Reference | HDR_ORFF | - |
| Order/Date | HDR_ODTE | convert to xsd:date format |
| Order/Buyer | - | "Example Corp." |
| Order/Address | - | "Example Street 5, 45202 Cincinnati, OH" |
| Order/Total | HDR_OTOT | - |
| Item/Code | LIN_OCOD | - |
| Item/Description | LIN_ODES | - |
| Item/Price | LIN_OPRI | - |
| Item/Quantity | LIN_OQTY | 1 if absent |
Once you've successfully completed the map analysis, it is a simple task to write the corresponding XI rules and XSLT stylesheet. If you are not familiar with XI, it uses regular expressions (regex) to pre-process text files and convert them to XML. The result of this pre-processing is fed to an XSL processor, which generates the final XML document.
The pre-processing is required because XSLT processors accept only XML documents as input.
The pre-processor works with a set of regexes (called a ruleset). Each regex is represented by a match element. The pre-processor attempts to match the source document against one of the regexes in the ruleset; if a regex matches, the pre-processor generates an XML element followed by more elements for every group in the regex.
In the first versions of XI, you would specify the regex through a pattern attribute associated to the match element. However, this proved too complex with long regexes (it was difficult to decide how the group elements related to the regex). To simplify coding, the latest version of XI lets you associate portions of the regex directly to the group element. A new filler element contains portions of the regex that are not in a group (that is, they do not generate XML tags). The pre-processor concatenates all portions to construct the regex for a given match element.
Listing 2 contains an XI ruleset for the document in Listing 1 and an XSLT stylesheet to test it. The two records (HDR and LIN) translate into two match elements. Since the fields have a fixed length, the regexes are simple, taking the form .{5}, which means "any character, repeated five times."
Listing 2. Writing the XI rules
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xi="http://ananas.org/2002/xi/rules"
xmlns:o="http://ananas.org/2003/order"
version="1.0">
<xi:rules version="1.0"
defaultPrefix="o"
targetNamespace="http://ananas.org/2003/order"
defaultSpace="trim">
<xi:ruleset name="export">
<xi:match name="header">
<xi:filler pattern="^HDR"/>
<xi:group pattern=".{5}" name="ORFF"/>
<xi:group pattern=".{8}" name="ODTE"/>
<xi:filler pattern=".{8}"/>
<xi:group pattern=".{9}" name="OTOT"/>
<xi:filler pattern="$"/>
</xi:match>
<xi:match name="line">
<xi:filler pattern="^LIN"/>
<xi:filler pattern=".{3}"/>
<xi:group pattern=".{5}" name="ORFF"/>
<xi:group pattern=".{5}" name="OCOD"/>
<xi:filler pattern=".{5}"/>
<xi:group pattern=".{35}" name="ODES"/>
<xi:group pattern=".{2}" name="OQTY"/>
<xi:group pattern=".{9}" name="OPRI"/>
<xi:filler pattern="$"/>
</xi:match>
</xi:ruleset>
</xi:rules>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
|
Fillers are appropriate for those fields that are not part of the SOAP requests, such as the creation order date. Other fields are given an XML name that matches their names in the export file (for instance, HDR_ORFF becomes ORFF).
Note the use of the defaultSpace attribute on the rules element. It automatically trims extra spaces from the field data. Fixed-length files always have a lot of unwanted spaces and it saves time to ask the XI pre-processor to remove them.
To test the regular expression, the stylesheet simply copies its input unmodified. Since the input to the stylesheet is the output of the pre-processor, the stylesheet effectively outputs the XML document generated by the pre-processor. Admittedly, this is a complex syntax to save the output of the pre-processor.
The "copy all" stylesheet has only one template, which I copied directly from the XSLT recommendation.
Use the stylesheet in Listing 2 to validate the regex. It is pointless to forge ahead and try to build a SOAP request until the pre-processor can accurately extract all relevant information from the export file.
SOAP encoding was covered at length in the previous installment. To summarize, the procedure call, the classes, and their variables become XML elements. The most complex aspects of the encoding are the arrays, because they need two special attributes (in SOAP 1.1):
- An
xsi:typeattribute whose value must besoapenc:Array - A
soapenc:arrayTypeattribute whose value is an array declaration written in a syntax similar to the Java language -- that is, the element type followed by the size of the array between square brackets (for instance,xsi:int[5]orpo:Order[3])
The encoding of arrays was changed for SOAP 1.2. Essentially, SOAP 1.2 dispenses with the attributes and replaces them with an optional soapenc:arraySize attribute. Still, the server I used for testing (Axis 1.1) only recognizes SOAP 1.1. Furthermore, most SOAP servers currently deployed implement SOAP 1.1.
Regardless of the version of SOAP used, the procedure call is wrapped in SOAP Envelope and Body elements.
The XSLT stylesheet is responsible for formatting the SOAP request. Because the server expects an array of Order objects and each Order may contain one or more Item objects, there are two loops in the stylesheet. The first one loops over the HDR records, the second one over the LIN records.
Listing 3 shows the stylesheet (the XI ruleset is identical to Listing 2):
Listing 3. Final stylesheet
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xi="http://ananas.org/2002/xi/rules"
xmlns:o="http://ananas.org/2003/order"
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:op="http://ananas.org/2003/order-processor"
xmlns:order="http://ananas.org/2003/order-processor/order"
version="1.0">
<xi:rules version="1.0"
defaultPrefix="o"
targetNamespace="http://ananas.org/2003/order"
defaultSpace="trim">
<xi:ruleset name="export">
<xi:match name="header">
<xi:filler pattern="^HDR"/>
<xi:group pattern=".{5}" name="ORFF"/>
<xi:group pattern=".{8}" name="ODTE"/>
<xi:filler pattern=".{8}"/>
<xi:group pattern=".{9}" name="OTOT"/>
<xi:filler pattern="$"/>
</xi:match>
<xi:match name="line">
<xi:filler pattern="^LIN"/>
<xi:filler pattern=".{3}"/>
<xi:group pattern=".{5}" name="ORFF"/>
<xi:group pattern=".{5}" name="OCOD"/>
<xi:filler pattern=".{5}"/>
<xi:group pattern=".{35}" name="ODES"/>
<xi:group pattern=".{2}" name="OQTY"/>
<xi:group pattern=".{9}" name="OPRI"/>
<xi:filler pattern="$"/>
</xi:match>
</xi:ruleset>
</xi:rules>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="o:export">
<soapenv:Envelope>
<soapenv:Body>
<op:processOrder
soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<orders xsi:type="soapenc:Array"
soapenc:arrayType="order:Order[{count(o:header)}]">
<xsl:for-each select="o:header">
<order>
<reference><xsl:apply-templates select="o:ORFF"/></reference>
<date><xsl:apply-templates select="o:ODTE"/></date>
<buyer>Example Corp.</buyer>
<address>Example Street 5, 45202 Cincinnati, OH</address>
<xsl:variable name="orff" select="o:ORFF"/>
<items xsi:type="soapenc:Array"
soapenc:arrayType="order:Item[{count(../o:line[o:ORFF = $orff])}]">
<xsl:for-each select="../o:line[o:ORFF = $orff]">
<item>
<code><xsl:apply-templates select="o:OCOD"/></code>
<description><xsl:apply-templates select="o:ODES"/></description>
<price><xsl:apply-templates select="o:OPRI"/></price>
<quantity><xsl:apply-templates select="o:OQTY"/></quantity>
</item>
</xsl:for-each>
</items>
<total><xsl:apply-templates select="o:OTOT"/></total>
</order>
</xsl:for-each>
</orders>
</op:processOrder>
</soapenv:Body>
</soapenv:Envelope>
</xsl:template>
<xsl:template match="o:ODTE">
<xsl:value-of select="substring(.,5,4)"/>
<xsl:text>-</xsl:text>
<xsl:value-of select="substring(.,1,2)"/>
<xsl:text>-</xsl:text>
<xsl:value-of select="substring(.,3,2)"/>
</xsl:template>
<xsl:template match="o:OQTY[not(text())]">
<xsl:text>1</xsl:text>
</xsl:template>
</xsl:stylesheet>
|
The main template matches o:export, which is the first element generated by the pre-processor. It creates elements for the SOAP envelope, body, and procedure call. Next comes the loop over the header records. The stylesheet computes the array size with the count() function. The fields (reference, date, buyer, address, and total) are written according to the mapping table.
A second loop writes the items array. To associate the lines with their order, the stylesheet uses the order number, as shown in the condition o:line[o:ORFF = $orff].
Note the use of a variable ($orff) to hold the header reference. Otherwise, since they have the same name, the XPath would confuse the header reference with the item reference.
Finally, note the use of the xsl:apply-templates instruction to retrieve the value of fields. Most fields are processed by the default rule, which copies the original data, but the date and quantity fields have special templates to reformat the date and to supply missing information, respectively.
As this set of articles illustrates, SOAP has a unique feature when compared to other RPC standards: a textual representation in XML. Since there are more tools, XML makes SOAP more accessible than other RPC standards.
Lightweight XML clients, such as the client introduced in this series, harness this increased flexibility to deploy B2B e-commerce to companies that could not afford custom RPC and Java programming.
- Participate in the discussion forum.
- Download the XI lightweight XML client used in this article.
- Review "Map files into SOAP requests, Part 1" by Benoit Marchal for an introduction to the analysis behind this project (developerWorks, December 2003).
- Read earlier Working XML installments that cover the lightweight client:
- "A lightweight XML client" (developerWorks, September 2003) launches the project -- an XML client for e-commerce, born out of the author's experience with B2B e-commerce over the last couple of years.
- "A first version of the lightweight client" (developerWorks, October 2003) shows you how to create SOAP transactions through XSLT.
- Take a look at XI, an earlier Working XML project that dealt with importing text documents in an XML publishing solution (or any XML solution for that matter).
- Discover an alternative technique for managing the relationship between a header record and the lines in "Recurse, not divide, to conquer" (developerWorks, July 2001) by Benoit Marchal.
- Try the IBM SDK for Web Services, part of "IBM WebSphere SDK for Web Services (WSDK) Version 5.1." You could use this SDK to build a SOAP server (developerWorks, September 2003).
- Check out Dave
Pawson's collection of information about the Muenchian
grouping that you might want to use to manage the relationship between
the header record and the lines.
- If you are new to regular expressions, try The Regex Coach tool.
- Find more XML resources on the developerWorks XML content area.
- Find out how you can become an IBM Certified Developer in XML and related technologies.

Benoit Marchal is a Belgian consultant. He is the author of XML by Example, Second Edition and other XML books. Benoit is available to help you with XML projects. You can contact him at bmarchal@pineapplesoft.com or through his personal site at marchal.com.




