Skip to main content

Working XML: Mapping files into SOAP requests, Part 2

Implement the analysis with XML and XSL

Benoit Marchal (bmarchal@pineapplesoft.com), Consultant, Pineapplesoft
Benoit Marchal is a Belgian consultant. He is the author of XML by Example, Second Edition and other XML books. Benoit is available to help you with XML projects. You can contact him at bmarchal@pineapplesoft.com or through his personal site at marchal.com.

Summary:  Many applications are being upgraded to accommodate e-commerce transactions. In his previous column, Benoit Marchal analyzed legacy data and showed how to map into a state-of-the art SOAP request. Now in part 2, he discusses the XML and XSL coding necessary to implement the analysis. Share your thoughts on this article with the author and other readers in the accompanying discussion forum.

Date:  14 Jan 2004
Level:  Intermediate
Activity:  1808 views
Comments:  

In 2001, I demonstrated to one of my skeptical clients that a Web service can be easy to deploy in the field. My goal was to replace most of my client's administrative paperwork with a Web service, thereby "fastening" their supply chain. The acid test was to deploy the Web service to my client's suppliers, most of which were small organizations (15 to 50 employees).

In such a setup, it is nearly impossible to use traditional RPC -- hence my client's skepticism. I proposed that we look at the RPC request as an XML document and use XML programming, rather than RPC coding, to construct the call. After two days of testing, we proved that my approach worked.

The lightweight XML client I have been working on for the past few columns embodies the same ideas. In my previous column, I showed you how to analyze a file export to turn it into an RPC request. This month I'll show you the actual programming.

Analysis summary

Remember from Part 1 of this article that the ultimate goal is to turn legacy data files, such as the one in Listing 1, into a SOAP request. The request has one method, processOrder(), which takes an array of Order objects.


Listing 1. A sample legacy file
HDRAZ5251029200309252003  1899.00
HDRAZ5281029200310272003   149.95
LINWEBAZ525THPRE     IBM ThinkPad R Series Economy          1099.00
LINDITAZ525THPRV     IBM ThinkPad R Series Value         2   400.00
LINWEBAZ528BKXBE     XML by Example                      5    29.99

The Order object is specified in UML in Figure 1:


Figure 1. SOAP data model
SOAP data model

Also recall that in Part 1, you analyzed this data and specified the mapping between the two formats, as shown in Table 1:

Table 1. Analysis results

SOAP elementExport fieldComment
Order/ReferenceHDR_ORFF-
Order/DateHDR_ODTEconvert to xsd:date format
Order/Buyer-"Example Corp."
Order/Address-"Example Street 5, 45202 Cincinnati, OH"
Order/TotalHDR_OTOT-
Item/CodeLIN_OCOD-
Item/DescriptionLIN_ODES-
Item/PriceLIN_OPRI-
Item/QuantityLIN_OQTY1 if absent

Stylesheet development

Once you've successfully completed the map analysis, it is a simple task to write the corresponding XI rules and XSLT stylesheet. If you are not familiar with XI, it uses regular expressions (regex) to pre-process text files and convert them to XML. The result of this pre-processing is fed to an XSL processor, which generates the final XML document.

The pre-processing is required because XSLT processors accept only XML documents as input.

Regex writing

The pre-processor works with a set of regexes (called a ruleset). Each regex is represented by a match element. The pre-processor attempts to match the source document against one of the regexes in the ruleset; if a regex matches, the pre-processor generates an XML element followed by more elements for every group in the regex.

In the first versions of XI, you would specify the regex through a pattern attribute associated to the match element. However, this proved too complex with long regexes (it was difficult to decide how the group elements related to the regex). To simplify coding, the latest version of XI lets you associate portions of the regex directly to the group element. A new filler element contains portions of the regex that are not in a group (that is, they do not generate XML tags). The pre-processor concatenates all portions to construct the regex for a given match element.

Listing 2 contains an XI ruleset for the document in Listing 1 and an XSLT stylesheet to test it. The two records (HDR and LIN) translate into two match elements. Since the fields have a fixed length, the regexes are simple, taking the form .{5}, which means "any character, repeated five times."


Listing 2. Writing the XI rules
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xi="http://ananas.org/2002/xi/rules"
                xmlns:o="http://ananas.org/2003/order"
                version="1.0">

<xi:rules version="1.0"
          defaultPrefix="o"
          targetNamespace="http://ananas.org/2003/order"
          defaultSpace="trim">
   <xi:ruleset name="export">
      <xi:match name="header">
         <xi:filler pattern="^HDR"/>
         <xi:group  pattern=".{5}" name="ORFF"/>
         <xi:group  pattern=".{8}" name="ODTE"/>
         <xi:filler pattern=".{8}"/>
         <xi:group  pattern=".{9}" name="OTOT"/>
         <xi:filler pattern="$"/>
      </xi:match>
      <xi:match name="line">
         <xi:filler pattern="^LIN"/>
         <xi:filler pattern=".{3}"/>
         <xi:group  pattern=".{5}"  name="ORFF"/>
         <xi:group  pattern=".{5}"  name="OCOD"/>
         <xi:filler pattern=".{5}"/>
         <xi:group  pattern=".{35}" name="ODES"/>
         <xi:group  pattern=".{2}"  name="OQTY"/>
         <xi:group  pattern=".{9}"  name="OPRI"/>
         <xi:filler pattern="$"/>
      </xi:match>
   </xi:ruleset>
</xi:rules>

<xsl:output method="xml" indent="yes"/>

<xsl:template match="@*|node()">
   <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
   </xsl:copy>
</xsl:template>

</xsl:stylesheet>

Fillers are appropriate for those fields that are not part of the SOAP requests, such as the creation order date. Other fields are given an XML name that matches their names in the export file (for instance, HDR_ORFF becomes ORFF).

Note the use of the defaultSpace attribute on the rules element. It automatically trims extra spaces from the field data. Fixed-length files always have a lot of unwanted spaces and it saves time to ask the XI pre-processor to remove them.

To test the regular expression, the stylesheet simply copies its input unmodified. Since the input to the stylesheet is the output of the pre-processor, the stylesheet effectively outputs the XML document generated by the pre-processor. Admittedly, this is a complex syntax to save the output of the pre-processor.

The "copy all" stylesheet has only one template, which I copied directly from the XSLT recommendation.

Use the stylesheet in Listing 2 to validate the regex. It is pointless to forge ahead and try to build a SOAP request until the pre-processor can accurately extract all relevant information from the export file.

Variations on the header/line relationship

Two variations on the relationship between headers and lines are common. The first variation has no header records -- the header information is repeated on every line. The Muenchian algorithm is the best solution for processing these cases (see Resources).

The other variation does not repeat the order reference in the line records, but uses their positions to associate headers and lines (a new header record marks the beginning of a new order; every line record is part of this order until the next header record). To process these files, you have to walk through them sideways one record at a time, as illustrated in the tip "Recurse, not divide to conquer" (see Resources).

SOAP encoding

SOAP encoding was covered at length in the previous installment. To summarize, the procedure call, the classes, and their variables become XML elements. The most complex aspects of the encoding are the arrays, because they need two special attributes (in SOAP 1.1):

  • An xsi:type attribute whose value must be soapenc:Array
  • A soapenc:arrayType attribute whose value is an array declaration written in a syntax similar to the Java language -- that is, the element type followed by the size of the array between square brackets (for instance, xsi:int[5] or po:Order[3])

The encoding of arrays was changed for SOAP 1.2. Essentially, SOAP 1.2 dispenses with the attributes and replaces them with an optional soapenc:arraySize attribute. Still, the server I used for testing (Axis 1.1) only recognizes SOAP 1.1. Furthermore, most SOAP servers currently deployed implement SOAP 1.1.

Regardless of the version of SOAP used, the procedure call is wrapped in SOAP Envelope and Body elements.

XSLT coding

The XSLT stylesheet is responsible for formatting the SOAP request. Because the server expects an array of Order objects and each Order may contain one or more Item objects, there are two loops in the stylesheet. The first one loops over the HDR records, the second one over the LIN records.

Listing 3 shows the stylesheet (the XI ruleset is identical to Listing 2):


Listing 3. Final stylesheet
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xi="http://ananas.org/2002/xi/rules"
                xmlns:o="http://ananas.org/2003/order"
                xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
                xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
                xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns:op="http://ananas.org/2003/order-processor"
                xmlns:order="http://ananas.org/2003/order-processor/order"
                version="1.0">

<xi:rules version="1.0"
          defaultPrefix="o"
          targetNamespace="http://ananas.org/2003/order"
       defaultSpace="trim">
   <xi:ruleset name="export">
      <xi:match name="header">
         <xi:filler pattern="^HDR"/>
         <xi:group  pattern=".{5}" name="ORFF"/>
         <xi:group  pattern=".{8}" name="ODTE"/>
         <xi:filler pattern=".{8}"/>
         <xi:group  pattern=".{9}" name="OTOT"/>
         <xi:filler pattern="$"/>
      </xi:match>
      <xi:match name="line">
         <xi:filler pattern="^LIN"/>
         <xi:filler pattern=".{3}"/>
         <xi:group  pattern=".{5}"  name="ORFF"/>
         <xi:group  pattern=".{5}"  name="OCOD"/>
         <xi:filler pattern=".{5}"/>
         <xi:group  pattern=".{35}" name="ODES"/>
         <xi:group  pattern=".{2}"  name="OQTY"/>
         <xi:group  pattern=".{9}"  name="OPRI"/>
         <xi:filler pattern="$"/>
      </xi:match>
   </xi:ruleset>
</xi:rules>

<xsl:output method="xml" indent="yes"/>

<xsl:template match="o:export">
<soapenv:Envelope>
 <soapenv:Body>
  <op:processOrder 
           soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
   <orders xsi:type="soapenc:Array" 
           soapenc:arrayType="order:Order[{count(o:header)}]">
    <xsl:for-each select="o:header">
     <order>
      <reference><xsl:apply-templates select="o:ORFF"/></reference>
      <date><xsl:apply-templates select="o:ODTE"/></date>
      <buyer>Example Corp.</buyer>
      <address>Example Street 5, 45202 Cincinnati, OH</address>
      <xsl:variable name="orff" select="o:ORFF"/>
      <items xsi:type="soapenc:Array"
           soapenc:arrayType="order:Item[{count(../o:line[o:ORFF = $orff])}]">
       <xsl:for-each select="../o:line[o:ORFF = $orff]">
        <item>
         <code><xsl:apply-templates select="o:OCOD"/></code>
         <description><xsl:apply-templates select="o:ODES"/></description>
         <price><xsl:apply-templates select="o:OPRI"/></price>
         <quantity><xsl:apply-templates select="o:OQTY"/></quantity>
        </item>
       </xsl:for-each>
      </items>
      <total><xsl:apply-templates select="o:OTOT"/></total>
     </order>
    </xsl:for-each>
   </orders>
  </op:processOrder>
 </soapenv:Body>
</soapenv:Envelope>
</xsl:template>

<xsl:template match="o:ODTE">
<xsl:value-of select="substring(.,5,4)"/>
<xsl:text>-</xsl:text>
<xsl:value-of select="substring(.,1,2)"/>
<xsl:text>-</xsl:text>
<xsl:value-of select="substring(.,3,2)"/>
</xsl:template>

<xsl:template match="o:OQTY[not(text())]">
<xsl:text>1</xsl:text>
</xsl:template>

</xsl:stylesheet>

The main template matches o:export, which is the first element generated by the pre-processor. It creates elements for the SOAP envelope, body, and procedure call. Next comes the loop over the header records. The stylesheet computes the array size with the count() function. The fields (reference, date, buyer, address, and total) are written according to the mapping table.

A second loop writes the items array. To associate the lines with their order, the stylesheet uses the order number, as shown in the condition o:line[o:ORFF = $orff].

Note the use of a variable ($orff) to hold the header reference. Otherwise, since they have the same name, the XPath would confuse the header reference with the item reference.

Finally, note the use of the xsl:apply-templates instruction to retrieve the value of fields. Most fields are processed by the default rule, which copies the original data, but the date and quantity fields have special templates to reformat the date and to supply missing information, respectively.


Another reason to like SOAP?

As this set of articles illustrates, SOAP has a unique feature when compared to other RPC standards: a textual representation in XML. Since there are more tools, XML makes SOAP more accessible than other RPC standards.

Lightweight XML clients, such as the client introduced in this series, harness this increased flexibility to deploy B2B e-commerce to companies that could not afford custom RPC and Java programming.


Resources

About the author

Benoit Marchal

Benoit Marchal is a Belgian consultant. He is the author of XML by Example, Second Edition and other XML books. Benoit is available to help you with XML projects. You can contact him at bmarchal@pineapplesoft.com or through his personal site at marchal.com.

Comments



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, SOA and Web services
ArticleID=12360
ArticleTitle=Working XML: Mapping files into SOAP requests, Part 2
publish-date=01142004
author1-email=bmarchal@pineapplesoft.com
author1-email-cc=dwxed@us.ibm.com

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers