Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

developerWorks Community:

  • Close [x]

Automate XML file updates, Part 1: XML process introduction and conversion stylesheet creation

A methodology using XSLT, Apache Ant, and Java SE

Tom Coppedge (tcoppedg@us.ibm.com), developerWorks Software Engineer, IBM
Tom Coppedge has been a member of the developerWorks design team since the site was launched in 1999. Tom's focus includes XML & XSLT strategy, information architecture, and site design. He joined IBM in 1988 after receiving a degree in Information Systems & Operations Management from the University of North Carolina at Greensboro.

Summary:  This is the first part of a tutorial series that describes a method for automating updates to a library of XML files so that they all conform to an updated XML schema. In Part 1, you learn the steps in the entire process and then create an XSLT stylesheet to update the XML files. In Part 2, you learn how to install, configure, and run Apache Ant and Java SE to iteratively transform each of your XML files based on the updates specified in your XSLT stylesheet.

View more content in this series

Date:  17 Aug 2006
Level:  Intermediate PDF:  A4 and Letter (303 KB | 15 pages)Get Adobe® Reader®

Activity:  13991 views
Comments:  

Create and test XSLT conversion templates

Once you list the schema updates and the non-schema-controlled data updates, and determine which need to be reflected in your XML instance documents, you can begin writing what I'll call a conversion stylesheet. You'll be converting XML instance documents into new XML instance documents that conform to the updated schema. The process is simple if you take it one step, or XSLT template, at a time.

The template writing process

For each schema update, or each non-schema-controlled data update that requires an XML file update:

  1. Write a corresponding XSLT template to reflect that update in the XML instance documents. The template will likely begin by matching (<xsl:template match="">) a given construct; the rest of the template will be dedicated to outputting XML as intended by the item in the change list. If your development process includes tracking numbers associated with one or more of these changes, you might consider adding a comment to the template with the tracking number to ensure you maintain a history trail.
  2. Test the XSLT template. Simply create a one-template stylesheet and transform a representative sample of XML instance documents that will test the functions of the template. If the previous schema definition for a given construct allowed for wide variation in structure and data values, be sure your tests cover all of those cases.
  3. Change your XSLT template to reflect any problems testing exposes.

A tale of two templates

The scenario and templates described below illustrate some common conversion stylesheet tasks, such as adding, removing, and changing constructs and data values within an XML file.

Imagine that you maintain a schema that describes employee information. The current employee element within the schema is described by a sequence of the following elements, along with their min/max occurence indicators:

  • Name (min 1, max 1)
  • Department (min 1, max 1)
  • Location (min 1, max 1)
  • Project (min 1, max unbounded)
  • Phone (min 1, max 1)

Now imagine that, for various reasons, you or someone else will have to update the schema (and therefore the XML instance documents upon which it's based) as follows:

  • Replace the dept element with a new dept attribute on the employee element.
  • Replace the old dept values with new ones.
  • Delete the location element.
  • Add a manager element (based on dept number.)
  • Replace the '123' area code with '333'.

All of these changes will require you update the existing XML instance documents so they validate when checked against the new schema.

You can divide the requested changes into two templates: One to handle the area code updates for the phone numbers, and one to handle the rest. In real life, the developer responsible for maintaining the phone information (and the corresponding section of the schema) may be separate from those who maintain the other information, so it would be logical to divide the work this way.

Whatever the justification, here are two templates that implement the required updates. The templates are coded with some assumptions to make them relatively brief. (As any XSL coder will tell you, there are many ways to code a template.) This isn't an XSLT tutorial, so I won't go into any detail about the templates themselves. (See Resources for very popular XSLT articles and tutorials on developerWorks.) I have, however, added bold highlighting to lines containing or introducing the primary functions.


Listing 1. Example conversion template: Updates to the employee element
<!-- Change employee element as follows:
       1. Replace dept element with new dept attribute.
       2. Replace old dept numbers with new ones.
       3. Delete location element.
       4. Add manager element.
  -->
  <xsl:template match="employee">
    <xsl:element name="employee">
      <!-- Add dept attribute; source from dept element. -->
      <xsl:attribute name="dept">
         <xsl:choose>
         <!-- Replace old dept numbers with new values -->
            <xsl:when test="dept='012'">
               <xsl:text>01</xsl:text>
            </xsl:when>
            <xsl:when test="dept='123'">
               <xsl:text>02</xsl:text>
            </xsl:when>
            <xsl:when test="dept='456'">
               <xsl:text>03</xsl:text>
            </xsl:when>
            <xsl:when test="dept='789'">
               <xsl:text>04</xsl:text>
            </xsl:when>
            <xsl:otherwise>
               <xsl:text>05</xsl:text>
            </xsl:otherwise>
         </xsl:choose>
      </xsl:attribute>
      <xsl:for-each select="*">
        <xsl:choose>
          <!-- Delete dept and location elements. -->
          <xsl:when test="name()='dept' or name()='location'"/>
          <xsl:otherwise>
            <xsl:choose>
              <xsl:when test="name()='phone'">
                <!-- The match="phone" template will be called -->
                <xsl:apply-templates select="."/>
              </xsl:when>
              <xsl:otherwise>
                <xsl:element name="{name()}">
                  <xsl:value-of select="."/>
                </xsl:element>
              </xsl:otherwise>
            </xsl:choose>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:for-each>
      <!-- Add manager element -->
      <xsl:element name="manager">
        <xsl:choose>
          <xsl:when test="dept='012'">
            <xsl:text>Ms. Alpha</xsl:text>
          </xsl:when>
          <xsl:when test="dept='123'">
            <xsl:text>Mr. Bravo</xsl:text>
          </xsl:when>
          <xsl:when test="dept='456'">
            <xsl:text>Mr. Charlie</xsl:text>
          </xsl:when>
          <xsl:when test="dept='789'">
            <xsl:text>Ms. Delta</xsl:text>
          </xsl:when>
          <xsl:otherwise>
             <xsl:text>Ms. Parker</xsl:text>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:element>
    </xsl:element>
  </xsl:template>


Listing 2. Example conversion template: Updates to the phone element
<!-- Change phone element as follows:
        Replace '123' area code with '333'.
  -->
   <xsl:template match="phone"> <!-- Change area code 123 to 333, but keep the exchange+number --> 
    <xsl:variable name="exchange-number">
      <xsl:value-of select="substring-after(.,'123')"/>
    </xsl:variable>
    <xsl:element name="phone">
       <xsl:value-of select="concat('333',$exchange-number)"/>
    </xsl:element>
  </xsl:template>


Additional benefits of this process

By addressing each change with a separate template, you gain the ability to divide the work among several XSLT developers. Each developer can code and test the assigned list of updates independently, thereby adding some flexibility to the schedule and work to be done. If you are able to divide up the work, try to do it so your developers won't work on the same constructs. If overlap is unavoidable, consider combining the work to be done on that construct and assign one person to the entire template.

Another side benefit to this approach is that it lends itself well to teaching XSLT coding to novices. When teaching XSLT concepts, it's useful to offer relatively small, self-contained exercises. If you're in the position to teach or mentor a co-worker, consider assigning them some of the easier changes to be made. Look for a wide variety of situations in which they will have to match a construct and then get the value of, copy, or restructure that construct or its child, parent, or sibling constructs. Due to the nature of these types of changes, it's likely that your student will have the opportunity to think through several possible cases and use conditional logic, which is always good practice.

5 of 10 | Previous | Next

Comments



static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development, XML
ArticleID=153748
TutorialTitle=Automate XML file updates, Part 1: XML process introduction and conversion stylesheet creation
publish-date=08172006
author1-email=tcoppedg@us.ibm.com
author1-email-cc=