Skip to main content

skip to main content

developerWorks  >  XML  >

Planning to upgrade XSLT 1.0 to 2.0, Part 2: Five strategies for changing from XSLT 1.0 to 2.0

A top-down view of upgrading

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Intermediate

David Marston (David_Marston@us.ibm.com), Software Engineer, IBM
Joanne Tong (joannet@ca.ibm.com), Software Developer, IBM

14 Nov 2006

XSLT 2.0 has features that allow a gradual upgrade of 1.0 stylesheets. However, some situations call for an overhaul, so that the whole architecture can be reviewed and improved. Should you overhaul or try the gradual approach? This article presents some relevant design issues to help you decide. You also get some guidance on the organizational characteristics that indicate success or difficulty for each upgrade strategy. To read the other articles in this series, check the Planning to upgrade overview page.

About this series

XSLT 2.0, the latest specification released by the W3C, is a language for transforming XML documents. It includes numerous new features, with some specifically designed to address shortcomings in XSLT 1.0. This collection of articles provides a high-level overview and an in-depth look at XSLT 2.0 from the point of view of an XSLT 1.0 user wishing to fix old problems, learn new techniques, and discover what to look out for. We provide examples derived from common applications and practical suggestions for those who wish to upgrade. To help you begin to use XSLT 2.0, migration techniques will be provided.



Back to top


Thinking strategically about an XSLT upgrade

You might have heard that XSLT 2.0 is closely compatible with 1.0 and that it adds many new features. Your 1.0 stylesheets probably use only a fraction of the 1.0 features and would use a smaller fraction of the 2.0 feature set. As you plan an upgrade, concentrate on the old features you use and the new features that motivate you. This article presents five options that represent five purified views of upgrading (or not, because the do-nothing option is also covered). You'll want to consider two kinds of decision factors when you choose an option: organizational capability factors and impact factors deriving from the 2.0 features that appeal to you.



Back to top


Things you need to know

The 2.0 version of the XSLT spec introduces the term stylesheet modules to encompass the units (typically files) that are imported or included into the collective entity called the stylesheet. Modules offer stronger separation than the separation between templates, mainly because XSLT declarations are often scoped to a single module. If, after reading this article and thinking about what would work best, you plan to upgrade incrementally, you can use modules to separate old XSLT code from the new (in addition to whatever modularity you already have).

To ease transitioning 1.0 stylesheets to 2.0, vendors can supply (at their option) a 2.0 processor that supports backwards compatibility (BC). In a stylesheet, this feature is controlled by a version attribute, which can appear on any element. (Recall that in 1.0, the version attribute can only appear on the top-level xsl:stylesheet element.) If an element has this attribute with a value of 1.0, then the BC feature is enabled for the element itself and the sub-tree within it. It is conceptually like an island of code that will execute as it would have in 1.0, with very few exceptions. Keep in mind that this feature is enabled based on the document structure of the stylesheet, not based on the flow of execution when running a transformation. For example, if an xsl:call-template instruction has an attribute version="1.0", but the named template that is called during execution has an attribute version="2.0", then BC is not enabled during the execution of that called template.

Similarly, an imported or included stylesheet module might have an attribute version="1.0" and the BC feature is then enabled for that module, even though the primary stylesheet is a 2.0 stylesheet. When an instruction is executed by a 2.0 processor in BC mode, its result might not be identical to the result produced by a 1.0 processor. Refer to the compatibility appendices of XSLT 2.0 (Appendix J), F&O (Appendix D), and XPath 2.0 (Appendix I) for a list of discrepancies between a 1.0 processor and a 2.0 processor in BC mode. (See the developerWorks Standards list in Resources for links.)

Other tools present different code to XSLT 1.0 and XSLT 2.0 processors, but they work at a more detailed level. The concepts in this section should suffice for strategic planning of an upgrade.



Back to top


Choosing the option that works for you

Each option has advantages and disadvantages. This section provides a brief description of each option and a long-form catalog of benefits and the price you might have to pay. A few of the options claim "you can apply locally standard tweaks or code cleanup" as a benefit, which means that your existing list of code-cleanup items is still useful. The cleanup might be local rules about coding style or policies about which modules have the exclusive right to contain certain declarations.

Option 1: Full rewrite to 2.0

If you choose this option, then it is like starting from the beginning. To take full advantage of XSLT 2.0 syntax and features, it is necessary to plan, possibly change the whole design, and probably rewrite most of your stylesheet modules. If you have a good modular structure with your current stylesheets, you might choose to retain the structure. More generally, if you have a clean architecture for 1.0 already in place, you might save some planning time, depending on which 2.0 features you wish to exploit.

ProsCons
  • You can take full advantage of 2.0 syntax and features because you are basically starting from scratch.
  • You do not need to find a 2.0 processor that supports backwards compatibility.
  • You can make your code more readable by using 2.0 features. See Part 1 of this series for more information.
  • You can make your code more straightforward because it does not involve backwards compatibility.
  • The result is portable across 2.0 processors.
  • You must study all 2.0 features that might be relevant. (Start now!)
  • Much up-front analysis and planning is required.
  • This option has the longest wait for payoff to be realized.
  • The design is hard to work back to a 1.0 version, should you need to cut short your development effort.

Option 2: Convert most of the stylesheet to 2.0 and but keep 1.0 islands

If a complete rewrite to 2.0 seems too drastic, then the next best option is to convert most of your stylesheet modules to 2.0 and reuse some 1.0 modules. You could make the 1.0 islands at the level of a single template or even deeper. This option still requires some planning, particularly about the places where you will use new 2.0 features, and investigating the reusability of old modules in your new stylesheet structures.

ProsCons
  • This option is the best approach for a gradual transition, if you eventually want to eliminate BC usage.
  • This option allows a narrow focus on backwards compatibility and a favorite feature.
  • You can make your code more readable. See Part 1 of this series for examples of readability improvements.
  • You can enhance modularity and reusability of your code.
  • You can apply locally standard tweaks or code cleanup as you go.
  • The result is vulnerable to mistakes in local versioning and tweaks.
  • This option requires a processor with backwards compatibility support.
  • The code might become harder to read around the islands (though better elsewhere).
  • Much up-front planning is needed to get net improvement in the code.
  • This approach might prolong an awkward mix of versions if you avoid touching the brittle parts.
  • It might consume more coding time on branching code for alternative versions.

Option 3: Institute 2.0 islands or rewrite modules that need overhaul

If specific 2.0 features are desirable, then you can try to patch 2.0 version islands in your 1.0 stylesheet, and run it using a 2.0 processor that supports backwards compatibility. If the processor generates unexpected results, then tweak your 1.0 modules until they work. This option lets you capitalize on the 2.0 features you want with minimal planning or redesign. Naturally, this works best if the new features you want are easily isolated. See Narrowing the options ahead for some thoughts about which features are more likely to be isolated.

ProsCons
  • Changes occur gradually, as driven by specific needs or a big payoff for one particular enhancement.
  • You can learn new features incrementally.
  • You can focus on performance bottlenecks.
  • You can focus on enhancing modularity and reusability.
  • This is an easier option when you have to maintain compatibility with pure 1.0 processors.
  • You can apply locally standard tweaks or code cleanup as you go.
  • Choosing this option usually results in a slow uptake on new 2.0 features and their associated benefits.
  • This is a patchy approach that might cause errors due to the wrong version being in effect.
  • This option requires a processor with backwards compatibility support.
  • Use of islands might hurt readability overall (though the islands themselves might be better).
  • The number and spread of islands can be unpredictable when attempting to localize a change.
  • It is harder to capitalize on new features that have wide impact, such as schema-awareness.
  • You can still have problems when a 2.0-with-BC processor runs 1.0 code; requires knowing both versions to debug.

Option 4: Change the stylesheet version to 2.0 and debug from there

If you need to change your existing 1.0 stylesheets into 2.0 stylesheets quickly, then why not just set the stylesheet version attribute to 2.0 and see if you get the expected output from a 2.0 processor? If that doesn't work, then start your debugging by back-tracking the result of each template and tweak until it works. Tweaking might involve fixing it with the appropriate 2.0 instruction, or (if you have the BC feature) you can just set the version attribute to 1.0 and see what happens. Of course, this option shows much less concern for exploiting new 2.0 features.

ProsCons
  • You can take an incremental approach to conversion.
  • This might be the fastest conversion if your 1.0 stylesheet did not run afoul of the compatibility exceptions. A good test set assures that the conversion is successful.
  • If you're lucky, you do not need to find a 2.0 processor that supports backwards compatibility.
  • This option is easy to initiate and involves low mental effort at the start because it does not require planning, other than knowing which new features you want to use.
  • This option allows a narrow focus on backwards compatibility and a favorite feature.
  • You can apply locally standard tweaks or code cleanup as you go.
  • This option requires a processor with backwards compatibility support when the chosen tweak to a problem is to set the version back to 1.0.
  • The result is vulnerable to mistakes in local versioning and tweaks, thus prolonging the debug phase.
  • The resulting stylesheet can be hard to read due to backwards compatibility and tweaks.
  • Without initial planning, it is harder to capitalize on new features that have wide impact, such as schema-awareness.
  • It is hard to predict how long the conversion will take. In this crisis-driven approach, you might have a big bug list at the start.

Option 5: Stay at 1.0

You might ask: why switch at all when your 1.0 stylesheets work as expected? You don't have to learn 2.0 syntax, don't have to plan for a 2.0 stylesheet structure, and don't have to install a 2.0 processor. This might be a viable option for you, but remember that 2.0 can do some things that a 1.0 processor simply cannot do, and 2.0 addresses many shortcomings of 1.0.

ProsCons
  • No work and no new processor needed (until you can't hire a person to write 1.0-only code).
  • You do not have to read six new specifications.
  • If you are using an open-source product, especially with locally written extensions, and you feel comfortable having access to the source, you can stay comfortable.
  • 1.0 processors are mature and well debugged.
  • You can't take advantage of new 2.0 features (see Part 1).
  • Turnaround time for processor bug fixing can be slower because XSLT product developers might not invest as much in 1.0 anymore.
  • The old techniques in XSLT 1.0 can become more irritating over time when you know that a readable 2.0 replacement exists.

Blending the options

The preceding options are deliberately stated with a single-minded focus for each, but you can mix these approaches to suit your needs. For example, you can draw up a grand plan as in options 1 and 2, but implement changes using the try-it-and-see approach of option 4. If you don't like the current modularization of your stylesheets, you might want to refactor all the modules for functional purposes, then refactor further to break out modules that will use a designated XSLT version, as in options 2 and 3. Another possibility is to take one motivating feature, such as stylesheet functions (see Part 1 of this series for more information), find all the places where you could use it, then decide whether you can isolate those places as in option 3 or force a complete refactoring as in option 1.

How about XQuery?

If your goal was to extract data from a datastore and present it in a tagged or formatted way, but not heavily restructured, you might find that XQuery meets those needs. Other developerWorks articles (see Resources) discuss XQuery 1.0 and compare it to XSLT.



Back to top


Ask yourself these questions before you decide

Apart from the technical factors that can push you in the direction of one option or another, you also have factors of corporate culture to consider.

Why are you doing this?

Upgrading to 2.0 is often driven by the attractiveness of a 2.0 feature that is simply not available in 1.0. Is this why you are investigating an upgrade of your stylesheets to 2.0? If so, then what are the 2.0 features that are most attractive to you? Which additional features are nice to have and should be employed as opportunities arise?

How does your organization manage development?

How well does your corporate culture do planning? How much risk-taking can your organization accept? Is anyone adept at scanning the whole code base, looking for pervasive issues or opportunities? Are less expensive tweaks and hacks preferable over doing it right (but expensively) the first time? Are resources available for maintenance (bug fixing)? Is the bug-fixing mentality dominant, and has it worked well in the past?

At what stage are resources available?

How much time do you have up front for learning new specs? To plan for the transition? To redesign your stylesheet structure? By contrast, how much time is available to rewrite stylesheet modules? Is a deadline to have the desired 2.0 features in place? If you can't commit to the upfront cost, then does your organization prefer an incremental transition (like option 3) or just a quick fix now?

Is a full rewrite too much for your organization to handle?

Does option 1, a full rewrite to 2.0, seems too drastic to your organization? On the other side, do you have an architect begging for the chance to do a clean-slate design for your stylesheets? Is it considered the last resort if no other options work? If you might do a clean-slate design anyway, you might as well put the expanded feature set of 2.0 at your disposal.

Are you concerned with interoperability across XSLT processors?

Determine up front if your stylesheets can be used by multiple processors, which might come from different vendors, be on multiple platforms, and even be processors that conform to different versions of XSLT.

Can you get a 2.0 processor that supports backwards compatibility?

If you do not have access to a 2.0 processor with BC support for your operating environment, then you cannot choose options 2 and 3. Also, option 4 will be more limited, because every bug fix has to use the 2.0 approach.



Back to top


Narrowing the options

The previous sections presented several lists of pros and cons for you to weigh. Following are some shortcuts that might help you eliminate options. To use these shortcuts, you need to know which specific 2.0 features you are most likely to use:

  • If you do not have access to a 2.0 processor that supports backwards compatibility, then you must choose option 1, 5, or a subset of option 4, where every bug fix must be a 2.0 solution.
  • If you're interested in schema-awareness, using new data types, choosing collations, enhanced type matching (see Resources for access to a link to section 2.5.4 of XPath 2.0), or manipulating namespace prefixes, then option 1 is preferable and maybe option 2. These are the features that are most likely to have wide impact.
  • If you're interested in tunnel parameters, stylesheet functions, forExpr in XPath (subset of FLWOR), next-match, or enhancements to modes, then option 1 and 2 are preferable and option 3 might suffice. It is fine to scan your whole code base for opportunities to use these features, but it is not essential.
  • If you're interested in ifExpr, rangeExpr, regular-expression functions and instructions, enhancements to temporary trees, better control of output characters, or multiple result documents, then option 3 might suffice. These are features whose impact is often narrow. The latter three features are described in Part 1 of this series.
  • If your stylesheet is relatively shallow or simple, then option 4 might suffice as a quick fix option.
  • If you mainly hope to use grouping, sequences, quantified expressions (see Resources for access to a link to section 3.9 of XPath 2.0), or the option to designate a named template as the initial template, then that fact alone does not indicate that a particular option is favorable. Options 1, 2, and 3 are likely candidates, but you need to take a deeper look to see whether the impact on your stylesheets is narrow or wide.



Back to top


Plan on more planning

Once you answer the organizational assessment questions and review the 2.0 features, use the decision factors presented to set an initial course for your upgrade. Of course, this is a high-level strategic view, and your plan should include revisiting the questions as you go along.



Resources

Learn

Get products and technologies
  • IBM trial software: Build your next development project with trial software available for download directly from developerWorks.


Discuss


About the authors

David Marston has worked with XML technologies since late 1998, particularly on standards conformance. Over his 25+ years in the computing business, he has been involved with all aspects of software development. He is a graduate of Dartmouth College and a member of the ACM. He is on the Next-Generation Web team at IBM Research. You can contact him at David_Marston@us.ibm.com.


Joanne Tong is a developer working on IBM's XSLT processors in the IBM Toronto lab. She is currently an editor of the W3C XSLT 2.0 and XQuery 1.0 Serialization specification and is an active member of the XSL working group. You can contact her at joannet@ca.ibm.com.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top