Skip to main content

Code generation in XSLT 2.0, Part 1: Generate SQL with XSLT 2.0

Eliminate the hard work while building better code

Jack Herrington (jherr@pobox.com), Editor-in-Chief, Code Generation Network
An engineer with with more than 20 years of experience, Jack Herrington is currently Editor-in-Chief of the Code Generation Network. He is the author of Code Generation in Action. You can contact him at jack_d_herrington@codegeneration.net.

Summary:  Learn to use the cutting-edge features of XSLT 2.0 and generate PHP code from an abstract data model. In Part 1 of this two-part series, Jack Herrington uses a robust multilevel transform technique to show you how to take a simple model of a target database and generate the SQL for the database server.

Date:  08 Feb 2005
Level:  Intermediate
Activity:  3257 views

You're the engineering team lead on a new PHP project. The requirements are fulfilled, and the preliminary data model is approximately 150 tables. Now, it's time to think about the schedule. You estimate a day to write each database access class and unit test it, so 150 tables at one table per day is 150 days. Assuming about a 20-day work month, will you have the almost eight months required just to write the database access layer?

Probably not, so you need a way to reduce the time required. You might use a persistence framework, but even if such a framework cut the time in half you would still need four months. You might write a generic library, but such libraries can be complex to use, error prone, and difficult to debug.

Or you might write a sample class or two, then build the rest with a code generator. Worst-case scenario, it could take up to a month to write the generator itself. But after that, the generation of all the code would take only minutes. In addition, any bugs that occur when the SQL schema doesn't match the database access code are all but eliminated because the generator builds both the SQL schema and the access code.

Can such a plan work? It can, and this article shows you how to do it and explains how your application will be easier to maintain, more robust, and enjoyable to write. Part 1 of this two-part series lays down the basics of code generation and walks through building the portion of the generator that creates SQL for a given model. Part 2 will drill deeper into XSL and show you how to finish the generator by walking through the code that creates the PHP.

Code generation basics

Before you begin, you'll need a generator that takes an abstract model of the database and builds both an SQL schema and PHP database access classes. Figure 1 illustrates this model and shows the potential for generating other database access code for other technologies, such as Java™ code.


Figure 1. Basic information flow for the generator
Basic information flow for the generator

The dashed boxes in Figure 1 (SQL model and Database access model) indicate temporary models created from the abstract model. The dotted section (Java) indicates what you could potentially generate.

The next question is which technology to use. You can write the generator using Java code, Perl, Python, or Ruby. Given the simplicity of the task, you might get by with a template engine such as Velocity or XSLT. A template engine like XSLT 2.0 is a good place to start because it has a lot of power on its own and you can embed it within Java code to extend its reach even farther.

XSLT 2.0

XSLT 2.0 offers expanded language capabilities to developers seeking a template engine for code generation. This article exploits these features in the context of building a robust generator for both SQL and PHP that works off an abstract table definition. In XSLT 1.x, only single-level transformations were possible; but with XSLT 2.0, you can build intermediate models of the code to be generated. This functionality makes understanding and maintaining the generator easier, and also makes it possible to repurpose the generator for other language targets.

As an XML translation technology, XSLT can translate XML into either XML or text. (I use both of these facilities in this code generator project.) XSLT 2.0 is a vast improvement over XSLT 1.x, and in this article I use three new features available in the second revision of the standard:

  • Functions: You can now define custom functions. This is a big improvement over the first version of XSLT, which used a cumbersome template syntax for this purpose.
  • Temporary trees: In the first version of XSLT, you had to operate on the input XML tree. XSLT 2.0 can build temporary trees in memory that you can use to drive other templates.
  • Result documents: Now, a single template can generate multiple output files. I use this feature to generate the SQL file and the individual PHP files.

Now, I'll show you how to build the generator.


Build the generator

Start with the input to the generator. Listing 1 provides an example of a table definition for a simple book database.


Listing 1. The input abstract table definition
	
<?xml version="1.0" encoding="UTF-8"?>
<tables>
    <table name="Author">
        <field name="first" type="text"/>
        <field name="last" type="text"/>
    </table>
    <table name="Publisher">
        <field name="name" type="text"/>
        <field name="last" type="text"/>
    </table>
    <table name="Book">
        <field name="name" type="text"/>
        <field name="author" type="id"/>
        <field name="publisher" type="id"/>
    </table>
</tables>

This simple XML script defines three database tables with their fields: Author, Publisher, and Book. The input for your full PHP application will be much larger and probably more complex, but this code serves as a reasonable starting set.

You can build a generator that turns this XML into SQL or PHP in a single pass, but I don't recommend doing so. The templates become complex and hard to maintain. Today's best practice for code generation is to have multiple levels of models, each with an increasing level of target specificity. This practice reduces the complexity level of any one transformation, and is similar to using multiple tiers in a Web server to reduce overall complexity and encourage reuse.

Referring back to Figure 1, you see how the abstract model generates one model for the SQL and another for the PHP through the Database access model. Those models, stored in temporary trees, are then used to build the code. Figure 2 tracks the Author table through the transformations in the generator.


Figure 2. The evolution of a single SQL table
The evolution of a single SQL table

The top box in Figure 2 shows the original abstract table model. The middle box is the SQL model built from that abstract model. The bottom box shows the resulting SQL output. As you can see, the SQL model specifies point-by-point what should be in the final SQL file. The code template's job is simply to translate the XML into SQL, which makes it possible to generate code for a variety of databases from a single model. Now, take a look at the XSL.


Generate the SQL

Listing 2 shows the first part of the main XSL stylesheet for the generator project.


Listing 2. The beginning of the main generator XSL sheet
	
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:gen="http://www.codegeneration.net/" version="2.0">

    <!-- Output specifications -->
    <xsl:output method="text"/>
    <xsl:output method="xml" name="debug-xml" indent="yes"/>
    <xsl:output method="text" name="sql" indent="yes"/>

    <!-- SQL Templates -->
    <xsl:include href="gen2-sql.xsl" />
    <xsl:include href="gen2-sql-model.xsl" />
    <xsl:include href="gen2-queries.xsl" />

    <!-- Database Access Templates -->
    <xsl:include href="gen2-dba.xsl" />
    <xsl:include href="gen2-php.xsl" />

    <!-- The generator main entry point -->
    <xsl:template match="/">
        <!-- Create the SQL model -->
<xsl:text>Building SQL model
</xsl:text>
        <xsl:variable name="sql-model">
            <xsl:call-template name="gen-sql-model">
                <xsl:with-param name="model" select="."/>
            </xsl:call-template>
        </xsl:variable>
        
        <!-- Dump it out for debugging -->
<xsl:text>Dumping SQL model
</xsl:text>
        <xsl:result-document href="db/gen-tables.xml" format="debug-xml">
            <xsl:copy-of select="$sql-model"/>
        </xsl:result-document>

        <!-- Generate the SQL from the SQL model -->
<xsl:text>Generating SQL
</xsl:text>
        <xsl:result-document href="db/gen-tables.sql" format="sql" >
            <xsl:apply-templates mode="sql" select="$sql-model/sql" />
        <xsl:result-document>

The important parts of this script are where I create the sql-model variable, which holds the SQL model, and the result-document tag, which creates the SQL file using the contents of the sql-model variable. These tags show two valuable additions to XSLT 2.0: the ability to create temporary trees with the xsl:variable tag and the ability to create multiple output files with the xsl:result-document tag.

Now, I'll show you the template that generates the SQL model from the abstract table model. Listing 3 shows the gen-sql-model template.


Listing 3. The SQL model generator
	
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:gen="http://www.codegeneration.net/" version="2.0">
    
    <!-- Builds the SQL model from the original data model -->
    <xsl:template name="gen-sql-model">
        <xsl:param name="model"/>
        <sql>
            <xsl:for-each select="$model/tables/table">
                <create name="{lower-case(@name)}"
   primary-key="{concat(lower-case(@name),'_id')}">
                    <field name="{concat(lower-case(@name),'_id')}"
    type="{gen:model-type-to-sql('integer')}"/>
                    <xsl:for-each select="field">
                        <field name="{lower-case(@name)}"
   type="{gen:model-type-to-sql(@type)}"/>
                    </xsl:for-each>
                </create>
            </xsl:for-each>
        </sql>
    </xsl:template>
</xsl:stylesheet>

This template is fairly simple: It takes the original model and builds some new tags for the SQL model. Note the call to the gen:model-type-to-sql function. This call isn't part of the standard XSLT function set -- it's a custom function that I define in Listing 4.


Listing 4. The model-to-SQL-type transform function
	
    <xsl:function name="gen:model-type-to-sql">
        <xsl:param name="type"/>
        <xsl:choose>
            <xsl:when test="$type eq 'text'">TEXT NOT NULL</xsl:when>
            <xsl:when test="$type eq 'id'">INTEGER NOT NULL</xsl:when>
            <xsl:when test="$type eq 'integer'">INTEGER NOT NULL</xsl:when>
        </xsl:choose>
    </xsl:function>

This ability to build new XPath functions is new with XSLT 2.0. Previously, you had to use the xsl:call-template syntax, which can become cumbersome as the number of parameters grows. Having templates act as functions also pollutes the template namespace.

The final step in generating the SQL is to format your SQL model as SQL. You do this using the two templates shown in Listing 5.


Listing 5. SQL generation templates
	
    <!-- Template for SQL create tags -->
    <xsl:template match="create" mode="sql">
DROP TABLE IF EXISTS <xsl:value-of select="@name" />;
CREATE <xsl:value-of select="@name" /> (
<xsl:apply-templates mode="sql" select="field" />
PRIMARY KEY ( <xsl:value-of select="@primary-key" /> )
        );
    </xsl:template>

    <!-- Template for SQL field tags -->
    <xsl:template match="field" mode="sql">
<xsl:value-of select="concat(@name,' ',@type,',')" /><xsl:text>
</xsl:text>
    </xsl:template>

That's really all it takes. I have one template that responds to the create tag in the SQL model. This template then uses the apply-templates tag to handle its child field tags, which are formatted by the second template.

Refer back to Figure 2. Notice how the structure of the XML looks like the structure of the SQL. A create tag includes several field tags. As with SQL, a create command includes several field parameters. Structuring the model the same way as the code can greatly simplify the code templates.


Next time

In this article, I have introduced you to some of the new features available in XSLT 2.0 -- in particular, the expanded language capabilities and the ability to build intermediate models of code to be generated -- and illustrated how you can use them to build a robust generator for SQL that works off an abstract table definition.

In Part 2, I will show you how to generate the PHP portion of the code that will provide database access for the Web server. I will also introduce a few more new features in XSLT 2.0.


Resources

About the author

An engineer with with more than 20 years of experience, Jack Herrington is currently Editor-in-Chief of the Code Generation Network. He is the author of Code Generation in Action. You can contact him at jack_d_herrington@codegeneration.net.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=35122
ArticleTitle=Code generation in XSLT 2.0, Part 1: Generate SQL with XSLT 2.0
publish-date=02082005
author1-email=jherr@pobox.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers