This column is currently focused on modeling, UML, and XML. More specifically, I am exploring the use of UML modeling for XML development and in particular how XSLT stylesheets can help through automatic derivation.
As XML has become a common feature in development projects, many developers have grown interested in integrating XML with the rest of their development. While many organizations still rely on ad-hoc tools for XML development, the trend is towards adopting the same methodology for XML -- or at least one common set of tools -- that is already in use for other development needs, such as Java technology, databases, or the Web.
As discussed in the previous column, a model is a simplified description of a system that can assist in calculations and predications. In the context of this article, the system is always an XML vocabulary.
Figure 1 illustrates the modeling cycle as a continuum of models. The first models are drawn on the whiteboard (or on a sheet of paper) and tend to be informal. At this stage, the goal is to give all participants (users, developers, designers) a chance to express themselves freely.
Figure 1. A continuum of models
The next step is to draw a UML model (or several models if the vocabulary is complex). The UML model is more refined and formal, but it remains synthetic and readable because it is intended primarily as a communication device between the team members.
The last model is the XML schema, which is the most precise of them all. Its goal is to allow the parser to validate XML documents against the vocabulary definition so it can forego readability in favor of precision.
The major difference between all these models is their goal: from informal communication to precise, formal validation by the parser. The difference is not in the nature of the models (simplified description of an XML vocabulary), but in the level of assistance each model provides.
If you think of a continuum of models, from the least precise to the most formal, it makes sense to look into automatic derivation -- the process of generating one model automatically from an earlier model. Obviously, automatic derivation works well only if the two models are equally descriptive, which sort of conflicts with the idea of some models being more descriptive than others. Addressing the different levels of description in the models will be the topic of the next column; here, I will focus on derivation.
XML Metadata Interchange (XMI)
You will recall from the previous installment that I implemented automatic derivation through XMI and XSLT. Assuming that you are already familiar with XML schema (if you're not, see Resources), I will introduce XML Metadata Interchange (XMI) in this section.
Vocabularies and compatibility
XMI is a sophisticated specification (version 1.2 is over 400 pages), so, in this article, I will limit myself to the bare minimum description needed for automatic derivation.
XMI does not specify an XML vocabulary, but rather an algorithm that generates vocabularies for metamodels. In other words, XMI does not define Class, Attribute, Association, or other tags as you would expect. Instead, XMI specifies how to create tags for concepts in a metamodel. I know that's a lot of models to work with, but bear with me -- it will become clearer in a moment.
Therefore XMI is not so much a vocabulary as a framework. Unfortunately, this means that no two tools interpret this framework in the same way. Differences also exist between different versions of the same tool: Rational Rose originally supported XMI through an add-on developed by Unisys. The latest versions of Rational XDE have built-in support for XMI, but it's a slightly different variant. The differences are not necessarily significant, but they may cause incompatibilities. In practice, it makes sense to target your stylesheets to the one or two tools that are used in your community and not worry about the rest.
In this article, rather than adopting one specific version of XMI, I will stick with the examples published by the OMG. Although no tool is directly compatible with the samples, this is good middle ground. Adapting them to your tool of choice will not be difficult.
Although it mostly specifies an algorithm, XMI also defines a few tags and attributes. You will need the following:
XMIis always the root element. It must have anxmi.versionattribute (valid versions are 1.0, 1.1, 1.2, and 2.0).XMI.headeris a placeholder for information on the model. Its most important children areXMI.documentationandXMI.metamodel.XMI.documentationholds end-user information as these children elements (whose names are self-explanatory):XMI.ownerXMI.contactXMI.longDescriptionXMI.shortDescriptionXMI.exporterXMI.exporterVersionXMI.exporterIDXMI.notice
-
XMI.metamodelrecords the metamodel to which the XMI algorithm has been applied -- in this case, the UML metamodel (XMI works with other metamodels such as Metaobject Facility, MOF, also published by the OMG). -
XMI.contentcontains the actual model. xmi.idandxmi.idrefare attributes for encoding links:xmi.idis an element identifier that must be unique;xmi.idrefis a reference to an element by its identifier.
The UML metamodel is a model that describes the UML language -- specifically, it describes classes, attributes, associations, packages, collaborations, use cases, actors, messages, states, and all the other concepts in the UML language. For coherence, the metamodel is written in UML.
The prefix "meta" indicates that the metamodel describes a model of a model. Likewise, XML is a metalanguage because it's a language that describes languages.
The UML metamodel is published in the UML specification. More specifically, XMI uses the "UML Model Interchange" described in chapter 5 of the UML specification (see Resources).
Be warned that the UML metamodel is rather large and intimidating. I can only give you a flavor for it in this article. Figure 2 is an excerpt from the metamodel that describes the class, one of the central concepts in class diagrams.
Figure 2. the metamodel for a class
In the metamodel, the class concept is modeled as the metaclass Class which inherits from the abstract metaclass Classifier. Classifier is the parent for Class, Interface, and Datatype (the latter two are not represented in Figure 2). The inheritance chain continues to: GeneralizableElement, which represents all concepts that can be generalized (inherited from); ModelElement, which represents all abstractions in the model (such as namespace, constraints, and class); and finally Element, the topmost metaclass. Each of these metaclasses has attributes from which Class inherits.
A composition exists between Classifier and Feature, which is the parent of StructuralFeature. Attribute is derived from StructuralFeature.
Confused by the metamodel? Try to forget it's a metamodel, try to forget it's about UML, and look at it as an ordinary model. Figure 2 is simply pointing out the concept of Class, which is a highly specialized element that is related to interface and data type (through its inheritance from Classifier). Class has a name, visibility, and many more attributes. Finally, there is an association between Class and Attribute.
So Figure 2 formally expresses that a class has a name, visibility, and other properties, and that it may have attributes. Indeed, Figure 2 is the definition of a UML class. If you find this confusing, it's probably because the definition itself is written in UML!
I have intentionally simplified Figure 1 to ignore namespace, constraint, stereotype, inheritance, and many other aspects of what makes a class a class. Trust me, they are included in the complete UML metamodel but they are not useful for this article.
Why bother with the metamodel? Because when you feed it to the XMI algorithm, you get an XML vocabulary for UML. As an example, Listing 1 is an XMI representation of Figure 3 (using the variation of XMI illustrated in the specification -- see above):
Figure 3. A UML model for an address
Listing 1. The address exported to XMI
<?xml version="1.0"?>
<XMI xmi.version="1.2" xmlns:UML="org.omg/UML/1.4">
<XMI.header>
<XMI.documentation>
<XMI.exporter>ananas.org stylesheet</XMI.exporter>
</XMI.documentation>
<XMI.metamodel xmi.name="UML" xmi.version="1.4"/>
</XMI.header>
<XMI.content>
<UML:Model xmi.id="M.1" name="address" visibility="public"
isSpecification="false" isRoot="false"
isLeaf="false" isAbstract="false">
<UML:Namespace.ownedElement>
<UML:Class xmi.id="C.1" name="address" visibility="public"
isSpecification="false" namespace="M.1" isRoot="true"
isLeaf="true" isAbstract="false" isActive="false">
<UML:Classifier.feature>
<UML:Attribute xmi.id="A.1" name="name" visibility="private"
isSpecification="false" ownerScope="instance"/>
<UML:Attribute xmi.id="A.2" name="street" visibility="private"
isSpecification="false" ownerScope="instance"/>
<UML:Attribute xmi.id="A.3" name="zip" visibility="private"
isSpecification="false" ownerScope="instance"/>
<UML:Attribute xmi.id="A.4" name="region" visibility="private"
isSpecification="false" ownerScope="instance"/>
<UML:Attribute xmi.id="A.5" name="city" visibility="private"
isSpecification="false" ownerScope="instance"/>
<UML:Attribute xmi.id="A.6" name="country" visibility="private"
isSpecification="false" ownerScope="instance"/>
</UML:Classifier.feature>
</UML:Class>
</UML:Namespace.ownedElement>
</UML:Model>
</XMI.content>
</XMI> |
Notice how the XML elements and attributes in Listing 1 match the classes and attributes in Figure 2. You've now come full circle: The XMI document is a direct representation of the UML metamodel because the UML metamodel is a description of UML itself.
A portion of the UML metamodel deals with the visual representation of concepts -- where to draw the concepts on screen. I don't process that information in my stylesheets for two reasons:
- It is not needed when deriving an XML schema from the UML model.
- It is extremely difficult to produce the visual output when deriving the UML model from an XML schema. A more reasonable option is to open the model in a modeling tool and take a few minutes to prepare the visual representation of the model on screen. The hardest work (getting the definitions right) has been done by the stylesheet.
Now that you have the key to reading XMI files, it's easy to map XMI tags to their XML schema equivalents. One possible mapping is:
UML:Modelbecomesxs:schema; its target namespace is derived from the model name.-
UML:Classbecomes a global XML element declaration (xs:element). UML:Attributebecome a local XML element declaration (xs:element).
Listing 2 is an XSLT stylesheet that implements the mapping:
Listing 2. XML schema derivation
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:UML="org.omg/UML/1.4"
exclude-result-prefixes="UML"
version="1.0">
<xsl:output indent="yes"/>
<xsl:template match="XMI[@xmi.version='1.2']">
<xsl:apply-templates select="XMI.content/UML:Model"/>
</xsl:template>
<xsl:template match="XMI">
<xsl:message terminate="yes">Unknown XMI version</xsl:message>
</xsl:template>
<xsl:template match="UML:Model">
<xs:schema targetNamespace="http://psol.com/uml/{@name}">
<xsl:apply-templates/>
</xs:schema>
</xsl:template>
<xsl:template match="UML:Namespace.ownedElement/UML:Class">
<xs:element name="{@name}">
<xs:complexType>
<xs:sequence>
<xsl:apply-templates/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xsl:template>
<xsl:template match="UML:Attribute">
<xs:element name="{@name}" type="xs:string"/>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="normalize-space(.)"/>
</xsl:template>
</xsl:stylesheet> |
Obviously, the stylesheet in Listing 2 is still very limited (and it does very limited error checking) because it only supports a small subset of the UML metamodel. It ignores packages, interfaces, associations, and more. You can enrich the stylesheet and support those concepts with a simple extension of the process I've shown you so far: Study the appropriate portion of the UML metamodel, define a mapping to XML schema, and implement it.
Listing 2 is very handy if you follow the normal modeling workflow: from the least detailed to the most detailed model. Frequently you will find that an XML schema already exists, and that it should serve as the starting point for your work. It would be tedious to recreate the UML model, so a stylesheet that implements the reverse mapping is handy. Listing 3 is an example:
Listing 3. Reverse derivation (from XML schema to UML)
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:UML="org.omg/UML/1.4"
exclude-result-prefixes="xs"
version="1.0">
<xsl:output indent="yes"/>
<xsl:template match="xs:schema">
<XMI xmi.version="1.2">
<XMI.header>
<XMI.documentation>
<XMI.exporter>dW simple stylesheet</XMI.exporter>
</XMI.documentation>
<XMI.metamodel xmi.name="UML" xmi.version="1.4"/>
</XMI.header>
<XMI.content>
<UML:Model xmi.id="{generate-id()}"
name="{substring-after(@targetNamespace,'http://psol.com/uml/')}"
visibility="public" isSpecification="false"
isRoot="false" isLeaf="false" isAbstract="false">
<UML:Namespace.ownedElement>
<xsl:apply-templates/>
</UML:Namespace.ownedElement>
</UML:Model>
</XMI.content>
</XMI>
</xsl:template>
<xsl:template match="xs:element">
<UML:Class xmi.id="{generate-id()}" name="{@name}"
visibility="public" isSpecification="false" isRoot="true"
isLeaf="true" isAbstract="false" isActive="false">
<xsl:apply-templates/>
</UML:Class>
</xsl:template>
<xsl:template match="xs:sequence">
<UML:Classifier.feature>
<xsl:apply-templates/>
</UML:Classifier.feature>
</xsl:template>
<xsl:template match="xs:sequence/xs:element">
<UML:Attribute xmi.id="{generate-id(.)}" name="{@name}"
visibility="private" isSpecification="false"
ownerScope="instance"/>
</xsl:template>
</xsl:stylesheet> |
Towards more comprehensive stylesheets
To say that the stylesheets I have introduced in this article are simplistic would be an understatement. They are less than 50 lines long and deal with a small subset of the UML metamodel. Real-world stylesheets recognize many more UML concepts, and typically weigh in at 500 lines or more. My goal in this installment has been to introduce the concepts behind automatic model derivation:
- Even the models (UML, XML schema) are represented as a data set; this special data set is called the metamodel.
- You can establish a mapping between the UML metamodel and an XML schema.
- You can implement the mapping in XSLT stylesheets.
- UML and XML schema are just different representations of the same reality; they differ because they serve different goals.
In this article, I have had to make simplifications. If you try to extend the stylesheets from Listings 2 and 3, you may encounter two problems:
- The UML model may not be specific enough (because it's a high-level view) to enable meaningful derivation of XML schema (low-level, detailed model).
- You may find more than one sensible mapping between the UML metamodel and XML schema.
Solving these two problems is the topic of my next two column installments.
- Participate in the discussion forum.
- Read the UML specification, which includes the complete UML metamodel, at the Object Management Group site.
- While you're there, check out the XMI
specification, which includes samples of UML data. Although no tools follow
these samples faithfully, they are a good middle ground from which it is
easy to adapt to the specifics of each tool.
- Review the previous installment of this column, "UML, XMI, and code generation, Part 1" (developerWorks, March 2004), in which Benoît Marchal discusses the relationship between UML and XML schema.
- If you're new to XML schema, the tutorial "XML Schema Infoset Model" (developerWorks, November 2003) will
bring you up to speed.
- Explore IBM Rational Rose, the
leading UML modeling product. You'll also find plenty of Rational and
UML resources on the Rational section
of developerWorks.
- Find hundreds more XML resources on the
developerWorks XML zone, including previous installments of Benoît Marchal's
Working XML column.
- Find out how you can become an IBM Certified Developer in XML and related technologies.

Benoît Marchal is a Belgian consultant. He is the author of XML by Example, Second Edition and other XML books. You can contact him at bmarchal@pineapplesoft.com or through his personal site at www.marchal.com.





