In the last two installments of the Working XML column, I explored modeling and, more specifically, the use of UML modeling for XML application development. Modeling is an important aspect of XML development. After all, XML is a structured language, so structuring and organizing information is the raison d'etre of XML. This series of articles focuses on how to combine the XML-specific modeling languages with UML, the industry standard for software development.
When it comes to modeling, I have explained my bias at length in the previous two articles: Briefly, I believe that the most reasonable strategy is to view modeling as a continuous activity that starts with an open discussion in front of a whiteboard (or a piece of paper in smaller offices) and ends with the production of a W3C XML Schema or a WSDL file.
At each step, the model is refined and made more formal. Bearing in mind that a model is a simplified representation of a system that is created to assist in understanding the system, it seems logical that your model will become more complete as your understanding of the system deepens.
Therefore, I believe it is crucial that you use tools to support your modeling activity -- tools that will help you refine the models. I have witnessed several nightmare projects in which modeling failed, and the one thing they all had in common was a lack of integration between the modeling and development activities. Fortunately, a workaround is as simple as deploying tools to integrate the two activities.
Ideally, you want any changes in the model to be instantaneously reported in the implementation, but it is seldom possible to achieve this ideal. For example, Java code generators may generate and update skeleton implementations but they cannot update the algorithms. With XML, you can achieve the ideal because you are always working with data models. The UML model and the XML Schema are both data models, though they use different languages and usually offer different levels of detail.
In my last article, I introduced two stylesheets: The first converts UML models saved in XMI into XML Schema; the second performs the reverse operation, generating an XMI file from an XML Schema.
The transformation relies on a mapping from the UML metamodel -- the data model into which UML models are saved -- to an XML Schema. Every UML concept (class, attribute, association, and more) is represented in the UML metamodel, so if you establish that a UML class should become an XML element, the stylesheet simply transforms
UML:Class from XMI into an
xs:element in the XML Schema.
A simple stylesheet can automate the tedious process of implementing UML models as XML Schemas, but to fit all this material in this article, I have made numerous simplifications.
Note that while most of this discussion centers on the transformation from UML models to XML Schemas, the reverse transformation is helpful too. For example, you might need to include in your project elements developed by another team or company which are available as XML Schema only.
Here, I revisit the stylesheets and address one of the two issues that I identified in my last article: the lack of implementation information in the UML model. This problem develops for two reasons:
- UML is a universal language, so although it supports many modeling concepts it lacks some concepts that are specific to XML. For example, XML has ordered sequences of elements but UML does not order all its concepts.
- XML is a hierarchical data model, but UML works with graphs that are more flexible. Therefore, you may find more than one valid mapping between UML and XML. You have to tell the stylesheet which one to adopt.
From the outset, the designers of UML recognized that UML would have applications that they had not anticipated, so they implemented extension mechanisms that allow users to improve UML.
One of these extension mechanisms is the stereotype, which allows users to define new concepts in UML that refine existing concepts. Using a stereotype, the user can basically say to the modeling tool "I have a concept that is almost a class (or an object, actor, association, and so forth), but is more specialized."
For all practical purposes, a stereotype is a form of inheritance in the metamodel. It helps to think of a stereotype as a descendant of a UML concept.
In fact, if you read the UML specification you will see that the standard itself uses stereotypes quite heavily. For example:
- Comment has stereotypes for requirement and responsibility.
- Constraint has stereotypes for invariant, postcondition, precondition, and stateInvariant.
Of course these are in addition to any user-defined stereotypes.
Graphically, stereotypes are indicated by a keyword in angle brackets, such as
<<requirement>> in Figure 1. Note that UML allows users to redefine the icon for a stereotype, but few tools implement this feature.
Figure 1. A stereotype in UML
A second extension mechanism that is closely linked to stereotypes is the tag. While stereotypes allow users to define new concepts in UML, tags allow users to store additional information about these new concepts. While a stereotype offers metamodel-level inheritance, a tag offers a metamodel-level mechanism to add attribute-like information to a stereotype.
I'll start by showing you how to specify a hierarchy. You'll define a stereotype root to mark elements that could be a root in the XML hierarchy. You will mark one or more classes with the stereotype, and the stylesheet will implement them as global elements. (In an XML Schema, only global elements can become a root.)
Note that more than one element can be marked as a root, which is in keeping with the XML Schema language. Note also that I have decided to call the stereotype "root" and not "global" (for "global element"). Several possible mappings can exist between UML and XML; for example, you can make all elements global in the Schema to expose the elements for reuse in other Schemas, or you can make as many elements local as possible. (In fact, there's a sophisticated theory behind the use of global and local elements -- see Resources.)
I believe these implementation choices should not be exposed in the UML model. Amongst other reasons, this allows me to change my implementation rules without revisiting the UML model. Indeed, if I mark elements as "global" and I decide to change the implementation rules -- for example to expose all elements as global -- I need to revisit the entire model and change most of the stereotypes. I don't like having that much exposure to implementation details in a UML model. I plan to explore this issue in more detail in my next article.
The vocabulary that you're working on will also affect what you'll want to expose as a stereotype. For example, when working on vocabulary for publishing- or document-type applications, you may want to define an attribute stereotype. When working on a vocabulary for data storage, it may be more sensible to have fixed rules as to what maps to attributes versus elements (for instance, the stylesheet I present in this article maps everything to XML elements).
Similarly, I have defined a
position tag that specifies the order in which elements must appear in a sequence in XML. The problem here is that the order of elements in the schema must not change from one conversion to another. This is particularly a problem when working with associations; UML tools don't order associations, so you cannot rely on them to output them in the same order.
One workaround is to order associations by name; another is to use tags to control the position explicitly. I have found that the latter solution is often preferable.
Figure 2 is a UML model that defines three classes (
job) and the associations between them. This model is an extension of the model introduced in the previous article.
Figure 2. A UML model
Listing 1 is the same model, exported to XMI. As in the last article, this listing is based on the XMI standard and not on the XMI document that's produced by any particular tool, but it is easy to adapt to a specific tool. New in this listing are stereotypes, tags, associations, and multiplicity. As I explained in the previous article, you will need to review the UML metamodel to interpret these. They follow the logic introduced previously, so I won't comment on them any further.
Listing 2 is an updated version of the stylesheet that converts this XMI model into an XML Schema. Most of the new code is similar to code introduced in the last article, except for the following templates:
- The stereotypes and tags are defined in the beginning of the XMI file and later used by reference, so it is convenient to declare two variables to store their IDs.
- The template for model selects the classes with the root stereotypes to declare as global elements; other classes will be declared local. You could change this rule to declare all elements global if you desired.
- The template for class walks through the associations and sorts them using the
- The template for association walks through the class attached on the right end of the association using a special mode. Two templates are defined in this mode: The first applies if the class has the root stereotype, inserting a reference to the class; the second applies to regular classes, inserting the definition as a local element.
I'll leave updating the other stylesheet, which converts XML Schemas into XMI, as an exercise for the reader.
With the material introduced in this series so far, you should be able to prepare your own stylesheets to convert any UML model into XML Schemas. I trust that as you gain experience with the technique, you will find that UML modeling is one of the easiest ways to design XML Schemas.
Although the stylesheets I have introduced are limited to a subset of the UML metamodel, they provide a good starting point from which to design more powerful models.
In the next article, I'll show you how to solve the last remaining problem: designing a stylesheet when more than one mapping from UML to XML is possible.
- Participate in the discussion forum.
- Read Roger Costello's paper "Global versus Local", which discusses different design techniques with exotic names such as Venetian blinds or Russian doll.
- Review the UML specification,
for the complete UML metamodel, at the Object Management Group site.
- Explore IBM Rational Rose, the
leading UML modeling product. You'll also find plenty of Rational and
UML resources on the Rational section
- Review the previous installments of this series by Benoît Marchal:
- Part 1 discusses the relationship between UML and XML schema (developerWorks, March 2004).
- Part 2 introduces the UML metamodel and then proceeds to XMI, the XML-based specification for the exchange of models. The author then shows how to map from the metamodel to XML Schema. (developerWorks, May 2004).
- Find hundreds more XML resources on the
developerWorks XML zone, including previous installments of Benoît Marchal's
Working XML column.
- Browse for books on these and other technical topics.
- Find out how you can become an IBM Certified Developer in XML and related technologies.