As stated in the summary, CAM represents the latest technology in validating XML documents. This, of course, implies that previous technologies validated XML documents.
The oldest is known by the acronym DTD, which stands for Document Type Definition. As with most entry points in emerging technologies, it was limited. It facilitated validation of XML document structure, but not much in the way of semantics. It also used somewhat awkward syntax to define the valid XML structure.
DTD was later replaced by XSD, which stands for XML Schema Definition. This was a much more powerful means of validating XML documents. First, the syntax was similar to an XML document itself. Next, it offered improved support for semantics. For the last several years, bleeding-edge technologists have opted to validate their XML documents with XSD as opposed to DTD.
The history of technology has shown repeatedly that there is always a better way to build the proverbial mousetrap. XML validation is no exception to that principle. CAM represents the latest and most sophisticated entry in the family of technologies used to validate XML documents.
CAM is offered by the standards body known as OASIS. This organization has provided a number of specifications, most notably regarding Web services and electronic business Extensible Markup Language (ebXML).
CAM is more powerful and flexible than its predecessors. Unlike XSD, it doesn't tightly couple the data structure to the business rules. It also provides for context-driven validation, something which is lacking in both XSD and DTD.
For most people who are familiar with XML, CAM is also much easier to learn than XSD or DTD. This is because, in defining structure, the format of a CAM document is strikingly similar to an XML instance. And, in defining business rules, CAM uses the well-known (XPath.
In Listing 1, you can see that the structure of a CAM template is not complicated.
Listing 1. The structure of a CAM template
<as:CAM xmlns:as="http://www.oasis-open.org/committees/cam" CAMlevel="1" version="1.0"> <as:Header /> <as:AssemblyStructure /> <as:BusinessUseContext /> </as:CAM>
The root element,
CAM, defines the namespace used throughout the template itself as well as the level and version of CAM.
Header element provides specific information about the validation document. Many of the child elements (not shown) are self-explanatory:
AssemblyStructure element defines the actual structure of the XML document instance. This is where CAM and XSD part company. The
AssemblyStructure element provides validation against the structure of the XML document but does not contain any information about semantics.
And, finally, the
BusinessUseContext element provides the business rules that were lacking in the previous element. How are these business rules enforced? That is an excellent question, but first you should be familiar with how CAM defines structure.
Listing 2 shows how CAM defines the structure for a simple purchase order.
Listing 2. A CAM structure for a simple purchase order
<as:AssemblyStructure> <as:Structure ID="myPO" taxonomy="XML"> <PurchaseOrder> <ShippingAddress> <Name>%string%</Name> <Street>%string%</Street> <City>%string%</City> <State>%string%</State> <Zip>%string%</Zip> </ShippingAddress> <ShipDate>%DD-MM-YYYY%</ShipDate> <comment>%string%</comment> <LineItems> <LineItem> <ItemName>%string%</ItemName> <Quantity>%1%</Quantity> <Price>%54321.00%</Price> <Comment>%string%</Comment> </LineItem> </LineItems> <TotalPrice>%54321.00%</TotalPrice> <ShippingMethod>%string%</ShippingMethod> </PurchaseOrder> </as:Structure> </as:AssemblyStructure>
In looking at Listing 2, note that the structure of the XML document is defined almost exactly as though it were an XML instance. In this respect, most IT professionals probably agree that CAM is far more readable than XSD for people who already understand XML syntax. The reality of the situation is that it really is depicted as an XML instance, but with irrelevant content, which I will explain anon.
Structure element is the parent of the actual structure definition. It has an
ID attribute that identifies this particular structure. The only currently recognized value for the
taxonomy attribute is
Notice that most elements include values demarcated by percent signs (%). These are simply place holders for actual content that will be included in the XML instance. They serve to make the document easier to understand to the naked eye as opposed to providing any validation logic. Some people, when constructing CAM templates, actually place example values inside the elements as opposed to the more generic values included in Listing 2. How to best include place holders is up to the individual developers.
Now that you understand how structure is defined in CAM, it's time to learn a little more about how business rules are enforced.
It's really this simple: XPath.
Yes, that's right. XPath.
And now you have yet another advantage of CAM versus older validation technologies. It uses syntax that most XML technologists already understand to enforce business rules. For these people, there is no need to learn another language to implement CAM validation within their applications.
Listing 3 has an example of the
Listing 3. Enforcing business rules with CAM
<as:BusinessUseContext> <as:Rules> <as:default> <as:context> <as:constraint action="makeRepeatable(//PurchaseOrder/LineItems/LineItem)"/> <as:constraint action="makeOptional(//LineItem/Comment)"/> <as:constraint action="setLength(//ShippingAddress/State,2)"/> <as:constraint action="setDateMask(//PurchaseOrder/ShipDate,DD-MM-YYYY)"/> <as:constraint action="setNumberMask(//LineItem/Quantity,###)"/> <as:constraint action="setNumberMask(//LineItem/Price,###.##)"/> <as:constraint action="setNumberMask(//PurchaseOrder/TotalPrice,###.##)"/> <as:constraint condition="//PurchaseOrder/TotalPrice > 100" action="makeOptional(//PurchaseOrder/ShippingMethod)"> </as:context> </as:default> </as:Rules> </as:BusinessUseContext>
To the experienced XML developer, this structure should be fairly easy to interpret. This is not only because the constraints use XPath, but also because the validation rules are named in standard English. Again, this is what makes CAM so attractive.
The rules themselves are defined within the
context element. Each rule is an
action parameter of one of the
constraint child elements.
Note the first rule:
makeRepeatable(//PurchaseOrder/LineItems/LineItem). As the name implies, this is telling the validator that the
LineItem child element of the
LineItems element is repeatable. This means that there can be many of them, which makes perfect sense because a
typical purchase order may contain many different items.
The next rule is about the
Comment element. This rule states that comments are optional. In other words, the XML document can be valid with an empty
The next rule enforces the maximum length, in characters, of the
State element. In this case, that maximum length is
2, which is the understood postal abbreviation for a state in the United States.
The next rule enforces the format of the date. Here, the format
DD-MM-YYYY is used, although you can certainly use other formats as well. In this case, a valid date would be something like 03-03-2009, meaning March 3, 2009.
The next rule enforces the format of the
Quantity element. In this case, the contents of that element must be a number conforming to the
### mask. In other words, a purchase order containing a line item with a four-digit number in the
Quantity element would be considered invalid. With this rule,
a purchase order cannot contain a line item that orders a quantity of more than 999 of any one product.
The next two rules,
TotalPrice, are similar to the previous rule. Like the
Quantity rule, they enforce a number mask. The difference is that the number mask allows for decimal points. This is because these two elements are dollar values that can contain fractional
values representing cents.
And, finally, there is a particularly interesting rule. It is interesting because it introduces a context-driven constraint. What exactly is that? It's a constraint that can validate an XML document based on the content of certain elements. In this case, if the total price of the purchase order exceeds $100,
ShippingMethod element of the XML document can be empty. Otherwise, it cannot be empty. The business rule being applied here is that orders totaling $100 or more automatically get free standard shipping. For orders less than $100, the document must specify a shipping method.
Listing 4 shows an entire CAM document assembled from the fragments provided earlier.
Listing 4. All together now
<?xml version='1.0'?> <as:CAM CAM level="1" version="1.0" xmlns:as="http://www.oasis-open.org/committees/cam" > <as:Header> <as:Description>Simple Purchase Order</as:Description> <as:Owner>developerWorks</as:Owner> <as:Version>0.1</as:Version> <as:DateTime>2009-07-07T12:00:00</as:DateTime> </as:Header> <as:AssemblyStructure> <as:Structure ID="myPO" taxonomy="XML"> <PurchaseOrder> <ShippingAddress> <Name>%string%</Name> <Street>%string%</Street> <City>%string%</City> <State>%string%</State> <Zip>%string%</Zip> </ShippingAddress> <ShipDate>%DD-MM-YYYY%</ShipDate> <comment>%string%</comment> <LineItems> <LineItem> <ItemName>%string%</ItemName> <Quantity>%1%</Quantity> <Price>%54321.00%</Price> <Comment>%string%</Comment> </LineItem> </LineItems> <TotalPrice>%54321.00%</TotalPrice> <ShippingMethod>%string%</ShippingMethod> </PurchaseOrder> </as:Structure> </as:AssemblyStructure> <as:BusinessUseContext> <as:Rules> <as:default> <as:context> <as:constraint action="makeRepeatable(//PurchaseOrder/LineItems/LineItem)"/> <as:constraint action="makeOptional(//LineItem/Comment)"/> <as:constraint action="setLength(//ShippingAddress/State,2)"/> <as:constraint action="setDateMask(//PurchaseOrder/ShipDate,DD-MM-YYYY)"/> <as:constraint action="setNumberMask(//LineItem/Quantity,###)"/> <as:constraint action="setNumberMask(//LineItem/Price,###.##)"/> <as:constraint action="setNumberMask(//PurchaseOrder/TotalPrice,###.##)"/> <as:constraint condition="//PurchaseOrder/TotalPrice > 100" action="makeOptional(//PurchaseOrder/ShippingMethod)"> </as:context> </as:default> </as:Rules> </as:BusinessUseContext> </as:CAM>
As you can see, Listing 4 is little more than a concatenation of Listings 2 and 3. A
Header element is added, which simply identifies information about this particular validation file. In this case, a
simple description, an owner, a version, and a document date are added.
Although it is not shown in Listing 4, the
Header element can also contain parameters. The validation of the XML
document can vary based on the value of the parameters. For example, if a parameter
noMoreThan10LineItems is set to
true, the CAM document enforces a business rule that there can be no more than 10
LineItem elements in the entire order. This is an example of how powerful and
flexible CAM can be when it comes to validation. The benefit here is that you can simply change that parameter to
false to invalidate that rule.
Obviously, just because a certain technology is new does not mean that it is useful or provides a higher return on investment than its predecessors. CAM, however, has several distinct advantages compared to its competition.
First, CAM separates structure from business rules. This is a recurring pattern throughout software development and is not at all limited to CAM. For example, the Model-View-Controller (MVC) pattern in distributed object development environments separates the model from the view from the controller. Contrary to CAM, XSD tightly couples the structure and the business rules, resulting in higher maintenance overhead.
CAM also enables context-driven validation. In other words, CAM recognizes a dynamic structure based on the content of certain elements or attributes. So, if element X contains a certain value, a business rule is applied to element Y. If it contains another value, that business rule can instead be applied to element Z. This was demonstrated in Listing 3 with the final rule. In that case, purchase order documents with a total price of $100 or more do not need to specify a shipping method because the standard shipping is free for those orders. CAM's predecessors do not facilitate such complex validation.
Analyzing rule sets and structure is much easier with CAM. The structure is represented
as an XML instance in the usual tree format, thereby humans as well as computers can read it more easily. The grammar used to enforce business
rules is likewise intuitive:
setLength, and so on are not terribly difficult to decipher. And, although the rules and the structure are separated, they are in the same document, making it easy
to get a bird's-eye view of the overall validation requirements. XSD, on the other hand, requires an understanding of a whole new set of non-intuitive definitions—such as
complexType (What does that mean?)—and is not so
Sticking with the "you don't have to learn anything new" theme, CAM uses XPath. As shown previously, this is the language that enforces business rules on certain elements. Not only is XPath intuitive and easy to learn, it is already understood by most XML technologists. This makes the transition to CAM much smoother because the business logic validation does not require XML developers to learn something totally new. The XSD grammar is not anything like XPath.
Another advantage of CAM over XSD is that localization needs are more easily enforced with CAM. With XSD, enumerations are static and, therefore, cannot be made context-aware. However, with CAM, you can apply particular enumerations based on context values. In the emerging global marketplace, the need for such streamlined validation should be self-explanatory.
CAM templates also provide next-generation Service-Oriented Architecture (SOA) support. CAM supports business processing technologies such as Business Process Execution Language (BPEL), Business Process Specification Schema (BPSS), and Business Process Modeling Notation (BPMN) modeling tools. To quote from the Wiki: "Completing the SOA picture CAM has extension mechanisms that can be used to support semantic registry referencing (such as ebXML-regrep) and metadata definitions (such as CCTS and OWL) external to the templates that are key to next generation SOA exchanges." Also, CAM was developed by OASIS, so you can be sure that the organization will ensure that CAM is compliant, if not compatible, with its other standards.
CAM represents the latest generation of XML validation technologies. It provides numerous benefits over its predecessors. Those benefits include a separation of concerns regarding structure and business logic, dynamic validation based on context, interoperability with cutting-edge technologies, lower maintenance overhead, and it is easier to learn. CAM is also endorsed by a well-respected standards organization, OASIS.
CAM is an emerging technology. As such, it is not as well documented and does not enjoy the benefit of mass experience. However, it certainly is robust in its initial release and promises to be a much more efficient means of XML validation.
CAM is almost certainly here to stay and supplant its predecessors.
- The OASIS CAM Wiki: Learn more about CAM.
- On XML Schema Tutorial: Explore how to create XML Schemas, why XML Schemas are more powerful than DTDs, and how to use XML Schema in your application.
- DTD Tutorial: Learn how to use DTDs.
- Introduction to XML (Doug Tidwell, developerWorks, August 2002): XML, the Extensible Markup Language, has gone from the latest buzzword to an entrenched eBusiness technology in record time. Learn what XML is, why it was developed, and how it's shaping the future of electronic commerce.
- Validating XML (Nicholas Chase, developerWorks, August 2003): Validate files and documents to make sure that data fits integrity constraints. Learn what validation is and how to check a document against a Document Type Definition (DTD) or XML Schema document.
- Design XML schemas for enterprise data (Bilal Siddiqui, developerWorks, October 2006): Learn to use W3C XML Schema features to design data formats for production management.
- IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.
- XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
- developerWorks technical events and webcasts: Stay current with technology in these sessions.
- The technology bookstore: Browse for books on these and other technical topics.
podcasts: Listen to interesting interviews and discussions for software developers.
Get products and technologies
- IBM product evaluation versions: Download or explore the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
- XML zone discussion forums: Participate in any of several XML-related discussions.
- developerWorks blogs: Check out these blogs and get involved in the developerWorks community.