Business and technical challenges
In today's complex information exchanges with XML and associated large XSD schema, coupled with an array of trading partners, it becomes a significant challenge to support and maintain accurate handling of all incoming transactions. Currently, XML schemas and DTDs provide the ability to validate, or verify, the structural content of a XML document. Certain validation rules can also be accommodated as part of XML schemas but not all kinds of transaction validations can be performed using XML schemas or DTDs.
With the advent of industry specific standards such as the Standards for Technology in Automotive Retail (STAR), whole collections of standard XML message exchange formats are provided in the form of XML Schemas. Both the consumers and providers of Web services must comply with these schemas to be certified by their industry standards body. However, such industry specific schemas are loosely bound with minimal validations and can be used for only structural validation of the incoming XML. Additional code is required to implement required validations that augment the schema checks. These validations prevent errors when the data is received by applications or components that expect the data to be in a particular structure and comply with business content validation rules.
The most common way to implement the needed validation logic in a Web service and its associated XML applications is to write custom code; as a result, the validation rules are buried inside the applications and cannot be easily adapted, documented, or shared. Depending upon the number and nature of the validations required, the validation code can be complex and lengthy and its maintenance can become a significant burden as more partners are added. Add to that the time, effort, and risk associated with recompiling and redeploying that code to a production server every time the validation logic changes.
In addition to the standalone applications, validations also are required when exposing the services through an Enterprise Service Bus (ESB). Figure 1 illustrates the typical architecture of an ESB centered on a messaging bus. The bus provides message delivery services based on standards such as SOAP, HTTP, and Java Messaging Service (JMS). The ESB enables services to interact with each other based on the quality of service requirements of the individual transactions. It also supports different standards such as SOAP, XML, WSDL, JMS, J2EE, JAX-RPC, and so on.
Figure 1. How to perform validations in an ESB architecture
One of the major challenges facing developers is how to perform message validations at the message provider and message consumer end points while interacting across the ESB. For example, as in Figure 1, a Web services component might require information from an existing application. The Web service (consumer) sends a message requesting information to the existing application (provider) through the ESB. The application component requires a request in a certain format with correct information, so it will validate the request message before processing it. The Web services component has its own set of requirements and will validate the response message. If the two endpoints use different protocols or standards, the ESB can transform each message and will perform validations before transforming the messages.
Each provider and consumer has its own requirements; hence, depending on the number of transaction types and validations, this can result in a long development cycle to define, create, and test all the validations. This stabilization phase proceeds until each validation component is able to provide correct feedback about message validation to its invoking component.
The solution approach we describe here is to implement the XML validation services based on the OASIS Content Assembly Mechanism (CAM) specification. The OASIS CAM template approach is based on a simple approach to XML content handling and validation that allows businesses to create common interchange models for their exchanges in XML. CAM templates support context-based rules, code-lists, and cross-field validations. Many cross-field validations cannot be implemented in an XSD schema alone; in other cases, it is not possible in the published industry schemas to accommodate all the validations variations.
The solution includes CAM Studio (an Eclipse-based UI template editor) that is used to define the CAM template. Then the CAMV validation engine provides a set of open source Java APIs which are used to validate the XML with the specific compiled CAM templates at run-time. CAM Studio template editor supports adding custom XPath expressions to its generated templates but the UI can define most rules without writing any custom expressions.
Figure 2 shows the Model, Author and Test, Deploy, and Monitor stages in the life-cycle of developing the validation rules:
Figure 2. Validation rules life cycle
In this step the data entities and their data elements are identified along with their corresponding validation rules. The required XML exchange schema is designed; alternatively, the required elements are mapped to an existing industry standard schema such as one from STAR (Standards for Technology in Automotive Retail).
CAM Templates are assembled or authored using the CAM Studio editor. These are the three possible editor options provided to create a CAM template:
- Create from scratch or hand-crafted
- Use an existing XML Schema
- Use an existing XML instance
Once you create the CAM template, the next step is to review each and every element and attribute and specify the validation rules as applicable. A panel in the editor displays the rules for each template node. Figure 3 displays a screen capture of the template structure in the CAM Template Editor:
Figure 3. CAM template in the CAM TemplateEditor
While all the validation rules need not be binary in nature (that is, either pass or fail), CAM supports classifying the validation failures as Warnings. This feature comes in handy for scenarios where corrective action can be taken at the service provider-end, modifying the payload to make the message usable rather than rejecting the complete message. For example, a rule might require the length of a particular comment field to be within 255 characters; however, a request message should not be rejected when the length exceeds the maximum value, but a warning should be sent to the consumer specifying that only the first 255 characters will be used from the comment.
You will see the details of how to set up a validation message classification as a Warning in the Tips and tricks section of this article.
The CAM templates are compiled using the CAM Studio Editor before you use them with the application run-time CAMV engine. The compiled format is the condensed XML version of the original CAM template itself and is designed to optimize performance of the CAMV validation engine. To compile the CAM Template, select the menu option Tools > Compile Template. This will generate the .cxx file format of the template which will be used at run time.
The CAMV validation engine offers a simple, open-source Java API which can be used in any Java application to validate an input XML with the applicable CAM template. The code snippets in Listing 1 illustrate the usage of CAMV:
Listing 1. Usage of CAMV API
TemplateValidator tv = new TemplateValidator(templateDocument);
tv.setErrHandler(new ElementErrorHandler(tv));
boolean tvResult = tv.validate(ioReader);
if (tvResult){
System.out.println("No errors, might be warnings.....");
}
List errList = tv.getErrors();
List warnList = tv.getWarnings();
|
The error, warning messages are formatted as
<error classification>: <XPATH> => <error or warning message> => Node: <node name> => attribute: <attribute name>
For example, an error message would look like this:
/p:ProcessRepairOrder[1]/p:ApplicationArea[1]/p:CreationDateTime[1]=>Content does not conform to the mask:YYYY-MM-DD'T'HH:MI:SSZ =>Node: CreationDateTime
A warning message would look like this:
Warning: /p:ProcessRepairOrder[1]/p:ProcessRepairOrderDataArea[1]/p:RepairOrder[1] /p:RepairOrderHeader[1]/p:OwnerParty[1]/p:SpecifiedPerson[1]/p:ResidenceAddress[1] /p:LineOne[1]=> length should be less than 80 =>Node: LineOne
By virtue of using CAMV, you can now externalize all the validation checks and need not embed them inside code or implement using custom coding. During the monitoring cycle, you can meet the need for additional validations by simply updating the validation templates. To add additional validations or remove existing ones, redistribute the compiled CAM templates (.cxx files). You do not need to recompile and redeploy any Java code in the event of a change in validation logic.
New features in the latest CAMV release
Some of the key features added to the latest (December 2009) release of CAMV are:
- A backward compatible release download for Java 1.5 has been created in addition to the default Java 1.6.
- CAMV is thread-safe; hence, it can be deployed in any J2EE container such as WebSphere® Application Server.
- CAMV can now accept XML input as StringReader in addition to JDOM documents, reducing the possible instances of serialization and de-serialization during message handling.
- Multiple conditions can be now defined on a single XML element or attribute.
The following are tips and tricks that we identified from a recent project where we used CAMV to create a validation framework for a B2B Gateway that exposes STAR-based Web services for a leading automotive industry organization.
CAMV supports creating validations rules for providing Warning messages in addition to Errors. A conditional XPath expression needs to be specified on the XML element to specify the validation for the Warning message.
For example, consider a business scenario where the Web service request need not be rejected if length of a particular field exceeds the specified limit of 255 characters. The business decision is to truncate the length of the field to 255 characters, if it exceeds, as required by the backend system; however a warning must be issued to the invoking component.
Such scenarios can be handled by specifying a printmessage() expression in the CAM template rules.
The Message Text must have a Prefix Warning: followed by the
required warning message such as length should be less than
255. The complete message text will appear as Warning: length
should be less than 255.
As the warning is returned only if the length of specific element exceeds the specified length, this rule is specified as conditional and an XPath expression is created to perform the length check as depicted in Figure 4 screen capture of the CAM Studio Editor expression entry wizard tool:
Figure 4. How to configure a warning rule
You can cache CAMV templates into memory to perform repeated validations and not read the templates from the hard disk for each and every validation performed. This reduces the disk I/O and significantly improves the performance and throughput.
Checking for validation errors
The CAMV Java method TemplateValidator.validate(..) returns
true even if
warnings are returned. It is set to false only when errors
are returned.
Hence, in the event where only warnings are returned, use the getWarnings() method to get the list of any warnings messages.
If the returned messages (which contain the XPath path, a validation message, and a node name) are not sufficient for the business scenario and more information is required, the client application can create custom code. CAMV returns the same input XML after adding the CAMERROR and CAMWARN attributes to the input exchange message XML as depicted in Listing 2.
Listing 2. Modified XML after performing validation
<p:ApplicationArea> <p:Sender> <p:CreatorNameCode>CNV</p:CreatorNameCode> <p:SenderNameCode>SNC</p:SenderNameCode> </p:Sender> <p:CreationDateTime CAMERROR="CreationDateTime | Content does not conform to the mask:YYYY-MM-DD'T'HH:MI:SSZ">2001-12-31T12:00:00</p:CreationDateTime> <p:Destination/> </p:ApplicationArea> <p:ResidenceAddress> <p:LineOne CAMWARN="WARNING:LineOne | length should be less than 80">100 Moon Drive 100 Moon Drive 100 Moon Drive 100 Moon Drive 100 Moon Drive 100 Moon Drive</p:LineOne> <p:LineTwo>APT # 100</p:LineTwo> <p:CityName>MALIBU</p:CityName> <p:CountryID>US</p:CountryID> <p:Postcode>99999</p:Postcode> <p:StateOrProvinceCountrySub-DivisionID>CA</p:StateOrProvinceCountrySub-DivisionID> </p:ResidenceAddress> |
When entering rules into the template, the XPath validation expressions are specified
(by default) using the wildcard expression of two slashes (//) which selects all nodes in the
document from the current node that match the selection no matter where they are.
Figure 5. How to specify wildcard expressions while defining rules
This results in rules being applied to all such instances of a particular element. (Note: The rules might not be visible immediately at all other instances of a particular element but become visible once the template is refreshed in the CAM template editor view).
However, in case you need to apply the check to a particular instance of an XML element then it is advisable to select Full for Rule XPath check box.
Figure 6. How to specify explicit expression while defining rules
Using CAMV, you can enforce the validation checks consistently and then rapidly change the rules to fine-tune message handling to match particular partner exchanges and content. By externalizing the validation rules, which conventionally have been embedded deep inside the backend application code, you have much better control and management along with more predictable message handling. These standards-based rules templates can optionally be shared with partners to facilitate better content handling alignment across systems.
With a more adaptive and fault tolerant process, the application is able to handle a wider variation in content and, hence, more easily support a broad set of interaction partners with reduced support and maintenance costs—which is the opposite of normal experiences.
The use of open source greatly facilitated collaboration on developing the solution and integrating the CAMV engine into the deployment environment.
Overall, this project demonstrated that innovative use of XML and dynamically configurable XML rule templates can provide a better, more stable, faster, and capable customer application experience than relying on static compiled code resources alone.
| Description | Name | Size | Download method |
|---|---|---|---|
| Sample Java project that uses CAMV Java APIs | ValidationFrameworkSample.zip | 2032KB | HTTP |
Information about download methods
Learn
- JCAM Engine with XML Editor / Validator: See information about the CAMV project at the SourceForge Web site.
- The OASIS CAM Wiki: Visit a resource site for users and developers of CAM templates and CAM processors.
- Meet CAM: A new XML validation technology (Brian M. Carey, developerWorks, September 2009): Read an introduction and overview of CAM.
- Taking XML Validation to the Next Level: Introducing CAM: Read an article series representing CAM: The Missing Manual.
- XML area on developerWorks: Get the resources you need to advance your skills in the XML arena.
- IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.
- XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
- developerWorks technical events and webcasts: Stay current with technology in these sessions.
- developerWorks on Twitter: Join today to follow developerWorks tweets.
- developerWorks
podcasts: Listen to interesting interviews and discussions for software developers.
Get products and technologies
- OASIS CAM specification standard: Download and review the CAM specifications.
- IBM product evaluation versions: Download or explore the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
Discuss
- XML zone discussion forums: Participate in any of several XML-related discussions.
- developerWorks blogs: Check out these blogs and get involved.

Puneet Kathuria is an Integration Architect working with IBM India Ltd. He has more than 13 years of experience, mainly in the application and integration architectures, and has been with IBM for the past four years.

David is currently consulting on NIEM IEPD development for the US government and is based in Washington DC, USA. David is Chair of the OASIS CAM technical committee and co-developer of the CAM Studio Eclipse editor responsible for the majority of the XSLT processing scripts. David has over 30 years experience in the industry and in 2007 was recognized as a Senior Member of the ACM for his industry work in XML. David has authored many articles on the topic of XML and information exchange optimization, standards specifications for OASIS, and presented widely on XML in North America, Europe, and Asia.

Martin is a consultant based in Suffolk, England specializing in XML, Ontologies, Java, Eclipse and Web solutions with over 20 years experience. Martin authored both the original OASIS CAM specifications and the CAM Studio Eclipse editor and CAMV validation engine implementations. Martin was also previously active in the telecommunications industry work on XML-based message exchanges and standards work in Europe. He has presented at a number of industry events including OASIS sponsored technology expositions in Europe particularly.




