Industry formats are an important part of standardized information exchange between different information systems across the industry including healthcare, insurance, financial business. These formats are based on XML. XML Schema defines the structure of documents, to which all derived documents must comply. In addition to XML Schema, another language based on XML stylesheet transformations called Schematron can be used to specify rules to make assertions on the content of XML documents. Even though DB2 pureXML is capable of XML Schema registration, XML document validation and XML stylesheet transformations, WebSphere DataPower SOA Appliances can compliment DB2 pureXML solutions. For example, they can offload XML validation and transformation work from the DB2 processor by utilizing IBM WebSphere DataPower's XML capabilities in addition to DataPower's routing and security features.
This article demonstrates native storage of XML documents in a DB2 pureXML database after the documents have been successfully validated through a DataPower SOA Appliance, as shown in Figure 1. The validation performed through the DataPower box includes the validation of XML documents against their XML schema and content validation using Schematron. (For more details on Schematron, please refer to the Resources section of this article.)
Figure 1. Simplified scenario
A major benefit of this solution is that the WebSphere DataPower Appliance performs all validation steps, appropriate error handling, and the insertion of the XML document into the DB2 pureXML database, off-loading the validation steps from the database processor. Please note that the insertion is performed only if the document has successfully passed all validation steps.
This is the first of two articles in this series on WebSphere DataPower and DB2 pureXML. The second article will describe how DB2 pureXML can be used as an audit log that is easy to access and query for XML messages that are being routed, transformed, or validated through WebSphere DataPower.
Setting up the scenario
The following sections provide a detailed overview on how the scenario is set up, including a sample XML schema, sample XML documents, a Schematron example, a DB2 pureXML database, a Data Web Service, and the configuration of a WebSphere DataPower SOA Appliance.
Step 1: XML schema, XML documents, and Schematron
Any industry format based on XML can be used in this scenario, as they are, for example, used in the free and publicly available DB2 pureXML online demonstration "Industry Formats and Services with pureXML" (see Resources). This article uses a simple XML schema, as shown in Listing 1, and corresponding XML documents, as shown in Listings 2, 3, and 5 are created.
Listing 1. Sample XML schema (simple.xsd)
<?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="person"> <xs:complexType> <xs:sequence> <xs:element name="identification" type="xs:integer"/> <xs:element name="name"> <xs:complexType> <xs:sequence> <xs:element name="first" type="xs:string"/> <xs:element name="last" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
Based on the XML schema defined above, the following two XML sample documents are provided:
- The first sample document, as shown in Listing 2, is valid.
- The second sample document, as shown in Listing 3,
contains well-formed XML, but is an invalid document according to the
corresponding XML schema since the required element
<identification />is missing.
Listing 2. Sample XML document 1 (simple_1.xml)
<?xml version="1.0" encoding="utf-8"?> <person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="simple.xsd"> <identification>1</identification> <name> <first>christian</first> <last>pichler</last> </name> </person>
Listing 3. Sample XML document 2 (simple_2.xml)
<?xml version="1.0" encoding="utf-8"?> <person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="simple.xsd"> <name> <first>christian</first> <last>pichler</last> </name> </person>
To go a step further and validate the content of XML documents, a language called Schematron is used. Schematron is a declarative validation language that enables the checking and cross-checking of XML content through the specification of rules in XPath, and of custom error messages should the rules fail. This article does not cover Schematron details, but it is important to know that Schematron is driven by XML stylesheet transformations. First of all, Schematron rules need to be defined in an XML format, as shown in Listing 4. This "rules document" is transformed into an XSL stylesheet using the Schematron XSL stylesheet. The resulting, new XSL stylesheet is then applied to each XML document and will, if the content is not as expected, produce the custom error messages.
Listing 4. Schematron implementation (simple.sch)
<?xml version="1.0" encoding="utf-8"?> <schema xmlns="http://purl.oclc.org/dsdl/schematron"> <title>Simple Schematron Validation Example</title> <pattern name="Personal Information"> <rule context="/person/name/first"> <report test="text() = 'christian'"> First name must not be 'christian'! </report> </rule> </pattern> </schema>
The above example looks up the first name in an XML document, checks whether the
first name equals "christian" or not, and prints a validation failure message if
it does. In case of validation failure, the message would be
First name must not be 'christian'!.
Finally, a sample XML document is created that is valid with respect to validation against the XML schema and that satisfies the Schematron rule defined above:
Listing 5. Sample XML document 3 (simple_3.xml)
<?xml version="1.0" encoding="utf-8"?> <person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="simple.xsd"> <identification>3</identification> <name> <first>keith</first> <last>wells</last> </name> </person>
Step 2: DB2 pureXML database and Data Web Services
This section describes the setup of the DB2 pureXML database that serves as a data storage after the XML documents have been validated. As shown in Listings 6 and 7, the setup consists of a database containing one table and one stored procedure only:
Listing 6. Setup of DB2 pureXML database (setup_environment.db2)
DROP DATABASE HOSPITAL@ CREATE DATABASE HOSPITAL USING CODESET UTF-8 TERRITORY US@ CONNECT TO HOSPITAL@ CREATE SCHEMA DB2ADMIN@ CREATE TABLE DB2ADMIN.PATIENT (ID INT PRIMARY KEY NOT NULL GENERATED ALWAYS AS IDENTITY, COMMENT VARCHAR(500), RECORD XML)@
Listing 7. Setup of the stored procedure to insert record into the Patient table
CREATE PROCEDURE insertPatient (IN xmlRecord XML) SPECIFIC insertPatient DYNAMIC RESULT SETS 1 P1: BEGIN INSERT INTO DB2ADMIN.PATIENT (COMMENT, RECORD) VALUES ('', xmlRecord); END P1@
The stored procedure
insertPatient is then exposed
through a Data Web Service, as shown in Figure 2, which
means that the stored procedure can be called by SOAP or REST requests.
Figure 2. DB2 pureXML database configuration overview
This article does not cover anymore details on how to create Data Web Services. If you need more information, please read the article "Generate Web Services for DB2 9 pureXML" (developerWorks, June 2007).
Step 3: WebSphere DataPower SOA Appliance
A WebSphere DataPower SOA Appliance is a versatile device that can be used to, among many other functions, process XML documents in various ways. The features of the device that are discussed in this article include the validation of XML documents against an XML schema and XSL transformations.
Before going into details on the configuration itself, this article provides some theoretical background. The WebSphere DataPower SOA Appliance can serve in many different ways, including XML Firewall, Web Services Proxy, XSL Accelerator, and many others. The scenario in this article uses the XML Firewall. Every XML Firewall contains at least one processing policy, and all of those processing policies contain at least one processing rule. Within every processing rule, simple processing actions can be specified, which are, for example, XML schema validation, XPath-based routing, encryption, XML stylesheet transformations, and many others.
The first step is to configure XML schema validation. In other words, XML documents being sent to this policy on the DataPower Appliance are validated against a particular XML schema. Configuring XML schema validation is achieved by adding an XML schema validation processing action to the processing rule, as shown in Figure 3, Number 3:
Figure 3. XML Firewall configuration of the DataPower SOA Appliance
If the validation action fails, the DataPower Appliance will respond to the request with a failure message and the HTTP 500 error code back to the client that initially sent the XML document. The standard error message for this case does not contain any specific information on why the validation action failed. To provide more information, this example includes an on-error action in the rule, as shown in Figure 3, Number 2. The on-error action causes the policy to call another rule named Rule #2 (shown in Figure 4, Number 1), if any fatal error occurs during any action in the rule:
Figure 4. XML Firewall configuration of the DataPower SOA Appliance
If Rule #2 is called, it will execute the XSL stylesheet, as shown in Listing 8:
Listing 8. Sample XML document 3 (simple_3.xml)
<?xml version='1.0' encoding='UTF-8' ?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:dp="http://www.datapower.com/extensions" extension-element-prefixes="dp" exclude-result-prefixes="dp"> <xsl:output method="text"/> <xsl:template match="/"> <xsl:value-of select="dp:variable('var://service/error-message')"/> </xsl:template> </xsl:stylesheet>
The XSL stylesheet shown in Listing 8 obtains the specific error message that explains why the validation action failed and returns it to the client that issued the request.
Now there are two items left that the DataPower Appliance needs to perform. The first one is applying the Schematron XSL stylesheet to the incoming request XML document. If the Schematron stylesheet action produced an error message, the error message needs to be sent back to the client that initially sent the request XML document. If there was no error, the DataPower Appliance should forward the XML document to the DB2 pureXML Data Web Service, which will then insert the valid XML document into the database. This is achieved through another XSL stylesheet transformation action, as shown in Figure 3, Number 4, which executes the XSL stylesheet, shown in Listing 9:
Listing 9. XSL stylesheet executing Schematron XSL stylesheet and performing content-based routing, based on the Schematron processing result (content_based_routing.xsl)
<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:dp="http://www.datapower.com/extensions" extension-element-prefixes="dp" exclude-result-prefixes="dp"> <xsl:output method="xml" /> <xsl:template match="/"> <xsl:variable name="schematronResult"> <error> <xsl:value-of select="dp:transform('local:///simple.xsl', .)" /> </error> </xsl:variable> <xsl:choose> <xsl:when test="$schematronResult/error/text()"> <dp:send-error override="true"> <xsl:copy-of select="$schematronResult" /> </dp:send-error> </xsl:when> <xsl:otherwise> <dp:url-open target="http://db2:8080/healthcarepatient/rest/patient/insertPatient" response="xml" data-type="xml" content-type="text/xml"> <q0:insertPatient xmlns:q0="urn:example"> <q0:_xFFFF_xmlRecord> <xsl:copy-of select="." /> </q0:_xFFFF_xmlRecord> </q0:insertPatient> </dp:url-open> </xsl:otherwise> </xsl:choose> </xsl:template> </xsl:stylesheet>
It is also important to know whether the insert into the DB2 pureXML database through the Data Web Service was successful or not. Therefore, the DataPower XML Firewall will forward the response message from the DB2 Data Web Service to the client that initially sent the request XML document to the DataPower Appliance, indicating whether the insertion operation was successful or not.
After successfully setting up this example, it's now time to see the DataPower SOA Appliance, DB2 pureXML, and Data Web Services together in action. The XML documents that have been previously defined in this article are used in this demonstration.
Listing 10. Submitting the first XML document
cpichle@DAIRYFARM /tmp $ curl --data-binary @simple_1.xml http://datapowerbox:2055/ <?xml version="1.0" encoding="UTF-8"?> <error>First name must not be 'christian'! </error> cpichle@DAIRYFARM /tmp $
Listing 10 shows that for the XML document simple_1.xml, the XML schema
validation action must have been successful. However, the presence of the
<error> tag indicates that the Schematron
validation failed because the first name supplied was
Listing 11. Submitting the second XML document
cpichle@DAIRYFARM /tmp $ curl --data-binary @simple_2.xml http://datapowerbox:2055/ http://datapowerbox:2055/: cvc-particle 3.1: in element person with anonymous type, found <name> (in default namespace), but next item should be identification cpichle@DAIRYFARM /tmp $
Listing 11 shows how the DataPower Appliance responds to receiving an XML document that does not conform to the schema. The error message indicates that the XML document was not valid when compared against the XML schema, and details on why the schema validation failed are included in the error message returned by the appliance.
Listing 12. Submitting the third XML document
cpichle@DAIRYFARM /tmp $ curl --data-binary @simple_3.xml http://datapowerbox:2055/ <?xml version="1.0" encoding="UTF-8"?> <ns1:insertPatientResponse xmlns:ns1="urn:example" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/> cpichle@DAIRYFARM /tmp $
After submitting the third example XML document (simple_3.xml), another response message is returned. The message returned has been generated by the Data Web Service, indicating that the insert of the XML document into the DB2 pureXML database was successful. This implies that the supplied document passed both the XML schema and Schematron validation steps.
This short and simple article has shown how DB2 pureXML and the WebSphere DataPower SOA Appliance can compliment each other to realize powerful applications, where the WebSphere DataPower appliance performs XML validation, and the DB2 pureXML database manages the XML storage, indexing and querying. Both XML structure validation (through XML schema) and content validation (through Schematron) have been described. The combination of the two products, WebSphere DataPower and DB2 pureXML, provides flexible and speedy access to validated XML documents.
Thank you to Bob Callaway and others who have contributed to this work by providing their knowledge and guiding advice.
|Downloads for this article||download.zip||3KB|
- "WebSphere DataPower and DB2 pureXML, Part 2: DB2 pureXML as an audit log for WebSphere DataPower (developerWorks, June 2008): See how IBM DB2 pureXML can further complement the WebSphere DataPower SOA appliance by providing an easily accessed and queried audit log. The example scenario illustrated in Part 2 is applicable to any situation where XML document instances are being exchanged.
- "Generate Web services for DB2 9 pureXML" (developerWorks, June 2007): Create Web services using a simple Java™ class to insert and retrieve XML data, into and from DB2 9 using the pureXML feature.
- "Exposing DB2 9 pureXML using WebSphere Integration Developer" (developerWorks, September 2007): Follow step-by-step instructions to build, test, and deploy a mediation module that can store well-formed XML documents in a DB2 XML column using a WebSphere Integration Developer module.
- "IBM Data Studio: Get Started with Data Web Services" (developerWorks, November 2007): Get very detailed and simple introduction on how to develop your first Data Web Service.
- "Data Web Services: Build Web Services the new way to access IBM database servers" (developerWorks, December 2007): Create and customize a Data Web Service. This article provides useful theoretical background on Data Web Services, which includes an architectural overview on Data Web Services, and addresses different aspects of Data Web Services, such as security.
- Schematron: Get more information about Schematron. Schematron is a language for making assertions about patterns found in XML documents and can therefore be used to validate the content of XML documents.
- IBM WebShpere DataPower SOA Appliance: Get an overview on the IBM DataPower SOA Appliances product line.
- developerWorks Information Management zone: Learn more about Information Management. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
- Technology bookstore: Browse for books on these and other technical topics.
Get products and technologies
- Industry Formats and Services with pureXML: Download a great variety of examples, for free! Each example illustrates how to work with XML-based Industry Formats and pureXML. The examples show how to register an XML schema, how to perform validation of XML instance documents, how to query XML data using XQuery or SQL/XML, and much more.
- IBM Data Studio: Download the development environment used to develop Data Web Services, for free.
- DB2 Express-C: Download the free version of DB2, which includes the core functionality as the other Data Servers, such as the pureXML technology. DB2 Express-C is free to develop, deploy and distribute.
- Build your next development project with IBM trial software, available for download directly from developerWorks.
- Participate in the discussion forum.
- Participate in developerWorks blogs and get involved in the developerWorks community.