The DB2 9 pureXML feature enables firms to effectively manage both relational and XML data in one database. XML data can be stored and queried natively with SQL/XML or XQuery. Extensive XML parsing is avoided in applications, and all database parsing for XML takes place at insert time. XML schemas can be stored in the XML Schema Repository (XSR) for pureXML. DB2 pureXML data can optionally be validated against XML schemas stored in the XSR, ensuring that the stored data conforms to the desired schema.
WebSphere Service Registry and Repository (Service Registry) is a complete environment for storing, managing and governing services artifacts. It supports the storage and retrieval of XML schema definitions following a defined governance model. As part of this model, you can define access rights to specific artifacts, classify them according to taxonomies (for example, industry standard classifications), and define a lifecycle with specified states that each artifact will transition through.
This article describes how to use the Service Registry product to govern XML schemas that are stored in DB2 pureXML XSR to validate industry standard data, or any XML data. With proper governance, you can stop XML schema proliferation, enable consistent usage, and increase interoperability throughout your enterprise. This article shows how the schemas can automatically be synchronized between the two environments.
There are other potential relationships between DB2 and Service Registry. For example, you can use Data Studio to build Data Web Services to access DB2 data. With the Service Registry Eclipse plug-in installed in Data Studio, you can upload Web Services Description Language (WSDL) documents generated with Data Web Service tooling to a Service Registry repository. This article, however, does not cover this scenario.
A number of standards exist that were established to improve the exchange of information between enterprises. In many cases, these standards are defined for a particular industry and express the structure of relevant data in the form of XML schema. One example for this is the Financial products Markup Language, or FpML (see Resources for more information). It describes a set of standard XML schemas for the exchange of information about complex financial products.
This article uses the FpML 4.3 schemas to show how to use the Service Registry and DB2 pureXML together. Note that this article does not go into the details of how this standard is used in any financial services scenario; it merely covers how to leverage standard schemas in both products in an integrated way. Hence, the article can be applied to other standard schemas, too.
How do you know which XML schema is in production? How do you ensure an XML schema is used consistently throughout your enterprise? Who is using the XML schema in your enterprise? Can your developers easily find the XML schema for reuse? What are the XML schema's dependencies? If I change this XML schema, who will be affected and how can I notify them? Who has the authority to change the XML schema? How do you version the XML schema? These are just a few of the questions that XML schema governance can help you answer.
With proper governance of your XML schemas, you can enable developers to easily find not only XML schemas as a whole, but also the types within these schemas, thus fostering reuse. By governing your XML schemas, you can effectively manage the changes to the XML schemas throughout the entire lifecycle, from inception to retirement. As these changes occur, notifications can be sent to the consumers of the XML schema, reducing unplanned disruptions. By governing your XML schemas, you define access rights, versioning policies, and standards that the XML schemas must adhere to. All of these, when enforced, increase your IT organization's effectiveness.
This article introduces the Service Registry as a tool to enable and enforce governance of your XML schema documents.
Above, this article described how proper governance of your XML schema will benefit your enterprise. To have successful governance you need a tool to enable and enforce it. The Service Registry is a tool that does just that. The Service Registry is a single component with both registry and repository capabilities, which enables it to be an authoritative governance source for all your SOA artifacts. It stores and manages SOA artifacts such as WSDL, XML, XSD, WS-Policy, and binary documents. This allows you to view and manage how an artifact is utilized throughout your entire SOA. The Service Registry handles, among other things, the versioning of the artifacts when new ones are introduced, the security and access policies of those artifacts, the enforcement of policies upon lifecycle transition of the artifacts, the notifications when the artifacts are changed, and the taxonomy by which the artifacts can be structured.
This section highlights only a small subset of the Service Registry's capabilities. This article discusses the capabilities in terms of XML schema, but it is important to note that these capabilities apply to all SOA artifacts. (See the Resources section of this article for more detailed information on the Service Registry.)
The Service Registry enables management by promoting the visibility of the XML schema documents and the logical parts within the documents. Upon loading, it parses the XML schema document into its logical parts: complex types, simple types, elements, and attributes. The document and the logical parts can then be annotated with additional metadata, such as properties and classification. This allows you to search the Service Registry not only for a specific document, but also for a specific logical part without knowing which document the part belonged to.
Figure 1. View of the complex types in the Service Registry
The Service Registry leverages Web Ontology Language (OWL) for its classification system (see Resources for more information). By utilizing OWL classification, it can provide additional vocabulary along with formal semantics to describe the XML schema assets. This, in turn, enables these assets to be more easily managed, searched, and interpreted by humans or machines.
For example, given the scenario introduced in this article, you can classify XML schema documents to be published to pureXML's XSR. You can then use the classification to automatically publish only the schema that has been identified accordingly. You can also use this classification in the Service Registry's faceted search to quickly find XML schemas that will or have been published to the XSR.
Figure 2 shows an example of such a search using the Service Registry UI:
Figure 2. Faceted search for the XML schema documents classified as pureXML
The Service Registry provides graphical impact analysis to enable you to fully understand who and what will be affected if the given XML schema changes. You have the ability to specify the dependency relationship options for performing impact analysis. For example, you can choose to show only artifacts that depend on this schema or show artifacts that this schema depends on. Moreover, you can specify the dependency depth. You can limit the impact analysis to pre-built relationships or custom-defined relationships. You can also choose to view the results of the impact analysis in graphical or text form.
Figure 3 shows an example of a graphical impact analysis for the XML schema document fpml-cd-4-3.xsd, which is part of the FpML standard schema set. The impact analysis shows all logical parts derived from fpml-cd-4-3.xsd and artifacts that depend on fpml-cd-4-3.xsd at the top of the view. It shows which artifacts the schema depends on at the bottom of the view. Note that there is a Viewing Window that helps you navigate and refocus the graphical view in the right navigation pane.
Figure 3. Impact analysis example
SOA artifacts follow different lifecycles. For example, a service's lifecycle will be different than that of an XML schema. The Service Registry provides a customizable state machine-enforced governance lifecycle. This means you can define lifecycles for your SOA artifacts to fit your needs. Sample lifecycles for services, interfaces, policies, and contracts are available out of the box and can be modified. The Service Registry provides validation and notification interfaces, which allows you to write custom validation and notification code. This code, in turn, is invoked using a validation or notification plug-in whenever an artifact is made governable, removed from governance, or when the artifact transitions to a new state in the lifecycle. The validation plug-in is invoked before the operation (for example, a transition) actually occurs. If the artifact does not meet all the requirements specified by the validation plug-in, the operation will not occur, and an error is returned.
The notification plug-in, on the other hand, is invoked after the operation has occurred. The customizable governance lifecycle, along with the validation and notification interfaces, provides powerful capabilities to ensure that the governance policies are enforced at each stage of the artifacts' lifecycle, thus ensuring that the appropriate people are notified when changes occur and to help automate the governance process.
Let's take a look at an example of how an XML schema lifecycle is leveraged using the standard industry schema FpML. When a new version of FpML is released, there are things you need to consider before you can move it into production. For example, you may have extended the FpML schema to better fit your particular business. You need to ensure that those extensions are applied to the latest version of FpML before you move it to production. Developers embarking on new projects, or updating existing ones, need to understand where the FpML versions are in their lifecycle in order to select the appropriate version to use based on the project's timeline. Applying and enforcing the XML schema lifecycle ensures that all considerations for the FpML version have been addressed; it enables developers to choose the appropriate version of the schema, and it notifies all interested parties of any state transitions.
Below is an example governance lifecycle state machine for XML schema. The state machine has the following states:
- Specify - A new version of FpML has been loaded into the Service Registry. When all the appropriate approvals and policies have been met to begin testing the new schema set, it can transition to the test state.
- Test - When a schema artifact is transitioned to this state, a custom developed notification plug-in is invoked and the schema is published to a test instance of pureXML XSR. When testing has completed and, again, all approvals and policies are met, the FpML can transition to the publish state.
- Publish - After having transitioned the FpML schemas to this state, yet another custom-developed notification plug-in is invoked and the schema is published to a production instance pureXML XSR.
- Retired - When the FpML schemas are transitioned to this state, a notification plug-in is invoked and the schemas are deleted from all instances of pureXML XSR.
Figure 4 shows the states and transitions of this sample lifecycle as a UML state machine diagram:
Figure 4. Sample XML schema lifecycle
As described above, the Service Registry handles all aspects of storing and governing artifacts related to your SOA. It is here where, among other things, new versions of those artifacts are introduced, where they can be structured according to a common taxonomy, and where the impact of changes to one artifact on other artifacts can be appropriately analyzed.
At the same time, DB2 pureXML is the product where the actual runtime data is stored (and retrieved, of course). XML data can be handled without the need to convert this data into a relational form. Moreover, structural compliance of this data with the proper XML schema can be enforced with the XSR.
This section shows how you can govern XML schema in the Service Registry and then synchronize it with the DB2 pureXML XSR for runtime use. Figure 5 shows an overview of how both the Service Registry and pureXML handle FpML schemas for different purposes:
Figure 5. XML schemas are synchronized between Service Registry and DB2 pureXML XSR
All aspects of governance, as described earlier in this article, are handled exclusively by the Service Registry. Any change to an existing schema is managed there and must be properly approved before being deployed to a production runtime system. Deployment to this runtime environment is done by automatically invoking DB2 pureXML scripts that load the appropriate schema artifacts into the database. There, they can be associated with XML documents that are stored and retrieved by business applications.
This article mentioned above how the Service Registry offers a notification plug-in that is invoked whenever a particular artifact transitions from one state to another. The actual logic that is executed when this happens is defined in the form of so-called governance notifiers. These notifiers are executed when an artifact has been created, deleted, changed, or when it has transitioned to a new state in its lifecycle.
A governance notifier is a custom-coded class that implements the
interface. It contains methods for creation, deletion, update, and transitioning
that you can implement to contain whichever logic is appropriate. You can create
more than one governance-notifier class. Which governance notifiers are actually
executed is configured with the notification plug-in properties file.
Let's apply this to the lifecycle example for FpML schemas that was introduced earlier. Assume that all of the FpML schemas, which you can govern as one collection in the Service Registry, go from "Test" to "Publish" state. We have configured in the notification plug-in properties file that a specific governance notifier class gets invoked. This class implements a method called transition, which has access to the content of each artifact that is transitioned to the new state. It can now extract this content and, you guessed it, send it to the pureXML XSR.
Loading of schema into the pureXML XSR is done using the invocation of a set of stored procedures. These stored procedure can easily be called from within a script and thus be automated to occur at predefined times. To install a schema that has never been installed before, follow these steps:
- Register the main XML schema, using the
- Optionally add additional XML schema documents that are related to the main
schema, using the
ADD XMLSCHEMA DOCUMENTcommand.
- Complete the process by calling the
For schema that already exists in the XSR and that has changed, pureXML provides
a command called
UPDATE XMLSCHEMA. This command can
only be used in cases where the change to the schema is backward-compatible, and
the updated schema is automatically applied to all XML documents that are stored
in the database. When evolving a schema from version to version, you have to
carefully consider whether or not to store new versions of schema as an updated
schema in XSR, or whether to store it as a new document altogether.
There are various levels of detail at which version numbers can be applied. You can version a service as a whole, including all of its related artifacts. You can also version individual files such as XML schemas as they are being developed, and checked into and out of a Source Code Management system. Finally, industry standards, like the FpML standard used for the scenario in this article, have versions, too.
This article does not describe a detailed approach to service versioning. Instead, it assumes that the industry standard we use will undergo only infrequent changes and thus will not change its version very often. In that respect, it follows a slower lifecycle than other artifacts in your service portfolio.
Above, this article described the
command to store updated — and backward compatible — versions of a
schema in the XSR. Here, it is assumed that each new version of the FpML standard
is not backward-compatible. This means that existing XML documents that are stored
in DB2 pureXML are not compliant with any newer versions of the FpML schemas.
Thus, you can't be use the
command, but need to use
REGISTER XMLSCHEMA instead.
This section describes how you can run a simple sample scenario on your own computer. We assume that you have an instance of Service Registry and DB2 9.5 installed. Note that you can download a free copy of DB2 9.5 Express-C (see Resources).
The Service Registry comes with a default profile, which you can modify and extend for your business needs. For this article, we extended the default profile to include a governance lifecycle for XML schema, an ontology to classify the schema as pureXML, and a custom notification plug-in to automatically publish the schema to pureXML's XSR. (For more detailed information on the Service Registry's profile, lifecycle, and notification capabilities, consult the Resources section of this article.)
To load and activate the pureXML Service Registry profile, complete the following tasks from the Service Registry Console (http://yourhost/ServiceRegistry):
- Switch to the configuration perspective by selecting Configuration from the perspective drop-down list, and click on Go.
- Expand Manage Configuration Profiles.
- Click on the Configuration Profiles link.
- Click on Load Configuration Profile.
- Select Browse, and browse to c:\pureXMLArticle\PUREXML_PROFILE.zip.
PUREXMLin the "Provide the Profile configuration item name" field.
- Click on OK.
- Select the PUREXML profile, and click on Make Active.
Figure 6. Active pureXML profile
To load the FpML schemas into the Service Registry, retrieve the FpML.zip from the Download section of this article, and then perform the following tasks from the Service Registry Console:
- Select Administrator from the perspectives drop-down list, and click on GO.
- In the "Load Documents" section, select Local file system, and then browse to the location of the FpML.zip.
- Select ZIP/JAR file from the Document type drop-down list.
- In the "Version" field, enter
- Optionally, enter a description. You can leave the namespace field empty, since the namespace property will be filled in automatically, based on the namespace of the loaded schema.
- Click on OK.
Figure 7. Batch load the FpML schemas into the Service Registry
The Service Registry auto detects document dependencies and then helps you resolve these dependencies. It checks to see if the dependent documents already exist in the Service Registry. If they do exist, you have the option to have the Service Registry automatically create a relationship between the documents, or you can also choose to load new version of the dependent document.
Figure 8. Resolving document dependencies
Save these schemas as a group. This enables you to govern them as a collection, as described above.
- Click Save as Group.
FpML43in the "Name" field.
- Click on Finish.
Figure 9. Save the FpML schema documents as a group
After the schema documents have been loaded, you can graphically navigate the schema documents' dependencies by selecting Service Documents -> XSD Documents from the left navigation, and click on the graph icon. See Figures 10 and 11:
Figure 10. From the XSD Documents view you can click on the Graph icon to navigate dependencies
Figure 11. Graphical navigation showing only external relationships starting at the fpml-cd-4-3.xsd document
In this section, you will place the group of schemas you imported in the previous section under governance, you will transition the governance state to Published, and you will verify the publishing of the schema to pureXML XSR was successful. When transitioned to the Published state, a notification event will invoke the custom notification plug-in, which we have created and configured for you as part of the pureXML Service Registry profile. The notification plug-in will export the schema to a directory and execute a script, which will then publish the exported schema to the XSR. Figure 12 shows the configuration of the PureXMLNotifier in the notification properties:
Figure 12. The notification properties configuration of the PureXMLNotifier
You can find the source code for the com.ibm.wsrr.purexml.PureXMLNotifier class in the download material for this article (see Download). The notifier extracts the appropriate content from the registry and then uses the DB2 commands for loading schema into the XSR, as described above.
Govern the schema group
To govern the schema group, perform the following tasks from the Service Registry Console:
- Expand Service Documents, and select Document Groups.
- Select FpML43.
This brings you to the FpML43 document group's detail page. From here you can view and add properties, classifications, relationships to other artifacts in the registry, and additional document group members. If you click on the Document Group Members link, you see the schema documents you imported in the previous section.
- Click on the Governance tab.
- Select InitiateXMLSchemaLifeCycle from the "Initial state transactions" drop-down list. You defined this lifecycle in the schema governance lifecycle section of this article. See Figure 4 for the lifecycle state machine.
- Click on Govern. Figure 13 shows the FpML43 document group governed, and it is in the Specify state:
Figure 13. FpML43 document group placed under governance
Transition the schema group's lifecycle state to Published
The current state of the FpML43 document group is Specify. To transition to the Published state, you must first transition to the Test state. The schema governance lifecycle section stated that when you transition to the Test, a notification plug-in is invoked and the schemas are published to a test pureXML XSR instance. To keep this article simple, we did not configure the notification plug-in to be invoked upon the transition to the Test state.
Transition to the Published state:
- Select ApprovedForTest, and click on Transition. This puts you in the Test state.
- Select Publish, and click on Transition.
The notification plug-in will be invoked after a successful transition to the Published state. This notification will publish the scheme to pureXML's XSR. Figure 14 shows a successful transition to the Published state. In the next section, you will verify that the schemas have actually been published to the XSR.
Figure 14. FpML43 document group in the Published state
Verification of publication to pureXML's XSR
The custom notification plug-in with the custom-coded notifier class first exports the schemas to the c:\pureXMLArticle\publish directory and then executes the bat file called C:\PureXMLArticle\articlescripts\start.bat, which publishes the schema to XSR.
To verify that the schema publishing has completed successfully, you can use the DB2 Control Center. After running the final state transition of the schemas, as described above, you should see a new database named "FPML43". Note that if you expand this database in the Control Center, you see an entry called "XML Schema Repository (XSR)". When you click on it, you can view the new entries in the XSR, as shown in Figure 15, below:
Figure 15. The FpML schemas stored in the pureXML XSR
You can select an individual schema and view it using the Open option. Figure 16 shows an example of this:
Figure 16. The DB2 XML Document Viewer
You can now begin to load XML documents into the database that follow the imported schemas and start taking full advantage of the functionality offered by pureXML.
This article showed you how to utilize the WebSphere Service Registry and Repository to store, manage, and govern standard industry XML schema. It also described how these schemas can be automatically synchronized with the XML schema Repository (XSR) in DB2 pureXML, where it is used for data validation and other functions around the storage and retrieval of standards compliant XML data. As an example, you leveraged the FpML industry schema for the financial services industry.
The scenario showed how Service Registry serves as the master registry and repository for an overall service oriented architecture, with DB2 pureXML serving as the implementation platform for handling (XML) data. Using the Service Registry with DB2 pureXML prevents XML schema proliferation, enables consistent usage, and increases interoperability throughout your enterprise. You can easily find not only XML schemas as a whole, but also the types within these schemas, thus fostering reuse. Moreover, you can effectively manage the changes to the XML schemas throughout the entire lifecycle, from inception to retirement.
|Sample content and code for this article||pureXMLArticle.zip||357KB||HTTP|
Service Registry and Repository Information Center:
Find complete product documentation.
"Introducing IBM WebSphere Service Registry and Repository, Part 1: A day in the life of WebSphere Service Registry
and Repository in the SOA life cycle"
(developerWorks, September 2006): Get an introduction to the main concepts and
capabilities of IBM WebSphere Service Registry and Repository. This article
explains the role of Service Registry and Repository throughout the
service-oriented architecture (SOA) life cycle and provides resources to help you
learn more about it.
"Introducing IBM WebSphere Service Registry and Repository, Part 2: Architecture, APIs, and
(developerWorks, September 2006): Explore an architectural overview of WebSphere
Service Registry and Repository and its capabilities. This article describes how
WebSphere Service Registry and Repository components work together to advertise,
find, retrieve, manage, and govern service metadata.
Universal Database Information Center:
Find detailed information about the DB2 Control Center.
DB2 Express-C: Find
DB2 Express-C product information.
IBM Data Studio:
Learn more about Data studio, and use it to build your data solution.
"Get started with Industry Formats and Services with pureXML"
(developerWorks, May 2007): Discover a fastpath to storing your industry XML
content in DB2.
FpML: Find more information about FpML.
Ontology Language (OWL):
Learn more about the Web Ontology Language.
Schema Registration and Validation in DB2"
(IBM, June 2006): Learn more about how to manage XML schemas in DB2.
developerWorks resource page for DB2 for Linux, UNIX, and Windows:
Find articles and tutorials and connect to other resources to expand your DB2
developerWorks Information Management zone:
Learn more about Information Management. Find technical documentation, how-to
articles, education, downloads, product information, and more.
- Stay current with
developerWorks technical events and webcasts.
Browse for books on these and other technical topics.
Get products and technologies
DB2 pureXML FpML software
Get the free DB2 pureXML FpML software bundle at IBM alphaWorks.
Download a free trial version of DB2 Enterprise 9.
Now you can use DB2 for free. Download a no-charge version of DB2 Express Edition
for the community that offers the same core data features as DB2 Express Edition
and provides a solid base to build and deploy applications.
- Build your next
development project with
IBM trial software,
available for download directly from developerWorks.
- Participate in
and get involved in the developerWorks community.
Laura Olson is a Certified Consulting IT Specialist at IBM in Rochester, Minnesota. She specializes in helping customers reach the full potential of their SOA by enabling SOA governance. Previously, she was the lead portal architect for the Advanced Web Technologies group in IBM where she architected and created the My Software Support portal. She likes to spend her spare time with her family and working on her private pilot's license.
Andre Tost works as a Senior Technical Staff Member in the IBM Software Services for WebSphere organization, where he helps IBM's customers establishing Service-Oriented Architectures. His special focus is on Web services and Enterprise Service Bus technology. Before his current assignment, he spent ten years in various partner enablement, development and architecture roles in IBM software development, most recently for the WebSphere Business Development group. Originally from Germany, he now lives and works in Rochester, Minnesota. In his spare time, he likes to spend time with his family and play and watch soccer whenever possible.