Use DB2 pureXML to implement a healthcare industry data solution

A Query Existing Data (QED) solution

Interoperability and standards are the latest buzzwords in the healthcare industry today. Use of standards is key to giving hospitals and doctors the capability to interoperate to share patient records better. IBM Research has been investigating the healthcare industry's evolution of standards, including the IHE and HL7 standards. This article offers a brief introduction to these standards and protocols, and it offers a scenario of an IBM DB2® pureXML® solution that follows the IHE QED protocol.


Amnon Shabo (, Research Relationship Manager, WSO2 Inc

Photo of Amnon ShaboAmnon Shabo (Shvo), PhD, works at the IBM Research Lab in Haifa as a research staff member specializing in health informatics. Amnon heads the Healthcare and Life Sciences Standards Program in IBM. He holds leading positions in HL7 (Health Level 7 - a major standards-developing organization dedicated to health information). Amnon is leading the contribution of the Haifa Healthcare and Life Sciences group to the Hypergenes project ( funded by the European Union to explore the genetic background of essential hypertension: a study conducted by a consortium of about twenty European partners. Amnon specializes in longitudinal and cross-institutional Electronic Health Records (EHR). A pioneer of the Independent Health Record Banks (IHRBs) vision, Amnon has promoted this idea for the past decade in various venues. including the U.S. Congress, where he gave a special briefing to Congressional Policy Staff (2007) on the new legislation introduced on IHRB.

Mary Desisto (, Solutions Architect, IBM

Mary Desisto photoMary Desisto is a Solutions Architect in the IM Technologies group. She has been focused on DB2 pureXML since 2007. Before that, she was a technical specialist in the ECM group.

Anna Burla (, Software Developer, WSO2 Inc

Photo of Anna BurlaAnna Burla is working at the Healthcare and Life Sciences team in the IBM Haifa Research Lab. She is currently working on a European project called HyperGenes (, which creates personalized treatment for essential hypertension disease. Anna is currently studying in Technion for a Bachelor of Science in Computer Engineering. She has experience in design and implementation of large-scale systems in the military industry.

Yonatan Maman (, Research Relationship Manager, WSO2 Inc

Photo of Yonatan MamanYonatan Maman is a research staff member at IBM Research Labs in Haifa. He received his Bachelor of Science in Software Engineering from the Technion. He is mainly involved in various IBM Healthcare and Life Sciences projects, focusing on Interoperability and standard-base clinical and genomic repository. Currently he works on a project that empowers patients and physicians by providing them means to monitor patient conditions, exploit live medical knowledge from various sources, and harness cutting-edge analysis techniques to increase patient safety, focusing on adverse drug events (ADE). Yonatan has experience in design and implementation of systems based on IBM and OpenSource technologies.

04 March 2010

Also available in Portuguese


This article does the following:

  • Introduces the architecture of the QED implementation
  • Describes the components of the QED solution
  • Introduces the queries that the QED profile requires
  • Illustrates the queries using the XQuery language from the w3c

Understanding the terms

About the IHE

Integrating the Healthcare Enterprise (IHE) is an initiative by healthcare professionals and the industry to improve the way computer systems in healthcare share information. Systems that support IHE integration profiles work together better, are easier to implement, and help care-providers use information more effectively.

The IHE initiative's goal is efficient delivery of optimal patient care. Several hundred products support one or more IHE profiles. A typical IHE profile uses existing standards, such as HL7, for administrative and clinical data. A typical IHE profile also uses DICOM for medical imaging information. The profile constrains each of the selected messages to meet the requested business use-cases.

About QED

Query Existing Data (QED) is one of the integration profiles that support dynamic queries for clinical data. QED defines a use-case in which a clinical consumer system queries a clinical source system for clinical information. QED defines different types of queries that are all patient centric. A clinical consumer system might request various types of clinical data for a specific patient, including vital signs, allergies, medications, immunizations, diagnostic results, procedures, and visit history. A QED profile leverages HL7 domain standards for the content model, and it leverages the common HL7 messaging formats for conveying both query and result. The response is composed of clinical statements, such as CDA snippets, that describe clinical data according to the query's parameters. HL7 developed the XML Schema Generator to enable the representation of the standard through XML.

About RIM

The Reference Information Model (RIM) is an essential part of the HL7 V3 development methodology. RIM expresses the data content needed in a specific clinical or administrative context. RIM provides an explicit representation of the semantic and lexical connections that exist between the information carried in the HL7 message fields. RIM is an ANSI-approved and ISO-approved standard. RIM-based applications and repositories have been developed based on relational schemas.


The IHE North American Connectathon is a week-long testing event managed by sponsoring organizations, by the Health Information Management Systems Society (HIMSS), and by the Radiological Society of North America (RSNA). The goal of IHE Connectathons worldwide is to promote the standards-based IHE interoperability solutions in commercially available healthcare IT systems. IHE Connectathons serve as industry-wide testing events where participants can test their implementations with those of other vendors.

About RIMon

In February 2009, an IBM Haifa Research team presented an IHE QED solution for querying patient clinical data to the North American 2009 Connectathon. The team successfully demonstrated that QED queries issued by other vendors' systems (such as General Electric, Corp.) could be run against the IBM QED data source using a data warehouse that integrates clinical data in various formats into a standardized framework. This article describes the QED solution called RIMon and its benefits in the healthcare industry when dispersed clinical data is queried.

RIMon, the HL7 RIM-based warehouse, introduces the usage of XML at the persistence layer, as shown in Figure 1. The RIMon supports semantic interoperability, because XML is the preferred implementation technology of the HL7 specifications.

Figure 1. UML class diagram of the Reference Information Model
Flow diagram showing a defined Entity, Role, RoleLink, Participation, Act, and ActRelationship

(See larger image.)

About DB2 pureXML

DB2 pureXML is an ideal vehicle to implement a QED clinical data source system because the return format QED queries need (that is, clinical statements represented in XML) is already available in RIMon. The RIMon data warehouse enables the integration of data from various sources and in various formats into a standardized framework that is best maintained using HL7 XML representation.

Understanding the QED architecture

The solution from the IBM Haifa Research Team is shown in Figure 2.

Figure 2. IBM QED profile solution architecture
Request to data source interface, fills request from RIMon warehouse and response back to data consumer

The major components

The major components of the IBM Research solution are the clinical data source, the clinical data source interface, and a clinical data consumer. The clinical data source consists of a RIMon repository. The RIMon repository has a DB2 database in which the CDA documents are inserted as XML using the pureXML feature of DB2 V9. All the results required by the QED standard queries are within the CDA documents. Static XQueries written against the inserted CDA documents can extract the data needed to respond to the QED standard queries. These queries can be run in a batch mode to create tables containing:

  • The required data to respond to the QED queries
  • Additional attribute data
  • The referenced XPath from which the data originated

Someone using the IBM Research solution might choose to extract selected data from the XML document for various reasons. A user might have many different schemas in an XML column, or if XML documents are very large and complex, extracting a portion of required data for a particular application might help to adapt the data structure into a specific use-case or application. Keeping the entire XML document intact is required to construct QED query results.

Making the complexity usable

The attribute mapping, the addition of the XPath of the data, and the additional data added in a configuration file is needed, because the CDA structure is very complex. The CDA structure is intentionally vague, because its schema is a standard that is designed for almost any type of medical observation or action. Someone reading a CDA directly, or trying to query a CDA document directly (without preliminary understanding of the varied templates in the warehouse), would have difficulty understanding all the embedded medical codes that represent the actual medical procedure or task that took place in the doctor's office. For example, the code for a heart beat is 8867-4, and the template ID for an allergy is The codes in CDA are derived from standard health terminologies, including SNOMED and LOINC, but non-medical users are typically not familiar with the codes. The mapping allows a non-medical person to ask for a heart beat as defined by QED specifications without requiring the code in standard global terminologies. Heart beat is mapped to code 8867-4 internally, and the correct observation data is returned.

This extracted data is the clinical data source. The clinical data consumer can issue a request asking for heart beat, height, or allergies, as shown in Figure 3.

Figure 3. Clinical data source
Shows screen capture displaying patient information query parameters, including Patient ID and Select Code as Heart Beat

The graphical user interface (GUI) enables a user to issue a request for all of the QED standard queries based on the QED specification (see Resources). The clinical data source consumer wraps the request into a PCC-1 QED response according to the QED requirements in XML format. The clinical data source consumer then sends the response to the clinical data source interface, as shown in Listing 1.

Listing 1. PCC-1 QED response
<?xml version="1.0" encoding="UTF-8"?>
 <soapenv:Envelope xmlns:q0="urn:hl7-org:v3" 
         <QUPC_IN043100UV01 ITSVersion="XML_1.0" xmlns="urn:hl7-
             <!-- A unique identifier for this transmission (mandatory). -->  
             <id extension="123" root="2.16.840.1.113883.19.4"/>  
             <!-- The time that the transmission was created. -->  
             <creationTime value="20100222185444"/>  
             <interactionId extension="QUQI_IN000003UV01" root="2.16.840.1.113883.5"/>
              <!-- debugging, production or training -->  
              <processingCode code="T"/>  
          Archive, Current Processing, Initial Load or Restore from Archive 
              <processingModeCode code="T"/>  
              <!-- Always, Errors only or Never -->  
              <acceptAckCode code="AL"/>  
              <receiver typeCode="RCV">  
                  <device classCode="DEV" determinerCode="INSTANCE">  
                      <id root="2.16.840.1.113883."/>  
              <sender typeCode="SND">  
                  <device classCode="DEV" determinerCode="INSTANCE">  
                      <id root="2.16.840.1.113883."/>  
                      <name>IBM QED clinical Consumer testing utility</name>  
                      <telecom value="http://ibm2:8080/qedConsumer/QedConsumer.html"/>
                      <manufacturerModelName>QED clinical consumer 
                      <softwareName>QED clinical consumer device 
              <controlActProcess classCode="CACT" moodCode="RQO">  
                  <id extension="111" root="2.16.840.1.113883.19.4"/>  
                  <!-- the identifier of the query -->  
                  <code code="QUPC_TE043100UV"/>  
                  <!-- The trigger event code -->  
                  <effectiveTime value="20100222185444"/>

Using an API, the QED clinical data source interface creates a simple SQL statement from the XML request to run against the clinical data source, such as a relational database created from the static XQueries. The clinical data source interface wraps the results from matching CDAs back from this query into the PCC-1 QED response, and the clinical data consumer displays it to the end user, as shown in Figure 4.

Figure 4. PCC-1 QED result
Screen cap: Clinical Statements on Results window showing the Heart Beat request was completed

The example solution maps care-provision codes in QED messages to attributes inside the clinical data source. The XPATH where the data was extracted from in the clinical data source was also defined.

Handling queries

Static queries are run against the CDA documents to extract the data for the QED standard queries. In addition to the query result being added to the clinical data source database, additional fields were added and mapped so that the queries could locate the data required to meet the QED specification. The XPath for the data was also included in a column. The Connectathon committee dictated the queries that the IBM Research team's demo had to accommodate. The query that extracted the heart beat observation is shown in Listing 2.

Listing 2. Heart beat observation
  declare default element namespace "urn:hl7-org:v3"; 
  for $obs in db2-fn:xmlcolumn('RIM.INSTANCE')
    let $y := fn:root($obs)
      let $entity_ext:=$y/*/id/@extension 
      let $entity_root:=$y/*/id/@root 
      let $effectiveTime:=$obs/effectiveTime[1] 
      let $statusCode:=$obs/statusCode 
      return <ibm:res xmlns:ibm="" xmlns:v3="urn:hl7-org:v3"
          entity_ext="{string($entity_ext)}" entity_root="{string($entity_root)}" 
          attr="HEART_BEAT" value="{string($obs/value/@value)}" 
          bakRef_root="{string($obs/id/@root)}" bakRef_ext="{string($obs/id/@extension)}">

The heart beat query uses the w3c XQuery language, which is the standard for querying XML data. The namespace is declared, because all CDAs contain this namespace. The db2-fn:xmlcolumn function is a DB2-specific function to query XML columns. Listing 2 queries the INSTANCE column in the RIM table. The line //section[@classCode='DOCSECT'] is the XPath to navigate to the appropriate observation that contains the heart beat. The codes 8716-3 and 8867-4 indicate that this is the correct observation segment for heart beat.

A CDA document can have many observation segments. The other elements in Listing 2, such as effective time, status code, and so on, are other elements from the CDA document that are required in the QED response. These elements are extracted and returned to the clinical data source.

An additional attribute called Heart_Beat is coded and mapped to the code for heart beat - 8867-4. In addition, the XPath from the original document is maintained to locate these values.

Using the pureXML repository directly

The clinical data source Interface can also access the DB2 pureXML repository directly. The pureXML repository would then become the clinical data source. The API that currently generates a simple SQL statement could be altered to generate an XQuery or an XML/SQL query that could access the CDA documents directly. Running batch jobs to extract the data into a relational database would no longer be needed. The query could return the same data required for the QED response, which the interface could use to populate the clinical data consumer. Listing 3 shows an example of an XML/SQL query to obtain the heart beat vital sign.

XML/SQL query to obtain heart beat directly
SELECT XMLQUERY ('$d//*:section/*:entry/*:observation[@classCode="OBS"
 and *:code/@code="8867-4"]' passing DOCUMENT as "d") 

The value for ID comes from the patient ID entered into the GUI. The code code/@code="8867-4"] comes from the heart beat selection. In this query, the *: represents a wildcard for the namespace, which can be replaced with the actual namespace. This query is written against a database table (CDA) in which the patient ID has been extracted from the CDA into a relational column. The column DOCUMENT indicates where the XML-based CDA has been inserted.

Using the demo yourself

You can experiment with the QED Consumer Web interface (see Resources) by sending QED requests to the RIMon QED Data Source. This Web interface can help you to build QED requests, render QED responses, and show which information goes on the wire. Valid patient IDs are 120 and 130. Leave the default patient ID as root.


The IBM Research team's demonstration proved that an implementation of a QED clinical data source system can be done on a DB2 pureXML successfully and conveniently. Many factors influence whether to extract data in a relational database. One of the benefits of using pureXML, however, is that the source XML document stays intact. Having this source stored enables you to use this repository for many different applications that might need the data, including CDAs, CCDs, EMRs, and other healthcare XML-based documents.



Get products and technologies

  • Experiment with the QED Consumer Web interface by sending QED requests to the RIMon QED Data Source. Valid patient IDs are 120 and 130. Leave the default patient ID as root.
  • Build your next development project with IBM trial software, available for download directly from developerWorks.



developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into Information management on developerWorks

Zone=Information Management, XML, Industries
ArticleTitle=Use DB2 pureXML to implement a healthcare industry data solution