As Part one describes, application developers often use middle-tier programming languages such as PHP, along with the languages' respective DOM parser in order to re-describe and re-structure the data into XML. At an enterprise level, multiply this n-fold. Imagine n-services and n-applications, each of which converts the data into XML, possibly using different techniques. From a software engineering perspective this poses a problem -- specifically regarding maintainability, performance, and data integrity.
First, note that the data is actually being manipulated twice: once by the SQL within the database engine, and again at the application level. This introduces additional risk in that the integrity of the data can no longer be guaranteed since it has been altered at the application level, outside of the database engine. Also, any additional work to parse and construct the data outside of the database engine, while constant in algorithmic complexity, may vary in performance depending on the programming language used.
It is simply not enough for the application architecture within a business to change. The underlying technology which manages the data that drives the business must evolve as well.
IBM DB2 9 database management system addresses this issue through the use of its pureXML capabilities and hybrid database engine. The data that previously could only be stored in the traditional relational format, can now be stored natively as XML as well. As in the case of the DB2 Health Monitor Sample Application, it is also possible to convert traditional SQL data into an XML document on-the-fly within the database engine.
In this sense, DB2 9 allows:
- The business to improve data integrity and prevent data from being locked into a legacy format
- The application developer to utilize the best storage medium for the task and eliminate embedded application logic for re-structuring data as XML
- The database administrator to centralize logic at the database server and optimize and guarantee performance of a specific level
While the DB2 Health Monitor Sample Application uses pureXML from the start, it is still possible to illustrate the thinking involved when attempting the transition to pureXML by examining the queries and code from all three possible approaches:
- Multiple SQL Queries with DOM Parsing
- Multiple SQL-XML Queries with DOM Parsing
- One XQUERY with multiple embedded SQL-XML Queries
Each of the above approaches, from top to bottom, builds on the previous approach and utilizes more pureXML than the last.
**For illustration purposes, the code and queries below will only build the health definitions section of the DB2 Health Monitor Sample Application health report as defined by its XML Schema Definition (XSD).
The SQL query below extracts the desired health definition information for the database manager, database, tablespace container, and tablespace objects. A standard SQL table is returned; therefore, DOM parsing, as shown below the query, must be used to assemble the data as XML.
Listing 1. SQL query extracting a desired health definition
SELECT t1.ID, t1.NAME, t1.SHORT_DESCRIPTION, t1.LONG_DESCRIPTION, t1.TYPE, t1.FORMULA,
t1.UNIT, t2.EVALUATE, t2.SENSITIVITY, t2.ALARM_THRESHOLD, t2.WARNING_THRESHOLD
FROM table(health_get_ind_definition('en_US')) as t1,
table(
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD, WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('TS', 'G', '','')) as t
UNION
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD, WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('TSC', 'G', '','')) as t
UNION
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD, WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('DBM', 'G', '','')) as t
UNION
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD, WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('DB', 'G', '','')) as t
) as t2
WHERE t1.ID = t2.ID
|
Listing 2. DOM parsing to assemble the data as XML
$dom = new DOMDocument('1.0');
$root = $dom->appendChild(new DOMElement('DB2Health_Report'));
$defset = $root->appendChild(new DOMElement('HIDefinitionSet'));
for($i = 0; $i < count($result); $i++)
{
$def = $defset->appendChild(new DOMElement('HIDefinition'));
$def->setAttribute('hiIdentifier', $result[$i]['ID']);
$def->setAttribute('hiName', $result[$i]['NAME']);
$def->setAttribute('hiShortDesc', $result[$i]['SHORT_DESCRIPTION']);
$def->setAttribute('hiLongDesc', $result[$i]['LONG_DESCRIPTION']);
$def->setAttribute('hiType', $result[$i]['TYPE']);
$def->setAttribute('hiFormula', $result[$i]['FORMULA']);
$def->setAttribute('hiUnit', $result[$i]['UNIT']);
$settings = $def->appendChild(new DOMElement('HISettings'));
$settings->setAttribute('sensitivity', $result[$i]['SENSITIVITY']);
$settings->setAttribute('evaluate', $result[$i]['EVALUATE']);
$settings->setAttribute('alarmThreshold', $result[$i]['ALARM_THRESHOLD']);
$settings->setAttribute('warningThreshold', $result[$i]['WARNING_THRESHOLD']);
}
$xml = $dom->saveXML();
|
SQL-XML is another standard that allows the XML structure to be defined within the query itself. As Listing 3 illustrates, rather that defining the elements and attributes within DOM, it is done through the query. In other words, for each row that was previously returned as a set of columns, a single XML document is returned.
Listing 3. SQL-XML query that returns a single XML document
SELECT
XMLELEMENT
(
NAME "HIDefinition",
XMLATTRIBUTES
(
t1.ID as "hiIdentifier",
t1.NAME as "hiName",
t1.SHORT_DESCRIPTION as "hiShortDesc",
t1.LONG_DESCRIPTION as "hiLongDesc",
t1.TYPE as "hiType",
t1.FORMULA as "hiFormula",
t1.UNIT as "hiUnit"
),
XMLELEMENT
(
NAME "HISettings",
XMLATTRIBUTES
(
t2.SENSITIVITY as "sensitivity",
t2.EVALUATE as "evaluate",
t2.ALARM_THRESHOLD as "alarmThreshold",
t2.WARNING_THRESHOLD as "warningThreshold"
)
)
)
FROM table(health_get_ind_definition('en_US')) as t1,
table(
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD, WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('TS', 'G', '','')) as t
UNION
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD, WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('TSC', 'G', '','')) as t
UNION
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD, WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('DBM', 'G', '','')) as t
UNION
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD, WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('DB', 'G', '','')) as t
) as t2
WHERE t1.ID = t2.ID
|
Therefore, only a minimal amount of DOM parsing is required in order to put all of the definitions within a single XML document structure. Listing 4 shows this minimal DOM parsing:
Listing 4. Minimal DOM parsing to put all of the definitions within a single XML document
$dom = new DOMDocument('1.0');
$dom->loadXML('<DB2Health_Report></DB2Health_Report>');
$def_set = new DOMDocument('1.0');
$def_set->loadXML('<HIDefinitionSet>'. $result .'</HIDefinitionSet>');
$root = $dom->getElementsByTagName('DB2Health_Report')->item(0);
$def_set_root = $def_set->getElementsByTagName('HIDefinitionSet')->item(0);
$import = $dom->importNode($def_set_root, true);
$root->appendChild($import);
$xml = $dom->saveXML();
|
The query below represents the pureXML alternative to the process described in Scenario 1. Using this approach, the DOM parsing has disappeared entirely and no external concatenation mechanisms are required. XQuery is used as the wrapping mechanism to define the overall structure of the XML document and the SQL-XML query from Scenario 2 is embedded within the structure in order to build the health definition subsection within the report. Normally, only an XQuery statement would be needed; however, since the health data is in the traditional SQL format, the use of SQL-XML is required to transform the data to its XML equivalent.
Listing 5. XQuery to define the overall structure of the XML document
xquery
<DB2Health_Report>
<HIDefinitionSet>
{
for $def in db2-fn:sqlquery('
SELECT
XMLELEMENT
(
NAME "HIDefinition",
XMLATTRIBUTES
(
t1.ID as "hiIdentifier",
t1.NAME as "hiName",
t1.SHORT_DESCRIPTION as "hiShortDesc",
t1.LONG_DESCRIPTION as "hiLongDesc",
t1.TYPE as "hiType",
t1.FORMULA as "hiFormula",
t1.UNIT as "hiUnit"
),
XMLELEMENT
(
NAME "HISettings",
XMLATTRIBUTES
(
t2.SENSITIVITY as "sensitivity",
t2.EVALUATE as "evaluate",
t2.ALARM_THRESHOLD as "alarmThreshold",
t2.WARNING_THRESHOLD as "warningThreshold"
)
)
)
FROM table(health_get_ind_definition('en_US')) as t1,
table(
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD,WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('TS', 'G', '','')) as t
UNION
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD,WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('TSC', 'G', '','')) as t
UNION
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD,WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('DBM', 'G', '','')) as t
UNION
SELECT ID, EVALUATE, SENSITIVITY, ALARM_THRESHOLD,WARNING_THRESHOLD
FROM table(HEALTH_GET_ALERT_CFG('DB', 'G', '','')) as t
) as t2
WHERE t1.ID = t2.ID
')
return $def
}
</HIDefinitionSet>
</DB2Health_Report>
|
Again, the above three scenarios only build the definition section of the health report; to build the full structure, as defined by the XSD, this pureXML solution was used.
The above query may seem intimidating. Although it is a single XQuery statement, it contains multiple embedded SQL-XML queries in order to gather health information on four types of database objects. XQuery only acts as a wrapper in order to glue all of the sub-XML documents together. To give an idea of how many queries are used, here is a breakdown per health report section:
Table 1. XQuery sub-query breakdown by Health Report section
| Health report section | Number of sub-queries | Description |
|---|---|---|
| InfoSet | 6 | Health Dashboard Information |
| HIDefinitionSet | 1 | Health Definition Information |
| AlertSet | 7 | Health Alert Information |
| ErrorSet | 0 | Error Information |
| TOTAL | 14 |
Notice the clear level of overlap between the scenarios and the distinct pattern for placement of the assembly logic:
- Scenario One: Gets the data as a table structure with the specified columns from the health UDFs. DOM parsing is used to assemble the entire XML structure.
- Scenario Two: Gets the data from the same source, but, instead of defining columns, it defines elements and attributes. DOM parsing is used to place the data within overall XML structure.
- Scenario Three: Places the structure from scenario two within the overall XML structure using XQuery as the wrapping mechanism.
The above findings can be summarized by the following diagram:
Figure 1. Scenario summary
Therefore, with only a single query being used, application maintenance has just been simplified; and from a DBA perspective, the troubleshooting process has just been improved. With the ability to centralize the logic entirely at the database server, in a stored procedure for example, DBAs can now do what they do best: optimize the database to guarantee performance and data integrity. While the data may be in XML, the same ideas of indexing, server-side logic, and query optimization still apply as they did with SQL.
This article examined the creation of a single DB2 Health Monitoring Service based on existing, traditional SQL data and analyzed it from both an SOA and data-centric perspective. The SOA perspective gives the notion of software stack flexibility -- the idea of utilizing the best technology for the task while maintaining a clear separation between presentation, business logic, and data layers. More importantly, the data perspective gave us the idea of reducing the complexity of an SOA application back to that of traditional application development by utilizing the database engine for what it does best: to describe, retrieve, and store data.
So where do we go from here?
If this single service is capable of monitoring a DB2 database on any server, imagine it monitoring multiple databases on multiple servers. Extending this idea even further, what if all of the XML health reports could be mined, then integrated into a single health report, and finally transformed using an XSLT into an RSS/ATOM feed?
To put it in context, the database engine has just been used to re-describe and re-organize the data into a form that can provide a summarized view of the health status changes of DB2 databases/servers across the entire enterprise.
If DB2 can do this to improve the operational efficiency of the business from an administrative point of view, can similar techniques be applied towards analyzing customer-related data?
Note
- The Health Monitor Demo has been bundled with the Web 2.0 Starter Toolkit for DB2 (PHP)
Learn
- Read Part
1 in this series and learn to use DB2 9 to implement SOA.
- Browse the
technology bookstore
for books on these and other technical topics.
Get products and technologies
- Download
IBM product evaluation versions
and get your hands on application development tools and middleware products from
DB2®, Lotus®, Rational®, Tivoli®, and
WebSphere®.
Discuss
- Participate in the discussion forum.
- Check out
developerWorks
blogs and
get involved in the
developerWorks community.

Rizwan Tejpar is an Industry Internship Program (IIP) student at IBM, where he develops sample applications for showcasing DB2 working with open-source technologies. He attended the ZendCon conference in 2006 and SDWest conference in 2007, where he demonstrated DB2 pureXML capabilities to developers on the trade show floor. Mr. Tejpar's most recent contribution was the DB2 Health Monitor Sample Application for PHP, which put an SOA spin on database health monitoring. Some of his other programming language experience includes C++, C# .NET, SQL, XQuery, and JSP.




