One of the major formats used for information publishing on the Internet is the Portable Document Format (PDF), created by Adobe Systems in 1993. PDF is an open de facto standard that reproduces document format characteristics, fonts and images in a high fidelity way on any platform where Adobe Acrobat Reader is present. If you are publishing your DB2® data on the Web, you may wish to publish your data directly into PDF format. As we'll discuss in this article, you can leverage the support for XML offered through DB2 Universal DatabaseTM to publish your data as a PDF document.
IBM has provided continuous improvement to XML support in recent versions of DB2 UDB. Version 8 brought new functions (XMLAGG, XMLELEMENT and XMLATTRIBUTES) that facilitate the conversion of DB2 data to XML format. These functions are independent of the XML Extender and conform to the future SQL standard (ISO/IEC 9075-14 - SQL/XML).
XML technology is being used more and more in several kinds of applications. XML itself is the core of the XML family. Depending on the application, you can use other family standards ("W3C recommendations") that deal with specific aspects. For example, one of the major XML characteristics is the ability to separate content from presentation, although if you want to define presentation characteristics, you may use XSL.
The article covers the following topics:
- DB2 XML built-in functions - This section discusses the new DB2 functions.
- XSL = XSLT + XSL-FO + XPath - This section covers the basic concepts of the W3C recommendations that deal with styles.
- PDF creation - This section show you how to use a XSL-FO renderer (FOP) to create a PDF document.
To illustrate each section, I will show an example that generates a PDF file with data from some tables from the DB2 UDB SAMPLE database.
Before examining how the new built-in functions work, you should become familiar with a new DB2 data type,
the "XML data type". This is an XML internal and transient representation that does not allow for
column definition or for the return of this data type to the application. It is always
necessary to use the scalar function XML2CLOB( xml-data-type ) : clob-data-type to serialize the
XML contents in a CLOB data type.
The new DB2 XML functions permit the creation of data in the XML format in a simple and direct way. Combining the XMLELEMENT and XMLATTRIBUTES scalar functions, it is possible to set the basic elements that form an XML document. Since these are scalars functions, for each line of the result set, a result of the XML data type is created. Also, it is possible to group lines of the result set using the XMLAGG column function.
The new XML functions are as follows:
- XMLELEMENT - scalar function to create an XML element
- XMLATTRIBUTES - scalar function to create element attributes (must always be used within an XMLELEMENT function)
- XMLAGG - column function that groups several elements within a parent element.
The syntax for the XMLELEMENT and XMLATTRIBUTES functions is presented in Listing 1. Notice in line 3 that, if there are attributes to an element, they should be informed right after the element's name, followed by the element content (line 4). The parameter "element_content" is the result of an expression that provides one of these data types: SMALLINT, INTEGER, BIGINT, DECIMAL, NUMERIC, REAL, DOUBLE, CHAR, VARCHAR, LONG VARCHAR, CLOB, GRAPHIC, VARGRAPHIC, LONG VARGRAPHIC, DBCLOB, DATE, TIME, TIMESTAMP, XML or any distinct type whose source type is one of the preceding data types. On the other hand, the parameter "attribute_value" (lines 7-8) is also the result of an expression that gives back the same data type of "element_content", except for XML type.
Listing 1. XMLELEMENT and XMLATTRIBUTES functions syntax
1. XMLELEMENT_FUNCTION ::=
2. XMLELEMENT ( NAME element_name
3. [ , XMLATTRIBUTES_FUNCTION ]
4. [ { , element_content } ] )
5.
6. XMLATTRIBUTES_FUCTION ::=
7. XMLATTRIBUTES ( attribute_value [ AS attribute_name ]
8. [ { , attribute_value [ AS attribute_name ] } ] )
|
Table 1 shows some examples of these functions. The result column has been formatted for so that we can easily view the resulting XML. DB2, as a result of processing the XML2CLOB function, doesn't fill blank spaces or insert line breaks between the XML elements.
Table 1. Functions XMLELEMENT and XMLATTRIBUTES examples
| # | SQL | Result | ||
|---|---|---|---|---|
| 1 |
|
| ||
| 2 |
|
| ||
| 3 |
|
|
Notice, as you can see in the examples, it is only necessary to use the XML2CLOB function once, at the outer level. Also notice, in example 3, the use of the function XMLATTRIBUTES to create the "empNo" attribute in the "employee" element.
In the Table 1 examples, the resulting XML refers to just one line of the EMPLOYEE table. It is possible to group several lines using the XMLAGG function. Listing 2 shows the syntax for this function.
Listing 2. Function XMLAGG syntax
1. XMLAGG_FUNCTION ::=
2. XMLAGG ( XMLELEMENT_FUNCTION
3. [ ORDER BY sort_key [ ASC | DESC ]
4. [ { , sort_key [ ASC | DESC ] } ] )
|
Notice, in lines 3-4, the ORDER BY option is used for specifying the order in which the elements should be concatenated. The example presented in Listing 4 uses the function XMLAGG to obtain three groups: departments, employees and projects.
It's very common to have many nested functions in a SELECT statement. You will find that it will help to format the command similar to the XML elements hierarchy you want to create. This practice considerably improves the command readability. Compare below an SQL example conforming to the suggested format, and another that has been formatted without this concern.
Table 2. Formatting the SQL command
OK |
| |
Not OK |
|
It is possible in just one SELECT command to create complex XML content. The more complex the XML is, the more difficult the SQL maintenance will be. For this reason it is helpful to use common table expressions for better command structuring. The example in Listing 3 shows the creation of structured XML using common table expressions.
Listing 3. XML structure with data from the SAMPLE database
01. <depts>
02. <dept deptno="..."> -- For each department
03. <deptName>...</deptName>
04. <emps>
05. <emp empNo="..."> -- For each employee in this department
06. <empName>...</empName> -- concat firstname with lastname
07. <empJob>...</empJob>
08. <empSalary>...</empSalary> -- empSalary = salary + comm
-- + bonus
09. </emp>
10. </emps>
11. <projs>
12. <proj projNo="..."> -- For each project with department
-- employee participation
13. <projName>...</projName>
14. </proj>
15. </projs>
16. </dept>
17. </depts>
|
Notice that three repetition groups are present within the intended structure (lines 2, 5 and 12). It's a good practice to group the repetitions in a higher level element (lines 1, 4 and 11). Among other things this grouping will help the creation of XSL transformation rules. The SELECT command below creates XML that conforms with the structure presented in Listing 2.
Listing 4. Example of XML creation using common table expressions
01. WITH 02. empXMLtab (workdept, empXMLcol) as 03. (select 04. workdept, 05. xmlelement(NAME "emps", 06. xmlagg( 07. xmlelement(NAME "emp", 08. xmlattributes(empno as "empNo"), 09. xmlelement(NAME "empName", firstnme || ' ' || lastname), 10. xmlelement(NAME "empJob", job), 11. xmlelement(NAME "empSalary", salary + bonus + comm) 12. ) order by firstnme 13. ) 14. ) 15. from employee 16. group by workdept 17. ), 18. projXMLtab(workdept, projXMLcol) as 19. (select 20. t.workdept, 21. xmlelement(NAME "projs", 22. xmlagg( 23. xmlelement(NAME "proj", 24. xmlattributes(t.projno as "projNo"), 25. xmlelement(NAME "projName", t.projname) 26. ) order by projname 27. ) 28. ) 29. from (select distinct e.workdept, p.projno, p.projname 30. from employee e, 31. emp_act ea, 32. project p 33. where e.empno = ea.empno 34. and ea.projno = p.projno 35. ) as t 36. group by t.workdept 37. ), 38. depXMLtab (depXMLcol) as 39. (select 40. xmlelement(NAME "depts", 41. xmlagg( 42. xmlelement(NAME "dept", 43. xmlattributes(deptno as "deptno"), 44. xmlelement(NAME "deptName", deptname), 45. empXMLcol, 46. projXMLcol 47. ) order by deptname 48. ) 49. ) 50. from department d, 51. empXMLtab e, 52. projXMLtab p 53. where d.deptno = e.workdept 54. and d.deptno = p.workdept 55. ) 56. select xml2clob(depXMLcol) 57. from depXMLtab; |
Notice that for each repetition group a common table expression has been created (lines 2, 18 and 38). And also, for each XMLAGG function (lines 6, 22 and 41), there is an XMLELEMENT function of a higher level (lines 5, 21 and 40) aiming to encapsulate the elements that have been concatenated by the XMLAGG function. XMLAGG functions use the ORDER BY clause to sort the elements within the group.
The common table expressions empXMLtab and projXMLtab select the column WORKDEPT aiming to get a join with DEPARTMENT table. Notice, in lines 45-46, the XML contents placement from the first two common table expressions.
The main SELECT (lines 56-57) uses the XML2CLOB function to serialize the XML contents.
In Listing 5 we see part of the XML created from the SUPPORT SERVICES department.
Listing 5. Resulting XML (E01 department, only)
01. <dept deptno="E01"> 02. <deptName>SUPPORT SERVICES</deptName> 03. <emps> 04. <emp empNo="000050"> 05. <empName>JOHN GEYER</empName> 06. <empJob>MANAGER</empJob> 07. <empSalary>000044189.00</empSalary> 08. </emp> 09. </emps> 10. <projs> 11. <proj projNo="OP1000"> 12. <projName>OPERATION SUPPORT</projName> 13. </proj> 14. <proj projNo="OP2010"> 15. <projName>SYSTEMS SUPPORT</projName> 16. </proj> 17. </projs> 18. </dept> |
Utilizing the new functions we create the SELECT command that generates the XML containing DB2 data. The next step will be to transform the XML tree into another that utilizes the XSL-FO vocabulary to define how the contents will be presented. This implementation will be presented in the next section.
According to W3C, "XSL is a language for expressing stylesheets. It consists of three parts: XSL Transformations (XSLT): a language for transforming XML documents, the XML Path Language (XPath), an expression language used by XSLT to access or refer to parts of an XML document (XPath is also used by the XML Linking specification). The third part is XSL Formatting Objects: an XML vocabulary for specifying formatting semantics".
It can seem odd, at first, the number of W3C recommendations that deal with styling. XML expert G. Ken Holman is very clear at this point:
"Styling is the rendering of information into a form suitable for consumption by a target audience. Because the audience can change for a given set of information, we often need to apply different styling for that information in order to obtain dissimilar renderings in order to meet the needs of each audience. Perhaps some information needs to be rearranged to make more sense for the reader. Perhaps some information needs to be highlighted differently to bring focus to key content.
"It is important when we think about styling information to remember that two distinct processes are involved, not just one. First, we must transform the information from the organization used when it was created into the organization needed for consumption. Second, when rendering we must express, whatever the target medium, the aspects of the appearance of the reorganized information."
The next step in our example is to create an XSL file for transforming the original XML document to another with the XSL-FO tags. This transformation is done by the XSLT Engine that gets two inputs: the XML file with the data and the XSLT file with the transformation rules (see figure 1). The output from this processing is the XML file with the tags that are in conformity with the defined vocabulary for the XSL-FO recommendation.
Figure 1. Styling Process with XSL-FO
The transformation rules are specified in a declarative way using templates. Each template matches elements from the original XML file and defines the elements structure and attributes that will be created as the output of processing. The elements of the XSL-FO vocabulary included in this transformation process deal with specific presentation aspects, such as fonts types and sizes, outlines, tables, pagination, and so forth. Its is important to remember that the XSL-FO recommendation only takes care of the page layout description, being, therefore, independent of the output format. Notice, in figure 1, that the XSL-FO render can generate the output in several formats.
To understand how this transformation occurs, see figure 2. This shows a formatted page corresponding to the Software Support Department. Notice that several elements haven been added: table titles, salary column totalization, besides many graphical elements.
Figure 2. PDF output for Software Support Department
As implemented through the SELECT command, where each grouping is related to a common table expression, the page stylesheet file (db2pdf.xsl file) has been structured using a template for each grouping. See in listing 6 part of the xsl file referring to the template that deals with the employees grouping.
Notice that the elements in Listing 6h are prefixed by aliases. Alias "xsl:" prefixes the elements that follow the XSLT recommendation vocabulary. These elements define the transformation rules from the original XML. On the other hand the alias "fo:" defines the elements to be aggregated to the original XML aiming to define page layout (XSL-FO Recommendation).
Line 1 defines, with the "match" attribute, the element that will allow the use of this template in the transformation process. This way, each occurrence of the "emps" element, will be subject to the rules of this template. Line 2 starts the block that contains the table layout with the employee information. Then, line 3 starts the table definition, and the 4 following lines define the column widths in this table. Lines 9 thru 31 define the table headline. Notice the presence of elements and attributes with definition of presentation characteristics such as outline, padding, colors, and so on. Line 32 uses the"xsl:for-each" element to define the loop that creates the main part of the table, generating one line for each "emp" element occurrence. Lines 33 thru 51 format the body of the table. Notice in lines 48 and 60 the use of the "format-numbe" function in the numeric field format mask definition.
And last, but not least, lines 53 thru 63 define the presentation of the last line of the table with the total amount of the salaries of the department's employees. Notice, in line 60, the XPath expression as argument of SUM function.
Listing 6. Page style template that deals with employees
01. <xsl:template match="emps">
02. <fo:block font-size="10pt" margin-left="5pt">
03. <fo:table table-layout="fixed" >
04. <fo:table-column column-width="2cm" />
05. <fo:table-column column-width="7cm" />
06. <fo:table-column column-width="4cm" />
07. <fo:table-column column-width="4cm" />
08. <fo:table-body>
09. <fo:table-row>
10. <xsl:attribute name="font-weight">bold</xsl:attribute>
11. <fo:table-cell />
12. <fo:table-cell border-width="0.2pt" border-style="solid"
border-color="gray"
13. border-bottom-width="1pt" border-top-width="1pt" >
14. <fo:block padding="2pt">
15. Name
16. </fo:block>
17. </fo:table-cell>
18. <fo:table-cell border-width="0.2pt" border-style="solid"
border-color="gray"
19. border-bottom-width="1pt" border-top-width="1pt">
20. <fo:block padding="2pt">
21. Job
22. </fo:block>
23. </fo:table-cell>
24. <fo:table-cell border-width="0.2pt" border-style="solid"
border-color="gray"
25. border-bottom-width="1pt" border-top-width="1pt"
text-align="right"
26. padding-right="4pt">
27. <fo:block padding="2pt">
28. Total Salary
29. </fo:block>
30. </fo:table-cell>
31. </fo:table-row>
32. <xsl:for-each select="emp">
33. <fo:table-row>
34. <fo:table-cell />
35. <fo:table-cell border-width="0.2pt" border-style="solid"
border-color="gray">
36. <fo:block padding="2pt">
37. <xsl:value-of select="empName"/>
38. </fo:block>
39. </fo:table-cell>
40. <fo:table-cell border-width="0.2pt" border-style="solid"
border-color="gray">
41. <fo:block padding="2pt">
42. <xsl:value-of select="empJob"/>
43. </fo:block>
44. </fo:table-cell>
45. <fo:table-cell border-width="0.2pt" border-style="solid"
border-color="gray"
46. text-align="right" padding-right="6pt">
47. <fo:block padding="2pt">
48. <xsl:value-of select="format-number(empSalary,
'###,###')" />
49. </fo:block>
50. </fo:table-cell>
51. </fo:table-row>
52. </xsl:for-each>
53. <fo:table-row>
54. <fo:table-cell />
55. <fo:table-cell />
56. <fo:table-cell />
57. <fo:table-cell border-width="0.2pt" border-style="solid"
border-color="gray"
58. text-align="right" padding-right="6pt">
59. <fo:block padding="2pt">
60. <xsl:value-of select="format-number(sum(emp/empSalary),
'###,###')" />
61. </fo:block>
62. </fo:table-cell>
63. </fo:table-row>
64. </fo:table-body>
65. </fo:table>
66. </fo:block>
67. </xsl:template>
|
We have created the stylesheet that defines the original transformation rules from the original XML into another that uses the vocabulary of the XSL-FO page definition. After performing the transformation, it's time to turn our attention to rendering the output according to the requested format. The PDF creation will be shown in the next section.
The last part of our example consists of rendering the PDF (see Figure 1, second step). This is done in an automatic way by an XSL-FO Render software. In our example, we will use FOP - Formatting Object Processor. FOP is part of The Apache Open Source project, and its purpose is to "deliver an XSL-FO to PDF formatter that is compliant to at least the Basic conformance level described in the W3C Recommendation from 15 October 2001, and that complies with the 11 March 1999 Portable Document Format Specification (Version 1.3) from Adobe Systems".
To create a PDF file using FOP, just inform the file name that is in conformity with XSL-FO recommendation and the output file name (lines 1 and 2 at Listing 7). It is also possible to delegate to FOP the transformation task from the original XML, being necessary to inform the XML file name and the stylesheet (lines 4 and 5 at Listing 7). In this last case, FOP will invoke the Xalan XSLT Processor before starting to render the PDF file.
Listing 7. FOP execution
01. fop.bat -fo db2toPDF.fo -pdf db2toPDF.PDF (windows) 02. fop.sh -fo db2toPDF.fo -pdf db2toPDF.PDF (unix) 03. or 04. fop.bat -xml db2toPDF.xml -xsl db2toPDF.xsl -pdf db2toPDF.PDF (windows) 05. fop.sh -xml db2toPDF.xml -xsl db2toPDF.xsl -pdf db2toPDF.PDF (unix) |
PDF rendering is the simplest step in our example. Just invoke the renderer to get the file created. FOP also allows you to render the results in other formats, such as PCL, PS, TXT, and AWT.
PDF format permits the publishing of data independent of the browser type or platform and in recent years it has became a de facto standard. XML technology allows, using XSL-FO, the page layout of published material to be described. And finally, new built-in XML functions allow easy creation of XML contents from structured data in DB2 tables. Using a simple example I've showed you how to generate XML contents with the new functions and how to transform it into another XML that uses the XSL-FO vocabulary. In our last step we rendered the PDF using an XSL-FO renderer.
For further information about XSL-FO e XSLT, I suggest the following tutorials: "XSL Formatting Objects (XSL-FO) basics" and "Practical transformation using XSLT and XPath".
| Name | Size | Download method |
|---|---|---|
| db2topdf.zip | 12.7 KB |
FTP
|
Information about download methods
- "What is XSL-FO ?", G. Ken Holman, http://www.xml.com/lpt/a/2002/03/20/xsl-fo.html
- World Wide Web Consortium, Extensible Markup Language (XML), Version 1.0 (Second Edition), W3C Recommendation, October, 2000.
- World Wide Web Consortium, XSL Transformations (XSLT), Version 1.0, W3C Recommendation, November, 1999.
- World Wide Web Consortium, Extensible Stylesheet Language (XSL), Version 1.0, W3C Recommendation, October, 2001.
- World Wide Web Consortium, XML Path Language (XPath), Version 1.0, W3C Recommendation, November, 1999.
João Alberto de Oliveira Lima began his career in software industry as a programmer/analyst in 1991. Since 1993 he has been involved with DB2 as developer, DBA and SQL performance analyst. João has a BSc in Computer Science and a MSc in Software Engineering. He has the following certifications: IBM Certified Developer in "XML and Related Technologies", IBM Certified Solution Designer in "Business Intelligence V8.1" and IBM Certified Solutions Expert in "DB2 UDB V7.1 DBA for OS/390", "DB2 UDB V7.1 DBA for Unix, Windows and OS/2" and "DB2 UDB V7.1 Family Application Development". He is the author of reference guide "DB2 UDB [Mainframe v. 7]" published by Novatec Editora (only in Portuguese). He lives in Brasília, Brazil.
Comments (Undergoing maintenance)





