My company has a problem. One of our clients wants to use Linux, and we need to port a large set of database applications from Windows to Linux. Those applications were written in non-portable C++, and we will rewrite them in Java. We selected Java because we need applications that work in both operating systems using the same source code to simplify maintenance. But we don't like the options we currently have for database reporting in Java.
What my company needs is a way to generate reports from the data stored in the database, while keeping the report format and content outside the application. Over the years we've discovered that a high percentage of requests for changes are related to printed reports. If we keep the report outside the application, we don't have to recompile anything when we receive a request like "move this title to the left." There are commercial products that make it possible to do this, but after testing and evaluating several report tools for Java, we decided to create our own. In this article I will describe the reporting tool we are working on.
A key piece in the set of Windows applications is a report generator program I wrote using C++ Builder. That tool creates a visual composition of the report layout, making reports maintenance an easy task. Reports are currently stored in a proprietary binary format, and without the report generator no one can edit them. This program only runs in Windows, and because it uses third-party Delphi components without source code, I'm unable to port it to Linux. We need either a replacement tool to edit reports in Windows and Linux, or we need to write the reports in plain text so we can use any editor.
We also have a set of Delphi components that we use in the rest of our Windows applications. These components are used to read the report definition from disk, open the database, extract the data, and show a preview or print a nice report. We can't use these components in Linux, which means we can't use our reporting tools in that OS.
The tool I wrote using C++ Builder works great when we have a local copy of the database available. Most of the time, unfortunately, data is stored in servers outside of our office. We use Telnet to administer remote sites, but when we are working with a Linux server we simply can't run the report editor remotely. To download a copy of the database for local use generates another problem: the database usually is too large.
As stated before, we need a new way to generate database reports. Our wish list for the new reporting system is very long, but at minimum we would like the following:
- To store reports in plain ASCII text, allowing us to use any editor to update them. We use Telnet to administer remote Linux boxes, and it should be possible to use character-based editors like vi.
- A set of JavaBeans that we can use in our applications to read and display/print reports.
- A stand-alone application that we can use in Windows and Linux to preview/print any report, especially from the command line under Linux.
The solution I'm working on involves storing reports in XML format. An XML document is beautiful plain text, and the tools I'll need for development (Java compilers and IDEs) are available in both target platforms, Linux and Windows. Since a picture is worth a thousand words, I have included Figure 1 to provide a visual description of the planned solution.
Figure 1. The solution

Report definitions have to describe the data that will be presented to the final reader and the appearance that the document should have. The report type we use is called a "banded report" -- a set of data sources plus a page layout composed of different band types. For the sake of simplicity, this article will focus only on three band types: page header, page footer, and detail band (the zone where the data will be listed). Figure 2 shows a sample of a report layout (left side) and a sample of the output that can be generated (right side). In the example, bands are filled with static text (labels) and data fields; in the real world, bands include graphics, bar codes, calculations, and other things we will leave aside.
XML will be used by our new reporting system to describe the source of the data, the size and content of the header and footer of each page, and the origin and distribution of the data being reported.
Figure 2. A sample report layout and output

Processing a report can consume a lot of database resources, so we must be sure we are dealing with a valid document before we start. It's not funny to discover that the report definition is incomplete after fetching thousands of records from the server. Remember that sometimes we will use text editors that lack parsing capabilities, and we may make mistakes easily. A missing XML tag can leave the report processor in an infinite loop. To validate an XML file, we need to check if it complies with a Document Type Definition (DTD). As we didn't have a DTD ready to use, I had to write one from scratch, as displayed in the next section.
The DTD should model the report completely. Using a DTD editor or a text editor (see Resources), we start defining the root element for every report, as shown in Listing 1:
Listing 1: Defining a root element for reports
Basic items in a report:
Additional attributes for the report:
| <!ELEMENT Report
(Page,SQLSatement)><br /><!ATTLIST Report<br />Name ID
#REQUIRED<br />Title CDATA #REQUIRED<br />Version CDATA #FIXED
"1.0"<br />> |
Now our document has page formatting and one SQL statement. We use an SQL query to get the data displayed in the detail band, and sometimes we use additional queries to summarize results in the last page or to show specific information in header bands. This sample is limited to one query only.
The attribute Version will be used in future versions to ensure
backward compatibility. The report processor bean (see Figure 1) should check what version of the DTD was used to generate the report.
Before we define what a page is in XML we need to understand why we need a page at all. A page is the canvas on which we will design our report. It has size (Height and Width ) and contains one or more report bands. In XML terms we define our page as:
<!ELEMENT Page (ReportBand+)>
<!ATTLIST Page
Height CDATA #REQUIRED
Width CDATA #IMPLIED
>
|
The definition of ReportBand is straightforward: It is a placeholder for all the elements (ReportElements ) we want to print. A ReportBand has a band type attribute that is used to select the position of the band when rendering the final view (header, footer or detail band). The ReportElements we will use are restricted to Labels and DataFields (see the DTD for more details).
Data for the report will be gathered from JDBC sources, and we need to specify this, as shown in Listing 2:
<!ELEMENT SQLStatement EMPTY> <!ATTLIST SQLStatement Name ID #REQUIRED Username CDATA #REQUIRED Password CDATA #REQUIRED Statement CDATA #REQUIRED Driver CDATA #REQUIRED URL CDATA #REQUIRED > |
The DTD should also describe additional elements, such as:
- ReportElement
- Label
- DataField
- ReportElementType
- Font
The above elements are defined in a simplified version of the real DTD. Please take some time to browse the DTD; it will be easier for you to understand the report model after you read it.
Writing an XML report definition
Now that we have a DTD that defines our documents, we can use any editor to write a test report. It doesn´t matter if the editor is a validating one or not. We will use the DTD with our processor bean to verify the report. Our sample report begins with something like:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Report SYSTEM
"file://localhost/C:/MXReports/MXReports.dtd">
<Report Title="Currencies" Name="ID0" Version="1.0">
|
The first line lets the reader (human or application) know the version of XML and the character set that are used in the document. The second line explains the type of root node being used, and shows the DTD that should be used for validation (make sure you use the right location of the DTD when you write an XML document). The last line defines attribute values for the root node.
The RDBMS behind this project is InterBase, and the sample SQL statement for this article will use InterBase syntax and Employee.gdb, an example database included in the standard distribution. Don't worry if you don't have InterBase; any database with JDBC driver can be used if you want to test these ideas.
Employee.gdb (the example database that we will use) includes a table named COUNTRY that has two fields: COUNTRY and CURRENCY. We will write an XML report listing the contents of those fields. The SQL statement to retrieve the data from the database is:
SELECT Country, Currency FROM Country ORDER BY Country |
This translates to:
<SQLStatement Username="SYSDBA" Password="masterkey"
Driver="interbase.interclient.Driver"
URL="jdbc:interbase://localhost/c:\mxreports\employee.gdb"
Statement="SELECT Country, Currency FROM Country ORDER BY Country"
Name="ST1"/>
|
The sample report, which is available separately (see Resources), includes a header band with a title label and a detail band with two data fields (country + currency). The report must have a detail band, or it will be useless. The DTD that is provided asks for at least one band but does not require it to be a DetailBand.
Extracting data from the database
At this stage we have a database, a sample report, and a DTD. We need something to combine those elements, generating something more useful. There is a box in Figure 1 that represents a report processor bean. That bean is used in a Java application to generate an intermediate document also in XML format.
The steps to generate the intermediate document are:
- Read and parse the XML report
- Establish a connection to the database and execute the
SELECTstatement specified in the report - Generate the content of the first header band
- Iterate through all the rows of the query's result set generating content for the detail bands
- Check the number of detail bands generated, inserting a new footer/header pair when the page size demands it
The code in Listing 3 illustrates the first part of the process listed above:
Listing 3: Reading and parsing the report
// create a parser and read the document String reportFileName = args[0]; FileInputStream file = new FileInputStream(reportFileName); DocumentBuilderFactory dFactory = DocumentBuilderFactory.newInstance(); dFactory.setValidating( true ); DocumentBuilder docBuilder = dFactory.newDocumentBuilder(); Document doc = docBuilder.parse( file ); Element root = doc.getDocumentElement(); |
// get page size from the document
NodeList nl = root.getElementsByTagName("Page");
Element pg = (Element)nl.item(0);
pageWidth = new Float( pg.getAttribute("Width") );
pageHeight = new Float( pg.getAttribute("Height") );
|
// read SQL related stuff from the document
NodeList nl2 = root.getElementsByTagName("SQLStatement");
Element sql = (Element)nl2.item(0);
driver = sql.getAttribute("Driver");
url = sql.getAttribute("URL");
statement = sql.getAttribute("Statement");
username = sql.getAttribute("Username");
password = sql.getAttribute("Password");
|
// open the database and get the data Class.forName ( driver ); DriverManager.setLogStream(System.out); Connection con = DriverManager.getConnection( url, username, password ); Statement stmt = con.createStatement (); ResultSet rs = stmt.executeQuery ( statement ); |
The code shown in Listing 3 allows us to load a report from disk and store it in memory as a validated XML tree. It shows you how to get data stored in the report, such as page size and all the parameters needed to establish a connection to the database.
The variable doc in the example holds a complete version of the
report in what is called a DOM (Document Object Model) tree. We can keep
the whole report definition in memory because, as is always the case, it's a
small document. The data used to generate the report will be fetched one record
at a time, and it doesn't matter how large the data set is.
The rest of the source code used to read the report into memory is available for download (see Resources). For the sake of brevity, I have omitted all exception catching and several variable declarations belonging to this excerpt.
Once we are able to read the XML report definition and get the data from the database with our program, we face a problem concerning what we should do with what we have. I've played with the following options:
- Generate a set of graphics, each of them representing a page, and render them with a special viewer.
- Generate an intermediate XML document and use XSL transformations to obtain different output formats.
The first option is exactly what our current C++ software does. It works for us and is enough for the moment. However, the second option provides additional benefits for the future. If you check Figure 1 again, you will see that option 2 was the one we selected for the new reporting system.
My first idea was to use a style sheet to transform the intermediate XML document into HTML, but then I remembered that we need printed output. HTML would make it hard to use footer bands in every page, and HTML printing varies with every browser.
Instead of generating HTML, there's the option of generating PDF files using FOP (Formating Objects Processor), a utility developed by Apache.org. According to Apache's FOP documentation: "FOP is the world's first print formatter driven by XSL formatting objects. It is a Java application that reads a formatting object tree and then turns it into a PDF document. The formatting object tree can be in the form of an XML document ... or can be passed in memory as a DOM document ... or SAX events" (see Resources). An issue that concerns me about using FOP is the possible need to have Acrobat Reader installed on the computer for viewing the result.
Another option is to generate RTF files from our intermediate document using a specialized style sheet. Microsoft has made available the whole definition of RTF files. While I would prefer to use RTF, it may be simpler to use FOP because developing a style sheet to generate RTF could be a difficult task, and I need this project working not now but yesterday.
While drafting the code used to read and parse the report, I decided to use Xerces and Xalan from Apache. Because, like InterBase, they are open-source and multiplatform projects, I figured I wouldn't have to be tied to proprietary, unportable code again. There is a problem now, as I'm unable to compile the latest FOP release using the latest version of Xalan. (Xalan, Xerces, and FOP are different projects and they evolve at different rates.) However, the latest version of FOP for beginners includes compatible versions of Xalan and Xerces. Using the set of JAR files that were in the distribution for beginners, I can keep my work moving forward.
So it's clear that I decided to go with FOP. Using Xalan with the code in the previous section, we built a tree that represents our report definition, and we have a connection to the database ready. Now we have to do something with that tree and the database. We should modify the report processor bean and make it generate the intermediate XML document.
The intermediate XML document that we have to generate must have a header, and we write one to the output device with the code in Listing 4.
Listing 4: Writing the header
System.out.println("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");.
System.out.println("<report xmlns:fo=\"http://www.w3.org/1999/XSL/Format\">");
|
I write to the output device, and you should redirect it if you want to save the resulting document to a file for later reuse. Notice that I don't use XMLReports.DTD anymore but use the DTD used by FOP instead.
After writing the header, I need to process all report elements that belong to the HeaderBand. For each element in the HeaderBand branch of our tree, I have to:
- Check its type (Label or DataField)
- If it is a DataField, get its value from the data source
- Specify font, size, and position to use when rendering
- Write everything to the output device using XML format
When I loaded the report into memory, I obtained the page size and the heights of all bands. With a simple calculation, I know how many detail rows fit in our canvas:
Listing 5: Calculating the detail rows
numberOfRows = (pageHeight - headerHeight - footerHeight) / detailHeight; |
After processing the header band, I have to read and process
NumberOfRows records from the database using the same method used to
process the elements of the header. When I am done reading all data that fits in
one page, I have to process the FooterBand and, if there still are records
available, write a new page header starting over and over.
Seems easy, doesn't it? Doing this repetitive work, I created an XML document
that ideally would serve as input to FOP. The ideal document should contain only
a set of fo (formating objects) as specified in FOP's documentation.
But the hard part is ahead. I need an XSL style sheet that FOP can use to
transform my data into the desired PDF file.
If you download the FOP version for beginners, you will get a lot of samples that are really easy to follow. After unzipping FOP, you will find xml2pdf.bat and xml2pdf.xsl at the root directory. The latter is the XSL style sheet to use if you want to continue working with this project. So the style sheet is right there and ready to use.
When you have an XML document ready and the right XSL style sheet (name it xml2pdf.xsl), all you have to do is run xml2pdf.bat. Simply provide your XML document, style sheet, and output PDF file as parameters for the batch file. Once you have FOP properly installed, go to FOP's root directory and test it with:
xml2pdf test.xml xml2pdf.xsl test.pdf |
The sample program does not generate a document that you can use to successfully build a PDF file using xml2pdf.xsl. Remember that this is just an explanation of a new approach, and that here we are dealing with a small subset of the options a report could have. Looking at xml2pdf.xsl, you will be able to identify the missing pieces in this puzzle.
When we started researching a solution to our reporting problem, we found several options that we still keep under consideration, like the generation of RTF files that could also be used in any word processor or spreadsheet.
Applying XSL on the intermediate document to generate a large HTML document as an alternative reporting method creates a problem with page footers, but having a set of HTML pages could be a good option. Sun developed Java Help, a help viewer similar to the help system in Windows 98/2000 and Internet Explorer 4.x/5.x. Java Help may be used to compile and preview/print a help document that would meet our requirements. Note: This would be the easiest path to take if you don't care too much about the differences between the previewed and the printed version.
So far I have described how you can generate a report in PDF format from the data stored in a RDMS. The method I explained uses several tools to achieve the desired goal. With our current tool (the one I wrote in C++), we can do design and print or preview a report using only one program. My dreams for the future tool include a WYSIWYG editor that can use the DTD and help build SQL queries visually and preview or print the final report, all functions in one program.
In this article I tried to explain how Java and XML are helping my company replace a database reporting system that is tied to one operating system. Searching for a cross-platform solution, I found promising tools like Xalan and FOP. There's still a lot of development and testing to do before we release a new reporting system, but the results obtained so far justify all the effort invested.
I invite you to send me your comments about the article. And more: I invite you to send me your ideas to improve this project. I would like to see a professional database reporting tool emerge from this sketch.
| Name | Size | Download method |
|---|---|---|
| x-dbrep.zip | 3KB | HTTP |
Information about download methods
- Visual DTD, now incorporated into the XML and Web Services Development Environment from alphaWorks.
- Windows Notepad. It might look like a joke, but I use it because it is a
nice tool.
- Xeena
I found it the only usable editor when working with large DTDs like
DocBook. Providing the right XSL style sheet you can perform a transformation
and preview your document. However, you will only see a tree representation
when editing.
- Morphon
A commercial XML editor that allows you to define cascaded style sheets to
preview the document while writing. A nice candidate as a WYSIWYG editor to
use in the future.
- JDK 1.2
- Xalan 2
- Download x-dbrep.zip to work with the following code files:
- MXReports.DTD This is really an HTML file and not a DTD to avoid confusing your browser.
- ReportLoader.java This program loads a report in a DOM tree and establishes a connection to a database using parameters defined in the report.
- Currency.xml is a sample report easy to understand. This is also an HTML file to view in your browser.
- A database with suitable JDBC driver. If you use InterBase you will need
jndi.jar and jdbc2_0-stdext.jar that are available at java.sun.com

Rodolfo M. Raya works for Maxprograms as consultant/developer/co-owner, trying to find the magic formula to fit 8 days in a week. No one knows why, after he spent 13 years programming in C/C++, he is dealing now with Java, XML, and all those new technologies that take time to understand and digest. This article is dedicated to Martín, his little baby who makes sleeping at night another difficult task. You can contact Rodolfo at rmraya@maxprograms.com.
Comments (Undergoing maintenance)





