Skip to main content

Generating XML from IDS 9.x

Jacques Roy (jacquesr@us.ibm.com), Sales Support Specialist, IBM, Software Group
Jacques Roy photo
Jacques Roy is a member of IBM's worldwide sales support organization. He has over 20 years of industry experience and over 5 years of experience with database extensibility. He is the author of Informix Dynamic Server.2000: Server-Side Programming in C and co-author of Open-Source Components for the Informix Dynamic Server 9.x.

Summary:  This article describes how to use the extensibility features of IBM Informix® Dynamic Server to generate XML-formatted data. Sample code is included.

Date:  25 Feb 2003
Level:  Intermediate
Activity:  735 views

Introduction

XML is becoming more and more important to the enterprise. Frequently users need to create XML documents from the contents of a relational database. This can be done in many ways, but the extensibility features of IBM® Informix Dynamic Server® version 9.x (IDS 9.x) give you the opportunity to create your own functions to generate XML output from your database.

This article explains a simple approach to generating XML that can answer many needs and provide a building block for more elaborate solutions. The example that accompanies this article uses the stores7 demonstration database provided with the IDS 9.x product.


Building block

We may want to generate XML-formatted data using multiple types of SQL statements. The data could be in a single table, or could require the joining of multiple tables. What these SQL statements have in common is that they all return rows of data. The solution is then the same for manipulating one table or multiple tables.

IDS 9.x provides support for a new type called rows. This means that an entire row result can be passed to a user-defined function (UDF) as a single argument. The row is defined by columns that have names and types. What is left to do is pass an additional argument that identifies a row name to encapsulate the row data in the XML representation. In order to accomplish this, consider a function that would have the following definition:

CREATE FUNCTION genxml(varchar(30), ROW)
RETURNING lvarchar
. . .

The first argument is the name we want to give to the row and the second argument is the row itself. A row includes its definition, so the genxml function can find how many columns are in the row, their names, and so on. The following statement generates an XML representation for each row of the customer table:

SELECT genxml("customer", customer)
FROM customer;

The second argument is the table name, which means that the entire row is passed as the second argument. The first row returned looks like this (reformatted for this article):

<customer>
<customer_num>101&lt//customer_num>
<fname>Ludwig        </fname>
<lname>Pauli         </lname>
<company>All Sports Supplies </company>
<address1>213 Erstwild Court  </address1>
<city>Sunnyvale      </city>
<state>CA</state>
<zipcode>94086</zipcode>
<phone>408-789-8075      </phone>
</customer>

The column names were extracted from the row definition by the genxml function and used as tags for their respective columns. But what if we want a partial list of columns? We can dynamically create a row with the desired columns as the second argument to the genxml function as follows:

SELECT genxml("customer", ROW(customer_num, fname, lname))
FROM customer;

The resulting XML row is similar to the previous output with fewer fields. Since we can build a row dynamically, we can also join multiple tables and create output based on all tables, as shown in the following example:

SELECT genxml("customer_calls",
        ROW(A.customer_num, fname, lname, call_dtime, call_code,
            call_descr, res_dtime, res_dtime, res_descr)
       )
FROM customer a, cust_calls b
WHERE a.customer_num = b.customer_num;

These examples cover many of the requirements for generating XML-formatted rows. There may also be a requirement to generate XML output based on the results of an SQL statement that includes an aggregation. Aggregate expressions can be passed to the genxml function, as you see here:

SELECT genxml("stats", ROW(customer_num, COUNT(*)) )
FROM cust_calls
GROUP BY customer_num;

With this basic tool, we are now ready to complete the building of an XML document by adding its header and footer.


Handling NULL values

If a column contains a NULL value, the genxml function, and all other functions described in this article, does not process the column. This makes sense, as the only things that would appear would be the opening and closing tags for the column.


Row type processing

A table may contain a column that is a row type. In that case, the row passed to the genxml function would contain a named or unnamed row type. For example, we could have a definition as follows:

CREATE ROW TYPE address_t (
  name     varchar(20),
  address1 varchar(20),
  address2 varchar(20),
  city     varchar(15),
  state    char(2),
  zipcode  char(5)
);
CREATE TABLE employee (
  name varchar(30),
  address  address_t,
  phone  varchar(18)
);

The function will process the row type, named or unnamed, just like other base types. The execution and result of a query on the employee table would look like this:

SELECT genxml("employee", employee) FROM employee;

<employee>
<name>Roy</name>
<address>
  <address1>123 first street</address1>
  <city>Denver</city>
  <state>CO</state>
  <zipcode>80111</zipcode>
</address>
<phone>303-555-1212</phone>
</employee>

As you can see, the address column tags surround the definition of the row, and the row columns names are used to identify the values just as in the higher level. Note that the address2 column is a NULL value and therefore does not appear in the output.


Table hierarchies

There is one more case we must consider. The IDS 9.x product supports the definition of table hierarchies that are based on named row types. Figure 1 shows an example of a hierarchy.


Figure 1. Hierarchy example
Figure 1

One reason to have such a hierarchy is to be able to do risk calculations that differ depending on the type of loan. By creating multiple risk functions that operate on specific row types, we can submit a query such as the following:

SELECT loan_number, risk(loans) FROM loans WHERE branch_id = 127;

This query goes through all the rows for branch_id 127 and calls the appropriate risk function depending on the type of row being returned. If we want to generate an XML representation for each loan for a specific branch using our existing genxml function, we must submit one SQL statement for each sub-table. We must do this to give the proper row type name for each row.

Since we are dealing with named row types, the genxml function could retrieve this information and use it implicitly. Note that IDS differentiates functions based on their names and arguments types. Therefore we can create a new, additional genxml function that takes only the row as its argument and retrieves the name from the row definition. We can then generate an XML representation of loans for a specific branch with an SQL statement such as this:

SELECT genxml(loans) FROM loans WHERE branch_id = 127;

If you were to use this function on an unnamed row type, the header would be the complete definition of the row. For the customer table, you would get the following header:

<ROW(customer_num integer, fname char(15), lname char(15),
     company char(20), address1 char(20), address2 char(20),
     city char(15),state char(2), zipcode char(5),
     phone char(18)
    )>

The same row name would be used as the row closing tag.


Completing the XML document

XML is used to describe data. It can also include information that identifies the document type and the definitions of valid documents of this type. These validation files are called document type declarations or DTDs for short. DTDs are themselves XML documents.

XML describes the data but not the way it should be presented. The presentation of the information can be described in an extensible stylesheet language (XSL).

Finally, you should identify the document as an XML document that follows a specific XML standard. This means that you should wrap the generated rows with at least with the identification that it is an XML document and a tag for the complete document. The result could be something such as this:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE customer_set SYSTEM "/home/dtd/customer_set.dtd">
<?xml-stylesheet type="text/xsl"
                 href="/home/xsl/customer_set.xsl" ?>
<customer_set>
. . . (XML-formatted rows)
</customer_set>

CREATE PROCEDURE xmlcustomerset()
RETURNING LVARCHAR
  DEFINE result LVARCHAR;
  DEFINE ressql LVARCHAR;
  LET result = '<?xml version="1.0" encoding="ISO-8859-1" ?>';
  LET result = result ||
      '<!DOCTYPE customer_set SYSTEM "/home/dtd/customer_set.dtd">';
  LET result = result ||
'<?xml-stylesheet type="text/xsl" href="/home/xsl/customer_set.xsl" ?>';
  LET result = result || '<customer_set>';
  FOREACH SELECT genxml('customer', customer) INTO ressql FROM customer
    LET result = result || ressql;
  END FOREACH;
  LET result = result || '</customer_set>';
  RETURN result;
END PROCEDURE;
EXECUTE PROCEDURE xmlcustomerset();

The procedure is very simple. It simply initializes the result variable to include the header information. It then executes the SQL statement that returns multiple XML-formatted rows and appends the result to the result variable. Finally, it closes the document by adding the closure of the customer_set tag and returns the result. The last line shows how the procedure is executed.


Keeping track of header information

The creation of an XML document from a stored procedure is always the same. It would be nice to generalize the process instead of having to create multiple stored procedures. The first step is to find a way to keep track of the document type and the stylesheet that should be used. We can use a table to keep track of this information:

CREATE TABLE genxmlinfo (
  name       varchar(30) PRIMARY KEY,
  dtypepath  lvarchar,
  xslpath    lvarchar
);

The name column is the document type used. In the example above, it is customer_set. The dtypepath column contains the path to the DTD and the xslpath contains the path to the XSL document to use. Both these columns allow NULLs. If the column is NULL, then the header line does not appear in the document. This gives us the flexibility to decide what must be included.


Generalizing the creation of XML documents

If we were to use stored procedures to generate XML documents, we would have to have one procedure for each SQL statement that we want to use. This is required because the stored procedure language only supports static SQL statements.

Now that we have a way to retrieve the header information, we can create additional user-defined functions that will execute SQL statements dynamically and add the header information. We can approach this process in one of two ways. Either we simply deal with the output of genxml() or we accept a generic SQL statement and format the rows that are returned by the statement. Let's look at the latter first.

We need to develop a function that takes a document type and an SQL statement as arguments and returns a complete XML document:

CREATE FUNCTION genxmlhdr(LVARCHAR, LVARCHAR)
RETURNING LVARCHAR
. . .

We can then generate the customer_set XML document by executing:

EXECUTE FUNCTION genxmlhdr("customer_set",
                           "SELECT * FROM customer");

This will generate an XML document that is slightly different from the XML document generated by the xmlcustomerset() procedure: we have no way to provide a name for the returned rows. We have to give them a generic name. In this case: row. This is not a problem. We simply have to take that into consideration when we put together the DTDs and XLS documents that relate to our document types.

We can provide a little more flexibility by adding a function similar to the genxml(lvarchar, lvarchar) function that assumes that the output of the SQL statement will be XML-formatted rows. The creation of the customer_set document would become:

EXECUTE FUNCTION addxmlhdr("customer_set",
        "SELECT genxml('customer', customer) FROM customer");

To complete our set of functions, we need to add one function that takes an SQL statement and returns all the rows as an XML document without the header information. With this function, it is then possible to execute dynamic SQL statements within a stored procedure. Since the function signature of this function is different from the one of genxml(lvarchar, row), we can use the same name:

CREATE FUNCTION genxml(LVARCHAR, LVARCHAR)
RETURNING LVARCHAR
. . .

The first argument represents the row name and the second, an SQL statement. The generation of the customer_set document without the header is then:

EXECUTE FUNCTION genxml("customer_set", "SELECT * FROM customer");


User-defined aggregate

IDS 9.x provides the capability to create your own aggregation functions. Since the creation of an XML document involves processing a set of rows to return one result, the user-defined aggregate (UDA) capabilities are well suited to XML processing.

By definition, a relational database does not guarantee any order in the rows. The only way to order the rows is by processing them after an SQL statement that includes an ORDER BY clause. You can work around this problem by using a temporary table to contain the result you want to generate as XML. A future release of IDS is supposed to allow selecting data from the result of an SQL statement.

If the row order is not an issue, you could generate an XML document by using a UDA. This UDA could be used in the following manner:

SELECT aggrxml(customer, "customer_set") FROM customer;

The first argument of the UDA is the row we want to process. The second argument is an initialization value that represents the document type name that is generated. This means that the output is the same as the one generated by the genxml(lvarchar, lvarchar) functions discussed above. The usage is more natural than using an EXECUTE FUNCTION statement. Since this is an aggregate function, you can take advantage of the GROUP BY SQL clause. This capability can be useful if you have to generate multiple XML documents. For example, this could reduce the number of SQL statements executed within a stored procedure by removing an embedded FOR EACH loop.


Implementation

All the functions discussed above have been implemented in C and are available as example code. Here is a list of the functions including their argument types:

addxmlhdr(lvarchar, lvarchar)aggrxml(ROW, lvarchar)genxml(lvarchar, ROW)
genxml(ROW)genxml(lvarchar, lvarchar)genxmlhdr(lvarchar, lvarchar)

Note that the addxmlhdr(lvarchar, lvarchar), genxml(lvarchar, lvarchar) and genxmlhdr(lvarchar, lvarchar) functions can only be executed by an EXECUTE FUNCTION statement.

The source code for these functions is around 300 lines of C code. The code was tested on IDS 9.30.TC2 under Microsoft® Windows® 2000 and IDS 9.30.Uc2 on Linux®.

The implementation also includes a function that's useful if you intend to modify the code: set_tracing. This function allows you to turn on the execution tracing of a function. You will need to remove "-DMITRACE_OFF=1" from the make files and add tracing code to the source. For more information on tracing, please see the book Server-Side Programming in C listed in the Resources section below.


Installation

The installation of the example code involved three steps:

  • Loading the code: All the files included in the example should be loaded in a directory under the $INFORMIXDIR/extend directory. The directory name must be genxml.
  • Compiling the code: You must compile the source code for your specific platform. The example code includes two makefiles: winnt.mak and makefile. Winnt.mak is used for compilation under Microsoft windows 2000. It assumes the use of Microsoft Visual C++. The file makefile is used to compile the code for UNIX#x422-type platforms. You will need to modify the makefile include statement (line 2) to reflect the appropriate platform
  • Register the functions: Before you can use the functions described above, they must be registered in the databases where the functions are needed. The script named genxml.sql registers all the required functions and creates the genxmlinfo table that is used to keep track of the information about document types.

To remove the functions, you can execute the genxml_d.sql script in the appropriate database. It removes all the functions and the genxmlinfo table. If you want to keep the table, you should modify the installation script and the removal script and comment out the lines that involve the table.

The easiest way to register and de-register the functions is to execute the scripts through the command line:

dbaccess -e jroy genxml <errlog 2<&1

This example assumes a database name of jroy and redirects the output of dbaccess to a file named errlog. If an error occurs, the script continues the execution at the next statement in the script file. The -e option tells dbaccess to echo the commands to the output so you can see what command generated which result.


Limitations

The example code provided with this article contains two major limitations:

  • The XML output of a row cannot exceed 2048 characters.
  • The XML document output is limited to around 32KB characters.

These limitations can be removed by adding the code to handle these limits. You can find example code on a type called idn_mrlvarchar at the IBM web site listed in the Resources section. You can also find the code and the description of the type and how to use it in the Open Source Components book listed also in the Resources section. By using an implementation similar to idn_mrlvarchar, your XML document limit would jump to 4TB.

This implementation supports most of the IDS data types except for collection types (LIST, MULTISET, SET), and the large object types (BLOB, BYTE, CLOB, and TEXT).


Conclusion

The ability to generate XML documents directly from the database server can greatly simplify the implementation of a solution. It simplifies the interaction between the application and the database. This application can be code running in an application server, code running in a web server, a CGI program, or a stand-alone application. Since the genxml code runs in the database server, it is readily available to any application. This availability fosters code reuse.

The genxml code demonstrates the power of object-relational technology. And it demonstrates at least two more things:

  • You don't always need to wait for the database vendor to provide functionality to solve your business problems.
  • Some interesting functionality can be added to the database server with little effort and provide significant business advantage.


Download

DescriptionNameSizeDownload method
genxml sample codegenxml.zip128 KB FTP | HTTP

Information about download methods


Resources

About the author

Jacques Roy photo

Jacques Roy is a member of IBM's worldwide sales support organization. He has over 20 years of industry experience and over 5 years of experience with database extensibility. He is the author of Informix Dynamic Server.2000: Server-Side Programming in C and co-author of Open-Source Components for the Informix Dynamic Server 9.x.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, WebSphere, XML
ArticleID=95791
ArticleTitle=Generating XML from IDS 9.x
publish-date=02252003
author1-email=jacquesr@us.ibm.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers