Simulate XQuery and XInclude functionality with PHP

Enjoy the power of XML data processing in web programming

Many browsers can handle XML in general, but are currently weak in their support of XQuery and XInclude. You need to work around this issue when you build web applications that combine and process XML-structured data from multiple sources. Using PHP is one solution. This article first shows how your web pages can deliver data extracts from XML using XQuery and XInclude explicitly, and then how to do the equivalent work with PHP, all within the context of a cooperative effort among garden centers.

Share:

Colin Beckingham, Writer and Researcher, Freelance

Colin Beckingham is a freelance researcher, writer, and programmer who lives in eastern Ontario, Canada. Holding degrees from Queen's University, Kingston, and the University of Windsor, he has worked in a rich variety of fields including banking, horticulture, horse racing, teaching, civil service, retail, and travel and tourism. The author of database applications and numerous newspaper, magazine, and online articles, his research interests include open source programming, VoIP, and voice-control applications on Linux. You can reach Colin at colbec@start.ca.



21 September 2010

Also available in Chinese Japanese Portuguese

XInclude and XQuery are XML tools that help web programmers process data dynamically. XInclude lets you treat multiple XML files as if they were one file, and XQuery can process the combined data and prepare it for inclusion into output for web-page display. Together, they perform this service elegantly and efficiently with few lines of code.

Frequently used acronyms

  • HTTP: Hypertext Transfer Protocol
  • W3C: World Wide Web Consortium
  • XHTML: Extensible Hypertext Markup Language
  • XML: Extensible Markup Language
  • XSL: Extensible Stylesheet Language

Most browsers can display and process XML files either directly or in cooperation with XSL templates. In an ideal world, browsers would understand XQuery and XInclude directly too. But at this point they support these tools only by placing unreasonable demands on users—for example, by requiring them to load experimental add-ons. Fetching the data from widely different sources and combining them into one large data set for processing can be a painstaking task for the web programmer.

Through a hypothetical business example, this article first shows you the strength of the combination of XQuery and XInclude. Then you'll learn how to use PHP to simulate the functionality that XQuery and XInclude provide. Moving all the data processing to the server side gives you a workaround to limited browser support for XQuery and XInclude. Another benefit is that PHP gives you much finer control over the final output presentation.

Example: Garden-center cooperation

Imagine that a town has three garden centers. They compete with one another but provide sufficiently different services that they can do so cooperatively rather than antagonistically. They do some market research to get find out how their customers buy their plants.

In the horticulture business, sales of highly perishable products occur in high volume in a short period of time, and customers can be quite particular about the type and quality of products they want. It is in the best interest of the business to get the high-maintenance products (for example, those that need to be watered and kept free of pests) out the door as soon as possible and to keep a mercurial customer base happy by providing them with the right products at the right time and in the right place. Otherwise customers flock to the competition that can.

Research shows that customers do not like phoning around to find where they can get their plants, and the high volume of calls is inconvenient to the businesses as well. At the beginning of the season, all the businesses have plenty of product. Later in the season, despite good planning, product starts to run out unevenly. Customers looking for petunias might have to phone or visit all three garden centers before they find the hot pink color they're looking for. In general, customers want to know who has which plant, in what size pot, in what quantity, and at what price.

The proposed setup

The three IT managers hold a meeting. They decide to create a common website where customers can find out which garden center has specific plants in stock.

Although all three garden centers are computerized, each uses a different system to store information. Apples and Things uses a Microsoft® Access® database system, Birch Trees Unlimited uses a Linux® system with MySQL, and Carnation Tarnation uses Mac OS® X with IBM® DB2.

Broadly, they decide to work toward an XML-based architecture. XML is a convenient data format because each of their systems can export current data to an XML file that is available in the cloud. A master XML file collects the individual stores' data into one central place, using XInclude. Finally, the main web page examines the master file, extracts the data using XQuery, and renders the final display.


Processing the XML with XInclude and XQuery

The IT managers decide that each store will produce an XML file similar to the one in Listing 1:

Listing 1. Store XML file
<store>
  <info>
    <name>...</name>
    <address>... </address>
    <phone>... </phone>
  </info>
  <plants>
    <plant>
      <name>Petunia</name>
      <description>Pink, in 4 inch pots</description>
      <quantity>100</quantity>
      <price>3.00</price>
    </plant>
    <plant>
      <name>Apple tree 'Spartan'</name>
      <description></description>
      <quantity>6</quantity>
      <price>25.00</price>
    </plant>
    ...
  </plants>
</store>

This file has a store root element and two child elements. The info child element contains information about the store in general. The plants child element contains many plant child elements, each of which contains further information about that plant, including the name, description, quantity and price. All three stores follow this pattern exactly.

Listing 2 is their version of a main XML file that combines all three store files so they appear to be one large file, maintaining details about the products:

Listing 2. Main XML file
<?xml version="1.0">
<storeData xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include href="http://path-to-apples.xml"/>
  <xi:include href="http://path-to-birches.xml"/>
  <xi:include href="http://path-to-carnations.xml"/>
</storeData>

In Listing 2, XInclude is used to fetch data from the three locations using HTTP in the cloud. After processing, it looks like Listing 3:

Listing 3. Main XML file expanded
<?xml version="1.0">
<storeData ...">
  <store>
    <info>
      <name>Apples</name>
      ...
    </info>
    <plants>
      ...
    </plants>
  </store>
  <store>
    <info>
      <name>Birches</name>
      ...
    </info>
    <plants>
      ...
    </plants>
  </store>
  ...
</storeData>

In Listing 3, the include instructions from Listing 2 are replaced by the contents of the individual store XML files. Each store element is now a child element of the global storeData element.

You can now apply XQuery to the combined dataset, as in Listing 4:

Listing 4. Applying XQuery
<div>{
  for $store in doc("mainXMLfile.xml")/storeData/store
  let $plants := $store/plants/plant
  return
      <div>{
        for $plant in $plants
            order by $plant/name
        return <div>{ concat($plant/quantity," of ",$plant/name," at ",
            $store/info/name," for ",$plant/price," each") }</div>
        }</div>
}</div>

Listing 4 requests XHTML output with div sections. Iterating through the stores, it reports the quantity, name, store name, and price for each plant. The output looks like Listing 5:

Listing 5. Generated XHTML
<div>
    <div>
        <div>6 of Apple tree 'Spartan' at apples for 25.00 each</div>
        <div>100 of Petunia at apples for 3.00 each</div>
    </div>
    <div>
        <div>100 of Coleus at birches for 3.00 each</div>
        <div>6 of Pear tree 'Flemish Beauty' at birches for 25.00 each</div>
    </div>
    <div>
        <div>6 of Orchids at carnation tarnation for 25.00 each</div>
        <div>100 of Roses at carnation tarnation for 3.00 each</div>
    </div>
</div>

You can check this output using any of a number of XQuery clients. For example, you could use the open source Java™-based eXist application to verify the output of various XQuery commands—including Listing 4's query—on a test database. Or you could use Zorba, a stand-alone XQuery application, to do the same thing. See Resources for more information about these XQuery clients. JavaScript approaches are available as well.

An alternate approach to using XInclude and XQuery in combination is to perform all the steps on the server side. You can do this with PHP by using the functions provided by PHP's SimpleXML extension. SimpleXML delivers the filtered and formatted data straight to the browser in XHTML, thereby reducing the burden on the general reader.


Including with PHP

Take the include step first, which you can accomplish with simple string substitution:

  1. Start with a basic empty XML framework.
  2. Insert a string for each of the stores.
  3. Fetch the files for the stores.
  4. Replace the store names with the files.

The result is one large file for all stores. Other ways to build the file, such as using the SimpleXML addChild(...) function, are also available. However, in this case string substitution is simpler and provides as much validation as is needed. Listing 6 shows the string-substitution technique:

Listing 6. Include by string substitution
<?php
// string substitution approach
$arr = array("apples","birches","carnations");
$myxml = "<?xml version=\"1.0\"?>
<storeData>";
foreach ($arr as $a) { $myxml .= "$a\n"; }
$myxml .= "</storeData>";
// now process includes
foreach ($arr as $a) {
  if (!$f = file_get_contents($a.".xml")) {
    xml_fallback($a);
  } else {
    if (!simplexml_load_string($f)) {
      xml_fallback($a);
    } else {
      xml_include($a,$f);
    }
  }
}
if (!$xml = simplexml_load_string($myxml)) {
  echo "Combined file failed!\n";
} else {
  echo $xml->asXML;
}
//
// functions
//
function xml_include($a,$rep) {
global $myxml;
  $myxml = str_replace($a,$rep,$myxml);
}
function xml_fallback($a) {
global $myxml;
  $rep = "<store><info>
    <name>$a</name>
  </info></store>";
  $myxml = str_replace($a,$rep,$myxml);
}
?>

Listing 6 starts by declaring an array that contains the short names of the stores and builds those names into a basic XML file. It then loops through the XML files for each store, fetching them as simple text. The file might not be there, in which case you need to take special action. If the file is there, you test it to see if, on its own, it is valid XML. If it's not valid, then you take the same action as if the file were missing, using the equivalent of the XInclude's Fallback functionality. In this case, the xml_fallback() function substitutes a basic XML snippet to keep the file well-formed. If the file is valid, it is substituted into the main file with xml_include(). Finally, the fully assembled XML file is tested for well-formedness. In this example, because you are testing only the include functionality, upon success the process stops and outputs the XML for visual verification. The XML is now ready for query processing.


Querying with PHP

You can now process the XML within PHP code either with an XQuery application such as Zorba or by using SimpleXML.

Query using Zorba

An interesting method of tying Zorba to PHP is as a PHP Extension Community Library extension. (For an explanation of this method, see Resources for a link to the article, "Building XQuery-powered applications with PHP and Zorba".) Alternatively, you can call Zorba from PHP without using an extension by using exec(), as demonstrated in the PHP snippet in Listing 7:

Listing 7. PHP and Zorba without using an extension
<?php
...
function run_zorba($myxquery) {
  ...
  exec("zorba -i -q '$myxquery'",$array);
  ...
  $xmlstr = implode("\n",$array);
  $xml = simplexml_load_string($xmlstr);
  echo $xml->asXML();
}
?>

In Listing 7, the function opens knowing the XQuery request in the $myxquery string argument. It then uses the exec() function to run Zorba. You may wish to wrap this in a process fork for safety. The exec() function fills the $array array with the output from Zorba. The -q and -i options call for a query to be run and for the output to be indented, respectively. This last option is particularly useful in debugging because it makes the XML code more readable. The contents of the array are then imploded into one long string, which is in turn loaded into an XML object and (in this test case) echoed.

Query using SimpleXML

Although using an XQuery agent such as Zorba is convenient, it is also possible to pull the data with SimpleXML directly, in the following way.

At this point, assume you have the $xml variable from Listing 6 loaded into an XML object that is ready for processing. Listing 8 shows the final querying of the XML data:

Listing 8. Query processing with SimpleXML
<?php
// Query with PHP
include 'mainstore2.php';
if (!$xml = simplexml_load_string($myxml)) {
  die("Cannot read output from XInclude step\n");
}
// begin output
$myout = "\n\n<html><head></head><body>\n";
$myout .= "<div>\n";
foreach ($xml->store as $store) {
  foreach ($store->plants->plant as $plant) {
    $myout .= $plant->quantity." of "
        .$plant->name." at "
        .$store->info->name." for "
        .$plant->price
        ."\n";
  } 
}
$myout .= "</div>\n";
$myout .= "</body></html>\n";
echo $myout;
?>

Listing 8 begins by loading the equivalent of Listing 6 and checking yet again if the XML is well-formed. It then loops through the stores, then through the plants available at each store, and finally generates the output shown in Listing 9. This output is comparable to the output shown in Listing 5, from the earlier XQuery approach. You can add XHTML markup as required.

Listing 9. Output to browser
<html><head></head><body>
<div>
100 of Petunia at apples for 3.00
6 of Apple tree 'Spartan' at apples for 25.00
100 of Coleus at birches for 3.00
6 of Pear tree 'Flemish Beauty' at birches for 25.00
100 of Roses at carnation tarnation for 3.00
6 of Orchids at carnation tarnation for 25.00
</div>
</body></html>

You can readily extend this kind of programming to give the garden center customers the ability to filter and sort based on query parameters such as price ranges and names, as required.


Conclusion

The programmer's concern is to compare PHP coding using the readily available SimpleXML libraries (on the one hand) with PHP and intermediate tools provided by special XInclude and XQuery libraries (on the other). A library of special tools is only of value if it eases the programmer's burden, reducing and clarifying the code required to get a job done.

In the case of includes, the insertion of data from other files is rather straightforward and does not require much coding using either method.

In the case of queries, XQuery libraries can replace quite a lot of code otherwise required ('for' loops, and so on) with the PHP+SimpleXML method.

However the more condensed the code is for data retrieval the less opportunity to use PHP for its other capabilities. To take an example, you could reduce your query to one line in XQuery replacing 20 lines of PHP+SimpleXML. The opportunity cost is that you have forgone the chance to easily and clearly insert other statements between the 20 separate statements. This is the trade-off.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into XML on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, Web development, Open source
ArticleID=522199
ArticleTitle=Simulate XQuery and XInclude functionality with PHP
publish-date=09212010