Combine advanced spreadsheet export with PHP to create pivot tables

Build flexible, exportable spreadsheets from an XML template

Add PHP to an advanced export function in Microsoft Excel to create a flexible pivot table in XML format. Then, you and users can sort and analyze the data in the pivot table on the fly.

Share:

Jack D. Herrington, Senior Software Engineer, Fortify Software, Inc.

Photo of Jack HerringtonJack Herrington is an engineer, author, and presenter who lives and works in the Bay Area. You can keep up with his work and his writing at http://jackherrington.com.



12 April 2011

Also available in Chinese Russian Japanese

Six years ago, I wrote an article, "Read and write Excel data with PHP," about using PHP to export data as a Microsoft® Excel® spreadsheet. As it turns out, that article was popular enough to warrant this follow-up article. The first article barely scratched the surface of the power of Excel. In this article, I show you how to use one of the more advanced features, pivot tables, as an example of the potential for Excel export.

Starting to build pivot tables with advanced Excel export with PHP

Frequently used acronyms

  • CSV: Comma-separated value
  • W3C: World Wide Web Consortium
  • XML: Extensible Markup Language

A pivot table is a table in which the user can dynamically choose which fields are used as the row headings, the column headings, and what is shown as the data in each cell. Users can rearrange the fields on the fly, which gives them the ability to answer complex questions and mine the data for interesting patterns.

Figure 1 shows a basic Excel spreadsheet. The first row contains the field names (Account, Genre, Images, Average Ran, Total Size), and the rows that follow contain the data. In this case, you are looking at the export from an image storage site (for example, Flickr). Each row contains the information for a given account including the genre of images, the number of images uploaded, their average ranking, and the total size of all of the images combined (in bytes).

Figure 1. The basic Excel table
Screen capture of a basic Excel table with account information for an image storage site

To build a pivot table in Excel, you select the data columns to be used in the pivot table, then select the PivotTable Report menu item from the Data menu. You are then prompted to specify which data to use for the report in a wizard. After the wizard is completed, a new sheet is added to the Excel file.

Figure 2 shows an empty pivot table with the PivotTable toolbar floating over the window. The spreadsheet includes regions in which you can drop each of the field items to generate the report.

Figure 2. Building a pivot table
Screen capture of empty pivot table with regions ready to receive dropped row fields, column fields, and data items

Figure 3 shows the PivotTable toolbar in more detail. At the top are the PivotTable menu and some toolbar buttons to control the depth of the table. At the bottom of the toolbar are the available fields (Account, Genre, Images, Average Ran, Total Size).

Figure 3. The pivot toolbar
Screen capture of pivotTable toolbar with menu, buttons, and available fields

To adjust the pivot table, you select fields and drop them in the appropriate zones on the page. Figure 4 shows the result of first dropping the genre, then the account and total size in the data region on the left side of the report.

Figure 4. A populated pivot table from the data
Screen capture of

The report now shows the records first grouped by genre, then by account. The sizes are then shown for each account, then the sizes are totaled for each genre, and finally a grand total is shown. Adjusting the ordering of the data shown is simply a matter of dragging and dropping fields. It's an extremely powerful mechanism for exploring your data and well worth the effort if you find yourself with large or complex data sets.

When you save an Excel spreadsheet in XML format, all the data and the pivot table parameters are stored in the XML file. Listing 1 shows an example fragment of this file.

Listing 1. Example Excel export
<?xml version="1.0"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
 xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882"
 xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
  <Author>Jack Herrington</Author>
  <LastAuthor>Jack Herrington</LastAuthor>
  <Created>2011-03-26T21:15:20Z</Created>
  <LastSaved>2011-03-26T21:45:33Z</LastSaved>
  <Company>Myself</Company>
  <Version>12.0</Version>
 </DocumentProperties>
...
</Workbook>

The example PHP application builds an Excel XML file that has one worksheet for the raw data and another for the PivotTable. That pivot table tab has all the dynamic capabilities available, just as if you created it in Excel.

To get started, you first need the data that you put into the spreadsheet.


Getting the data

To make it easier on yourself, store the data in an XML file as in Listing 2.

Listing 2. data.xml
<users>
  <user account="Megan" genre="Portraits" images="20"
     avgrank="4.0" size="72000" />
  <user account="Hannah" genre="Landscapes" images="31"
     avgrank="3.5" size="83000" />
  <user account="Vicky" genre="Floral" images="25"
     avgrank="4.2" size="42000" />
  <user account="Ian" genre="Portraits" images="40"
     avgrank="3.7" size="92000" />
  <user account="Michael" genre="Landscapes" images="23"
     avgrank="3.8" size="72000" />
  <user account="Daniel" genre="Landscapes" images="29"
     avgrank="4.4" size="85000" />
</users>

The root of the XML is a users tag that contains a tag for each user. Each user tag contains account, genre, images, avgrank, and size attributes. These attributes hold the relevant data for each account.

To read the data, a simple data.php file contains a single function, load_data. This function reads the XML and returns the data back as an array of hash tables. Listing 3 shows the code for this function.

Listing 3. data.php
<?php
function load_data() {
  $xmlDoc = new DOMDocument();
  $xmlDoc->load("data.xml");
  $data = array();
  foreach ($xmlDoc->documentElement->childNodes AS $item)
  {
    if ( $item->nodeType == XML_ELEMENT_NODE ) {
    $data []= array( "account" => $item->getAttribute("account"),
      "genre" => $item->getAttribute("genre"),
      "images" => $item->getAttribute("images"),
      "avgrank" => $item->getAttribute("avgrank"),
      "size" => $item->getAttribute("size") );
    }
  }
  return $data;
}
?>

The code starts by creating a DOMDocument object that is used to load the XML from the file. The foreach loop then iterates through each of the user nodes and stores their attributes in the data array. This data array is then returned to the caller.

In addition to the data reader, another function, called print_file, is used to print the entire contents of the file. Listing 4 shows this function.

Listing 4. helpers.php
<?php
function print_file( $file ) {
  print file_get_contents( $file );
}
?>

The reason for print_file is simple. The Excel XML file is pretty complex, so the easiest way to build one is to start with a file exported from XML, then grab large chunks of it as seed code for the export file. In this case, you take the top portion of the file up to the point where the first workbook is defined and put that into body_header.txt. The bottom portion of the file, after the last workbook is finished, is stored in body_footer.txt.

Similar files, data_header.txt and data_footer.txt, bracket the table in the data tab of the spreadsheet.


Building a simple Excel spreadsheet

With the data reader and the file printer ready to go, you can start to build Excel spreadsheets. Listing 5 shows a first take on the Excel export code.

Listing 5. helpers.php
<?php
require_once 'data.php';
require_once 'helpers.php';
require_once 'exporters.php';

$data = load_data();

print_file( 'body_header.txt' );
print_file( 'data_header.txt' );
?>
<Table ss:ExpandedColumnCount="5" 
       ss:ExpandedRowCount="<?php echo( count($data) + 1 ) ?>" x:FullColumns="1"
 x:FullRows="1">
 <Row>
  <Cell ss:StyleID="s21"><Data ss:Type="String">Account</Data></Cell>
  <Cell ss:StyleID="s21"><Data ss:Type="String">Genre</Data></Cell>
  <Cell ss:StyleID="s21"><Data ss:Type="String">Images</Data></Cell>
  <Cell ss:StyleID="s21"><Data ss:Type="String">Average Ranking</Data></Cell>
  <Cell ss:StyleID="s21"><Data ss:Type="String">Total Size</Data></Cell>
 </Row>
<?php foreach( $data as $row ) { ?>
<Row>
 <Cell><Data ss:Type="String"><?php echo( $row['account'] ) ?></Data></Cell>
 <Cell><Data ss:Type="String"><?php echo( $row['genre'] ) ?></Data></Cell>
 <Cell><Data ss:Type="Number"><?php echo( $row['images'] ) ?></Data></Cell>
 <Cell><Data ss:Type="Number"><?php echo( $row['avgrank'] ) ?></Data></Cell>
 <Cell><Data ss:Type="Number"><?php echo( $row['size'] ) ?></Data></Cell>
</Row>
<?php } ?>
</Table>
<?php
print_file( 'data_footer.txt' );
print_file( 'body_footer.txt' );
?>

The code first reads in the data, then prints both the body and data headers. Then the XML code defines the table, the first row of which has the headers. Then the foreach loop iterates through each data item and creates a new row with the data.

When the program is run, the output can be stored in a file that is then opened by Excel. The result should be similar to the result in Figure 5.

Figure 5. The basic table exported from PHP
Screen capture of pivot table with data as originally formatted in the file

Figure 5 shows the data properly formatted as it appears in the original file; the first row displays the field headings, and each subsequent row provides the data.

Obviously, you won't use this exact code for your Excel export because your fields and data will be different. But the process of building the sheet from first creating an Excel file, then cutting out the header and footer sections, then injecting your own data into the sheet remains the same.


Building the pivot table

The pivot table spreadsheet is a little more complex, but the idea is roughly the same. The script starts by reading in the data, then printing the body header. It then prints the pages for the data, then for the pivot table, then for the footer. Listing 6 shows the code for this process.

Listing 6. build2.php
<?php
require_once 'data.php';
require_once 'helpers.php';
require_once 'exporters.php';

header( 'Content-type: application/excel' );
header( 'Content-Disposition: attachment; filename="pivot.xml"' );

$data = load_data();

print_file( 'body_header.txt' );
export_data( $data );
export_pivot( $data );
print_file( 'body_footer.txt' );
?>

The header function calls at the beginning are also interesting because they tell the browser what to do with the file. The Content-type header tells the browser that the script is exporting an Excel spreadsheet. The Content-disposition header tells the browser that this spreadsheet should be stored as a file and recommends a name, in this case pivot.xml.

The bulk of the work is done with export functions, which are defined in the exporters.php file in Listing 7.

Listing 7. exporters.php
<?php
require_once 'helpers.php';

function export_data( $data ) {
  print_file( 'data_header.txt' );
?>
<Table ss:ExpandedColumnCount="5" 
       ss:ExpandedRowCount="<?php echo( count($data) + 1 ) ?>" x:FullColumns="1"
 x:FullRows="1">
...
</Table>
<?php
  print_file( 'data_footer.txt' );
}

function export_pivot( $data ) {
?>
<Worksheet ss:Name="Pivot 1">
<?php
  export_pivot_table( $data );
  export_pivot_worksheet_options( $data );
  export_pivot_pivottable( $data );
?>
</Worksheet>
<?php
  export_pivotcache( $data );
}

function export_pivot_table( $data ) {
?>
<Table ss:ExpandedColumnCount="2" 
       ss:ExpandedRowCount="<?php echo( count($data) + 5 ) ?>" x:FullColumns="1"
 x:FullRows="1">
 <Column ss:Width="96.0"/>
 <Column ss:AutoFitWidth="0" ss:Width="49.0"/>
 <Row ss:Index="3">
  <Cell ss:StyleID="s22">
     <Data ss:Type="String">Sum of Total Size</Data>
  </Cell>
  <Cell ss:StyleID="s24"/>
 </Row>
 <Row>
  <Cell ss:StyleID="s34">
     <Data ss:Type="String">Account</Data>
  </Cell>
  <Cell ss:StyleID="s24">
     <Data ss:Type="String">Total</Data>
  </Cell>
 </Row>
<?php
$total = 0;
foreach( $data as $row ) { 
$total += intval( $row['size'] );
?>
<Row>
  <Cell ss:StyleID="s22">
     <Data ss:Type="String"><?php echo( $row['account'] ) ?></Data>
  </Cell>
 <Cell ss:StyleID="s26">
     <Data ss:Type="Number"><?php echo( $row['size'] ) ?></Data>
  </Cell>
</Row>
<?php } ?>
 <Row>
  <Cell ss:StyleID="s31">
     <Data ss:Type="String">Grand Total</Data>
  </Cell>
  <Cell ss:StyleID="s33">
     <Data ss:Type="Number"><?php echo( $total ) ?></Data>
  </Cell>
 </Row>
</Table>
<?php
}

function export_pivot_pivottable( $data ) {
?>
<PivotTable xmlns="urn:schemas-microsoft-com:office:excel">
...
</PivotTable>
<?php
}

function export_pivot_worksheet_options( $data ) {
?>
<WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
...
</WorksheetOptions>
<?php
}

function export_pivotcache( $data ) {
?>
<PivotCache xmlns="urn:schemas-microsoft-com:office:excel">
...
<?php foreach( $data as $row ) { ?>
  <row Col1="<?php echo( $row['account'] ) ?>" Col2="<?php echo( $row['genre'] ) ?>" 
	Col3="<?php echo( $row['images'] ) ?>" Col4="<?php echo( $row['avgrank'] ) ?>" 
	Col5="<?php echo( $row['size'] ) ?>" xmlns="#RowsetSchema"/>
<?php } ?>
 </data>
</PivotCache>
<?php
}
?>

Much of the content of these functions has been removed to save space. The complete working code is available in the excelpivotxml.zip download file (see Download). The first function, export_data, does what the first script did to export the data. To create the pivot table tab, the export_pivot function first creates a worksheet and then adds the table of data, the worksheet options, and the PivotTable XML. The final PivotCache creation happens after the worksheet is finished.

To be honest, I don't know why so much XML code is required for the definition of the pivot table. It's certainly more code than that for a simple table of data. But the nice thing is that you can create the pivot table in Excel, save it as XML, and then use that XML code as a template for your own code.

After you run the PHP script, save the result to a file, then open it in Excel to see something like Figure 6.

Figure 6. The exported pivot table
Screen capture of spreadsheet with list of accounts and (size) totals, plus a grand total

In Figure 6, the pivot table shows each of the accounts, their size totals, and a grand total at the end. The Excel pivot table was correctly created so a PivotTable toolbar displays over the window. Using this toolbar, the user can now add fields and adjust the table.


Conclusion

The XML format Excel uses might be a little complex, but you can use a saved XML file as a template for building the spreadsheets that you export. This approach greatly simplifies the process of generating the files and means that you can have your users build the spreadsheets that they want to see, then you can export them and populate them with the live data on the fly. That's an excellent way to provide new value to your customers while staying in the PHP and XML world of the server.


Download

DescriptionNameSize
Source code for articleexcelpivotxml.zip---

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into XML on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, Open source
ArticleID=645441
ArticleTitle=Combine advanced spreadsheet export with PHP to create pivot tables
publish-date=04122011