Did you try out creative excuse #432 on your boss this week? Were you tempted to?
Hopefully, you didn't have to. Part 1 of this series helped you understand how to work with Microsoft® Excel® spreadsheets in Java™ technology (see the link to "Read, recycle, and reuse: Reporting made easy with Excel, XML, and Java technologies, Part 1" in Resources). Simply download Apache POI and set up to use it, and soon enough, walking through an Excel spreadsheet is almost as easy as walking through a park. And almost as green.
But reading Excel files is only a start. This installment shows you how to use Apache POI and the XML Object Model (XOM) to store an Excel file in XML objects. Then, you can recycle those objects to write an entirely new Excel spreadsheet and XML file.
The sample application contains an Excel spreadsheet called Employee_List.xls from the fictional Planet Power corporation. The Big Boss has convinced Planet Power's top employees to donate 1% of their salaries to his favorite cause: the Genetically Engineered Enormous Wild Hamster Interplanetary Sanctuary (GEE WHIS). The sample application calculates the amount and creates an XML report to rush to the sanctuary. Meanwhile, the application writes an Excel spreadsheet for the Big Boss.
To follow the examples in this article, download the samples and extract the files to C:\Planet Power. Then, start Eclipse.
To import the Employees2 Eclipse project containing the sample application, perform these steps:
- In Eclipse, right-click in the Package Explorer, and then click Import.
- Expand General, and then select Existing Projects into Workspace.
Click Next (Figure 1).
Figure 1. Bring an existing project into the workspace
- Click Browse beside Select root directory, and then navigate to C:\Planet Power\Employees2.
- Select the Employees2 folder, click OK, and then click
Finish (Figure 2).
Figure 2. Finish importing a project into Eclipse
The Employees2 folder should appear in the Package Explorer pane.
Note: For this project, use the file ExcelXML.java in the Employees2 project under src\default_package.
In Part 1 of this series, your first step is to import Apache POI along with exception and file-handling classes (see Resources for the link to Part 1). In addition, you'll need to add some XML API classes along with classes for working with numbers, as in Listing 1.
Listing 1. Importing classes (ExcelXML.java)
// File and exception handling imports import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.text.ParseException; // Apache POI imports import org.apache.poi.hssf.usermodel.HSSFCell; import org.apache.poi.hssf.usermodel.HSSFHyperlink; import org.apache.poi.hssf.usermodel.HSSFRow; import org.apache.poi.hssf.usermodel.HSSFSheet; import org.apache.poi.hssf.usermodel.HSSFWorkbook; import org.apache.poi.hssf.usermodel.HSSFDataFormatter; import org.apache.poi.hssf.usermodel.HSSFCellStyle; import org.apache.poi.hssf.usermodel.HSSFDataFormat; // XOM imports import nu.xom.Document; import nu.xom.Elements; import nu.xom.Element; import nu.xom.Attribute; import nu.xom.Serializer; // Imports for later calculations import java.text.NumberFormat; |
After importing the classes,you're ready to begin programming inside
main().
The code in Listing 2 represents an exception-handling structure
for file (IOException) and number conversion
(ParseException) errors.
Listing 2. Set up exception handling (ExcelXML.java)
public class ExcelXML {
public static void main(String[] args) {
try {
// Put file and number handling code here.
// End try
}
catch (IOException e) {
System.out.println("File Input/Output Exception!");
}
catch (ParseException e) {
System.out.println("Number Parse Exception!");
}
// End main method
}
// End class block
}
|
Now you can begin to work with XOM.
XOM, Document, and XML objects
XOM simplifies working with XML by parsing it into objects that represent an XML
document's pieces. The class that represents an entire XML document is
nu.xom.Document. From the Document,
you can add or access other pieces. Some classes for working with XML include:
nu.xom.Buildernu.xom.Documentnu.xom.Elementsnu.xom.Elementnu.xom.Serializer
The main pieces of XML are called elements. An element consists of a pair of tags and the content in between. Here's one example of an element from a sample file called weather_service.xml.
<dawn>07:00</dawn> |
The words dawn and their brackets (<>)
and slash (/) are called tags. The
07:00 is the content.
Elements can contain other elements, text content, or both. A containing element is called a parent, and the elements inside are called child elements or children.
In XOM, elements are represented by nu.xom.Element
objects. A set of elements is a nu.xom.Elements
object.
To find elements, use the one element every well-formed XML document is guaranteed to have: the root element. The root acts as a container for every other element. If a document doesn't have a root, it's not proper XML.
To find the root, use getRootElement() on the
Document object. Getting just one element opens
strategies for exploring the document. Want to work with the root's children?
Get a list of all child elements or child elements with a specific name using
variations of Element.getChildElements().
Want a single element? Get only the first child element with a specific name from
Element.getFirstChildElement().
Does this seem familiar? Iterating through an XML document to find children of children is similar to iterating through an Excel worksheet.
In the previous article, you saw how Apache POI's getStringCellValue()
method retrieved a string value from an Excel HSSFCell.
Similarly, XOM uses Element.getValue() to get string
content from an XML element. However, unlike working with cells, working with
XOM Elements does not require testing the data types. XML is all text.
Excel-XML mashup and XML markup
When traversing spreadsheets and extracting data, you can store the data in any number of objects: Strings. Arrays. Squirrels. This article uses XML elements. (That way, it's easier on the Squirrels.) It also offers some insight into translating between the two formats.
Listing 3 prepares a new HSSFWorkbook
and XML Document to store the workbook's information.
Listing 3. Prepare a workbook and an XML Document (ExcelXML.java)
// First, create a new XML Document object to load the Excel sheet into XML.
// To create an XML Document, first create an element to be its root.
Element reportRoot = new Element("sheet");
// Create a new XML Document object using the root element
Document XMLReport = new Document(reportRoot);
// Set up a FileInputStream to represent the Excel spreadsheet
FileInputStream excelFIS = new FileInputStream("C:\\Planet Power\\Employee_List.xls");
// Create an Excel Workbook object using the FileInputStream created above
HSSFWorkbook excelWB = new HSSFWorkbook(excelFIS);
|
Does something seem backward about Listing 3? The code
creates a new Element before it creates a
Document. Why?
A brand new Document requires one thing: an
Element to be its root element. Remember, the
Document can't represent a well-formed XML document
without having a root element. Therefore, you must pass it a root element when you
create it.
In contrast, you can create elements that don't belong to a root. You can leave them as free-floating elements or append them to a document root or other elements. However, before you start, you should plan the structure of your XML.
XML describes and formats data, mainly through elements and attributes. Attributes are name-value pairs. The name part of an attribute describes the data the attribute holds. The value of the attribute is its data. Writers of HTML code are familiar with attributes such as:
<font size="12">Hello, Planet!</font> |
The entire listing is an element. The tag is <font>.
The attribute is size="12". The name of the
attribute is size. Its value is 12.
The equal sign (=) joins the two. Generally, data is
stored in elements, while metadata is stored in attributes.
So, how should the Employee_List.xls workbook translate to XML?
Markup method 1: mimic workbook structure
One way to structure the XML is to mimic the physical structure of an Excel workbook. Consider Listing 4.
Listing 4. XML structure based on general workbook structure
<sheet>
<row>
<cell>
Thumb
</cell>
<cell>
Green
</cell>
<cell>
Growing Plants
</cell>
<cell>
5:00 AM
</cell>
<cell>
2:00 PM
</cell>
<cell>
150,000
</cell>
</row>
</sheet>
|
With this format, the rows and cells are clear. But what data does each cell hold? Is 5:00 AM the employee's start or end time? What about using the column name as an attribute of the cell?
If maintaining the Excel spreadsheet structure is important, that might work. However, in XOM, without learning how to write XPath queries, there is no easy method to extract a collection of elements based on their attribute values. With the column names stored as attributes, extra code is required to locate a collection of all of the elements from a specific column. Because XOM contains a method to find elements by their names, consider using the Excel columns as element names rather than as attributes.
Markup method 2: mimic the data structure
Instead of basing XML on the presentation of data, consider a structure that describes the data. That's what XML does best, as Listing 5 shows.
Listing 5. XML structure based on describing data
<EmployeeData>
<Employee>
<EmployeeLastName>
Thumb
</EmployeeLastName>
<EmployeeFirstName>
Green
</EmployeeFirstName>
<MainSuperPower>
Growing Plants
</MainSuperPower>
<DailyStartTime>
5:00 AM
</DailyStartTime>
<DailyEndTime>
2:00 PM
</DailyEndTime>
<Salary>
150,000
</Salary>
</Employee>
</EmployeeData>
|
This approach allows you to use the getChildElements("Salary")
method on each Employee element to quickly find
the employees' salaries.
However, using Excel column names for the element names is risky. Excel columns can use characters that are not valid in XML element names, such as spaces. So, be sure to scrub those characters out of potential element names.
To structure the data this way, you must be familiar with the data. It would be
difficult to calculate programmatically that a row in the Excel spreadsheet
should be called an Employee in the XML. It would
also be difficult to calculate a name for the root element
(EmployeeData in the above example).
Also, what happens if the structure changes or the Big Boss wants to recycle the code for other spreadsheets? Spreadsheet rows could list types of hamsters rather than employees. You would then have to adjust what the program calls the rows.
Markup method 3: Excel XML mashup blend
Consider blending the Excel-structured XML with the data-structured markup, as in Listing 6.
Listing 6. Blending workbook structure with data-based markup
<sheet>
<row>
<EmployeeLastName>
Thumb
</EmployeeLastName>
<EmployeeFirstName>
Green
</EmployeeFirstName>
<MainSuperPower>
Growing Plants
</MainSuperPower>
<DailyStartTime>
5:00 AM
</DailyStartTime>
<DailyEndTime>
2:00 PM
</DailyEndTime>
<Salary>
150,000
</Salary>
</row>
</sheet>
|
If the first row in each Excel spreadsheet contains column names, this blend
could be flexible enough to work with several Excel workbooks. Because the
data in the document and rows might not be consistent between spreadsheets,
you can use the generic sheet and row as element names, and they
will still have meaning to programmers reading the code.
Note: As with pure data-centric markup, beware of illegal characters weaseling into XML element names.
This article uses an XML-Excel blended format to store cell values. But what about tracking other cell information, like data type and formatting?
Data type and formatting are metadata. Depending on what metadata must be preserved,
you can use attributes like dataFormat and
dataType.
After deciding how your XML should look, begin storing Excel data in XML elements. Use the same loops to traverse the Excel spreadsheet as in Part 1 of this series (see Resources for the link). Then, add XML. Listing 7 recycles Excel-reading code from the previous article.
Listing 7. Start traversing through the Excel spreadsheet (ExcelXML.java)
// Traverse the workbook's rows and cells.
// Remember, excelWB is the workbook object obtained earlier.
// Just use the first sheet in the book to keep the example simple.
// Pretend this is an outer loop (looping through sheets).
HSSFSheet oneSheet = excelWB.getSheetAt(0);
// Now get the number of rows in the sheet
int rows = oneSheet.getPhysicalNumberOfRows();
// Middle loop: Loop through rows in the sheet
for (int rowNumber = 0; rowNumber < rows; rowNumber++) {
HSSFRow oneRow = oneSheet.getRow(rowNumber);
// Skip empty (null) rows.
if (oneRow == null) {
continue;
}
|
As you iterate through the spreadsheet's rows, create XML elements to represent the rows, as in Listing 8.
Listing 8. Create a row element (ExcelXML.java)
// Create an XML element to represent the row.
Element rowElement = new Element("row");
|
Don't attach the rows to the Document yet, in case you
have empty rows or null rows. If the row isn't empty after adding cells,
you can attach it to the root element at the bottom of the row loop.
Next, start the inner loop to read the cells, as in Listing 9.
Listing 9. Continue to iterate through the Excel cells (ExcelXML.java)
// Get the number of cells in the row
int cells = oneRow.getPhysicalNumberOfCells();
// Inner loop: Loop through each cell in the row
for (int cellNumber = 0; cellNumber < cells; cellNumber++) {
HSSFCell oneCell = oneRow.getCell(cellNumber);
// If the cell is blank, the cell object is null, so don't
// try to use it. It will cause errors.
// Use continue to skip it and just keep going.
if (oneCell == null) {
continue;
}
|
Once you're inside the inner loop, create an XML element to represent the cell,
using the appropriate column name as the element name. In Listing 10,
because element names can't be empty, each name defaults to header. After
the first row, calculate a new name based on the data stored in the first row
elements, which contain the Excel spreadsheet column names.
Listing 10. Create elements for the cells using the column names as element names (ExcelXML.java)
// Set up a string to use just "header" as the element name
// to store the column header cells themselves.
String elementName="header";
// Figure out the column position of the Excel cell.
int cellColumnNumber = oneCell.getColumnIndex();
// If on the first Excel row, don't change the element name from "header,"
// because the first row is headers. Before changing the element name,
// test to make sure you're past the first row, which is the zero row.
if (rowNumber > 0)
// Set the elementName variable equal to the column name
elementName=reportRoot.getFirstChildElement("row").getChild(cellColumnNumber).getValue();
// Remove weird characters and spaces from elementName,
// as they're not allowed in element names.
elementName = elementName.replaceAll("[\\P{ASCII}]","");
elementName = elementName.replaceAll(" ", "");
// Create an XML element to represent the cell, using
// the calculated elementName
Element cellElement = new Element(elementName);
|
A lot happens in Listing 10. In the long line in the middle
beneath the comment, // Set the elementName variable equal to the column name,
elementName is set equal to something. Reading the end
of the line, you can see it's set to the text value inside an element using
getValue(). Which element's value does it use?
Inside the reportRoot, inside the first
row element (reportRoot.getFirstChildElement("row")),
the code locates a child element by its index number using
getChild(cellColumnNumber). You have already stored
the first row of the spreadsheet in the first row element, so the values of the row's
child elements are the column names in the spreadsheet. The index number is the
same as the current cell's column number in the spreadsheet. So the value set for
the elementName is the corresponding column name
from the first row of heading elements.
Next, the code scrubs the elementName string of illegal
characters likely to exist in spreadsheets. First, the replaceAll()
method of String replaces all non-ASCII characters with
empty strings. Then, it replaces all spaces. Both lines use regular expressions. For
information on regular expressions, see Resources.
Finally, the last line of Listing 10 creates the element using the proper column name.
Appending attributes and elements
Just like elements, you can create attributes independently, and they can remain
free-floating until you append them to an element using its
addAttribute() method. Create an attribute, and then
populate it with cell metadata using a getter method such as
getDataFormatString() from the Apache POI's
HSSFCell object, as in Listing 11.
Listing 11. Add Attributes (ExcelXML.java)
// Create an attribute to hold the cell's format.
// May be repeated for any other formatting item of interest.
String attributeValue = oneCell.getCellStyle().getDataFormatString();
Attribute dataFormatAttribute = new Attribute("dataFormat", attributeValue);
// Add the attribute to the cell element
cellElement.addAttribute(dataFormatAttribute);
|
Now, the element exists and has an attribute.
To get data for the element, remember to test the data type of each
HSSFCell, so you know which getter method to use.
Because you're testing for the data type anyway, you can also create an attribute
that stores the cell's data type information.
Adding data to the element and appending the element as a child of the row is straightforward when working with string values, as in Listing 12.
Listing 12. Add Attributes, append the cell text to the element, and append the element into the row (ExcelXML.java)
switch (oneCell.getCellType()) {
case HSSFCell.CELL_TYPE_STRING:
// If the cell value is string, create an attribute
// for the cellElement to state the data type is a string
Attribute strTypeAttribute = new Attribute("dataType", "String");
cellElement.addAttribute(strTypeAttribute);
// Append the cell text into the element
cellElement.appendChild(oneCell.getStringCellValue());
// Append the cell element into the row
rowElement.appendChild(cellElement);
break;
|
Repeat the case section for each data type. But numeric
data can be tricky. Part 1 of this article series
pointed out that extracted Excel data is not the same as spreadsheet data (see the link to Part 1 in Resources). It is raw
data. Certain numbers, like dates, look awfully funny as raw data and would not be
the correct values if stored without formatting. You'll want to properly format
numeric data before it goes into an element. You can use the Apache POI class
HSSFDataFormatter and its method
formatCellValue() to accomplish this. See
Listing 13.
Listing 13. Add Attributes, format numeric data, and append the cell element (ExcelXML.java)
case HSSFCell.CELL_TYPE_NUMERIC:
// If the cell value is a number, create an attribute
// for the cellElement to state the data type is numeric
Attribute cellAttribute = new Attribute("dataType", "Numeric");
// Add the attribute to the cell
cellElement.addAttribute(cellAttribute);
// Apply the formatting from the cells to the raw data
// to get the right format in the XML. First, create an
// HSSFDataFormatter object.
HSSFDataFormatter dataFormatter = new HSSFDataFormatter();
// Then use the HSSFDataFormatter to return a formatted string
// from the cell rather than a raw numeric value:
String cellFormatted = dataFormatter.formatCellValue(oneCell);
//Append the formatted data into the element
cellElement.appendChild(cellFormatted);
// Append the cell element into the row
rowElement.appendChild(cellElement);
break;
|
Repeat the case section for each possible cell
type. Check out the full example in ExcelXML.java.
After storing the cell data, close the inner loop. Before closing the middle loop,
which represents rows, test whether the row element
is empty. If it's not, append it to the root element. Then, close the middle loop,
as in Listing 14.
Listing 14. Append the row element into the root element (ExcelXML.java)
// End inner loop
}
// Append the row element into the root
// if the row isn't empty.
if (rowElement.getChildCount() > 0) {
reportRoot.appendChild(rowElement);
}
// End middle loop
}
|
Now your Excel file is a complete XML document.
When using XML to perform calculations, such as figuring out 1% of salaries, you must convert between strings and numbers. Listing 15 provides an example.
Listing 15. Calculate 1% of the salary and store it in a donation element (ExcelXML.java)
// To get employees' salaries, iterate through row elements and get a collection of rows
Elements rowElements = reportRoot.getChildElements("row");
// For each row element
for (int i = 0; i < rowElements.size(); i++) {
// Get the salary element,
// Calculate 1% of it and store it in a donation element.
// Unless it's the first row (0), which needs a header element.
if (i==0) {
Element donationElement = new Element("header");
donationElement.appendChild("Donation");
Attribute dataType = new Attribute("dataType","String");
donationElement.addAttribute(dataType);
Attribute dataFormat = new Attribute("dataFormat","General");
donationElement.addAttribute(dataFormat);
// Append the donation element to the row element.
rowElements.get(i).appendChild(donationElement);
}
// If the row is not the first row, put the donation in the element.
else {
Element donationElement = new Element("Donation");
Attribute dataType = new Attribute("dataType","Numeric");
donationElement.addAttribute(dataType);
// The dataFormat of the donation should be the same
// number format as salary, which looking at the XML file tells
// us is "#,##0".
Attribute dataFormat = new Attribute("dataFormat","#,##0");
donationElement.addAttribute(dataFormat);
// Get the salary element and its value
Element salaryElement = rowElements.get(i).getFirstChildElement("Salary");
String salaryString = salaryElement.getValue();
// Calculate 1% of the salary. Salary is a string
// with commas, so it
// must be converted for the calculation.
// Get a java.text.NumberFormat object for converting string to a double
NumberFormat numberFormat = NumberFormat.getInstance();
// Use numberFormat.parse() to convert string to double.
// Throws ParseException
Number salaryNumber = numberFormat.parse(salaryString);
// Use Number.doubleValue() method on salaryNumber to
// return a double to use in the calculation.
// Perform the calculation to figure out 1%.
double donationAmount = salaryNumber.doubleValue()*.01;
// Append the value of the donation into the donationElement.
// donationAmount is a double and must be converted to a string.
donationElement.appendChild(Double.toString(donationAmount));
// Append the donation element to the row element
rowElements.get(i).appendChild(donationElement);
//End else
}
// End for loop
}
|
Now you have stored an extra Donation element for
each row. You have finished building your XML Document object.
XOM makes it easy to write your Document object to an
XML file. Use nu.xom.Serailizer.write(), as in
Listing 16.
Listing 16. Writing XML to Excel (ExcelXML.java)
// Print out the XML version of the spreadsheet to see it in the console
System.out.println(XMLReport.toXML());
// To save the XML into a file for GEE WHIS, start with a FileOutputStream
// to represent the file to write, C:\Planet Power\GEE_WHIS.xml.
FileOutputStream hamsterFile = new FileOutputStream("C:\\Planet Power\\GEE_WHIS.xml");
// Create a serializer to handle writing the XML
Serializer saveTheHamsters = new Serializer(hamsterFile);
// Set child element indent level to 5 spaces to make it pretty
saveTheHamsters.setIndent(5);
// Write the XML to the file C:\Planet Power\GEE_WHIS.xml
saveTheHamsters.write(XMLReport);
|
The hamsters will love their new donation report. XML is their native language.
To write the XML to an Excel spreadsheet, iterate through the XML and set cell values and formatting. Listing 17 sets up and starts looping through rows.
Listing 17. Set up to write XML to Excel (ExcelXML.java)
// Create a new Excel workbook and iterate through the XML
// to fill the cells.
// Create an Excel workbook object
HSSFWorkbook donationWorkbook = new HSSFWorkbook();
// Next, create a sheet for the workbook.
HSSFSheet donationSheet = donationWorkbook.createSheet();
// Iterate through the row elements and then cell elements
// Outer loop: There was already an elements collection of all row elements
// created earlier. It's called rowElements.
// For each row element in rowElements:
for (int j = 0; j < rowElements.size(); j++) {
// Create a row in the workbook for each row element (j)
HSSFRow createdRow = donationSheet.createRow(j);
// Get the cell elements from that row element and add them to the workbook.
Elements cellElements = rowElements.get(j).getChildElements();
|
Like looping through rows, looping through cell creation is straightforward. The harder part is formatting the cells.
An HSSFCellStyle object represents style options for
a cell, like font, border, and numeric formatting (including date and time formats).
However, styles are per workbook, not per cell. The HSSFCellStyle
object represents a named style existing in the workbook that can be applied to a
cell. These styles are groupings of style options, like named styles in Microsoft
Office Word. Similarly, an HSSFDataFormat is created
per workbook but represents only numeric formatting, like date and time formats.
To style a cell, create a new HSSFCellStyle for the workbook, or
use an existing HSSFCellStyle. Then, apply it to the
cell using HSSFCell.setCellStyle(). To set a numeric
format for a cell, set the numeric format of the HSSFCellStyle,
not the cell. Then, apply the HSSFCellStyle to the cell.
The HSSFDataFormat objects are indexed by number in the
workbook. To tell an HSSFCellStyle which
HSSFDataFormat to use, you need the index number of
the HSSFDataFormat. This is a numeric short, not an
HSSFDataFormat object.
Fortunately, the HSSFDataFormat object has a method called
getFormat(). When passed a string representing a desired
format, it returns the index number of the HSSFDataFormat
that matches the string. The index is returned as a numeric short. If there is no
match, it creates a new HSSFDataFormat and returns its
index. You can use that index to apply the formatting to the cell style and apply
the cell style to the cell, as in Listing 18.
Listing 18. Loop through cell elements and set up proper numeric formatting before inserting data
// Middle loop: Loop through the cell elements.
for (int k = 0; k < cellElements.size(); k++) {
// Create cells and cell styles. Use the row's
// createCell (int column) method.
// The column index is the same as the cell element index, which is k.
HSSFCell createdCell = createdRow.createCell(k);
// To set the cell data format, retrieve it from the attribute
// where it was stored: the dataFormat attribute. Store it in a string.
String dataFormatString = cellElements.get(k).getAttributeValue("dataFormat");
// Create an HSSFCellStyle using the createCellStyle() method of the workbook.
HSSFCellStyle currentCellStyle = donationWorkbook.createCellStyle();
// Create an HSSFDataFormat object from the workbook's method
HSSFDataFormat currentDataFormat = donationWorkbook.createDataFormat();
// Get the index of the HSSFDataFormat to use. The index of the numeric format
// matching the dataFormatString is returned by getFormat(dataFormatString).
short dataFormatIndex = currentDataFormat.getFormat(dataFormatString);
// Next, use the retrieved index to set the HSSFCellStyle object's DataFormat.
currentCellStyle.setDataFormat(dataFormatIndex);
// Then apply the HSSFCellStyle to the created cell.
createdCell.setCellStyle(currentCellStyle);
|
After setting the cell's style, use setCellValue() to put
most data types in the cell.
However, numeric data needs special handling. To store numbers as numbers rather than text, convert them to doubles first. Do not convert dates to doubles, or they will be incorrect. Test the data format of numeric data to determine whether it's a date (see Listing 19).
Listing 19. Insert cell data, converting to doubles for certain numeric data
// Set cell value and types depending on the dataType attribute
if (cellElements.get(k).getAttributeValue("dataType")=="String") {
createdCell.setCellType(HSSFCell.CELL_TYPE_STRING);
createdCell.setCellValue(cellElements.get(k).getValue());
}
if (cellElements.get(k).getAttributeValue("dataType")=="Numeric") {
createdCell.setCellType(HSSFCell.CELL_TYPE_NUMERIC);
// In this spreadsheet, number styles are times, dates,
// or salaries. To store as a number and not as text,
// salaries should be converted to doubles first.
// Dates and times should not be converted to doubles first,
// or you'll be inputting the wrong date or time value.
// Dates and times can be entered as Java Date objects.
if (cellElements.get(k).getAttributeValue("dataFormat").contains("#")) {
// If formatting contains a pound sign, it's not a date.
// Use a Java NumberFormat to format the numeric type cell as a double,
// because like before, the element has commas in it.
NumberFormat numberFormat = NumberFormat.getInstance();
Number cellValueNumber = numberFormat.parse(cellElements.get(k).getValue());
createdCell.setCellValue(cellValueNumber.doubleValue());
// Add a hyperlink to the fictional GEE WHIS Web site just
// to demonstrate that you can.
HSSFHyperlink hyperlink = new HSSFHyperlink(HSSFHyperlink.LINK_URL);
hyperlink.setAddress("http://www.ibm.com/developerworks/");
createdCell.setHyperlink(hyperlink);
}
else {
// if it's a date, don't convert to double
createdCell.setCellValue(cellElements.get(k).getValue());
}
}
// Handle formula and error type cells. See ExcelXML.java for the full example.
//End middle (cell) for loop
}
// End outer (row) for loop
}
|
Formatting is also necessary for Excel functions that create dates, such as
TODAY() and NOW().
See Listing 20.
Listing 20. Use Excel functions with proper formatting
// Demonstrate functions:
// Add the TODAY() and NOW() functions at bottom of the Excel report
// to say when the workbook was opened.
// Find the last row and increment by two to skip a row
int lastRowIndex = donationSheet.getLastRowNum()+2;
// Create a row and three cells to hold the information.
HSSFRow lastRow = donationSheet.createRow(lastRowIndex);
HSSFCell notationCell = lastRow.createCell(0);
HSSFCell reportDateCell = lastRow.createCell(1);
HSSFCell reportTimeCell = lastRow.createCell(2);
// Set a regular string value in one cell
notationCell.setCellValue("Time:");
// Setting formula values uses setCellFormula()
reportDateCell.setCellFormula("TODAY()");
reportTimeCell.setCellFormula("NOW()");
// Create HSSFCellStyle objects for the date and time cells.
// Use the createCellStyle() method of the workbook.
HSSFCellStyle dateCellStyle = donationWorkbook.createCellStyle();
HSSFCellStyle timeCellStyle = donationWorkbook.createCellStyle();
// Get a HSSFDataFormat object to set the time and date formats for the cell styles
HSSFDataFormat dataFormat = donationWorkbook.createDataFormat();
// Set the cell styles to the right format by using the index numbers of
// the desired formats retrieved from the getFormat() function of the HSSFDataFormat.
dateCellStyle.setDataFormat(dataFormat.getFormat("m/dd/yy"));
timeCellStyle.setDataFormat(dataFormat.getFormat("h:mm AM/PM"));
// Set the date and time cells to the appropriate HSSFCellStyles.
reportDateCell.setCellStyle(dateCellStyle);
reportTimeCell.setCellStyle(timeCellStyle);
|
Finally, after finishing the desired workbook object, write it to a file using the
workbook's write() method (Listing 21).
Listing 21. Write the Excel workbook to a file
// Write out the workbook to a file. First, // you need some sort of OutputStream to represent the file. String filePathString = "C:\\Planet Power\\Employee_Donations.xls"; FileOutputStream donationStream = new FileOutputStream(filePathString); donationWorkbook.write(donationStream); |
You have now written an Excel spreadsheet calculating donations to GEE WHIS.
The reports are done. Along with reading Excel and creating XML, Java programmers can now write XML back to Excel files. After understanding the basics of converting between the two formats, you might feel yourself globally warming to the whole idea of reporting.
The Big Boss is happy, and you've done your part toward helping Planet Power's environmental superheroes save Genetically Engineered Enormous Wild Hamsters. Everyone's happy. Reporting really can be good for the environment.
| Description | Name | Size | Download method |
|---|---|---|---|
| Sample Excel spreadsheet and Java code | Java-Excel-XML-Planet-Power2.zip | 17KB | HTTP |
Information about download methods
Learn
- Read, recycle, and reuse: Reporting made easy with Excel, XML, and Java technologies, Part 1: Read Excel files and write them to new files using Java and XML (Shaene M. Siders, developerWorks, February 2010): Every company faces the challenge of extracting business data. Learn to extract data from Excel and convert it between Excel and XML using Java technology in Part 1 of this series by the author.
- JavaDoc documentation for Apache POI: Browse the Apache POI documentation.
- XML and Java technologies: Data binding, Part 2: Performance (Dennis Sosnoski, developerWorks, January 2003): Take XML data binding frameworks out for a test drive.
- Regular expressions: Learn more about regular expressions using regular expressions in Java and the ASCII POSIX bracket expression used in this article.
- The Busy Developers' Guide to HSSF and XSSF Features: Jump start your POI skills with this guide, available on Apache's site.
- Get started with XPath (Bertrand Portier, developerWorks, May 2004): Explore XPath and learn about its syntax and semantics, XPath location paths, XPath expressions, XPath functions, and how XPath relates to XSLT in this introductory XPath tutorial.
- XPath: Explore XPath with W3Schools.
- XML area on developerWorks: Get the resources you need to advance your skills in the XML arena.
- IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.
- XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
- developerWorks technical events and webcasts: Stay current with technology in these sessions.
- developerWorks on Twitter: Join today to follow developerWorks's tweets.
- developerWorks
podcasts: Listen to interesting interviews and discussions for software developers.
Get products and technologies
- Eclipse Classic: Download Eclipse. This article uses version 3.5.1.
- Apache POI: Learn more about and download version 3.6 of the Apache POI, which is the latest stable release.
- Complete zip for XOM: Learn more about and download Elliotte Rusty Harold's XML API.
- Examine an elegant code snippet: Detect (and remove) Excel rows that are blank (but not null).
- IBM product evaluation versions: Download or explore the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
Discuss
- XML zone discussion forums: Participate in any of several XML-related discussions.
- developerWorks blogs: Check out these blogs and get involved.

Shaene Siders is a consultant and writer for technology, training, and entertainment. She started a writing and Web development company in 1996 and has been an IBM Rational Certified Instructor for ClearCase and ClearQuest since 2003. She is the author and producer of the training courseware series Java for Mad Scientists, and she is also an IBM Rational Certified Instructor for Build Forge, Rational Functional Tester, Rational Performance Tester, and Rational Manual Tester. You can reach Shaene at consult@DragonUnderGlass.com.



