ODFDOM for Java: Simplifying programmatic control of documents and their data, Part 3

The third of a three-part series, this article introduces how to use Open Document Format (ODF) Document Object Model (DOM) for Java™ to create text (text), spreadsheet (spreadsheet), and presentation graphics (presentation) documents.

Share:

Li Wei (weili@cn.ibm.com), Software Engineer, IBM

Li Wei is a Software Engineer based in the IBM China Development Laboratory, where he works in the Emerging Technology Institute department. He is involved in projects concerning standards such as ACORD, NAVA, and ODF and is a member of the ODF Toolkit development community. You can reach him at weili@cn.ibm.com.



27 April 2010

Also available in Chinese Russian Portuguese

Editor's note: Know a lot about this topic? Want to share your expertise? Participate in the IBM Lotus software wiki program today.

Using ODFDOM to create text, spreadsheet, and presentation graphic documents

First, let's briefly describe the ODF document structure. ODF documents are stored in a ZIP package that includes content.xml, style.xml, and several other documents.

Content.xml is used to store the contents of the document, and style.xml is used to store the document style information. The content.xml file also contains some style information and can be used to create some of the default values, such as fonts and colors, automatically.

In general, there are four steps to manipulate an ODF document:

  1. Load an existing ODF document or create a ODF document.
  2. Insert content to the ODF document.
  3. Set the style for different parts of the information.
  4. Save the document.

ODFDOM now provides some convenient APIs with which you can perform these four major types of document operations.


Creating a text file

In this section, we demonstrate a simple use case: Read an XML file and output to an ODF document.

As a text file, its content.xml hierarchy is as follows (see listing 1):

  • The first element is <office:body>, which is a subelement of the document root.
  • The next layer is the <office:text> element, which represents all the contents of the document elements to be saved in the output document.
  • Before <office:body>, <office:automatic-styles> is another subelement of the document root, used to store various style information for elements.
Listing 1. Structure of the text file content.xml level
<office:document-content>
    <office:automatic-style/>
      <office:body>
         <office:text/>
      </office:body>
</ office:document-content>

Here, <office:style> and <office:automatic-styles> both define some styles, but there are some differences. <office:style> is usually used to define some commonly used styles. We generally apply the style defined in <office:style> to the elements by setting the style name. From the view of the ODF editor, a style defined in <office:style> is a set of characteristic values defined by the user.

On the other hand, a style defined in <office:automatic-styles> contains some special style properties. From the view of the ODF editor, it means to edit certain attributes of an object.

ODFDOM provides objects to represent the ODF package of the various documents:

  • OdfTextDocument textDocument: Corresponds to a text file (odt) object.
  • OdfFileDom contentDom: Corresponds to the XML document object of content.xml.
  • OdfFileDom stylesDom: Corresponds to the XML document object of styles.xml.

A text file object can use the API to get the content object (content.xml) and style object (styles.xml):

  • OdfFileDom contentDom = textDocument.getContentDom()
  • OdfFileDom stylesDom = textDocument.getStylesDom()

ODFDOM also provides a number of objects to represent the various elements of content and style:

  • OdfOfficeAutomaticStyles contentAutoStyles: The corresponding elements <office:automatic-styles> in content.xml.
  • OdfOfficeStyles stylesOfficeStyles: The corresponding elements <office:styles> in styles.xml.
  • OdfOfficeText officeText: The corresponding elements <office:text> in content.xml.

With the ODFDOM API it is also easy to get the content element object and text element object:

  • OdfOfficeAutomaticStyles autoStyles = textDocument.getContentDom (). GetAutomaticStyles ()
  • OdfOfficeStyles styles = textDocument.getDocumentStyles ();
  • OdfOfficeText text = (OfficeText) textDocument.getContentRoot ();

ODFDOM provides an API to manipulate ODF documents at the file level that can be used, for example, to create a text file, load an existing text file, and save a file:

  • OdfTextDocument odtDoc = OdfTextDocument.newTextDocument (); / / Create a text file
  • OdfTextDocument odtDoc = (OdfTextDocument) OdfDocument.loadDocument ( "text.odt"); / / Load an existing text file
  • odtDoc.save ( "text.odt") / / save the file

Certainly, a text file cannot do without rich style attributes, such as text fonts, paragraph layout, and bullets. Therefore, ODFDOM has a method to deal with these style attributes:

  • OdfStyle style = odtDoc.getDocumentStyles (). getStyle ( "myStyle",
    OdfStyleFamily.Paragraph);
    / / From a text file to obtain a user-defined styles (style) Object
  • style.setProperty (OdfStyleTextProperties.FontWeight, "bold");
    / / Set the style for the specific value of the property

The code for setting the text style attributes is shown in listing 2.

Listing 2. Code for setting text style attributes
OdfTextParagraph para; 
para.setProperty (OdfStyleTextProperties.FontSize, "17pt"); 
para.setProperty (OdfStyleParagraphProperties.TextAlign, "left"); 
para.setProperty (OdfStyleChartProperties.DataLabelNumber, "value");

The code using characteristic values can be applied to a specific document element, and the style created in this way is <office:automatic-style>.

We introduced a text file with some significant use of APIs. In the following examples, we look at an ODFDOM application that reads the data from an XML file and then, by use of the ODFDOM API operation, the data is saved to an ODF text document in a specified format (see listings 3-7).

Listing 3. XML data file book.xml
<book> 
     <title> The ODFDOM tutorial </ title> 
     <author> IBM ODF team </ author> 
     <content> introduce ODFDOM usage </ content> 
</ book>
Listing 4. Using Java DOM API to parse XML documents to read
DocumentBuilder builder = null; 
  inputDocument = null; 
  try ( 
      inputXPath = XPathFactory.newInstance (). newXPath (); 
      builder = DocumentBuilderFactory.newInstance (). newDocumentBuilder (); 
      inputDocument = builder.parse ( "book.xml"); 
  ) catch (IOException e) ( 
      System.err.println ( "Unable to read input file."); 
      System.err.println (e.getMessage ()); 
  ) catch (Exception e) ( 
      System.err.println ( "Unable to parse input file."); 
      System.err.println (e.getMessage ()); 
)
Listing 5. Create a text ODF object
try ( 
     OdfTextDocument odtDocument = OdfTextDocument.newTextDocument (); 
     OdfFileDom contentDom = outputDocument.getContentDom (); 
     OdfFileDom stylesDom = outputDocument.getStylesDom (); 
     contentAutoStyles = contentDom.getOrCreateAutomaticStyles (); 
     OdfOfficeStyles stylesOfficeStyles = odtDocument.getOrCreateDocumentStyles (); 
     officeText = outputDocument.getContentRoot (); 
     ) catch (Exception e) ( 
     System.err.println ( "Unable to create output file."); 
     System.err.println (e.getMessage ()); 
     odtDocument = null; 
)

Because the content and style need additional information to complete, we must insert content elements into this new text file's content DOM tree, and insert elements of the automatic styles into the content and styles DOM tree (see listings 6 and 7).

Listing 6. Read the XML content into a text file in the ODF
NodeList booklist = inputDocument.getElementsByTagName ( "book"); 
Node book = booklist [0]; 
String title = inputXPath.evaluate ( "book / title", book); 
String author = inputXPath.evaluate ( "book / author", book); 
String content = inputXPath.evaluate ( "book / content", book); 
OdfTextHeading heading = (OdfHeading) officeText.newTextHElement (title); 
(OdfTextParagraph) para = (OdfTextParagraph) newTextPElement (); 
para. addContent (content);
Listing 7. Style applied to text content
OdfStyle style1 = odtDocument.getOrCreateDocumentStyles (). 
NewStyle ( "hStyle", OdfStyleFamily.Text); 
style.setProperty (OdfTextProperties.FontWeight, "bold"); 
style.setProperty (OdfTextProperties.FontStyle, "italic"); 
style.setProperty (OdfTextProperties.FontSize, "16"); 
heading.setStyleName ( "hStyle"); 

OdfStyle style2 = odtDocument.getOrCreateDocumentStyles (). 
NewStyle ( "pStyle", OdfStyleFamily.Text); 
style.setProperty (OdfTextProperties.FontStyle, "italic"); 
style.setProperty (OdfTextProperties.FontSize, "10"); 
para.setStyleName ( "pStyle");

Finally, we save the ODT text file:

odtDocument.save ( "text.odt")

You can use OpenOffice or IBM® Lotus® Symphony™ to open this new file and to see how the ODFDOM yields results by directly accessing the APIs.


Creating a spreadsheet file

First, let's look at the main structure of content.xml for the spreadsheet document (see listing 8).

Listing 8. Structure of the spreadsheet content.xml
<office:document-content> 
<office :automatic-style/> 
<office :body> 
    <office:spreadsheet> 
        <table:table/> 
    </ Office: spreadsheet> 
</ office: body> 
</ Office: document-content>

In a spreadsheet document, the main elements are these:

  • <table:table>, which is the root element of the table, and all elements of the table contents are its sub.elements
  • <table:column>,which specifies the width of a spreadsheet and the default style definitions
  • <table:row>, which represents a table row and is composed of multiple <table:cell> elements

For the <table:cell> element, two properties, office:value-type and office: value, usually must be specified with a subelement, <text:p>.

ODFDOM has a number of objects related to the table feature:

ObjectEquivalent
OdfSpreadsheetDocumentcontent.xml
OdfTable <table:table>
OdfTableColumn<table:table-column>
OdfTableRow<table:table-row>
OdfTableCell<table:table-cell>

where:

  • OdfTable object represents a table of the spreadsheet.
  • OdfTableColumn is used to specify a column in a spreadsheet, and the column number is set by the value of the TableNumberColumnsRepeatedAttribute property.
  • OdfTableRow is used to represent the table row. One row is made up of one or more of the OdfTableCell objects.
  • OdfTableCell is the unit constituting a table element, and each cell object can be used to place values, paragraphs, and other text content; usually it's necessary to set these three property values:
    • OfficeValueAttribute
    • OfficeValueTypeAttribute
    • TextContent

OfficeValueTypeAttribute is used to specify the type of stored data (characters, numbers, date, time, formulas, and so on); OfficeValueAttribute is used to store values; and TextContent is used to store the value seen by a user.

Now we create a table with three rows and four columns to illustrate how to apply some ODFDOM API operations on the table feature (see listing 9):

  1. Create a spreadsheet object OdfSpreadsheetDocument.
  2. Get the root of OdfOfficeSpreadsheet.
  3. Create an OdfTable object.
  4. Create a <table:column> element, and set the attribute table:number-columns-repeated as the number of columns in this table.
  5. Use a loop to create <table:row> elements for each row.
  6. Use a loop to create <table:cell> elements for each cell in a row, and fill in the values.
  7. Save the spreadsheet.
Listing 9. Building a spreadsheet
int data [][]= ((1,2,3,4), (5,6,7,8), (9,10,11,12)); 
OdfSpreadsheetDocument odfdoc = OdfSpreadsheetDocument.newSpreadsheetDocument (); 
OdfOfficeSpreadsheet spreadsheet = odfdoc.getContentRoot (); 
OdfTable table = (OdfTable) spreadsheet.newTableTableElement (); 
OdfTableColumn column = (OdfTableColumn) table.newTableTableColumnElement (); 
column.setTableNumberColumnsRepeatedAttribute (new Integer (4)); 
for (int i = 0; i <3; i) ( 
OdfTableRow row = (OdfTableRow) table.newTableTableRowElement (); 
/ / row.setStyleName ( "ro1"); 
for (int j = 0; j <4; j) ( 
OdfTableCell cell = (OdfTableCell) row.newTableTableCellElement (); 
cell.setOfficeValueAttribute (new Double (data [i] [j])); 
cell.setOfficeValueTypeAttribute ( "float"); 
cell.setTextContent ((new Double (data [i] [j]). toString ())); 
) 
) Odfdoc.save (ResourceUtilities.createTestResource ( "table3R4C.ods"));

The table feature always contains date and time values, and the representations of these values can be different for different countries and regions. ODFDOM is able to provide not only the appropriate means to set date and time formats, but also some special style classes for many kinds of formats.

For example, OdfNumberDateStyle is used to handle date formats, OdfNumberTimeStyle is used to handle time formats, and OdfNumberStyle is used to handle number formats.

These style elements are to be placed under the <office:automatic-styles> element (see listing 10) as follows:

  1. Obtain the object of OdfAutomaticStyles with getStylesDom ().
  2. Create objects of corresponding style classes.
  3. Set specific formats of the style objects.
Listing 10. Set the number style
OdfOfficeAutomaticStyles autoStyles = odfdoc.getStylesDom (). GetAutomaticStyles (); 
OdfNumberDateStyle dataStyle = (OdfNumberDateStyle) 
autoStyles.newNumberDateStyleElement ( "numberDateStyle"); 
dataStyle.buildFromFormat ( "yyyy-MM-dd"); 
OdfNumberTimeStyle timeStyle = (OdfNumberTimeStyle) 
autoStyles.newNumberTimeStyleElement ( "numberTimeStyle"); 
timeStyle.buildFromFormat ( "hh: mm: ss"); 
OdfNumberStyle numberStyle = (OdfNumberStyle) 
autoStyles.newNumberNumberStyleElement ( "numberStyle"); 
numberStyle.buildFromFormat ( "# 0.00");

Then we specify a cell style and associate these objects of date and number styles with objects of the cell style (see listing 11):

  1. Create an object of OdfStyle whose family is table-cell.
  2. Get the name of date and number styles with getStyleNameAttribute ().
  3. Set the style:data-stylename property of the cell style object as the name of the data and number styles with setStyleDataStyleNameAttribute().
  4. Apply this cell style to a specific cell.
Listing 11. Apply the cell styles
Cell style for date cells:

OdfStyle style; 
style = autoStyles.newStyle (OdfStyleFamily.TableCell); 
String dataCellStyleName = style.getStyleNameAttribute (); 
style.setStyleDataStyleNameAttribute ( "numberDateStyle"); 
cell.setStyleName (dataCellStyleName); 

And for time cells:

style = autoStyles.newStyle (OdfStyleFamily.TableCell); 
String timeCellStyleName = style.getStyleNameAttribute (); 
style.setStyleDataStyleNameAttribute ( "numberTimeStyle"); 
cell.setStyleName (timeCellStyleName); 

And for the temperatures:

style = autoStyles.newStyle (OdfStyleFamily.TableCell); 
String numberCellStyleName = style.getStyleNameAttribute (); 
style.setStyleDataStyleNameAttribute ( "numberStyle"); 
cell.setStyleName (numberCellStyleNam);

In this example, we created a simple spreadsheet but, actually, spreadsheets can be quite complex. For example, they can have table cells that span multiple rows and columns, with an application of a variety of styles and embedded objects and media.

These complex features can be implemented by the ODFDOM API, though the code might be complex. As ODFDOM grows, however, these complex spreadsheets can become easier to create.


Creating a presentation file

Let's start by defining the relevant terms in the presentation content.xml (see listing 12):

  • <office:presentation> is the root element of a presentation graphic document.
  • <draw:page>, which is a subelement of <office:presentation>, presents a slide in a presentation. Only graphics elements can be stored in <draw:page>, so the text elements, such as <text:title>, must be placed under <draw:frame> to be placed in a slide.
  • <style:master-page> is a generic template page and a subelement of <style:master-style>. Styles, such as background, are set in a master page, and every slide can be associated with a master page; thus, the corresponding template is applied to it.
Listing 12. Structure of the presentation content.xml file
<office:document-content> 
<office :automatic-style/> 
<office :body> 
    <office:presentation> 
       <draw:page/> 
    </ office: presentation> 
</ office: body> 
</ Office: document-content>

In this section, we illustrate how to create a presentation file, insert a slide, list the title, apply a master page template, and then save this new slide.

Table 1 lists the ODF classes that are used in the code.

Table 1. ODF classes and purposes
ODF class Purpose
OdfPresentationDocumentPresentation file
OdfStyleDomStyle DOM
OdfOfficePresentation<office:presentation> element
OdfOfficeStyles<office:styles> document style elements
OdfOfficeAutomaticStyle<style:automatic-styles> is placed on an automatic style
OdfStylePageLayout <style:page-layout> defines the layout of a page
OdfOfficeMasterStyles<office:master-styles> defines the master style for a page
OdfStyleMasterPage<style:master-page> is <office:master-styles> subelement used to define a main style templates page
OdfDrawPage <draw:page> element is used to represent a page in presentation or a slide
OdfDrawFrame<draw:frame> is a container element in which other elements are placed

Here are the steps (see listing 13):

  1. Create an object of OdfPresentationDocument.
  2. Get the object of OdfOfficePresentation.
  3. Get the object of OdfOfficeStyles, which represents the document style element; create one if this element does not exist.
  4. Create an element of <style:page-layout>; this element is under the OdfOfficeAutomaticStyle element, so it can be obtained using getAutomaticStyles ().
  5. Get the object of OdfOfficeMasterStyles with the getOfficeMasterStyles() method of the OdfPresentationDocument class.
  6. Create an object of OdfStyleMasterPage, where we must specify the name of this master page and the name of the layout style. (The name of the layout style created in step 4 can be used.
  7. A presentation document is composed of slides, so the next step is to create an object of OdfDrawPage with the newDrawPageElement method (MasterPageStyleName), where we can specify a master page. After this method is invoked, the master page is applied to the new slide.
  8. Because only graphic elements can be stored in <draw:page>, we need to create an object of OdfDrawFrame to add text content.
  9. We create two objects of OdfDrawFrame. One is used to store the title; the other is used to store the image.

Thus, the presentation document is created, after which you can save it as an ODP document and use OpenOffice or IBM Lotus Symphony to open it.

Listing 13. Create presentation files
OdfPresentationDocument presentationDoc = 
OdfPresentationDocument.newPresentationDocument (); 
OdfOfficePresentation presentation1 = presentationDoc.getContentRoot (); 
presentationDoc.getOrCreateDocumentStyles (); 
presentationDoc.getStylesDom (). getAutomaticStyles (). 
newStylePageLayoutElement ( "PM01"); 
OdfOfficeMasterStyles officeMasterStyles = presentationDoc.getOfficeMasterStyles (); 
                                                            
    OdfStyleMasterPage masterPage = (OdfStyleMasterPage) officeMasterStyles.
    newStyleMasterPageElement ( "master-name-1", "PM01"); 
    OdfDrawPage page4 = (OdfDrawPage) presentation1.newDrawPageElement 
    ( "master-name-1"); 
    OdfDrawFrame frame1 = (OdfDrawFrame) page4.newDrawFrameElement (); 
    frame1.newDrawTextBoxElement (). setTextContent ( "title"); 
    OdfDrawFrame frame2 = (OdfDrawFrame) page4.newDrawFrameElement (); 
    frame2.newDrawImageElement (). setXlinkHrefAttribute ( "http://impage"); 
    presentationDoc.save ( "presentation.odp");

Conclusion

Using the three examples detailed in this article, we illustrated how to use the ODFDOM API to create the contents, styles, and other features of text, spreadsheet, and presentation ODF documents.


Acknowledgment

The author extends a special acknowledgment to the Project Leader, Ying Chun (Daisy) Guo, for her contributions to this article.


Download

DescriptionNameSize
Code samplePart3-ODFDOM-practice_EN.zip10KB

Resources

Learn

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into IBM collaboration and social software on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Lotus
ArticleID=485074
ArticleTitle=ODFDOM for Java: Simplifying programmatic control of documents and their data, Part 3
publish-date=04272010