Skip to main content

Parse XML with dom4j

Create and modify an XML document with the dom4j API

Deepak Vohra (dvohra09@yahoo.com) is a Web Developer, a NuBean Consultant, and a Sun Certified Java 1.4 Programmer. You can reach him at dvohra09@yahoo.com.

Summary:  dom4j is an open-source XML framework for parsing XML documents. This article shows you how to create an XML document and modify it with the parser that's included with dom4j.

Date:  31 Mar 2004
Level:  Introductory
Activity:  8826 views

The dom4j API download includes a tool for parsing XML documents. In this article, an example XML document is created with the parser. Listing 1 shows the example XML document, catalog.xml.


Listing 1. Example XML document (catalog.xml)
<?xml version="1.0" encoding="UTF-8"?> 
<catalog> 
<!--An XML Catalog--> 
<?target instruction?>
  <journal title="XML Zone" 
                  publisher="IBM developerWorks"> 

<article level="Intermediate" date="December-2001">
 <title>Java configuration with XML Schema</title> 
 <author> 
     <firstname>Marcello</firstname> 
     <lastname>Vitaletti</lastname> 
 </author>
  </article>
  </journal> 
</catalog>

catalog.xml is modified with the parser. Listing 2 shows the modified XML document, catalog-modified.xml.


Listing 2. The modified XML document (catalog-modified.xml)
<?xml version="1.0" encoding="UTF-8"?> 
<catalog> 
<!--An XML catalog--> 
<?target instruction?>
  <journal title="XML Zone"
                   publisher="IBM developerWorks"> 

<article level="Introductory" date="October-2002">
 <title>Create flexible and extensible XML schemas</title> 
 <author> 
     <firstname>Ayesha</firstname> 
     <lastname>Malik</lastname> 
 </author> 
  </article>
  </journal> 
</catalog>

The advantage of using the parser that's included with dom4j, over the W3C DOM APIs, is that dom4j has native XPath support. The DOM parser does not support XPath to select nodes.

This article is structured into the following sections:

  • Preliminary setup
  • Creating a document
  • Modifying a document

Preliminary setup

This parser may be obtained from http://dom4j.org. Make dom4j-1.4/dom4j-full.jar available on the classpath; it contains the dom4j classes, the XPath engine, and the interfaces for SAX and DOM. If the SAX and DOM interfaces that are included in the JAXP parser are being used, add dom4j-1.4/dom4j.jar to the classpath. dom4j.jar contains the dom4j classes and the XPath engine, but not the SAX and DOM interfaces.


Creating a document

In this section, I discuss the procedure for creating an XML document with the dom4j API . The example XML document, catalog.xml, will be created.

Import the dom4j API classes with import statements:

import org.dom4j.Document;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;

Create a document instance with the DocumentHelper class. DocumentHelper is a dom4j API factory for generating XML document nodes.

 Document document = DocumentHelper.createDocument();

Create the root element catalog with the addElement() method. addElement() is used to add elements to an XML document.

Element catalogElement = document.addElement("catalog");

Create and add the comment "An XML catalog" to the catalog element with the addComment() method.

 catalogElement.addComment("An XML catalog");

Add a processing instruction to the catalog element with the addProcessingInstruction() method.

catalogElement.addProcessingInstruction("target","text");

Add the journal element to the catalog element with the addElement() method.

Element journalElement =  catalogElement.addElement("journal");

Add the title and publisher attributes to the journal element with the addAttribute()method.

journalElement.addAttribute("title", "XML Zone");
         journalElement.addAttribute("publisher", "IBM developerWorks");

Add an article element to the journal element.

Element articleElement=journalElement.addElement("article");

Add the level and date attributes to the article element.

articleElement.addAttribute("level", "Intermediate");
      articleElement.addAttribute("date", "December-2001");

Add the title element to the article element.

Element titleElement=articleElement.addElement("title");

Set the text of the article element with the setText() method.

titleElement.setText("Java configuration with XML Schema");

Add an author element to the article element.

Element authorElement=articleElement.addElement("author");

Add a firstname element to the author element and set the text of the firstname element.

Element  firstNameElement=authorElement.addElement("firstname");
     firstNameElement.setText("Marcello");

Add a lastname element to the author element and set the text of the lastname element.

Element lastNameElement=authorElement.addElement("lastname");
     lastNameElement.setText("Vitaletti");

A document type statement may be added to the document with the addDocType() method.

document.addDocType("catalog", null,"file://c:/Dtds/catalog.dtd");

A document type statement gets added to the XML document:

<!DOCTYPE catalog SYSTEM "file://c:/Dtds/catalog.dtd">

A Doctype is required if the document is to be validated with a Document Type Definition (DTD) document.

The XML declaration <?xml version="1.0" encoding="UTF-8"?> gets added to the XML document by default.

Listing 3 shows the example program XmlDom4J.java, which is used to create the XML document catalog.xml.


Listing 3. Program that generates the XML document catalog.xml (XmlDom4J.java)
import org.dom4j.Document;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;
import org.dom4j.io.XMLWriter;
import java.io.*;



public class XmlDom4J{


public void generateDocument(){
Document document = DocumentHelper.createDocument();
     Element catalogElement = document.addElement("catalog");
     catalogElement.addComment("An XML Catalog");
     catalogElement.addProcessingInstruction("target","text");
     Element journalElement =  catalogElement.addElement("journal");
     journalElement.addAttribute("title", "XML Zone");
     journalElement.addAttribute("publisher", "IBM developerWorks");


     Element articleElement=journalElement.addElement("article");
     articleElement.addAttribute("level", "Intermediate");
     articleElement.addAttribute("date", "December-2001");
     Element  titleElement=articleElement.addElement("title");
     titleElement.setText("Java configuration with XML Schema");
     Element authorElement=articleElement.addElement("author");
     Element  firstNameElement=authorElement.addElement("firstname");
     firstNameElement.setText("Marcello");
     Element lastNameElement=authorElement.addElement("lastname");
     lastNameElement.setText("Vitaletti");

     document.addDocType("catalog",
                           null,"file://c:/Dtds/catalog.dtd");
    try{
    XMLWriter output = new XMLWriter(
            new FileWriter( new File("c:/catalog/catalog.xml") ));
        output.write( document );
        output.close();
        }
     catch(IOException e){System.out.println(e.getMessage());}
}

public static void main(String[] argv){
XmlDom4J dom4j=new XmlDom4J();
dom4j.generateDocument();
}}

This section discussed the procedure for creating an XML document . In the next section, you'll see how to modify the XML document catalog.xml, created in this section, with the dom4j API.


Modifying a document

In this section, I'll show you how to modify the example XML document, catalog.xml, with the dom4j API.

Parse the XML document, catalog.xml, with the SAXReader:

SAXReader saxReader = new SAXReader();
 Document document = saxReader.read(inputXml);

The SAXReader is included in the org.dom4j.io package.

inputXml is the java.io.File created from c:/catalog/catalog.xml. Get a list of level nodes in the article elements using an XPath expression. If the level attribute value is "Intermediate", modify it to "Introductory".

List list = document.selectNodes("//article/@level" );
      Iterator iter=list.iterator();
        while(iter.hasNext()){
            Attribute attribute=(Attribute)iter.next();
               if(attribute.getValue().equals("Intermediate"))
               attribute.setValue("Introductory"); 
       }

Get a list of article elements. Obtain an iterator for the title elements in the article elements, and modify the text of the title element.

list = document.selectNodes("//article" );
     iter=list.iterator();
   while(iter.hasNext()){
       Element element=(Element)iter.next();
      Iterator iterator=element.elementIterator("title");
   while(iterator.hasNext()){
   Element titleElement=(Element)iterator.next();
   if(titleElement.getText().equals("Java configuration with XML Schema"))
     titleElement.setText("Create flexible and extensible XML schema");

    }}

Modify the author element with a procedure similar to that of the title element.

Listing 4 shows the example program, Dom4JParser.java, which is used to modify the catalog.xml document to the catalog-modified.xml document.


Listing 4. Program used to modify catalog.xml (Dom4Jparser.java)
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.Attribute;
import java.util.List;
import java.util.Iterator;
import org.dom4j.io.XMLWriter;
import java.io.*;
import org.dom4j.DocumentException;
import org.dom4j.io.SAXReader; 

public class Dom4JParser{

 public void modifyDocument(File inputXml){

  try{
   SAXReader saxReader = new SAXReader();
   Document document = saxReader.read(inputXml);

   List list = document.selectNodes("//article/@level" );
   Iterator iter=list.iterator();
   while(iter.hasNext()){
    Attribute attribute=(Attribute)iter.next();
    if(attribute.getValue().equals("Intermediate"))
      attribute.setValue("Introductory"); 

       }
   
   list = document.selectNodes("//article/@date" );
   iter=list.iterator();
   while(iter.hasNext()){
    Attribute attribute=(Attribute)iter.next();
    if(attribute.getValue().equals("December-2001"))
      attribute.setValue("October-2002");

       }

   list = document.selectNodes("//article" );
   iter=list.iterator();
   while(iter.hasNext()){
    Element element=(Element)iter.next();
    Iterator iterator=element.elementIterator("title");
      while(iterator.hasNext()){
        Element titleElement=(Element)iterator.next();
        if(titleElement.getText().equals("Java configuration with XML

      Schema"))
        titleElement.setText("Create flexible and extensible XML schema");

                                          }

                                }

    list = document.selectNodes("//article/author" );
    iter=list.iterator();
     while(iter.hasNext()){
     Element element=(Element)iter.next();
     Iterator iterator=element.elementIterator("firstname");
     while(iterator.hasNext()){
      Element firstNameElement=(Element)iterator.next();
      if(firstNameElement.getText().equals("Marcello"))
      firstNameElement.setText("Ayesha");
                                     }

                              }

    list = document.selectNodes("//article/author" );
    iter=list.iterator();
     while(iter.hasNext()){
      Element element=(Element)iter.next();
      Iterator iterator=element.elementIterator("lastname");
     while(iterator.hasNext()){
      Element lastNameElement=(Element)iterator.next();
      if(lastNameElement.getText().equals("Vitaletti"))
      lastNameElement.setText("Malik");

                                  }

                               }
     XMLWriter output = new XMLWriter(
      new FileWriter( new File("c:/catalog/catalog-modified.xml") ));
     output.write( document );
     output.close();
   }
 
  catch(DocumentException e)
                 {
                  System.out.println(e.getMessage());
                            }

  catch(IOException e){
                       System.out.println(e.getMessage());
                    }
 }

 public static void main(String[] argv){

  Dom4JParser dom4jParser=new Dom4JParser();
  dom4jParser.modifyDocument(new File("c:/catalog/catalog.xml"));

                                        }

   }

This section showed you how to modify the example XML document with the parser that's included with dom4j. This parser does not validate an XML document with a DTD or a schema. If validation of an XML document is required, integrate dom4j with the JAXP SAX parser.


Conclusion

The parser that's included with dom4j is a non-validating tool that is used to parse XML documents. It may be integrated with a JAXP, Crimson, or Xerces parser. This article has shown you how to use this parser to create and modify an XML document.


Resources

About the author

Deepak Vohra (dvohra09@yahoo.com) is a Web Developer, a NuBean Consultant, and a Sun Certified Java 1.4 Programmer. You can reach him at dvohra09@yahoo.com.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=12390
ArticleTitle=Parse XML with dom4j
publish-date=03312004
author1-email=dvohra09@yahoo.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers