Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Understanding DOM

Nicholas Chase (nicholas@nicholaschase.com), Author, Web site developer
Nicholas Chase, a Studio B author, has been involved in Web site development for companies such as Lucent Technologies, Sun Microsystems, Oracle, and the Tampa Bay Buccaneers. Nick has been a high school physics teacher, a low-level radioactive waste facility manager, an online science fiction magazine editor, a multimedia engineer, and an Oracle instructor. More recently, he was the Chief Technology Officer of Site Dynamics Interactive Communications in Clearwater, Florida, USA, and is the author of four books on Web development, including XML Primer Plus (Sams). He loves to hear from readers and can be reached at nicholas@nicholaschase.com.

Summary:  Even before there was XML, there was the Document Object Model, or DOM. It allows a developer to refer to, retrieve, and change items within an XML structure, and is essential to working with XML. In this tutorial, you will learn about the structure of a DOM document. You will also learn how to use Java™ technology to create a Document from an XML file, make changes to it, and retrieve the output.

Date:  12 Mar 2007 (Published 29 Jul 2003)
Level:  Introductory PDF:  A4 and Letter (95 KB | 29 pages)Get Adobe® Reader®

Activity:  60277 views
Comments:  

Stepping through the document

Get the root element

Once the document is parsed and a Document is created, an application can step through the structure to review, find, or display information. This navigation is the basis for many operations that will be performed on a Document.

Stepping through the document begins with the root element. A well-formed document has only one root element, also known as the DocumentElement. First the application retrieves this element.

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import java.io.File;
import org.w3c.dom.Document;

import org.w3c.dom.Element;

public class OrderProcessor {
...
      System.exit(1);
   }

   //STEP 1:  Get the root element

   Element root = doc.getDocumentElement();
   System.out.println("The root element is " + root.getNodeName());

 }
}

Compiling and running the application outputs the name of the root element, orders.


Get the children of a node

Once the application determines the root element, it retrieves a list of the root element's children as a NodeList. The NodeList class is a series of items through which the application can iterate. In this example, for brevity, the application gets the child nodes and verifies the retrieval by showing only how many elements appear in the resulting NodeList.

Notice that the document has only two elements, but the NodeList contains five children, including three text nodes that contain line feeds -- another reminder that nodes and elements are not equivalent in the DOM. The other three nodes are text nodes containing line feeds.

...


import org.w3c.dom.NodeList;

...
   //STEP 1:  Get the root element
   Element root = doc.getDocumentElement();
   System.out.println("The root element is "+root.getNodeName());
      
   //STEP 2:  Get the children
 NodeList children = root.getChildNodes();
   System.out.println("There are "+children.getLength()
                                  +" nodes in this document.");
                                  
 }
}


Using getFirstChild() and getNextSibling()

The parent-child and sibling relationships offer an alternative means for iterating through all of the children of a node that may be more appropriate in some situations, such as when these relationships and the order in which children appear is crucial to understanding the data.

In Step 3, a for-loop starts with the first child of the root. The application iterates through each of the siblings of the first child until they have all been evaluated.

Each time the application executes the loop, it retrieves a Node object, outputting its name and value. Notice that the five children of orders include the order elements and three text nodes. Notice also that the elements carry a value of null, rather than the expected text. It is the text nodes that are children of the elements that carry the actual content as their values.

...

 import org.w3c.dom.Node; 

...
      
      //STEP 3:  Step through the children
 for (Node child = root.getFirstChild(); 
          child != null;
          child = child.getNextSibling())
      {
         System.out.println(start.getNodeName()+" = " 
                                       +start.getNodeValue());
      }

   }
}
...


Recursing through multiple levels of children

The code in Using getFirstChild() and getNextSibling() shows the first-level children, but that's hardly the entire document. To see all of the elements, the functionality in the previous example must be turned into a method and called recursively.

The application starts with the root element and prints the name and value to the screen. The application then runs through each of its children, just as before. But for each of the children, the application also runs through each of their children, examining all of the children and grandchildren of the root element.

...

public class OrderProcessor {
   
 private static void stepThrough (Node start)
   {

      System.out.println(start.getNodeName()+" = "+start.getNodeValue());   

      for (Node child = start.getFirstChild(); 
          child != null;
          child = child.getNextSibling())
      {
 stepThrough(child);

      }
 }


   public static void main (String args[]) {
      File docFile = new File("orders.xml");
      
...     
      System.out.println("There are "+children.getLength()
                            +" nodes in this document.");

      //STEP 4:  Recurse this functionality
 stepThrough(root);

   }
}


Including attributes

The stepThrough() method as written so far can run through most types of nodes, but it misses attributes entirely because they are not children of any nodes. To show attributes, modify stepThrough() to check element nodes for attributes.

The modified code below checks each node output to see whether or not it's an element by comparing its nodeType to the constant value ELEMENT_NODE. A Node object carries member constants that represent each type of node, such as ELEMENT_NODE or ATTRIBUTE_NODE. If the nodeType matches ELEMENT_NODE, it is an element.

For every element it finds, the application creates a NamedNodeMap that contains all of the attributes for the element. The application can iterate through a NamedNodeMap, printing each attribute's name and value, just as it iterated through the NodeList.

...

 import org.w3c.dom.NamedNodeMap;

...
private static void stepThroughAll (Node start)
   {
      System.out.println(start.getNodeName()+" = "+start.getNodeValue());   
         
 if (start.getNodeType() == start.ELEMENT_NODE) 
      {   
          NamedNodeMap startAttr = start.getAttributes();
          for (int i = 0; 
               i < startAttr.getLength();
               i++) {
             Node attr = startAttr.item(i);
             System.out.println("  Attribute:  "+ attr.getNodeName()
                                          +" = "+attr.getNodeValue());
          }   
      } 

      
      for (Node child = start.getFirstChild(); 
          child != null;
          child = child.getNextSibling())
      {
         stepThroughAll(child);
      }
   }

6 of 11 | Previous | Next

Comments



static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=138429
TutorialTitle=Understanding DOM
publish-date=03122007
author1-email=nicholas@nicholaschase.com
author1-email-cc=