Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Tip: Using a DOM NodeFilter

Control which nodes are visible to a TreeWalker or NodeIterator

Nicholas Chase (nicholas@nicholaschase.com), President, Chase and Chase Inc.
Nicholas Chase has been involved in Web site development for companies such as Lucent Technologies, Sun Microsystems, Oracle, and the Tampa Bay Buccaneers. Nick has been a high school physics teacher, a low-level radioactive waste facility manager, an online science fiction magazine editor, a multimedia engineer, and an Oracle instructor. More recently, he was the Chief Technology Officer of Site Dynamics Interactive Communications in Clearwater, Florida, USA, and is the author of three books on Web development, including Java and XML From Scratch (Que) and the upcoming Primer Plus XML Programming (Sams). He loves to hear from readers and can be reached at nicholas@nicholaschase.com.

Summary:  XML's DOM Level 2 Traversal module provides two new objects, the TreeWalker and the NodeIterator, which simplify the process of navigating a Document. More than that, the module defines a NodeFilter, which can be used to programmatically control what nodes are visible to the TreeWalker or NodeFilter. This tip shows you how to create a NodeFilter as well as a Traversal object that uses it.

View more content in this series

Date:  01 Nov 2002
Level:  Introductory

Comments:  

Note: This tip uses JAXP, but the sample application will also work with Xerces-Java 2, and the concepts are applicable for any XML parser environment.

The source code

This tip creates an application that traverses a simple XML document that contains information on which employees to contact in case of emergency:


Listing 1. The source document
                

<?xml version="1.0"?>
<personnel>
   <employee empid="332" status="contact">
        <deptid>24</deptid>
        <shift>night</shift>
        <name>Jenny Berman</name>
   </employee>
   <!-- Other employees listed here -->
</personnel>

Ultimately, the application counts on the NodeFilter to eliminate employees with a status value of donotcontact.


Traversing the tree

The Document Object Model Level 2 Traversal Module defines objects that walk the tree of an XML document, displaying information about the current Node. The entire process of creating a TreeWalker is described in the tip Traversing an XML document with a TreeWalker, but for convenience, consider this application which displays the elements of the employee document:


Listing 2. Creating the TreeWalker
                

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import java.io.File;
import org.w3c.dom.Document;
import org.w3c.dom.DOMImplementation;
import org.w3c.dom.Node;
import org.w3c.dom.traversal.DocumentTraversal;
import org.w3c.dom.traversal.TreeWalker;
import org.w3c.dom.traversal.NodeIterator;
import org.w3c.dom.traversal.NodeFilter;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Element;

public class ShowDocument {

    public static void main (String args[]) {
       File docFile = new File("employees.xml");
                
       Document doc = null;
       try {
          DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
          DocumentBuilder db = dbf.newDocumentBuilder();
        
          doc = db.parse(docFile);
       } catch (Exception e) {
           System.out.print("Problem parsing the file.");
       }

       DOMImplementation domimpl = doc.getImplementation();
       if (domimpl.hasFeature("Traversal", "2.0")) {

           Node root = doc.getDocumentElement();
           int whattoshow = NodeFilter.SHOW_ALL;
          NodeFilter nodefilter = null; 
           boolean expandreferences = false;

           DocumentTraversal traversal = (DocumentTraversal)doc;
  
           TreeWalker walker = traversal.createTreeWalker(root, 
                                                          whattoshow, 
                                                          nodefilter, 
                                                          expandreferences);
           Node thisNode = null;
           thisNode = walker.nextNode();
           while (thisNode != null) {
              if (thisNode.getNodeType() == thisNode.ELEMENT_NODE) {
                 System.out.print(thisNode.getNodeName() + " ");
                 Element thisElement = (Element)thisNode;
                 NamedNodeMap attributes = thisElement.getAttributes();
                 System.out.print("(");
                 for (int i = 0; i < attributes.getLength(); i++) {
                    System.out.print(attributes.item(i).getNodeName() + "=\"" +
                                     attributes.item(i).getNodeValue() + "\" ");
                 }
                 System.out.print(") : ");
              } else if (thisNode.getNodeType() == thisNode.TEXT_NODE) {
                 System.out.print(thisNode.getNodeValue());
              }
              thisNode = walker.nextNode();
          }

        } else {
           System.out.println("The Traversal module isn't supported.");
        }
   }
}

When the TreeWalker traverses the Document tree, it displays Element names, attributes, and Text Nodes:


Listing 3. The application output -- all nodes
                
	
personnel () :
   employee (empid="332" status="contact" ) :
        deptid () : 24
        shift () : night
        name () : Jenny Berman

   employee (empid="994" status="donotcontact" ) :
        deptid () : 24
        shift () : day
        name () : Andrew Fule

   employee (empid="948" status="contact" ) :
        deptid () : 3
        shift () : night
        name () : Anna Bangle

Notice that one of the parameters passed on the creation of the TreeWalker is a NodeFilter object that has been set to null. The result is that the TreeWalker sees all of the Nodes of the Document that satisfy the whattoshow value, NodeFilter.SHOW_ALL.


Creating a NodeFilter

Creating a NodeFilter object gives you fine-grained control over the Nodes that are seen by the TreeWalker object. All that's required is a class that implements the NodeFilter interface, which consists of a single method, acceptNode(). When the TreeWalker encounters a Node, it passes it to the acceptNode() method to determine whether the Node is acceptable or not. Because this is a custom class, you can base that judgment on anything you can pack into an application. In this case, the judgment is based on the value of the status attribute:


Listing 4. Implementing the NodeFilter
                

import org.w3c.dom.traversal.NodeFilter;
import org.w3c.dom.Node;
import org.w3c.dom.Element;

public class EmployeeFilter implements NodeFilter { 

    public short acceptNode(Node thisNode) { 
         if (thisNode.getNodeType() == Node.ELEMENT_NODE) { 
              Element e = (Element)thisNode; 
              if (e.getAttribute("status").equals("donotcontact")) {
                   return NodeFilter.FILTER_SKIP; 
              }  
         } 
         return NodeFilter.FILTER_ACCEPT; 
    } 
} 

Each Node is checked to see if it's an Element. If it is, the status attribute (if any) is checked. The filter skips all elements with a status attribute of donotcontact while accepting everything else.

All that's necessary now is to create the TreeWalker with the new NodeFilter object:


Listing 5. Setting the TreeWalker to see the NodeFilter
                

...
           Node root = doc.getDocumentElement();
           int whattoshow = NodeFilter.SHOW_ALL;
           NodeFilter nodefilter = new EmployeeFilter(); 
           boolean expandreferences = false;

           DocumentTraversal traversal = (DocumentTraversal)doc;
  
           TreeWalker walker = traversal.createTreeWalker(root, 
                                                          whattoshow, 
                                                          nodefilter, 
                                                          expandreferences);
...

Now when the TreeWalker traverses the Document, it checks each Node against the EmployeeFilter object, so it skips the Node that contains a status attribute of donotcontact:


Listing 6. The results
                

personnel () :
   employee (empid="332" status="contact" ) :
        deptid () : 24
        shift () : night
        name () : Jenny Berman


        deptid () : 24
        shift () : day
        name () : Andrew Fule

   employee (empid="948" status="contact" ) :
        deptid () : 3
        shift () : night
        name () : Anna Bangle

Notice that the employee element is missing, but its children are not. In some cases, such as this application, this isn't what you really want. Instead of skipping a Node, you want to reject it altogether.


FILTER_SKIP vs. FILTER_REJECT

When a TreeWalker skips a Node, it moves on to the next Node encountered. In some cases, this is a child of the original. For this application, you're trying to eliminate employees who shouldn't be contacted, so rather than just skipping the employee element, you want to reject that element and all of its children. You can do this easily by changing the NodeFilter to use FILTER_REJECT instead of FILTER_SKIP:


Listing 7. Rejecting a Node
                

...
         if (thisNode.getNodeType()==Node.ELEMENT_NODE) { 
              Element e = (Element)thisNode; 
              if (e.getAttribute("status").equals("donotcontact")) {
                   return NodeFilter.FILTER_REJECT; 
              }  
         } 
         return NodeFilter.FILTER_ACCEPT; 
    } 
}

Now when the application runs, the entire element (including its children) is missing:


Listing 8. Results of rejecting a Node
                

   employee (empid="332" status="contact" ) :
        deptid () : 24
        shift () : night
        name () : Jenny Berman


   employee (empid="948" status="contact" ) :
        deptid () : 3
        shift () : night
        name () : Anna Bangle

It's important to note that the TreeWalker is able to skip the entire Element because it understands the inherent parent-child relationships. A NodeIterator, on the other hand, sees the document in a flattened way, much like a SAX stream, and has no concept of parents or children. If you were to create a NodeIterator rather than a TreeWalker, FILTER_REJECT would act the same as FILTER_SKIP.


Summary

The Traversal module defines TreeWalkers and NodeIterators that look to an external NodeFilter object to determine which Nodes are visible. This enables you to create an application in which the available data can be controlled from outside the main application. A Node can be skipped, in which case the next Node is processed, or it can be rejected, in which case all of its children are also hidden from the main application.


Resources

About the author

Nicholas Chase has been involved in Web site development for companies such as Lucent Technologies, Sun Microsystems, Oracle, and the Tampa Bay Buccaneers. Nick has been a high school physics teacher, a low-level radioactive waste facility manager, an online science fiction magazine editor, a multimedia engineer, and an Oracle instructor. More recently, he was the Chief Technology Officer of Site Dynamics Interactive Communications in Clearwater, Florida, USA, and is the author of three books on Web development, including Java and XML From Scratch (Que) and the upcoming Primer Plus XML Programming (Sams). He loves to hear from readers and can be reached at nicholas@nicholaschase.com.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=12181
ArticleTitle=Tip: Using a DOM NodeFilter
publish-date=11012002
author1-email=nicholas@nicholaschase.com
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).