Skip to main content

In pursuit of code quality: Discover XMLUnit

A JUnit extension framework for testing XML documents

Andrew Glover (aglover@stelligent.com), President, Stelligent Incorporated
Andrew Glover is president of Stelligent Incorporated, which helps companies address software quality with effective developer testing strategies and continuous integration techniques that enable teams to monitor code quality early and often. Check out Andy's blog for a list of his publications.

Summary:  Java™ developers are natural problem solvers, so it makes sense that someone has come up with an easier way to validate XML documents. This month, Andrew introduces XMLUnit, a JUnit extension framework that meets all your XML validation needs.

View more content in this series

Date:  19 Dec 2006
Level:  Intermediate
Activity:  3672 views

From time to time in the software development cycle, you need to verify the structure or content of XML documents. No matter what type of applications you're building, testing XML documents presents some challenges, especially without tools to facilitate the process.

This month, I'll first show you why you don't want to use String comparisons to verify the structure and content of XML documents. Then I'll introduce XMLUnit, an XML validation tool created by and for Java developers, and show you how to use it to validate XML documents.

Improve your code quality

Don't miss Andrew's accompanying discussion forum for assistance with code metrics, test frameworks, and writing quality-focused code.

Good old String comparisons

To get started, let's imagine you've built an application that outputs an XML document representing an object-dependency report. For a given collection of classes and corresponding filters, a report is generated that outputs a class and its class dependencies (think imports).

Listing 1 shows the report for a given list of classes, com.acme.web.Widget and com.acme.web.Account, with filters set to ignore outside classes such as java.lang.String:


Listing 1. A sample dependency XML report

<DependencyReport date="Sun Dec 03 22:30:21 EST 2006">
  <FiltersApplied>
    <Filter pattern="java|org"/>
    <Filter pattern="net."/>
  </FiltersApplied>
  <Class name="com.acme.web.Widget">
    <Dependency name="com.acme.resource.Configuration"/>
    <Dependency name="com.acme.xml.Document"/>
  </Class>
  <Class name="com.acme.web.Account">
    <Dependency name="com.acme.resource.Configuration"/>
    <Dependency name="com.acme.xml.Document"/>
  </Class>
</DependencyReport>

Listing 1 is obviously generated by an application; consequently, the first level of testing is to verify the application actually can generate a document. Once that's been verified, you'll want to test at least three aspects of the specific document:

  • Structure
  • Content
  • Specific content

You can handle the first two aspects with JUnit alone using String comparisons, as shown in Listing 2:


Listing 2. Validating XML the hard way

public class XMLReportTest extends TestCase {

 private Filter[] getFilters(){
  Filter[] fltrs = new Filter[2];
  fltrs[0] = new RegexPackageFilter("java|org");
  fltrs[1] = new SimplePackageFilter("net.");
  return fltrs;
 }

 private Dependency[] getDependencies(){
  Dependency[] deps = new Dependency[2];
  deps[0] = new Dependency("com.acme.resource.Configuration");
  deps[1] = new Dependency("com.acme.xml.Document");
  return deps;
 }

 public void testToXML() {
  Date now = new Date();
  BatchDependencyXMLReport report = 
   new BatchDependencyXMLReport(now, this.getFilters());

  report.addTargetAndDependencies(
    "com.acme.web.Widget", this.getDependencies());
  report.addTargetAndDependencies(
    "com.acme.web.Account", this.getDependencies());

  String valid = "<DependencyReport date=\"" + now.toString() + "\">"+
    "<FiltersApplied><Filter pattern=\"java|org\" /><Filter pattern=\"net.\" />"+
    "</FiltersApplied><Class name=\"com.acme.web.Widget\">" +
    " <Dependency name=\"com.acme.resource.Configuration\" />"+
    "<Dependency name=\"com.acme.xml.Document\" /></Class>"+
    "<Class name=\"com.acme.web.Account\">"+
    "<Dependency name=\"com.acme.resource.Configuration\" />"+
    "<Dependency name=\"com.acme.xml.Document\" />"+
    "</Class></DependencyReport>";

   assertEquals("report didn't match xml", valid, report.toXML());
 }
}

The test in Listing 2 has some major drawbacks -- and not just the hard-coded String comparisons, either. First, the test isn't exactly readable. Second, it's amazingly brittle; should the format of the XML document change (including the addition of white space), you would be better off pasting in a new copy of the document than attempting to fix the String itself. Finally, the nature of the test forces you to contend with the Date aspect, even though you probably don't care about it.

What if you wanted to ensure that the second Class element's name value in the document was com.acme.web.Account? Sure, you could use regular expressions or String searches, but that would be too much work. Wouldn't it make more sense to manipulate the DOM directly using a parsing framework?

XMLUnit with TestNG?

XMLUnit is a JUnit extension, but that doesn't necessarily mean you can't use it within TestNG. You can incorporate almost any framework into TestNG so long as it has an API that supports delegation and is not decorator based.

Testing with XMLUnit

When you get that feeling that you're working too hard, you can usually assume someone else has figured out an easier way to solve the problem. When it comes to programmatically verifying XML documents, the solution that comes to mind is XMLUnit.

XMLUnit is a JUnit extension framework that facilitates developer testing of XML documents. In fact, XMLUnit is a veritable XML-testing hat trick: you can use it to validate the structure of an XML document, its contents, and even specific portions of the document.

The simplest thing to do is use XMLUnit to logically compare run-time XML documents with predefined, valid control files. Essentially, this is a difference test: Given an XML document that you know is correct, does the application at run time generate the same thing? It's a relatively simple test, but you can use it to validate the structure and content of an XML document. You can also validate specific content with a little help from XPath.

Delegate, don't inherit!

As a rule of thumb, avoid test-case inheritance whenever possible. Many JUnit extension frameworks, including XMLUnit, offer specialized test cases that can be inherited from to facilitate testing a particular architecture. Test cases that inherit classes from a framework suffer from inflexibility, however, because of the Java platform's single-inheritance paradigm. More often than not, these same JUnit extension frameworks offer a delegation API, which makes it easy to combine various frameworks without taking on a rigid inheritance structure.

Validating content

You can utilize XMLUnit through delegation or inheritance. As a rule of thumb, I recommend avoiding test-case inheritance. On the other hand, inheriting from XMLUnit's XMLTestCase does provide some convenient assertion methods (which aren't static and therefore can't be referenced statically like JUnit's TestCase asserts).

Regardless of how you choose to use XMLUnit, you must initialize XMLUnit's parsers. You can either initialize them through System.setProperty calls or through some handy static methods on the XMLUnit core class.

Once you've properly initialized XMLUnit with the various required parsers, you can use the Diff class, which is the central mechanism for logically comparing two XML documents. In Listing 3, I've improved the testToXML test with a dash of XMLUnit:


Listing 3. An improved testToXML test

public class XMLReportTest extends TestCase {

 protected void setUp() throws Exception {		 
  XMLUnit.setControlParser(
    "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
  XMLUnit.setTestParser(
    "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
  XMLUnit.setSAXParserFactory(
    "org.apache.xerces.jaxp.SAXParserFactoryImpl");
  XMLUnit.setIgnoreWhitespace(true);   
 }

 private Filter[] getFilters(){
  Filter[] fltrs = new Filter[2];
  fltrs[0] = new RegexPackageFilter("java|org");
  fltrs[1] = new SimplePackageFilter("net.");
  return fltrs;
 }

 private Dependency[] getDependencies(){
  Dependency[] deps = new Dependency[2];
  deps[0] = new Dependency("com.acme.resource.Configuration");
  deps[1] = new Dependency("com.acme.xml.Document");
  return deps;
 }

 public void testToXML() {
  BatchDependencyXMLReport report = 
    new BatchDependencyXMLReport(new Date(1165203021718L), 
	  this.getFilters());

  report.addTargetAndDependencies(
    "com.acme.web.Widget", this.getDependencies());
  report.addTargetAndDependencies(
    "com.acme.web.Account", this.getDependencies());

  Diff diff = new Diff(new FileReader(
    new File("./test/conf/report-control.xml")),
    new StringReader(report.toXML()));

  assertTrue("XML was not identical", diff.identical());		
 }
}

Notice how my fixture initializes XMLUnit's setControlParser, setTestParser, and setSAXParserFactory methods. You can use any JAXP-compliant parser framework for these values. Also note that I call the setIgnoreWhitespace with true -- this is a lifesaver, believe me! Otherwise, you'll find yourself with a lot of failures when two documents differ because of inconsistent white space!


Comparisons with Diff

The Diff class supports two types of comparisons: identical and similar. If two compared documents are exactly the same in structure and values (ignoring white space if that flag is set), then they are considered identical; if two documents are identical, they are incidentally similar as well. The opposite, however, isn't necessarily true.

For example, Listing 4 shows a simple XML snippet that is logically similar to the XML found in Listing 5; however, they are not identical:


Listing 4. An account XML snippet

<account>
 <id>3A-00</id>
 <name>acme</name>
</account>

The XML snippet in Listing 5 is the same logical document as the one you see in Listing 4. XMLUnit doesn't consider them identical, however, because the name and id elements are swapped.


Listing 5. A similar XML snippet

<account>
 <name>acme</name>
 <id>3A-00</id>
</account>

Accordingly, I can write a test case to verify XMLUnit's behavior, as shown in Listing 6:


Listing 6. A test to verify similar and identical

public void testIdenticalAndSimilar() throws Exception {
 String controlXML = "<account><id>3A-00</id><name>acme</name></account>";
 String testXML = "<account><name>acme</name><id>3A-00</id></account>"; 
 Diff diff = new Diff(controlXML, testXML);
 assertTrue(diff.similar());
 assertFalse(diff.identical());
}

The difference between similar and identical XML documents is subtle; however, the ability to validate both can be quite helpful, such as in testing situations where documents are generated by different applications or clients.


Validating structure

In addition to validating content, you will occasionally need to validate the structure of an XML document. In this case, the values of individual elements and attributes don't matter -- it's the structure you're concerned about.

Fortunately, I can reuse the test case defined in Listing 3 to validate the document's structure, by effectively ignoring element text values and attribute values. I do this by calling overrideDifferenceListener() on the Diff class and providing it with the IgnoreTextAndAttributeValuesDifferenceListener, which is supplied by XMLUnit. The revised test is show in Listing 7:


Listing 7. Verifying an XML structure without attribute values

public void testToXMLFormatOnly() throws Exception{
 BatchDependencyXMLReport report = 
   new BatchDependencyXMLReport(new Date(), this.getFilters());

 report.addTargetAndDependencies(
   "com.acme.web.Widget", this.getDependencies());
 report.addTargetAndDependencies(
   "com.acme.web.Account", this.getDependencies());
 
 Diff diff = new Diff(new FileReader(
   new File("./test/conf/report-control.xml")),
   new StringReader(report.toXML()));

 diff.overrideDifferenceListener(
   new IgnoreTextAndAttributeValuesDifferenceListener());
 assertTrue("XML was not similar", diff.similar());		
}

Similar not identical!

When using the IgnoreTextAndAttributeValuesDifferenceListener class, you must assert that two documents are similar and not identical. If you mistakenly call identical the attribute values will be processed.

Of course, DTD's and XML schemas facilitate XML structure validation; however, sometimes documents don't reference them -- in these scenarios, structure validation can be helpful. Also too, if you need to ignore specific values (like of Dates, for instance) you can implement the DifferenceListener interface (as IgnoreTextAndAttributeValuesDifferenceListener did) and provide a custom implementation.

XMLUnit with XPath

To complete the XML testing hat trick, XMLUnit facilitates validating specific portions of an XML document with XPath.

For example, using the same document format from Listing 1, I'd like to validate that the first Class element's name attribute value generated by my application is com.acme.web.Widget. To do so, I must create an XPath expression to navigate to the precise location; furthermore, XMLUnit's XMLTestCase provides a handy assertXpathExists() method, which means I must now extend XMLTestCase.


Listing 8. Using XPath to validate precise XML values

public void testToXMLFormatOnly() throws Exception{
 BatchDependencyXMLReport report = 
   new BatchDependencyXMLReport(new Date(), this.getFilters());

 report.addTargetAndDependencies(
   "com.acme.web.Widget", this.getDependencies());
 report.addTargetAndDependencies(
   "com.acme.web.Account", this.getDependencies());
 
 assertXpathExists("//Class[1][@name='com.acme.web.Widget']", 
  report.toXML());	
}

As you can see in Listing 8, XMLUnit, in concert with XPath, provides a handy mechanism for validating precise aspects of an XML document rather than doing a large difference test. Keep in mind that to take advantage of XPath in XMLUnit, your test cases must extend XMLTestCase. Familiarity with XPath also helps!

XWhat?

XPath or the XML Path Language is an expression language for addressing portions of an XML document based on a tree representation. XPath allows you to navigate an XML document and facilitates selecting document values.

Why work harder?

XMLUnit is an open source Java-based tool that makes testing XML documents much easier and more flexible than anything you could do with String comparisons. The only possible downside of using XMLUnit for difference testing is that the tests will rely on a file system to load the control document. Account for this added dependency when you write your tests.

While XMLUnit hasn't released any new updates in some time, its current set of features is robust enough to provide plenty of bang for the testing buck -- which, in this case is basically free!


Resources

Learn

Get products and technologies

  • Download JUnit: Find out what's new with JUnit 4.

  • Download TestNG: A powerful, easy-to-use testing framework inspired by JUnit and NUnit.

  • Download XMLUnit: A JUnit extension framework that facilitates developer testing of XML documents.

Discuss

About the author

Andrew Glover

Andrew Glover is president of Stelligent Incorporated, which helps companies address software quality with effective developer testing strategies and continuous integration techniques that enable teams to monitor code quality early and often. Check out Andy's blog for a list of his publications.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=183848
ArticleTitle=In pursuit of code quality: Discover XMLUnit
publish-date=12192006
author1-email=aglover@stelligent.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers