Contents


In pursuit of code quality

Discover XMLUnit

A JUnit extension framework for testing XML documents

Comments

Content series:

This content is part # of # in the series: In pursuit of code quality

Stay tuned for additional content in this series.

This content is part of the series:In pursuit of code quality

Stay tuned for additional content in this series.

From time to time in the software development cycle, you need to verify the structure or content of XML documents. No matter what type of applications you're building, testing XML documents presents some challenges, especially without tools to facilitate the process.

This month, I'll first show you why you don't want to use String comparisons to verify the structure and content of XML documents. Then I'll introduce XMLUnit, an XML validation tool created by and for Java developers, and show you how to use it to validate XML documents.

Good old String comparisons

To get started, let's imagine you've built an application that outputs an XML document representing an object-dependency report. For a given collection of classes and corresponding filters, a report is generated that outputs a class and its class dependencies (think imports).

Listing 1 shows the report for a given list of classes, com.acme.web.Widget and com.acme.web.Account, with filters set to ignore outside classes such as java.lang.String:

Listing 1. A sample dependency XML report
<DependencyReport date="Sun Dec 03 22:30:21 EST 2006">
  <FiltersApplied>
    <Filter pattern="java|org"/>
    <Filter pattern="net."/>
  </FiltersApplied>
  <Class name="com.acme.web.Widget">
    <Dependency name="com.acme.resource.Configuration"/>
    <Dependency name="com.acme.xml.Document"/>
  </Class>
  <Class name="com.acme.web.Account">
    <Dependency name="com.acme.resource.Configuration"/>
    <Dependency name="com.acme.xml.Document"/>
  </Class>
</DependencyReport>

Listing 1 is obviously generated by an application; consequently, the first level of testing is to verify the application actually can generate a document. Once that's been verified, you'll want to test at least three aspects of the specific document:

  • Structure
  • Content
  • Specific content

You can handle the first two aspects with JUnit alone using String comparisons, as shown in Listing 2:

Listing 2. Validating XML the hard way
public class XMLReportTest extends TestCase {

 private Filter[] getFilters(){
  Filter[] fltrs = new Filter[2];
  fltrs[0] = new RegexPackageFilter("java|org");
  fltrs[1] = new SimplePackageFilter("net.");
  return fltrs;
 }

 private Dependency[] getDependencies(){
  Dependency[] deps = new Dependency[2];
  deps[0] = new Dependency("com.acme.resource.Configuration");
  deps[1] = new Dependency("com.acme.xml.Document");
  return deps;
 }

 public void testToXML() {
  Date now = new Date();
  BatchDependencyXMLReport report = 
   new BatchDependencyXMLReport(now, this.getFilters());

  report.addTargetAndDependencies(
    "com.acme.web.Widget", this.getDependencies());
  report.addTargetAndDependencies(
    "com.acme.web.Account", this.getDependencies());

  String valid = "<DependencyReport date=\"" + now.toString() + "\">"+
    "<FiltersApplied><Filter pattern=\"java|org\" /><Filter pattern=\"net.\" />"+
    "</FiltersApplied><Class name=\"com.acme.web.Widget\">" +
    " <Dependency name=\"com.acme.resource.Configuration\" />"+
    "<Dependency name=\"com.acme.xml.Document\" /></Class>"+
    "<Class name=\"com.acme.web.Account\">"+
    "<Dependency name=\"com.acme.resource.Configuration\" />"+
    "<Dependency name=\"com.acme.xml.Document\" />"+
    "</Class></DependencyReport>";

   assertEquals("report didn't match xml", valid, report.toXML());
 }
}

The test in Listing 2 has some major drawbacks -- and not just the hard-coded String comparisons, either. First, the test isn't exactly readable. Second, it's amazingly brittle; should the format of the XML document change (including the addition of white space), you would be better off pasting in a new copy of the document than attempting to fix the String itself. Finally, the nature of the test forces you to contend with the Date aspect, even though you probably don't care about it.

What if you wanted to ensure that the second Class element's name value in the document was com.acme.web.Account? Sure, you could use regular expressions or String searches, but that would be too much work. Wouldn't it make more sense to manipulate the DOM directly using a parsing framework?

Testing with XMLUnit

When you get that feeling that you're working too hard, you can usually assume someone else has figured out an easier way to solve the problem. When it comes to programmatically verifying XML documents, the solution that comes to mind is XMLUnit.

XMLUnit is a JUnit extension framework that facilitates developer testing of XML documents. In fact, XMLUnit is a veritable XML-testing hat trick: you can use it to validate the structure of an XML document, its contents, and even specific portions of the document.

The simplest thing to do is use XMLUnit to logically compare run-time XML documents with predefined, valid control files. Essentially, this is a difference test: Given an XML document that you know is correct, does the application at run time generate the same thing? It's a relatively simple test, but you can use it to validate the structure and content of an XML document. You can also validate specific content with a little help from XPath.

Validating content

You can utilize XMLUnit through delegation or inheritance. As a rule of thumb, I recommend avoiding test-case inheritance. On the other hand, inheriting from XMLUnit's XMLTestCase does provide some convenient assertion methods (which aren't static and therefore can't be referenced statically like JUnit's TestCase asserts).

Regardless of how you choose to use XMLUnit, you must initialize XMLUnit's parsers. You can either initialize them through System.setProperty calls or through some handy static methods on the XMLUnit core class.

Once you've properly initialized XMLUnit with the various required parsers, you can use the Diff class, which is the central mechanism for logically comparing two XML documents. In Listing 3, I've improved the testToXML test with a dash of XMLUnit:

Listing 3. An improved testToXML test
public class XMLReportTest extends TestCase {

 protected void setUp() throws Exception {		 
  XMLUnit.setControlParser(
    "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
  XMLUnit.setTestParser(
    "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
  XMLUnit.setSAXParserFactory(
    "org.apache.xerces.jaxp.SAXParserFactoryImpl");
  XMLUnit.setIgnoreWhitespace(true);   
 }

 private Filter[] getFilters(){
  Filter[] fltrs = new Filter[2];
  fltrs[0] = new RegexPackageFilter("java|org");
  fltrs[1] = new SimplePackageFilter("net.");
  return fltrs;
 }

 private Dependency[] getDependencies(){
  Dependency[] deps = new Dependency[2];
  deps[0] = new Dependency("com.acme.resource.Configuration");
  deps[1] = new Dependency("com.acme.xml.Document");
  return deps;
 }

 public void testToXML() {
  BatchDependencyXMLReport report = 
    new BatchDependencyXMLReport(new Date(1165203021718L), 
	  this.getFilters());

  report.addTargetAndDependencies(
    "com.acme.web.Widget", this.getDependencies());
  report.addTargetAndDependencies(
    "com.acme.web.Account", this.getDependencies());

  Diff diff = new Diff(new FileReader(
    new File("./test/conf/report-control.xml")),
    new StringReader(report.toXML()));

  assertTrue("XML was not identical", diff.identical());		
 }
}

Notice how my fixture initializes XMLUnit's setControlParser, setTestParser, and setSAXParserFactory methods. You can use any JAXP-compliant parser framework for these values. Also note that I call the setIgnoreWhitespace with true -- this is a lifesaver, believe me! Otherwise, you'll find yourself with a lot of failures when two documents differ because of inconsistent white space!

Comparisons with Diff

The Diff class supports two types of comparisons: identical and similar. If two compared documents are exactly the same in structure and values (ignoring white space if that flag is set), then they are considered identical; if two documents are identical, they are incidentally similar as well. The opposite, however, isn't necessarily true.

For example, Listing 4 shows a simple XML snippet that is logically similar to the XML found in Listing 5; however, they are not identical:

Listing 4. An account XML snippet
<account>
 <id>3A-00</id>
 <name>acme</name>
</account>

The XML snippet in Listing 5 is the same logical document as the one you see in Listing 4. XMLUnit doesn't consider them identical, however, because the name and id elements are swapped.

Listing 5. A similar XML snippet
<account>
 <name>acme</name>
 <id>3A-00</id>
</account>

Accordingly, I can write a test case to verify XMLUnit's behavior, as shown in Listing 6:

Listing 6. A test to verify similar and identical
public void testIdenticalAndSimilar() throws Exception {
 String controlXML = "<account><id>3A-00</id><name>acme</name></account>";
 String testXML = "<account><name>acme</name><id>3A-00</id></account>"; 
 Diff diff = new Diff(controlXML, testXML);
 assertTrue(diff.similar());
 assertFalse(diff.identical());
}

The difference between similar and identical XML documents is subtle; however, the ability to validate both can be quite helpful, such as in testing situations where documents are generated by different applications or clients.

Validating structure

In addition to validating content, you will occasionally need to validate the structure of an XML document. In this case, the values of individual elements and attributes don't matter -- it's the structure you're concerned about.

Fortunately, I can reuse the test case defined in Listing 3 to validate the document's structure, by effectively ignoring element text values and attribute values. I do this by calling overrideDifferenceListener() on the Diff class and providing it with the IgnoreTextAndAttributeValuesDifferenceListener, which is supplied by XMLUnit. The revised test is show in Listing 7:

Listing 7. Verifying an XML structure without attribute values
public void testToXMLFormatOnly() throws Exception{
 BatchDependencyXMLReport report = 
   new BatchDependencyXMLReport(new Date(), this.getFilters());

 report.addTargetAndDependencies(
   "com.acme.web.Widget", this.getDependencies());
 report.addTargetAndDependencies(
   "com.acme.web.Account", this.getDependencies());
 
 Diff diff = new Diff(new FileReader(
   new File("./test/conf/report-control.xml")),
   new StringReader(report.toXML()));

 diff.overrideDifferenceListener(
   new IgnoreTextAndAttributeValuesDifferenceListener());
 assertTrue("XML was not similar", diff.similar());		
}

Of course, DTD's and XML schemas facilitate XML structure validation; however, sometimes documents don't reference them -- in these scenarios, structure validation can be helpful. Also too, if you need to ignore specific values (like of Dates, for instance) you can implement the DifferenceListener interface (as IgnoreTextAndAttributeValuesDifferenceListener did) and provide a custom implementation.

XMLUnit with XPath

To complete the XML testing hat trick, XMLUnit facilitates validating specific portions of an XML document with XPath.

For example, using the same document format from Listing 1, I'd like to validate that the first Class element's name attribute value generated by my application is com.acme.web.Widget. To do so, I must create an XPath expression to navigate to the precise location; furthermore, XMLUnit's XMLTestCase provides a handy assertXpathExists() method, which means I must now extend XMLTestCase.

Listing 8. Using XPath to validate precise XML values
public void testToXMLFormatOnly() throws Exception{
 BatchDependencyXMLReport report = 
   new BatchDependencyXMLReport(new Date(), this.getFilters());

 report.addTargetAndDependencies(
   "com.acme.web.Widget", this.getDependencies());
 report.addTargetAndDependencies(
   "com.acme.web.Account", this.getDependencies());
 
 assertXpathExists("//Class[1][@name='com.acme.web.Widget']", 
  report.toXML());	
}

As you can see in Listing 8, XMLUnit, in concert with XPath, provides a handy mechanism for validating precise aspects of an XML document rather than doing a large difference test. Keep in mind that to take advantage of XPath in XMLUnit, your test cases must extend XMLTestCase. Familiarity with XPath also helps!

Why work harder?

XMLUnit is an open source Java-based tool that makes testing XML documents much easier and more flexible than anything you could do with String comparisons. The only possible downside of using XMLUnit for difference testing is that the tests will rely on a file system to load the control document. Account for this added dependency when you write your tests.

While XMLUnit hasn't released any new updates in some time, its current set of features is robust enough to provide plenty of bang for the testing buck -- which, in this case is basically free!


Downloadable resources


Related topics

  • "Use test categorization for agile builds" (Andrew Glover, developerWorks, October 2006): Andrew Glover reveals the three categories of testing needed to ensure end-to-end system soundness and then shows you how to automatically sort and run tests by category.
  • "The Java XPath API" (Elliotte Rusty Harold, developerWorks, July 2006): Elliotte Harold demonstrates Java 5's new XPath API.
  • "Get started with XPath" (Bertrand Portier, developerWorks, May 2004): Learn more about XPath.
  • "Verifying XML -- the easy way" (Andrew Glover, testearly.com, March 2006): A short introduction to XMLUnit.
  • In pursuit of code quality series (Andrew Glover, developerWorks): Learn more about code metrics, test frameworks, and writing quality-focused code.
  • Download JUnit: Find out what's new with JUnit 4.
  • Download TestNG: A powerful, easy-to-use testing framework inspired by JUnit and NUnit.
  • Download XMLUnit: A JUnit extension framework that facilitates developer testing of XML documents.

Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java development
ArticleID=183848
ArticleTitle=In pursuit of code quality: Discover XMLUnit
publish-date=12192006