In pursuit of code quality: Discover XMLUnit

A JUnit extension framework for testing XML documents

Java™ developers are natural problem solvers, so it makes sense that someone has come up with an easier way to validate XML documents. This month, Andrew introduces XMLUnit, a JUnit extension framework that meets all your XML validation needs.

Andrew Glover, President, Stelligent Incorporated

Andrew GloverAndrew Glover is president of Stelligent Incorporated, which helps companies address software quality with effective developer testing strategies and continuous integration techniques that enable teams to monitor code quality early and often. Check out Andy's blog for a list of his publications.



19 December 2006

Also available in Chinese Russian

From time to time in the software development cycle, you need to verify the structure or content of XML documents. No matter what type of applications you're building, testing XML documents presents some challenges, especially without tools to facilitate the process.

This month, I'll first show you why you don't want to use String comparisons to verify the structure and content of XML documents. Then I'll introduce XMLUnit, an XML validation tool created by and for Java developers, and show you how to use it to validate XML documents.

Improve your code quality

Don't miss Andrew's accompanying discussion forum for assistance with code metrics, test frameworks, and writing quality-focused code.

Good old String comparisons

To get started, let's imagine you've built an application that outputs an XML document representing an object-dependency report. For a given collection of classes and corresponding filters, a report is generated that outputs a class and its class dependencies (think imports).

Listing 1 shows the report for a given list of classes, com.acme.web.Widget and com.acme.web.Account, with filters set to ignore outside classes such as java.lang.String:

Listing 1. A sample dependency XML report
<DependencyReport date="Sun Dec 03 22:30:21 EST 2006">
  <FiltersApplied>
    <Filter pattern="java|org"/>
    <Filter pattern="net."/>
  </FiltersApplied>
  <Class name="com.acme.web.Widget">
    <Dependency name="com.acme.resource.Configuration"/>
    <Dependency name="com.acme.xml.Document"/>
  </Class>
  <Class name="com.acme.web.Account">
    <Dependency name="com.acme.resource.Configuration"/>
    <Dependency name="com.acme.xml.Document"/>
  </Class>
</DependencyReport>

Listing 1 is obviously generated by an application; consequently, the first level of testing is to verify the application actually can generate a document. Once that's been verified, you'll want to test at least three aspects of the specific document:

  • Structure
  • Content
  • Specific content

You can handle the first two aspects with JUnit alone using String comparisons, as shown in Listing 2:

Listing 2. Validating XML the hard way
public class XMLReportTest extends TestCase {

 private Filter[] getFilters(){
  Filter[] fltrs = new Filter[2];
  fltrs[0] = new RegexPackageFilter("java|org");
  fltrs[1] = new SimplePackageFilter("net.");
  return fltrs;
 }

 private Dependency[] getDependencies(){
  Dependency[] deps = new Dependency[2];
  deps[0] = new Dependency("com.acme.resource.Configuration");
  deps[1] = new Dependency("com.acme.xml.Document");
  return deps;
 }

 public void testToXML() {
  Date now = new Date();
  BatchDependencyXMLReport report = 
   new BatchDependencyXMLReport(now, this.getFilters());

  report.addTargetAndDependencies(
    "com.acme.web.Widget", this.getDependencies());
  report.addTargetAndDependencies(
    "com.acme.web.Account", this.getDependencies());

  String valid = "<DependencyReport date=\"" + now.toString() + "\">"+
    "<FiltersApplied><Filter pattern=\"java|org\" /><Filter pattern=\"net.\" />"+
    "</FiltersApplied><Class name=\"com.acme.web.Widget\">" +
    " <Dependency name=\"com.acme.resource.Configuration\" />"+
    "<Dependency name=\"com.acme.xml.Document\" /></Class>"+
    "<Class name=\"com.acme.web.Account\">"+
    "<Dependency name=\"com.acme.resource.Configuration\" />"+
    "<Dependency name=\"com.acme.xml.Document\" />"+
    "</Class></DependencyReport>";

   assertEquals("report didn't match xml", valid, report.toXML());
 }
}

The test in Listing 2 has some major drawbacks -- and not just the hard-coded String comparisons, either. First, the test isn't exactly readable. Second, it's amazingly brittle; should the format of the XML document change (including the addition of white space), you would be better off pasting in a new copy of the document than attempting to fix the String itself. Finally, the nature of the test forces you to contend with the Date aspect, even though you probably don't care about it.

What if you wanted to ensure that the second Class element's name value in the document was com.acme.web.Account? Sure, you could use regular expressions or String searches, but that would be too much work. Wouldn't it make more sense to manipulate the DOM directly using a parsing framework?

XMLUnit with TestNG?

XMLUnit is a JUnit extension, but that doesn't necessarily mean you can't use it within TestNG. You can incorporate almost any framework into TestNG so long as it has an API that supports delegation and is not decorator based.

Testing with XMLUnit

When you get that feeling that you're working too hard, you can usually assume someone else has figured out an easier way to solve the problem. When it comes to programmatically verifying XML documents, the solution that comes to mind is XMLUnit.

XMLUnit is a JUnit extension framework that facilitates developer testing of XML documents. In fact, XMLUnit is a veritable XML-testing hat trick: you can use it to validate the structure of an XML document, its contents, and even specific portions of the document.

The simplest thing to do is use XMLUnit to logically compare run-time XML documents with predefined, valid control files. Essentially, this is a difference test: Given an XML document that you know is correct, does the application at run time generate the same thing? It's a relatively simple test, but you can use it to validate the structure and content of an XML document. You can also validate specific content with a little help from XPath.

Delegate, don't inherit!

As a rule of thumb, avoid test-case inheritance whenever possible. Many JUnit extension frameworks, including XMLUnit, offer specialized test cases that can be inherited from to facilitate testing a particular architecture. Test cases that inherit classes from a framework suffer from inflexibility, however, because of the Java platform's single-inheritance paradigm. More often than not, these same JUnit extension frameworks offer a delegation API, which makes it easy to combine various frameworks without taking on a rigid inheritance structure.

Validating content

You can utilize XMLUnit through delegation or inheritance. As a rule of thumb, I recommend avoiding test-case inheritance. On the other hand, inheriting from XMLUnit's XMLTestCase does provide some convenient assertion methods (which aren't static and therefore can't be referenced statically like JUnit's TestCase asserts).

Regardless of how you choose to use XMLUnit, you must initialize XMLUnit's parsers. You can either initialize them through System.setProperty calls or through some handy static methods on the XMLUnit core class.

Once you've properly initialized XMLUnit with the various required parsers, you can use the Diff class, which is the central mechanism for logically comparing two XML documents. In Listing 3, I've improved the testToXML test with a dash of XMLUnit:

Listing 3. An improved testToXML test
public class XMLReportTest extends TestCase {

 protected void setUp() throws Exception {		 
  XMLUnit.setControlParser(
    "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
  XMLUnit.setTestParser(
    "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
  XMLUnit.setSAXParserFactory(
    "org.apache.xerces.jaxp.SAXParserFactoryImpl");
  XMLUnit.setIgnoreWhitespace(true);   
 }

 private Filter[] getFilters(){
  Filter[] fltrs = new Filter[2];
  fltrs[0] = new RegexPackageFilter("java|org");
  fltrs[1] = new SimplePackageFilter("net.");
  return fltrs;
 }

 private Dependency[] getDependencies(){
  Dependency[] deps = new Dependency[2];
  deps[0] = new Dependency("com.acme.resource.Configuration");
  deps[1] = new Dependency("com.acme.xml.Document");
  return deps;
 }

 public void testToXML() {
  BatchDependencyXMLReport report = 
    new BatchDependencyXMLReport(new Date(1165203021718L), 
	  this.getFilters());

  report.addTargetAndDependencies(
    "com.acme.web.Widget", this.getDependencies());
  report.addTargetAndDependencies(
    "com.acme.web.Account", this.getDependencies());

  Diff diff = new Diff(new FileReader(
    new File("./test/conf/report-control.xml")),
    new StringReader(report.toXML()));

  assertTrue("XML was not identical", diff.identical());		
 }
}

Notice how my fixture initializes XMLUnit's setControlParser, setTestParser, and setSAXParserFactory methods. You can use any JAXP-compliant parser framework for these values. Also note that I call the setIgnoreWhitespace with true -- this is a lifesaver, believe me! Otherwise, you'll find yourself with a lot of failures when two documents differ because of inconsistent white space!


Comparisons with Diff

The Diff class supports two types of comparisons: identical and similar. If two compared documents are exactly the same in structure and values (ignoring white space if that flag is set), then they are considered identical; if two documents are identical, they are incidentally similar as well. The opposite, however, isn't necessarily true.

For example, Listing 4 shows a simple XML snippet that is logically similar to the XML found in Listing 5; however, they are not identical:

Listing 4. An account XML snippet
<account>
 <id>3A-00</id>
 <name>acme</name>
</account>

The XML snippet in Listing 5 is the same logical document as the one you see in Listing 4. XMLUnit doesn't consider them identical, however, because the name and id elements are swapped.

Listing 5. A similar XML snippet
<account>
 <name>acme</name>
 <id>3A-00</id>
</account>

Accordingly, I can write a test case to verify XMLUnit's behavior, as shown in Listing 6:

Listing 6. A test to verify similar and identical
public void testIdenticalAndSimilar() throws Exception {
 String controlXML = "<account><id>3A-00</id><name>acme</name></account>";
 String testXML = "<account><name>acme</name><id>3A-00</id></account>"; 
 Diff diff = new Diff(controlXML, testXML);
 assertTrue(diff.similar());
 assertFalse(diff.identical());
}

The difference between similar and identical XML documents is subtle; however, the ability to validate both can be quite helpful, such as in testing situations where documents are generated by different applications or clients.


Validating structure

In addition to validating content, you will occasionally need to validate the structure of an XML document. In this case, the values of individual elements and attributes don't matter -- it's the structure you're concerned about.

Fortunately, I can reuse the test case defined in Listing 3 to validate the document's structure, by effectively ignoring element text values and attribute values. I do this by calling overrideDifferenceListener() on the Diff class and providing it with the IgnoreTextAndAttributeValuesDifferenceListener, which is supplied by XMLUnit. The revised test is show in Listing 7:

Listing 7. Verifying an XML structure without attribute values
public void testToXMLFormatOnly() throws Exception{
 BatchDependencyXMLReport report = 
   new BatchDependencyXMLReport(new Date(), this.getFilters());

 report.addTargetAndDependencies(
   "com.acme.web.Widget", this.getDependencies());
 report.addTargetAndDependencies(
   "com.acme.web.Account", this.getDependencies());
 
 Diff diff = new Diff(new FileReader(
   new File("./test/conf/report-control.xml")),
   new StringReader(report.toXML()));

 diff.overrideDifferenceListener(
   new IgnoreTextAndAttributeValuesDifferenceListener());
 assertTrue("XML was not similar", diff.similar());		
}

Similar not identical!

When using the IgnoreTextAndAttributeValuesDifferenceListener class, you must assert that two documents are similar and not identical. If you mistakenly call identical the attribute values will be processed.

Of course, DTD's and XML schemas facilitate XML structure validation; however, sometimes documents don't reference them -- in these scenarios, structure validation can be helpful. Also too, if you need to ignore specific values (like of Dates, for instance) you can implement the DifferenceListener interface (as IgnoreTextAndAttributeValuesDifferenceListener did) and provide a custom implementation.

XMLUnit with XPath

To complete the XML testing hat trick, XMLUnit facilitates validating specific portions of an XML document with XPath.

For example, using the same document format from Listing 1, I'd like to validate that the first Class element's name attribute value generated by my application is com.acme.web.Widget. To do so, I must create an XPath expression to navigate to the precise location; furthermore, XMLUnit's XMLTestCase provides a handy assertXpathExists() method, which means I must now extend XMLTestCase.

Listing 8. Using XPath to validate precise XML values
public void testToXMLFormatOnly() throws Exception{
 BatchDependencyXMLReport report = 
   new BatchDependencyXMLReport(new Date(), this.getFilters());

 report.addTargetAndDependencies(
   "com.acme.web.Widget", this.getDependencies());
 report.addTargetAndDependencies(
   "com.acme.web.Account", this.getDependencies());
 
 assertXpathExists("//Class[1][@name='com.acme.web.Widget']", 
  report.toXML());	
}

As you can see in Listing 8, XMLUnit, in concert with XPath, provides a handy mechanism for validating precise aspects of an XML document rather than doing a large difference test. Keep in mind that to take advantage of XPath in XMLUnit, your test cases must extend XMLTestCase. Familiarity with XPath also helps!

XWhat?

XPath or the XML Path Language is an expression language for addressing portions of an XML document based on a tree representation. XPath allows you to navigate an XML document and facilitates selecting document values.

Why work harder?

XMLUnit is an open source Java-based tool that makes testing XML documents much easier and more flexible than anything you could do with String comparisons. The only possible downside of using XMLUnit for difference testing is that the tests will rely on a file system to load the control document. Account for this added dependency when you write your tests.

While XMLUnit hasn't released any new updates in some time, its current set of features is robust enough to provide plenty of bang for the testing buck -- which, in this case is basically free!

Resources

Learn

Get products and technologies

  • Download JUnit: Find out what's new with JUnit 4.
  • Download TestNG: A powerful, easy-to-use testing framework inspired by JUnit and NUnit.
  • Download XMLUnit: A JUnit extension framework that facilitates developer testing of XML documents.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=183848
ArticleTitle=In pursuit of code quality: Discover XMLUnit
publish-date=12192006