About this tutorial
This tutorial is for developers who need to build XML Schema support into Xerces-Java based applications. It discusses the XML Schema capabilities of the Xerces-Java 2.x parser and demonstrates their use in a Java application.
Familiarity with XML and Java technology is required. The tutorial covers the basics of the W3C XML Schema recommendation, so while it is helpful, no previous XML Schema knowledge is required. For basic XML information, see the XML programming in Java technology tutorial. An understanding of XML namespaces is also helpful; you can find basic background in the Understanding DOM tutorial.
The original XML specification included Document Type Definitions, or DTDs, which enabled developers to define a grammar, or structure, for their data. Validating parsers, such as IBM's XML4J or its descendant, Xerces, could then validate XML data against this grammar. The W3C XML Schema specification is an attempt to improve on the DTD -- eliminating some of its weaknesses -- and to allow developers to create XML grammars using XML syntax.
Several schema proposals exist, but this tutorial deals with the W3C XML Schema recommendation; this is the specification I mean when this tutorial mentions XML Schema (with an uppercase S). When I use the lowercase schema or schemas, I am referring to documents that I've written or that you might write that conform to the W3C XML Schema Recommendation.
Xerces-Java 1.x provided basic support for XML Schemas, but version 2.x offers essentially complete XML Schema support. This tutorial explains how to use the XML Schema-related features and properties of Xerces-Java to validate documents against an XML Schema document.
The tutorial begins with a quick review of the Document Object Model (DOM) and the Simple API for XML (SAX), and explains how to create a parser for each using Xerces 2.2.0.
Make sure that the following tools are installed and tested before beginning the tutorial:
- A Java Virtual Machine, such as the IBM Java machine or Sun's JDK 1.2 or higher must be installed and working on the target machine. You can find links to JVMs for various platforms in Resources. Users of Java 2 version 1.4.x need to make use of the The Endorsed Standards Override Mechanism.
- The Xerces-Java 2.0 binary files: Apache provides precompiled files, so you don't need the source files. Download the binaries.
-
A plain text editor for creating Java technology
applications. If you choose to use an IDE such as VisualAge instead, make sure
that the Xerces
*.jarfiles are on the classpath, and that they precede any other XML APIs. (WebSphere Studio includes support for Xerces-J as part of the default install.)
The Endorsed Standards Override Mechanism
Many of the DOM- and SAX-related classes included in Xerces-J are also part of Java 1.4, and under normal circumstances, the versions included with Java will be used instead of those provided with Xerces.
Fortunately, these classes are "endorsed standards," so you can overide the Java version
using the Endorsed Standards Override Mechanism by placing the relevant jar files (in this
case, xercesImpl.jar) in the appropriate directory. The default location is
%JAVA_HOME%\lib\endorsed |
on Windows, or
%JAVA_HOME%/lib/endorsed |
on a Linux system.
Developers can also choose a new location by setting the java.endorsed.dirs system property.
Users of older versions of Java need not be concerned with this issue.

