This tutorial examines the use of the Simple API for XML version 2.0.x, or SAX 2.0.x. It is aimed at developers who have an understanding of XML and wish to learn this lightweight, event-based API for working with XML data. It assumes that you are familiar with concepts such as well-formedness and the tag-like nature of an XML document. (You can get a basic grounding in XML itself through the Introduction to XML tutorial, if necessary.) In this tutorial, you will learn how to use SAX to retrieve, manipulate, and output XML data.
Prerequisites: SAX is available in a number of programming languages, such as Java, Perl, C++, and Python. This tutorial uses the Java language in its demonstrations, but the concepts are substantially similar in all languages, and you can gain a thorough understanding of SAX without actually working through the examples.
The standard means for reading and manipulating XML files is the Document Object Model (DOM). Unfortunately this method, which involves reading the entire file and storing it in a tree structure, can be inefficient, slow, and a strain on resources.
One alternative is the Simple API for XML, or SAX. SAX allows you to process a document as it's being read, which avoids the need to wait for all of it to be stored before taking action.
SAX was developed by the members of the XML-DEV mailing list, and the Java version is now a SourceForge project (see Resources ). The purpose of the project was to provide a more natural means for working with XML -- in other words, one that did not involve the overhead and conceptual leap required for the DOM.
The result was an API that is event-based. The parser sends events, such as the start or end of an element, to an event handler, which processes the information. The application itself can then deal with the data. The original document remains untouched, but SAX provides the means for manipulating the data, which can then be directed to another process or document.
SAX has no official standards body; it is not maintained by the World Wide Web Consortium (W3C) or any other official body, but it is a de facto standard in the XML community.
The examples in this tutorial, should you decide to try them out, require the following tools to be installed and working correctly. Running the examples is not a requirement for understanding.
- A text editor: XML files are simply text. To create and read them, a text editor is all you need.
- Java™ 2 SDK, Standard Edition version 1.4.x: SAX support has been built into the latest version of Java (available at http://java.sun.com/j2se/1.4.2/download.html ), so you won't need to install any separate classes. If you're using an earlier version of Java, such as Java 1.3.x, you'll also need an XML parser such as the Apache project's Xerces-Java (available at http://xml.apache.org/xerces2-j/index.html ), or Sun's Java API for XML Parsing (JAXP), part of the Java Web Services Developer Pack (available at http://java.sun.com/webservices/downloads/webservicespack.html ). You can also download the official version from SourceForge (available at http://sourceforge.net/project/showfiles.php?group_id=29449 ).
- Other Languages: Should you wish to adapt the examples, SAX implementations are also available in other programming languages. You can find information on C, C++, Visual Basic, Perl, and Python implementations of a SAX parser at http://www.saxproject.org/?selected=langs.