XML standards and extensions

Other standards and extensions to XML work together to make your information more portable and useful.

You need to know about these standards and extensions in order to do the following:

XML is good for describing information, but it cannot do everything. For example, XML documents do not contain the kind of information that current browsers and many other devices require to display it in a useful way. The same is true for linking to other information, transporting XML data so that it can be used in a meaningful way by the receiving application, and so on.

The XML community has and continues to develop standards and extensions to expand the capabilities of XML:

APIs

Application programming interfaces (APIs) allow applications to work with XML information using a standard set of portable interfaces.

DOM 1.0 and DOM Level 2.0

The Document Object Model (DOM) API enables you to build XML documents as well as parse them. These interfaces enable you to access, manipulate, and create XML documents (and the data within) as programming objects that have methods and events. Your programs can construct or change a DOM tree in memory and then persist that DOM tree to a file or stream. DOM is best suited for instances where you will parse few XML documents but require extensive control over the contents.

Namespaces

Namespaces are pointers that enable you to differentiate between duplicate XML elements or attribute names, a situation that can occur when using XSLT style sheets or more than a single DTD. For example, the <code> element from one DTD might mean something different from a <code> element in another DTD. To avoid name collisions and ambiguity, give each pointer a unique local name. This makes it simple to distinguish between the different namespaces.

SAX 1.0 and SAX 2.0

The Simple API for XML (SAX) is a read-only, single-pass interface best suited for processing many documents or very large documents. You can use this API to extract information from the XML documents, but you cannot use it to add new data to or to change the content of the XML documents. The SAX API is event-driven, notifying your application when certain events happen as it parses your document. For example, your application might need to know when the parser encounters the start or end of an element node. Note that it is your application that must keep the necessary state information to determine the content and context of these XML events.

XSL and XSLT

Extensible Stylesheet Language (XSL) and XSL Transformation (XSLT) work in combination to enable you to display XML data in a variety of ways, for example, displayed in a browser or on a PDA, or printed in a brochure. XSL and XSLT processing also enable you to transform an XML message or document from one XML markup language to another, which has key applications in e-business.

See XSL introduction for more information.

XPath and XPointer

XML Path Language (XPath) and XML Pointer Language (XPointer) enable you to search for and identify data in the hierarchical XML document structure.

XPath defines a syntax for locating data in an XML document. Both XSLT and XPointer use XPath. XPath defines an XML document as a hierarchy of nodes, with the top node being the root. Just like using a regular expression finds one or more patterns in text, using XPath finds patterns in data within the nodes of one or more XML documents.

XML Pointer Language (XPointer) extends XPath to enable locating specific portions of data, called fragments, based on XML attribute values, types, content, or relative position. These fragments can be discrete pieces of data, a range of information between two points, or a continuous series of elements.

XML Schema

XML Schema Language defines the logical structure of an XML document, much like a document type definition (DTD).

The significant difference between DTDs and XML Schemas are that schemas do the following:

  • Are written as XML markup language itself, making them extensible, unlike DTDs
  • Focus on the problem of cardinality, enabling the enumeration of minimum and maximum allowed elements
  • Allow constraints on values
  • Allow additional data types and definitions of data types that can be inherited

All of these enhancements give you more control over the allowable content of the XML document or message. For example, you can add a different type of element to an existing schema as long as your addition does not break the original schema. Schemas also have many more available data types than do DTDs, making importing and exporting data somewhat easier.