Skip to main content


developerWorks  >  XML  >

Document Schema Definition Languages (DSDL)

Use a framework of various schema languages

developerWorks

Level: Intermediate

Contributors: ISO

06 Feb 2007
Updated 25 Apr 2007

Document Schema Definition Languages (DSDL) is a framework approach to XML validation and core processing comprising individual specifications from expert individuals or small groups, each of which addresses a well-defined and well-bounded problem domain. Discover the parts of DSDL, including RELAX NG and Schematron, that already have traction on their own, and the ones that are still works in progress.

Document Schema Definition Languages (DSDL) [ISO Draft Standards and Standards of ISO/IEC JTC 1 SC 34 WG 1] is a collection of specifications related to the validation and basic document composition of XML. The insight behind DSDL is that you can approach XML validation and core processing in numerous ways, and that many of these ways are complementary. Rather than create one giant system covering all this capability, DSDL creates a framework of standards that you can use separately or together for XML validation. In this way, it avoids rigid complexity within each specification while providing the power to address complex problems. DSDL has 10 parts, which are listed here:

  • Part 1: Interoperability framework: This part is a formal roadmap and outline of DSDL as a whole.
  • Part 2: Grammar-based validation: This part is ISO RELAX NG.
  • Part 3: Rule-based validation: This part is ISO Schematron.
  • Part 4: Selection of validation candidates: This part is Namespace-based Validation Dispatching Language (NVDL), a means of splitting up documents comprising multiple vocabularies so that you can validate them more easily. There have been many inputs to this part, but James Clark's Namespace Routing Language (NRL) is the main input to the process.
  • Part 5: Datatypes: This part is a framework for creating new primitive data types. Jeni Tennison's Datatype Library Language is an input. It defines an XML language for defining regular expressions for the lexical representation of new types. This much alone is provided for (to some extent) in the facet mechanisms in W3C XML Schema (WXS), but the important distinction in DSDL Part 5 is that it adds a mechanism for mapping these new data types to the value space, which is not possible in WXS. In effect, it allows you to specify semantics as well as syntax of new data types, which is essential.
  • Part 6: Path-based integrity constraints: The goal of this part is to define features similar to WXS's xs:unique, xs:key, and xs:keyref.
  • Part 7: Character repertoire validation: The goal of this part is to develop a language that allows schema designers to constrain the character sets that they can use in various lexical structures in XML. There are ways to express some such restrictions at present in RELAX NG, but they break down when trying to apply such restrictions in cases such as mixed content. Part 7 would work, for example, by allowing one to express the constraint that "element and attribute names as well as PI targets to be basic Latin-1" or "numbers must not appear in element and attribute names."
  • Part 8: Declarative document manipulation: This part is a means to define patterns that can be expressed in more than one actual XML syntax, based on a powerful schema technology called Architectural Forms.
  • Part 9: Datatype- and namespace-aware DTDs: This part makes DTDs more useful in the face of recent developments in the XML space by adding features more common in more recent schema languages.
  • Part 10: Validation management: This part is meant to be the glue to allow you combine different parts from DSDL. It provides a pipeline framework for preprocessing and validation of documents.

Resources


Back to top


Document options

Document options requiring JavaScript are not displayed


My developerWorks needs you!

Connect to your technical community