History and Relationships of SGML, HTML and XML

XML is a subset of Standard Generalized Markup Language (SGML). SGML became an ISO standard in 1986 as a way of expressing data in text-processing applications. Both XML and HTML are document formats derived from SGML. All three share certain characteristics, such as a similar syntax and the use of bracketed tags. The difference is that HTML is an application of SGML, whereas XML is a subset of SGML. Please take a look at the great diagram the W3C has developed to help clarify this relationship.

Adapted from World Wide Web Consortium note http://www.w3.org/TR/NOTE-rdfarch, by Tim Berners-Lee.

SGML is popular among organizations that have large amounts of document data to create, manage and distribute. SGML has a character set, and allows for entities (objects) to be used. External data could be referenced, and extended. SGML prescribes the rules for creating a specific markup language such as HTML. HTML is a single set of tags, while SGML provides the capability for creating a desired set of tags.

XML was invented because there are barriers to delivering SGML over the web. These include the lack of stylesheet support, no mainstream browser support, software complexity, and obstacles to interchange of SGML data because of varying levels of SGML compliance among SGML Software Packages. Due to the lack of SGML support in mainstream Web browsers, most applications delivering SGML information over the Web convert the SGML to HTML. This down-translation removes much of the intelligence of the original SGML information, which eliminates flexibility and poses a barrier to reuse, interchange and automation.

XML will displace HTML in Web applications where high degrees of reuse, interchange, and automation are required, and will displace HTML as the preferred way to deliver SGML information over the web. Full SGML will remain the appropriate technology for creating and storing enterprise-critical documents and data. XML will become the primary means to to deliver over the web the vast amount of SGML-based information that currently exists.

XML is a much smaller specification than SGML, and has a number of related specifications (including Extensible Stylesheet language "XSL" and Extensible Linking Language "XLL"), for presenting the data to a browser. XML is similar to SGML in that it provides the capability for creating any desired set of tags. XML's document type definition is carried over from SGML. XML became a formal specification in mid-February 2000.


Feedback