Extensible Markup Language (XML), which is based on Standard Generalized Markup Language (SGML), features strict syntax rules and a language -- Document Type Definition (DTD) -- for defining structural constraints. Learn about XML 1.0 and its Unicode foundation, as well as all the new features that XML 1.1 offers, and the controversy surrounding this latest version.
Extensible Markup Language (XML) 1.0 (Fourth Edition) [W3C Recommendation] is, of course, the trunk of the sprawling XML technology tree. It builds on Unicode [Unicode Consortium technical report and ISO standard] to define strict rules for text format as well as the DTD validation language. The current (fourth) edition of the specification contains accumulated corrections to the specification and updates to accommodate more recent versions of Unicode. It has been widely translated, although the English version is the only normative one, meaning the only one that is intended to carry the force of standardization.
XML 1.1 (Second Edition) [in development] is the first revision that changes the definition of a well-formed XML document. The primary change is to revise the treatment of characters in the XML specification so that you always refer to the most recent Unicode version rather than a fixed one. It also provides for the normalization of characters across Unicode versions by referencing the Character Model for the World Wide Web 1.0: Fundamentals [W3C Recommendation]. XML 1.1 also adds to the list of line-end characters, adding NEL, a character used for EOL in IBM mainframe systems. This change is controversial because some feel that the modest benefit to mainframe users is not worth such a fundamental change. There is additional controversy because some observers find all the changes too modest to introduce all the likely interoperability problems of an XML version change. There has not been much adoption of XML 1.1 since its completion in February of 2004.
XML is based on Standard Generalized Markup Language (SGML) , defined in ISO 8879:1986 [ISO Standard]. XML represents a significant simplification of SGML, and it includes adjustments that make it better suited to the Web environment.
- Start with the IBM developerWorks New to XML
page.
- Learn all about what XML 1.1 has to offer in XML
1.1 and Namespaces 1.1 revealed by Arnaud Le Hors (developerWorks,
May 2004).
- Learn Unicode from a Web point of view in Unicode
fundamentals by Jim Melnick (developerWorks, February 2001).
- Mike Brown's skew.org
XML Tutorial is a "reintroduction to XML with an emphasis on
character encoding." It highlights topics that are too often glossed
over in other treatments.
- Learn what makes the experts successful with XML in one of the
many developerWorks XML tips.
- In The Annotated
XML Specification, Tim Bray provides useful, in-line commentary and
clarifications on the text of XML 1.0.
-
The XML FAQ is edited by
Peter Flynn.
-
Unicode in XML
and other Markup Languages is a formal technical report for people
(probably implementors) who need a rigorous discussion of the
intersection of Unicode and XML.
- The open
internationalization resources directory is a reference site for all
aspects of managing internationalized data, which is the core goal of
XML in building on Unicode.
- Read about other XML standards:
Index of XML
standards.
- Participate in any of several XML-centered forums:
XML
zone discussion forums.
- Get involved in the developerWorks community: developerWorks blogs
- Find out how you can become an IBM-Certified
Developer in XML and related technologies at IBM XML
certification.
- See the developerWorks XML Zone for a wide range of
technical articles and tips, tutorials, standards, and IBM Redbooks at
XML
technical library.
- Stay current with technology in these sessions: developerWorks technical events and webcasts.
- Build your next development project with trial
software available for download directly from developerWorks: IBM trial software
