XML Matters, TEI -- the Text Encoding Initiative
An XML dialect for archival and complex documents
From the developerWorks archives
Date archived: May 14, 2019 | First published: September 04, 2003
Nowadays, XML is usually thought of as a markup technique utilized by programmers to encode computer-oriented data. Even DocBook and similar document-oriented DTDs focus on preparation of technical documentation. However, the real roots of XML are in the SGML community, which is largely composed of publishers, archivists, librarians, and scholars. In this installment, David looks at Text Encoding Initiative, an XML schema devoted to the markup of literary and linguistic texts. TEI allows useful abstractions of typographic features of source documents, but in a manner that enables effective searching, indexing, comparison, and print publication -- something not possible with publications archived as mere photographic images.
This content is no longer being updated or maintained. The full article is provided "as is" in a PDF file. Given the rapid evolution of technology, some content, steps, or illustrations may have changed.