Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Developing Drupal publications to support standards-based XML

Customize your Drupal installation to support the publication of TEI (or other) XML documents

Garrick Bodine (garrick.bodine@gmail.com), Information Technology Manager, Penn State University
Garrick Bodine is an Information Technology Manager in the Office of Undergraduate Admissions at Penn State University.
Stephanie Schlitz (sschlitz@gmail.com), Linguistics Professor, Bloomsburg University of Pennsylvania
Stephanie Schlitz is a Linguistics Professor at Bloomsburg University of Pennsylvania. She collaborates on several ongoing XML-based projects.

Summary:  Academic and corporate clients seeking digital journals or other types of web publications regularly require platforms that support standards-based XML. This tutorial explains how to customize a Drupal implementation to develop publications that enable editors, authors, and users to submit and edit content in standards-based XML, where the standard can be enforced using server-side validation settings. For illustrative purposes, the discussion references TEI XML, the markup standard in widespread use in academia.

Date:  08 Feb 2011
Level:  Intermediate PDF:  A4 and Letter (803 KB | 33 pages)Get Adobe® Reader®

Activity:  19852 views
Comments:  

Before you start

Frequently used acronyms

  • CMS: Content management system
  • CSS: Cascading Stylesheets
  • FTP: File Transfer Protocol
  • HTML: HyperText Markup Language
  • SQL: Structured Query Language
  • URL: Uniform Resource Locator
  • XML: Extensible Markup Language
  • XSL: Extensible Stylesheet Language
  • XSLT: Extensible Stylesheet Language Transformation

This tutorial is for developers interested in collecting and publishing documents based on a standardized XML format. In this case, we use the Text Encoding Initiative's TEI P5, a format widely used by academics, archivists, and librarians worldwide for archival and research purposes. While some hands-on Drupal experience is recommended, we introduce fundamental Drupal concepts and walk you through the basic steps of installation. Drupal experience, therefore, is not essential. After you complete the tutorial, you will have learned how to install Drupal and how to configure the Content Construction Kit (CCK) and XML Content modules to enable various content types that can be input in XML, validated against your custom schema, and published according to the specifications defined in your stylesheets.

About this tutorial

The sample site covered in this tutorial demonstrates how to publish documents that strictly adhere to custom XML standards using the Drupal content management system.

Although Drupal is not the only option (not even the only free and open source option) to implement a system that enables publication of TEI documents, it is one of the most widely used platforms, running hundreds of thousands of sites worldwide, making it both mature (well tested) and well supported by the community.

Because TEI P5 XML is one of the most widely used published standards for academic, archival, and research purposes, it is the format we chose for this tutorial. Other XML standards with available schemas, such as DocBook or DITA XML, can be used where we implement TEI, assuming that you make the necessary changes.

Among the driving factors for many who choose TEI XML (including the authors) for archival and research purposes are the range of data types supported by the TEI's Guidelines for Electronic Text Encoding and Interchange (that is, TEI's markup standard) and the active, ongoing development of the standard by the TEI community. We therefore consider TEI markup to be one of the best choices for describing, displaying, and retaining documents, offering powerful and flexible display capabilities when it is leveraged together with any number of the available free and open source XML tools.


Prerequisites

Drupal CMS—Drupal is freely available and can be downloaded from http://drupal.org/download. This tutorial uses Drupal version 6.

You need a web server or web host with PHP installed and access to a database in order to install Drupal and make your site available to the public across the web. We used Apache and MySQL. Although it is beyond the scope of this tutorial to take you through the selection of a web hosting provider or installation of a local web server and database, you can find that many inexpensive web hosts support the installation of Drupal and provide access to databases such as MySQL or PostgreSQL.

In addition to Drupal itself, you also need to download a few Drupal modules to enable the publishing features described in the rest of the tutorial:

  • The XML Content module to enable uploading, enforcement, and guidance with regard to the site publisher's chosen XML features.
  • The Content Construction Kit (CCK) module for Drupal to enable custom types of Drupal content, in this case the addition of an XML content type defined by the site publisher.
  • You might also wish to choose a Drupal theme that enables you to change the appearance of your site.

TEI Roma—TEI Roma is a web-based tool for generating custom XML schemas that the publication module described in the tutorial uses to enforce the standards chosen by the site publisher.

See Resources for links to all the tool downloads.

1 of 8 | Next

Comments



Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, Open source, Web development,
ArticleID=623175
TutorialTitle=Developing Drupal publications to support standards-based XML
publish-date=02082011
author1-email=garrick.bodine@gmail.com
author1-email-cc=nancy_hannigan@us.ibm.com
author2-email=sschlitz@gmail.com
author2-email-cc=nancy_hannigan@us.ibm.com

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.