Creating an XML schema
Now that you have set up Drupal and the necessary TEI XML modules, it's time to support XML content by developing schemas that validate your XML.
The general definitions of whichever version of XML you are using describe well-formedness, that is, the actual syntax and layout that might be used to create a viable, machine-readable XML document. XML itself is meant to be nearly infinitely flexible. You can make up syntactically valid, well-formed XML using elements and attributes that no one has seen before. While this ability is indeed essential in making XML universally useful, very often specific applications of XML require that only certain elements, attributes, and values are used. Although there are a number of ways to accomplish this task, schemas are regularly employed in this context to ensure and enforce continuity among documents within a community or collection.
XML schemas are machine-readable technical descriptions of what constitutes valid XML documents according to the rules described within the schema. The rules might be strict or lax, and they are compiled arbitrarily by document authors or designers.
In the site we are creating, we plan to allow only a certain subset of all the elements available within TEI XML markup. This constraint allows us to prepare the necessary XSL and CSS display aspects more accurately, as we can ensure that there are no unexpected tags or attributes in the materials. TEI P5 XML is already a strictly defined XML application, but we can further streamline the available options for our documents by using a TEI markup validation tool provided by the TEI Consortium: TEI Roma.
TEI Roma initially gives several options as starting points for creating a TEI schema. You can create a customized validator based on some of the most commonly used applications of TEI, as in Figure 8.
Figure 8. Create an XML schema
We use a plain, unadulterated version of TEI Lite, a subset of TEI P5 that contains most of the commonly used elements needed for describing documents in a digital format, though you can use any version of TEI produced by Roma with the XML Content module as described in the text that follows.
One important consideration to note is that Roma gives multiple format options in the delivery of your custom schema. Although the XML Content module handles several formats as well, you need to be sure to select a compatible one when you tell Roma to output your file. We have found that the RELAX NG (see Figure 9) format (XML syntax) works well with the XML Content module's validator and is a powerful and portable format if you need to use your schema for other purposes, so we use it for the rest of this tutorial (see Resources).
Figure 9. Select a schema format
After you download the schema file from Roma, upload it to the appropriate location on your website using the same methods that you used to upload Drupal and its modules. We place the file in the XML Content module's expected location in the sites/all/modules/xmlcontent directory.
You now need to update the XML Content module's validator to indicate that you've included a custom schema for it to validate all incoming XML content against. Navigate to the Input Formats section of Site Configuration in your Drupal administration console (Administer > Site configuration > Input formats > TEI XML). As in Figure 10, in the Schema File Pathfield, type in the filename and extension of the file you just uploaded (as provided by Roma in this case): teilite.rng.
Figure 10. Configure the schema in the XML Content module
After you upload the teilite.rng file containing the custom schema you created and have enabled the schema validation in the input filter configuration settings, you are ready to begin uploading or creating TEI-compliant XML content on your site.