Though it's best known as a powerful text editor favored by UNIX developers, Emacs can be used to work with XML in non-UNIX platforms such as Windows, MS-DOS, and MacOS. Emacs (see the sidebar Emacs in a nutshell) works as a full-blown development environment for processing text, writing applications, and, as I'll discuss, creating structured information like XML and SGML. I use it as a general-purpose editor for creating and managing some of my programming projects, and for writing XHTML and playing around with SGML and XML. In fact, I used it to write this article.
This article tells how to install Emacs and the extensions PSGML and OpenSP. It also outlines how to customize Emacs to make it function with a variety of DTDs. I present many of the Emacs customizations one piece at a time. However, you can download a zip file with sample DTDs and all of the Emacs customizations (see Resources). My intent is to get you started using Emacs by providing you with just enough information for you understand what's going on. Then you'll be able to add DTDs and customize Emacs based on your needs and preferences.
Start by installing Emacs. You can access additional Emacs information and distributions from the GNU Web site or its mirrors (see Resources). Some UNIX-based distributions come with Emacs. For example, my Redhat Linux 7.1 came with Emacs version 20.5.1 (an older version of PSGML) already installed.
Because most UNIX and Linux users are savvy enough to get and install software without any guidance from me, I'll just direct you to the GNU project site. The customizations I describe in the rest of the article will apply to UNIX/Linux environments.
Windows users can find the latest binary distribution from the windows/emacs/ directory of any of the FTP sites listed on the GNU FTP list. The emacs-20.7-bin-i386.tar.gz file does not include the Emacs Lisp source. Editor's note: A newer version, version 21.1, was released in late October, while this article was in production. This article is based on the 20.7 version and will be updated to include details on the new version. If you're interested in programming Emacs or seeing how particular functions are implemented, instead download the emacs-20.7-fullbin-i386.tar.gz file. Download the .gz file to your local hard drive. Use WinZip or some other .gz-aware tool to extract the contents to a directory structure on your hard drive (make sure you "retain folder information" when you extract so the appropriate directory structure is created). If you unzip to a drive d:, and allow the original directory structure to be created, you will end up with a base path of d:\emacs-20.7, where d: is the drive on which you unpacked the distribution. For the remainder of this article, I'll refer to this directory as d:\Emacs. The readme suggests that you avoid spaces in your install path. I'd heed this warning.
After you've unpacked the distribution, there will be a number of files and four sub-directories: bin, etc, info, and lisp under the main directory. The README.W32 file contains information on obtaining future distributions, setting up Emacs, and so on. (The README file also includes a URL for the FAQ for GNU Emacs on Windows 95/98/ME, and 2000.) Though it is not required, I suggest that you run the addpm.exe file in the bin subdirectory to register Emacs so that it's accessible from your Start menu. Once it is installed, select Start->Gnu Emacs->Emacs. If you opt not to register Emacs, start it up by double clicking the runemacs.exe file installed in the d:\Emacs\bin directory.
You can take a tutorial by starting Emacs and selecting Help->Emacs Tutorial. Don't get discouraged by the fact that you must use control-key sequences for many of the functions. You can begin by learning a few commonly used control-key sequences and learn new ones as you find you need them. Besides, in the GUI version of Emacs, many functions are accessible from menus. See Resources for a couple of suggestions for other tutorials on Emacs and PSGML.
The next step is to start customizing Emacs as necessary, such as:
- Setting variables to control various behaviors
- Adding packages
- Writing your own Emacs Lisp code
So first I'll cover how to set variables and add packages.
Your first step is to access an Emacs initialization file. Emacs looks for this file in your home directory. In a UNIX environment, the initialization file is typically named .emacs, and located (by default) in your home directory.
On Windows, I use a file named _emacs since Windows doesn't generally like filenames that start with a period. On Windows, you specify the home directory by setting an environment variable or by setting a registry entry. As a last resort, Emacs looks for the initialization file in the directory c:\. (So for now, either create this file in c:\, or consult the GNU Emacs FAQ For Windows (see Resources) for other options.)
To test that Emacs is finding your initialization file, use your favorite text editor to add the entry in Listing 1, which turns on the clock in the Emacs status bar. After turning on the clock and starting Emacs, look for the time in the status area (after the name of the current file). If you see the clock, all is well.
Listing 1. Testing the Emacs initialization file
; Display the time in the Emacs status area (an easy way to test ; that we are picking up our Emacs customizations). (display-time)
Now that you have Emacs installed and you've laid the foundation for customizing it, we'll look at how to add packages that provide an environment for editing and validating SGML and XML documents.
The current distribution of GNU Emacs includes major editing modes for HTML and SGML. Generally, the function provided by these is limited to assisting with element/attribute entry and navigating among elements. The HTML support is based on earlier versions of HTML.
Around the time SGML was starting to become popular for document publishing, Lennart Staflin created PSGML, a package for adding an SGML major editing mode to Emacs. Because HTML and XML are subsets of SGML, you can use PSGML for editing those as well. In fact, recent PSGML versions provide an XML editing mode.
PSGML also includes a built-in SGML parser that is DTD aware. If you have your own dialect of SGML or XML, you simply install your DTD(s). Changes in HTML standards are handled by installing a new DTD (or set of DTDs). PSGML provides context-sensitive editing, so you can add elements or attributes based on where you are in the document. Navigation features allow you to move among elements and even move to the next-trouble-spot to locate markup that doesn't conform to your DTD. Formatting features indent elements, based on nesting, or hide element content so you can restrict the view to specific areas. Finally, you can validate documents with an external validating parser, which I discuss later in this article.
Figure 1. Emacs with PSGML installed (editing the DITA FAQs)
Figure 1 shows some of the structured editing features PSGML adds to Emacs, including:
- Colored markup syntax.
- Markup indented based on nesting level.
- Element folding. Note how the
<prolog>and first two
<section>elements have been collapsed to one line, to get them out of the way, while you can see the subelements in the unfolded Tips and Techniques
- Validation using an external parser. The results of the validation are displayed in a buffer below the document buffer. In this case, I used OpenSP to validate the document. If validation results in errors, you can use the Emacs next-error command (
[Ctrl]-x `) to locate the error(s) in the source.
In addition to the features visible in Figure 1, PSGML adds many functions that you can access via pull-down menus, pop-up menus, or control-key sequences or commands.
You'll need to download the current version of PSGML. Version of 1.2.2, current as of this writing, is available from Source Forge (see Resources). As with Emacs, PSGML is downloaded as a .gz file; unpack it using a .gz-aware utility such as WinZip. I unpacked the PSGML distribution into the site-lisp directory of my Emacs installation. Again, remember to specify to retain directory information when you unpack. In my installation, I have d:\Emacs\site-lisp\psgml-1.2.2.
Once unpacked, consult README.psgml for some basic information, including how to install it in the UNIX version of Emacs.
Installing PSGML in Emacs for Windows
To prepare to install PSGML in the Windows version of Emacs, first create a directory for it (mine is site-lisp) and unpack the .gz file into it, retaining directory information.
Next you need to make sure that Emacs can find the files that comprise PSGML. You do that by adding the contents of Listing 2 to your Emacs initialization file _emacs file.
Now Emacs should have access to the PSGML files and it will use PSGML whenever you invoke sgml-mode or xml-mode. Later I'll show how to invoke those modes automatically, based on the file extension of the file being edited.
Whether you're working in Linux or Windows, there's one more thing to do to complete the installation: compile the PSGML files. Look in the psgml directory and find a bunch of .el file-types. These are Emacs Lisp files. If you compile them, the PSGML support runs faster. Here's a simple way to accomplish this:
- Start Emacs.
- When prompted for a command, enter
- When prompted for a directory name, change the path to your PSGML files, for example d:/Emacs/site-list/psgml-1.2.2 and press [Enter].
That ought to compile most of the .el files and display the results in a "*Compile-log*" buffer. (I received a couple of warnings about obsolete variables when I compiled, but I believe they are harmless enough to ignore.) The end result should be an .elc file for most of the .el files in the psgml directory (not all of the files will be compiled, so don't worry if some are missing).
SGML and XML modes aren't much use without incorporating the DTDs to describe the types of documents you need to create. So here's how to add some DTDs and the appropriate configuration to make them useful with PSGML.
Let's start with XHTML 1.0, which is an XMLized version of HTML 4.01 (see Resources for more information on XHTML). The XHTML DTDs will let you create HTML that conforms to the XML standard and can be validated with a parser (more on this later), thereby providing more robust and manageable documents. (See Resources for a zip file that contains the XHTML 1.0 DTDs and catalog file I discuss in this section).
Here's how to download the XHTML DTDs and the related entities:
- Create a subdirectory for the XHTML DTDs. I keep all of my DTDs in one place on my system; let's assume they will reside under a DTDs folder at the same level as Emacs: d:\DTDs. Under there, create a folder for the XHTML DTDs, d:\DTDs\xhtml1.
- After creating a folder to hold them, simply go to the W3C's DTD site (see Resources) to obtain the XHTML DTDs. There are three document types (strict, transitional, and frameset).
- For each of the three document types, click mouse-button-2 on the links and then save the target as a file. (You may need to remove the extra .txt extension that the browser adds when saving the files).
- Save the three entity sets (xhtml-lat1.ent, xhtml-special.ent, and xhtml-symbol.ent) into the same subdirectory as the DTDs.
Next, you need to create an SGML catalog file that PSGML can use to find these DTDs.
In the same directory as the DTDs, create a file called xhtml1.soc. The content should look like Listing 3.
See Resources for background on SGML
Open Catalogs. For this article, I'll just explain the particular features that are used
in Listing 3. The PUBLIC entries map what is referred to as a formal public
identifier to a file system entity, which in this case is the file containing
the various DTDs. This will allow us to refer to these DTDs without having
to actually know where they are in the file-system. They require that your
documents have a
<!DOCTYPE xxxxxx PUBLIC "yyyyy"> document type declaration,
where the "xxxxx" matches one of the entries in your catalog file. The
DTDDECL entries are not actually used by PSGML, but they will be used by the
SGML parser (stay tuned!), and they indicate what SGML declaration
should be used with the DTD that has the same formal public identifier.
Lastly, the DOCTYPE entry allows us to refer to a particular DTD without using the formal public identifier or an actual filename. The downside to this is that, for XHTML, there are several DTDs that define the same document type
html, so you have to pick one. I would simply choose the one you'd expect to use the most. In Listing 3, I've chosen the transitional DTD.
Remember, you can use any of the XHTML document types as long as you include the full
There's one more piece of configuration that you need to do. PSGML needs to know where to find the SGML catalog files. There are a couple of ways to accomplish this, as described in the PSGML documentation. I use the method that makes use of the environment variable SGML_CATALOG_FILES because it is also used by the SGML parser (patience, I come to it in the next section of this article). So, now that you have a set of DTDs and a catalog file, create the afore-mentioned environment variable and set it to include the path to your xhtml1.soc file, for example d:\DTDs\xhtml1\xhtml1.soc. If you have more that one catalog file, you can include them all, separating them with a path delimiter (";" on Windows, ":" on UNIX-based systems).
I'll show you how to add one more set of DTDs:
- If necessary, create a subdirectory for the new DTDs, such as d:\DTDs\dita.
- Download the current DITA zip.
- Once you have the download, use your favorite utility to unpack the distribution to d:\DTDs\dita, once again preserving the directory information.
- Add the included catalog file to your SGML_CATALOG_FILES environment variable, so you might now have d:\DTDs\xhtml1\xhtml1.soc;d:\DTDs\dita\dtd\dita.soc.
Listing 4. dita.soc - SGML catalog file for DITA DTDs
OVERRIDE YES -- For documents that don't include a DOCTYPE declaration -- DOCTYPE topic "topic.dtd" --DOCTYPE topic "ditabase.dtd"-- DOCTYPE task "task.dtd" DOCTYPE reftopic "reftopic.dtd" DOCTYPE concept "concept.dtd" DOCTYPE APIdesc "APIdesc.dtd" DOCTYPE bctask "bctask.dtd" -- There should probably be an entry here referencing the standard -- -- XML SGML declaration for example SGMLDECL or DTDDECL -- -- (once we have public identifiers for the DTDs) --
As you can see, once you get things initially set up, adding new DTDs is relatively easy.
Now that you have Emacs with PSGML installed and you have a set of DTDs to work with, you can begin editing documents using PSGML. Whenever you edit a document with an extension of .sgml or .xml, you will note that Emacs invokes SGML major mode (indicated in the status area) and the menu changes to look like the one shown in Figure 2.
Figure 2. Emacs menu with SGML editing mode
So far, if you edit an .html document, the old HTML major mode will be invoked. I'll show you how to fix that in a moment. In the meantime, you could invoke
[Alt]-x and key in xml-mode to force XML mode.
To try using PSGML, edit a test file called test.html and insert beginning and ending html tags:
Turn on XML mode by invoking
[Alt]-x and then keying in xml-mode.
Next, click on the menu item DTD->Info->General DTD Info. This causes
PSGML to parse the DTD and display general information in a buffer below
your document. If your test was not successful, check for an error in your catalog
file or environment variable. Also, this test assumes you have the
html entry in one of your SGML catalog files so that PSGML knows what DTD to
associate with a doctype of "html". Alternatively, you could include a
doctype declaration, such as
<!DOCTYPE html PUBLIC ...>, where the
PUBLIC identifier matches an entry in one of your SGML catalog files. If
you have your catalogs and environment variables set up correctly, you
should see something like this:
Doctype: html Element types: 89 Entities: 253 Parameter entities: 63 Files used: d:/DTDs/xhtml1/xhtml-special.ent d:/DTDs/xhtml1/xhtml-symbol.ent d:/DTDs/xhtml1/xhtml-lat1.ent d:/DTDs/xhtml1/xhtml1-transitional.dtd
The output indicates that PSGML was able to locate the DTD and parse it, including all of the referenced entity modules.
Now PSGML is aware of your DTD, and you can begin utilizing some of PSGML's more powerful features. For example, place the cursor after the
<html> tag and select menu item Markup->Insert Element. You will be presented with a list of elements that are valid at that location in the document. But before getting into any more of the editing features, let's do some more customization to get more out of PSGML.
Now that you can edit documents with PSGML, let's explore some more customizations that will exploit more of PSGML's features and make it easier to use. Listing 5 shows some more customizations you can append to your existing Emacs initialization file.
The first section of Listing 5 tells Emacs which major mode to invoke when you load a file with a particular extension, similar to the way Windows associates application based on file type. Note here that I've set .htm and .html files to use xml-mode. This is because I'm actually writing XHTML.
The next four sections of Listing 5 provide for syntax-based highlighting which causes different markup constructs to appear in different colors in the editor. By default, PSGML simply defines tags to appear in bold and comments to appear in italic. Here, I've set start and end tags to appear in blue, comments to appear in purple, entity references to appear in blue, PIs to appear in magenta, and so on. In addition to the constructs I've modified, you can also define the appearance of ignored marked sections, marked section start and ends, and short references. The purpose of the four sections is to:
- Define a face
- Set the characteristics of the face
- Associate the face with the particular markup type
- Activate the settings
The next section of Listing 5,
sgml-auto-activate-dtd, causes the DTD associated
with the document to be parsed as soon as the document is loaded. This
is set to false by default because of the processing required. With processors
as fast as they are, this shouldn't be a concern. Also, if this is not
set to true, when a document is initially loaded, the syntax coloring will not
take effect until you explicity parse the DTD, using either the
DTD->Parse DTD menu item or the [Ctrl]-c[Ctrl]-p key sequence.
The next section modifies the DTD->Insert DTD menu item to allow you to quickly insert the
DOCTYPE declaration for a new document. I've included a variety of document types, including both SGML and XML document types (some are commented out). Note how the XML document types include the XML declaration. Whenever you add a new DTD, you'll probably want to update the
sgml-custom-dtd variable to add your new DTD to the Insert DTD menu.
The last section defines my-psgml-hook and hooks it into the SGML mode. This allows you to launch your default browser against the current file you are editing. This is handy for viewing HTML and XHTML as you edit. It will be even more handy when browsers more fully support XML and XSLT.
Now that you have some customizations in place, let's take a quick test drive to see some of the PSGML editing features.
- Start Emacs and open a file ([Ctrl]-x[Ctrl]-f) called test.html. That should put Emacs into XML mode, which you can verify by looking at the status line.
From the menu, select DTD->Insert DTD->XHTML 1.0 Transitional. That should insert the XML declaration and a
<!DOCTYPE html...>declaration for an document with the default name "html." Also notice syntax coloring of these two entries.
Next, place the cursor after the
DOCTYPEdeclaration and from the menu select Markup->Insert Element (or press Shift and mouse-button-2). You should see a pop-up menu with a list of elements that are valid at this point in the document, in this case the
htmlelement. Notice that when you insert the HTML element, its required elements,
body, are also inserted. Also, a comment appears prompting you that you must insert either a
baseelement. This feature is handy until you get used to a particular markup language, after which it's more annoying than helpful. You can disable the prompting by setting the
sgml-insert-missing-element-commentvariable to false in your Emacs initialization file.
- You can use the same technique to add or modify attributes: Place the cursor inside a start-tag and select from the menu Markup->Insert Attribute (or press [Shift]mouse-button-2). A pop-up menu appears that offers valid attributes for the selected element. Select an attribute from the pop-up menu.
Note how the structure is indented based on element nesting. If you
insert an H1 inside the body, it will not be indented. This is because
the default settings do not indent mixed content elements (elements that
may contain both markup and text, or PCDATA in SGML/DTD parlance).
You can change the indenting assumptions by setting
sgml-indent-datato true in your Emacs initialization file. Before doing that, consider whether white-space will be significant in your XML application (see Resources).
If you have already installed an external validator, try validating your
document: Select SGML->Validate and then press Enter (you may be prompted
to save your file) or press [Ctrl]-c [Ctrl]-v and then press Enter.
Note: If validation doesn't work, install an external validator (as I explain how to do in the next section) and test drive that feature later. If validation does work, you should receive an error indicating the "head" is not finished. If you press Ctrl-x` (note the back-tic), you will be taken to the line number in the source where the error occurred. Go ahead and insert a title element.
Although PSGML contains an SGML parser, it is not a fully functional parser. It does, however, provide the ability to validate SGML and XML documents using an external parser. This allows you to fully validate your source and find, for example, elements with IDREFs that lack a corresponding target element with a matching ID.
When you invoke SGML->Validate from the menu or keyboard (Ctrl-c Ctrl-v),
PSGML will shell a process to invoke the SGML
parser against the file you are currently editing. It displays the results
of the validation in a buffer below the file you are currently editing.
If it encounters errors, use the Emacs
[Ctrl]-x ` (note the back-tic) to have Emacs take you to the location of the
error in your source document.
By default, it is configured
to invoke nsgmls, part of SP, an SGML parser originally written by James Clark. SP is
no longer being supported, but is the foundation for OpenSP, which is now maintained on SourceForge.net as part of the
OpenJade project. (See Resources for more information
on SP and OpenSP.) You can download and use SP or OpenSP. I chose OpenSP because it is actively supported,
and it contains support for the DTDDECL keyword of SGML catalogs whereas
SP does not (DTDDECL is supported as of the 1.4 version of OpenSP). If you are dealing only with XML, you will need only a single SGML
declaration defined for XML. If, however, you will also be dealing with SGML, the DTD you are using will probably reference its own declaration. Because PSGML allows you to specify only one particular SGML declaration to be used, via the
sgml-xml-declaration for XML mode),
the DTDDECL catalog feature can come in handy. One last consideration is
that I was unable to locate binaries for OpenSP for the Windows platform.
Because SourceForge.net maintains only source code, you will need to build the binaries
yourself or locate them by searching more diligently than I did.
If you prefer to use SP, all you really need to do is download SP (see Resources),
unpack it, and update two environment variables. You will need to append your PATH so that nsgmls can be found when invoked by PSGML. Assuming you unpack the distribution to the path d:\SP, you would need to add d:\SP\bin to your PATH. Also, you will need to add an entry to your SGML_CATALOG_FILES so the SGML declaration for XML can be found. If you don't pick up the correct SGML declaration when validating your XML, you will probably receive a lot of error messages. This is because XML doesn't support the SGML's OMITTAG feature which requires the DTD to specify minimization information (XML DTDs do not include this information because all tags are required). Again, assuming you installed SP in d:\SP, an SGML declaration for XML will be in d:\SP\pubtext\xml.dcl which is referenced by d:\SP\pubtext\xml.soc (see the SGMLDECL entry). So simply add d:\SP\pubtext\xml.soc to your
SGML_CATALOG_FILES so nsgmls can find this catalog. Alternatively, you can set the Emacs/PSGML variable
sgml-xml-declaration in your Emacs initialization file to point to this file as shown in Listing 6.
Listing 6. _emacs - enabling SP for validation
; Note the forward slashes in the path!!!! (setq sgml-xml-declaration "d:/SP/pubtext/xml.dcl")
If you wish to use OpenSP, you need to make a couple of slight modifications to PSGML, however, all of this can be done using the Emacs initialization file.
Assuming you have built and installed OpenSP or found a pre-built binary distribution, again the first thing you need to do is update your PATH so the executables can be found. Assuming OpenSP is installed in d:\OpenSP, you would need to add d:\OpenSP\bin to your PATH. Note that you can have both SP and OpenSP installed and accessible at the same time because the executables in OpenSP have been renamed.
The next thing you need to do is update your Emacs configuration to alter the command used for validation. This would normally be done by setting the Emacs variable
sgml-validate-command, and in fact we will set this
variable to handle the case of using OpenSp's onsgmls executable to validate in sgml-mode. For xml-mode, however, this doesn't seem to work correctly: When I set this variable in my Emacs initialization file, the sgml-mode picks up the change, but the xml-mode does not. You can get around this issue by providing a mode-hook. The goal is to override the default validate command, which is defined as
nsgmls -wxml -s %s %s, setting it to
onsgmls -wxml -s %s %s. The fragment of Emacs initialization code in Listing 7 takes care of both of these tasks.
You really don't need to understand what's going on here to make PSGML work with OpenSP. However if you're interested, a mode-hook basically defines an Emacs function that will be invoked after the mode is initialized. This gives you an opportunity to override functions and settings established by that mode. In this case, since the validate command is hardwired in the PSGML code, you can use the mode-hook to override that setting without having to modify the PSGML code and recompile it (which would need to be done each time you install a new version of PSGML).
Once you get comfortable with the basic functions I've described, try exploring each of the menus that PSGML adds to the Emacs menu bar:
- On the SGML menu, experimenting with the File Options and User Options can give you a good idea of what you can customize within PSGML. For more information on particular settings, you can refer to the online documentation or consult the "Editing SGML with Emacs and PSGML" document included with PSGML. Changes you make through this menu persist only for that particular editing session. If you prefer to make a permanent change, you have to update your Emacs initialization file.
- The Modify menu mainly provides functions for changing existing markup. Some of these functions, for example Normalize, might come in handy for trying to clean up HTML and make it XHTML.
- Functions under the Move menu basically allow for quicker navigation of the structure of your document.
- The Markup menu provides menu access for inserting elements, tags, attributes, entities, and so on. I'll just point out two things that might not be obvious. Tag Region allows you to wrap existing text inside an element, using PSGML's internal parser to determine what elements are valid for the highlighted location. Insert Entity allows you to insert general text entities defined in your DTD. If you define new text entities in your internal subset at the beginning of the document, you will need to reparse the DTD to pick up the newly defined entities during your editing session.
- Items under View are self explanatory.
- Most of the items under the DTD menu have been covered. The Info items are worth a mention, however, because they can be useful for exploring your DTD if not already familiar with it.
|Source code for this article||x-emacs/emacscust.zip||35 KB||HTTP|
The XHTML 1.0 DTDs, _emacs customizations, and updated dita.soc files I described in this article are available in the emacscust.zip.
- Download PSGML version of 1.2.2 (or whatever version is current) from Source
- The GNU Web site provides information on Emacs as well as numerous other GNU projects.
- If you prefer to learn from a book, O'Reilly & Associates publishes a good book called Learning GNU Emacs, which provides information on how to accomplish basic editing
tasks, use many of the major editing modes, customize, and even program Emacs.
- There's also an excellent tutorial in Bob DuCharme's book SGML CD, in the chapter "Editing SGML with the Emacs Text Editor", which is available online. In addition to providing a tutorial on using Emacs, Bob also discusses using PSGML for editing SGML documents, and in fact this chapter is what got me started.
- Check out the GNU Emacs FAQ for Windows.
- For more information on XHTML, visit the XHTML 1.0 section of the W3C Web site.
- The Darwinian Information Typing Architecture, DITA,
is an architecture for creating article-based information. DITA includes a base set of DTDs and framework that allows for specialization using derived DTDs and processing conventions.
- In the DTD samples file provided, I've included a DTD I used to edit Host On Demand (HOD) macros (in the hodmacro directory). This demonstrates how Emacs with PSGML can be used to edit XML which is not of the traditional book or article type of information. For more information on HOD, see WebSphere Host Publisher. You can learn more about WebSphere Host Publisher file formats from WebSphere Host Publisher Programmer's and Reference.
- For a more data-oriented XML editing tool, check out the replacement for the WebSphere Studio Application Developer environment -- WebSphere Studio Site Developer which contains a visual XML editor, or check out the Downloads and products section of developerWorks XML zone (to view editing tools only, select Editing in the View by field).
- Find out more about SGML Open Catalogs.
- SP is an SGML parser originally written by James Clark. It is no longer being supported, but
is the foundation for OpenSP which is now maintained on SourceForge.net as part of the OpenJade project. If you're looking for a pre-built RPM package for Linux, you can try RPM Find (SP) or RPM Find (OpenSP).
- Another good source of publicly available XML/SGML tools is The
XML Cover Pages.
- For details on comparing XML documents, including a discussion of significant and nonsignificant whitespace, see Brett McLaughlin's tip, What's the diff?.
Brian Gillan works in the ID Technology and Design group for IBM in Research Triangle Park, North Carolina, providing programming, integration, strategy, and support for publishing tools used by the IBM Information Design community. You can contact Brian at firstname.lastname@example.org.