A multitude of XML plug-ins have been developed, and new ones are created all the time. This article focuses on the plug-in called XMLBuddy, because its feature-rich set contains most of the functions needed for XML document development. We do touch on other plug-ins when they provide a richer set of user options for specific tasks. This article will familiarize you with the basic XML editing features, but bear in mind that Eclipse is a dynamic frameset that puts an endless array of tools and features at your fingertips.
Eclipse already includes source code for a very primitive XML editor that offers only XML syntax highlighting. It extends classes included in
the org.eclipse.ui.editors package, which provides a standard text editor and a file-based document provider for the Eclipse Platform. This simple XML editor serves as a code example that you can use as a base for your own Eclipse XML plug-ins. Its source code can only be generated from the Eclipse project wizard, and you will need to compile it yourself as described here.
To build this basic XML editor, go to the menu File => New and select Project. In the project wizard, select Plug-in Development => Plug-in Project.
If you don't see the Plug-in Development option, it means that you have the Eclipse Platform Runtime Binary without the Plug-in development environment. Go to the eclipse.org downloads page (see Resources later in this article for a link) and download the Eclipse Platform Plug-in SDK.
Click Next. Give your project a name, such as org.my.eclipse.xmleditor, click Next, and accept the default values on the screen Plug-in Project Structure. Now select the Create a plug-in project using a code
generation wizard and the Plug-in with an editor options. The wizard automatically generates the source code for the XML editor.
You still need to compile it, though. Click Next and then the Finish option on the next screen Simple Plug-in Content; go to the Project menu and select Rebuild All to build the project.
Now you need to create the editor.jar file using the File => Export menu. Exit Eclipse and copy the whole org.my.eclipse.xmleditor plugin directory. When you run Eclipse again, you can add an XML file to your project and see how XML syntax highlighting works (see Figure 1). Remember that this sample XML editor doesn't offer any kind of validation or syntax checking.
Figure 1. Simple syntax highlighting is provided by the Eclipse XML editor

The most popular and advanced XML editor plug-in for Eclipse is XMLBuddy, developed by Bocaloco Software (see Resources for a link). XMLBuddy is a freeware plug-in that enriches Eclipse with XML editing capabilities that include user-configurable syntax coloring, DTD-driven code assist, and validation and synchronized outline view. XML Buddy also adds an XML perspective to the Workspace and new project templates for XML documents and DTDs. You can install XMLBuddy in the same way as any other Eclipse plug-in: just unzip the plug-in archive into the \eclipse\plugins subdirectory under the main Eclipse installation directory. Remember to restart Eclipse. Figure 2 shows XMLBuddy in action.
Figure 2. XMLBuddy in action: The main editor window with XML Outline view

XML is a meta-markup language. An XML element is made up of a start tag, an end tag, and the data in between, so you need good editing features along with syntax highlighting. XMLBuddy (in the current version 0.2) extends Eclipse with the following XML editing features:
- Formatting. You can format an XML document or part of a document automatically by selecting all or a part of code.
- Advanced syntax coloring. You can configure XML code coloring through the Window => Preferences => XML => Colors menu. Coloring applies to ordinary XML documents, DTDs (internal and external subsets), and JSP files. Figure 3 shows how you can change the default settings for syntax highlighting.
Figure 3. Changing default settings for XML syntax highlighting

- XML code assist. Based on a document's DTD, assistance is provided for elements or other tag names, attribute names, and attribute values.
-
Extended character encoding support. XMLBuddy auto-detects document encoding according to the XML 1.0 specification, honoring the
<?xmlencoding declaration, if provided. You can also specify a default encoding for all XML documents or for only a specific file (see Validating and character encoding XML code below). - Outline view. The outline view window shows the structure of the elements in your document. By default, the outline is dynamically synchronized with editing. This gives you a quick look at a document's logic.
- DTD generation. You can automatically generate a DTD from a document's contents. XMLBuddy caches Internet-based DTDs locally, so a DTD and associated documents are downloaded only once, no matter how many times they are used.
Validating and character encoding XML code
The fundamental difficulty with XML documents is checking their internal validity (cohesion of document logic). You need to perform syntax checking to see whether all tags and definitions are proper and properly called. Only after syntax checking is passed, can an XML document be confirmed as well formed and can the logical structure of the document be parsed. XML documents are validated by XML parsers.
All the Eclipse XML plug-ins described below are able to perform XML validation, indicating warnings and errors inside code. If you try to open an XML document, the XML Parser might generate an error. The exact error code, the error text, and even the line that caused the error can be retrieved. You can validate an XML document on demand or automatically when you save the document. You may clear validation error tasks as a group. The XMLBuddy plug-in uses a system-wide XML parser, but remember that the Eclipse Platform comes with one of the best XML parsers available: Xerces (XML4J). See Resources for download information. You are not limited to Xerces or a system parser, though, because you can point to the other XML parsers using Run => External Tools => Configure.
The other significant XMLBuddy feature is support for different character encodings. This comes in handy if, for example, you need to work with XML portable documents written in different languages (for example, Polish and English). This is not an easy task considering that there are three main ways for encoding Polish national characters: one is Windows Latin-2 (CP1250) used by Windows 9x/2000; another is ISO Latin-2 (ISO8859-2) used in the Internet and by UNIX and UNIX-like systems (such as Linux); a third is MacOS and MacOS X, which use different character encoding standards for the Polish language.
In general, XMLBuddy offers two solutions for character encoding: XML document encoding may be auto-detected based on file contents or set to a default encoding. Default encoding may be workspace-wide or resource-specific. To open the XML encoding preferences, select Window => Preferences => XML => Encoding.
The problem with these solutions to character encoding is that, for XML, one encoding (per workbench) does not fit all. XML documents may arrive from any number of sources worldwide. In many cases, users have no control over the encoding of documents that come from others and may have no way to partition their work along encoding boundaries. It is very unlikely that the same encoding preference will suit, say, Java source files and XML documents. In cases where no single set of global preferences is sufficient, XMLBuddy provides per-document properties. Specifying the properties for each file in a project is a burdensome task. But when that one document arrives that uses an uncommon encoding that cannot be auto-detected and is not specified in the document, properties are the only solution. To open the encoding properties for a particular file, right-click on the file and select Properties = > XML => Encoding. Figure 4 shows how to set global character encoding.
Figure 4. Setting global character encoding for XML documents in Eclipse

XML Schema specifies the XML Schema definition language, which offers facilities for describing the structure and constraining the contents of XML 1.0 documents, including those that exploit the XML Namespace facility. The schema language, which is itself represented in XML 1.0 and uses Namespaces, substantially reconstructs and considerably extends the capabilities found in XML 1.0 DTDs. Keep in mind that DTDs have a number of limitations:
- The content models are generally hard to use for complex requirements.
- There is no support for Namespaces.
- There is very limited support for modularity and reuse.
- There is no support for extensions or inheritance of declarations.
- It is difficult to write, maintain, and read large DTDs, and to define families of related schemas.
- There is no embedded, structured self-documentation (only
<!-- comments -->is available). - Content and attribute declarations cannot depend on attributes or element context (many XML languages use that, but their DTDs have to "allow too much").
- Only a primitive ID attribute mechanism is available (in other words, no uniqueness scope).
But XML Schema also has disadvantages:
- XML Schema is complicated; programmers that only need to use XML occasionally may find it unnecessarily difficult.
- XML Schema cannot require a specific root element (so extra information is required to validate even the simplest documents).
- When describing mixed content, the character data cannot be constrained in any way.
- Content and attribute declarations cannot depend on attributes or element context (this is also a central problem of DTDs).
- Defaults cannot be specified separate from the declarations.
- Element defaults can only be character data (not containing markup).
XMLBuddy offers adequate support for both DTD and Schema, but if you need really good support for XML Schema, you should use the XSD-XML Infoset Browser for Java plug-in (see Resources for a link). It is a Java reference library that implements the XML Schema Infoset Model as described in the W3C XML Schema specifications. It is extremely useful for any code that examines, creates, or modifies XML Schemas. The XML Infoset Browser provides an API for manipulating the components of an XML Schema, as well as an API for manipulating the DOM-accessible representation of XML Schema as a series of XML documents. XML Infoset essentially allows two or more developers to use Java and XML together, thereby providing a standard way of recognizing and creating XML-based schemas.
Figure 5. XML Schema validation becomes available after installing IBM XML SQC

The necessary supplement for the XML Infoset Browser is the IBM XML Schema Quality Checker or SQC (see also Figure 5 and Resources for a link). SQC is a Java program that takes as input an XML Schema written in the W3C XML schema language and diagnosis improper uses of the Schema language. SQC reads Schemas conforming to the latest XML Schema specifications and attempts to determine if they are valid under the various constraints that apply to Schemas. When SQC encounters a non-conformant element, it gives diagnostic messages that may include a suggestion about how to fix the problem. For Schemas that are composed of numerous Schema documents connected via <include>, <import>, or <redefine> element information items, a full Schema-wide check is performed. SQC can also run in batch mode to check multiple XML schemas in a single run.
Other helpful XML plug-ins are Transclipse and Eclipse Tidy (see Resources for links). Transclipse is a plug-in for XML transformation. It processes XML documents via XSLT with any JAXP-compliant XSL stylesheet processor and XSL-FO documents using the Apache Formatting Objects Processor (FOP). Transclipse is a part of the j2h or Java to HTML plug-in, which converts Java source code to HTML, XHTML, and LaTeX with syntax highlighting. The Eclipse Tidy project provides a plug-in for formatting and printing XML/HTML documents. Visit the categorized Eclipse plug-in registry for more information (see Resources).
- Join the Eclipse Platform community and download the Platform at
eclipse.org. The Eclipse Platform source code is licensed under the Common Public License. At eclipse.org, you'll also find a
glossary of terms and descriptions of Eclipse projects, along with technical articles and newsgroups. The Eclipse Platform white paper details the major components and functions of Eclipse.
- Download and learn more at XMLBuddy at Bocaloco Software's XMLBuddy Web site.
- Download the Eclipse Platform Plug-in SDK at the eclipse project downloads page.
- For information on Apache XML projects (including the Xerces parser), see Apache.org.
- Download the XSD-XML Infoset Browser for Java at eclipse.org.
- Download the XML Schema Quality Checker at IBM's alphaWorks site.
- For an introductory article on the Eclipse Platform and how it works, see the developerWorks article "Working the Eclipse Platform" by Greg Adams and Marc Erickson.
- To get started with developing applications using the Eclipse Platform, see the developerWorks article "Getting started with the Eclipse Platform" by David Gallardo.
- If you're interested in creating your own Eclipse plug-ins, see the developerWorks article "Developing Eclipse plug-ins" by David Gallardo.
- For a look at how one developer integrated XM (a simple content management and publishing solution based on XML and XSLT) and Eclipse, see the developerWorks article "Integrating XM and Eclipse" by Benoit Marchal.
- See the wealth of XML resources available on the W3C consortium Web site.
- Download the Transclipse plug-in at SourceForge.net.
- Get the Eclipse Tidy Project, also at SourceForge.net.
- Learn about the j2h (Java to HTML) plug-in.
- Browse the Eclipse plug-in registry.
- Find more resources for Eclipse users and XML developers on developerWorks.
Pawel Leszek, a Studio B author, is an independent software consultant and author specializing in Linux/Win/Mac OS system architecture and administration. He has experience with many operating systems, programming languages, and network protocols, especially Lotus Domino and DB2. Pawel is also the author of series of articles for LinuxWorld and a Linux columnist for the Polish edition of PC World. Pawel lives in Warsaw with his wife and sweet little daughter. Questions and comments are welcome; you can contact Pawel at pawel.leszek@ipgate.pl.



