Using the developerWorks XML validation tools

Optional tools for authors creating content for publication on developerWorks

If you can't find a validating XML editor you like, or prefer not to take the time now to learn how to use one, you can edit the XML for your developerWorks content using your preferred text editor. Ian Shields has created some great tools to help you validate, transform, and preview your content. This article shows you how easy it is to use those tools on Microsoft® Windows® or Linux®.

Ian Shields, Senior Programmer, IBM developerWorks

Ian ShieldsIan Shields works on a multitude of Linux projects for the developerWorks Linux zone. He is a Senior Programmer at IBM at the Research Triangle Park, NC. He joined IBM in Canberra, Australia, as a Systems Engineer in 1973, and has since worked on communications systems and pervasive computing in Montreal, Canada, and RTP, NC. He has several patents and has published several papers. His undergraduate degree is in pure mathematics and philosophy from the Australian National University. He has an M.S. and Ph.D. in computer science from North Carolina State University. Learn more about Ian in Ian's profile on developerWorks Community.



23 August 2013 (First published 28 July 2005)

Also available in Chinese Russian Japanese

Getting started

Creating content for publication on developerWorks involves these steps:

  1. Download the author package and unzip the file.
  2. Copy the XML template for articles.
  3. Edit the XML template to add your content, then validate that XML against the schema, and correct any errors.
  4. Preview your content in a browser to get an idea of how it will appear on developerWorks.

Start by reading and completing Step 1 and Step 2 in the article, "Authoring with the developerWorks XML templates." That article shows you how to download the author package that contains the tools and all files you'll need. It also includes tips for composing and submitting your content.

Then return to this article after you have completed Step 1 and Step 2. You will have downloaded the author package, unzipped the contents to your hard drive, and created a copy of the template using the new-article tools described in that article. You are now ready to do the remaining steps: edit, validate, and preview your content using the instructions below.


Using Microsoft Windows

You will need:

  • A text editor or word processor that can save in plain text format. Although not necessary, you will find it convenient to have an editor that can display line numbers, because any errors detected in the validation phase will be reported by line number. The Notepad application on Windows XP will display line numbers if you select the View > Status Bar menu option. Note that this option is not available in Notepad in earlier versions of Windows.
  • The latest version of Microsoft's XML Parser (MSXML), which is contained in Microsoft Core XML Services (MSXML) 6.0 at this writing. MSXML 4.0 Service Pack 2 (Microsoft XML Core Services) will also work. (To get the parser, see the Resources section of this article for a link.) Use the link to msxml6.msi to install Microsoft Core XML Services (MSXML) 6.0 on your computer. You may also save msxml6.msi to your local disk for later installation.
  • Internet Explorer Version 6, 7, or 8, or another web browser such as Mozilla, Firefox, or Opera.

Step 1. Edit the XML file

Navigate to your new folder and edit your file (index.xml) using your favorite text editor. Notepad will suffice if you don't have another preferred editor. Follow the detailed comments in the index.xml template file. They will help you understand what you need to do.

Be sure to save your file as plain text if you are using a word processor. Similarly, if you cut and paste from a file with embedded formatting, such as a Microsoft Word file, either use your editor's capabilities to paste (or paste special) as text, or be sure to save your XML file as plain text. Do not change the file name from index.xml, and do not edit the HTML file (index.html) that you may generate using our tools; your developerWorks editor will work from the XML version. Save any images, such as photos or screen shots, in the directory you created for your article (my-article, in our example).

Step 2. Validate the XML

When editing your content, you will need to validate the changes against our schema. If you are new to XML, we suggest validating your file as you go along. It will help you identify the errors more easily.

A tool for validating your XML was installed in your project directory when you created your new article. Click the dw-transform.vbs script in your article directory. Depending on your Windows settings, this may simply show as dw-transform. You should see a window like Figure 1.

Figure 1. Validating your article with the dw-transform script
Validating with the dw-transform script

Step 3. Correct validation problems

If you happened to make an XML coding error, you will see an error output instead. Using the dw-transform script, you should see a window like Figure 2.

Figure 2. An invalid article (VB script checker)
An invalid article (script checker)

In this case, we introduced a deliberate error by including <u>Underline error.</u>. Because links are underlined, we do not use underlined text for other purposes in developerWorks content, so the <u> and </u> tags are not permitted by the schema. The tool will identify the location of the first error and give a reason for it. The reasons are generated by the MSXML parser. Although they are somewhat cryptic, they will usually help you locate the problem. If the reason contains Expecting a, b, br, ... with a long list of other tag names, you've probably mistyped a tag name or attempted to use a tag that isn't supported by the developerWorks schema. An editor that displays line numbers will help you find errors quickly. See Figure 3, where we have marked the invalid text starting at the offending line and column number.

Figure 3. Locating an error in your XML with Notepad
Locating an error in your XML with Notepad

After you have located and corrected your error, save the file and run dw-transform.vbs again to recheck it. Repeat this process until you have no more errors.

Step 4. Preview your content

When you have no errors, you are ready to see a preview of how your content will look on developerWorks. You should have a file called index.html in your directory. Open this file with your preferred browser to preview your content. If you are using Internet Explorer with Windows XP Service Pack 2, you may see a pop-up window and an information bar advising that Internet Explorer has restricted the file from showing active content. Click the information bar and select Allow blocked content... in order to preview your content.

Figure 4. Previewing your article
Previewing your article

Click to see larger image

Figure 4. Previewing your article

Previewing your article

Note: Some of the stylesheets and some dynamic elements are included by the server, so the formatting in your preview won't appear exactly as it will when generated by the developerWorks staff and published on the server.

Next steps

Congratulations! You've edited, validated, and previewed your article. Now, return to "Authoring with the developerWorks XML templates" for tips on finishing and submitting your content to your developerWorks editor.


Using Linux or another operating system

You will need:

  • A text editor or word processor that can save in plain text format. Although not necessary, you will find it convenient to have an editor that can display line numbers, because any errors detected in the validation phase will be reported by line number. Many Linux and UNIX® editors (including vi and emacs) will display line numbers, either always or as a user option. If your editor can also display column numbers, that will help you even more.
  • An IBM Developer Kit for Java Version 5.0 or later. If you install from the tarball rather than the RPM, unpack in /opt/ibm. If you do not have root access to your computer, you can unpack the tarball version into a subfolder of the developerworks folder that contains this package. The IBM Runtime Environment for Java 2 (JRE) includes the necessary Xalan and Xerces functions and can be installed alongside other Java implementations. See the Resources section of this article for download links. If you use another Java runtime or developer kit, you may also need the Apache Xalan Version 2.7 or later package (which includes the required Xerces functions) if it is not included with your Java version. You may also need to modify the developerworks/tools/dwxmlxslt.sh script or set a CLASSPATH environment variable.
  • The appropriate zenity, gdialog, or kdialog package for your GNOME or KDE desktop if you are using a graphical environment, or the dialog package if you are using a non-graphical environment.
  • A graphical browser such as Mozilla, Firefox, or Opera.

Step 1. Edit your XML file

Navigate to your new folder and edit your XML file using your favorite text editor. Follow the detailed comments in the index.xml file. Be sure to save your file as plain text if you are using a word processor. Do not change the file name from index.xml. Save any images, such as photos or screen shots, in the directory you created for your article (my-article, in our example).

Step 2. Validate your XML

When editing your content, you will need to validate the changes against our schema. If you are new to XML, we suggest validating your file as you go along. It will help you identify the errors more easily.

The tool for validating your content was installed in your directory when you created your new content. Run the dw-transform.sh script in your directory. If you are running the KDE or GNOME desktops, you may run this from a graphical manager, such as Nautilus or Konqueror; otherwise, you should run the script in a terminal window.

The first time you run the validation script, it searches for a suitable Java version. This may take a few moments. After a suitable Java executable is found, its path is saved in the tools directory in a file called dwjava.txt. This path is checked first in the future to improve speed. If you remove this file, or if the path in the file is no longer valid, a new search will be performed. If you install the IBM developer kit in a location other than in the /opt, /usr, or developerworks (author package) tree, you can manually edit this file to tell the scripts where to find the Java executable. If you are using version 6 of the IBM developer kit on a 64-bit system and it is installed under /opt, your developerworks/tools/java.txt will look like Listing 1.

Listing 1. Typical dwjava.txt
/opt/ibm/java-x86_64-60/jre/bin/java

When the script completes, you should see a message box like Figure 5 if all is well.

Figure 5. Validating with the dw-transform shell script
Validating with the dw-transform shell script

Step 3. Correct validation problems

If you happened to make an XML coding error, your message box will show an error similar to Figure 6.

Figure 6. An invalid article (shell script checker)
An invalid article (shell script checker)

In this case, we introduced a deliberate error by including <u>Underline error.</u>. Because links are underlined, we do not use underlined text for other purposes in developerWorks content, so the <u> and </u> tags are not permitted by the schema. The tool will identify the location of the first error and give a reason for it. The reasons are generated by the Java parser. Although they are somewhat cryptic, they will usually help you locate the problem. If the reason contains One of '{"" with a long list of other tag names, you've probably mistyped a tag name or attempted to use a tag that isn't supported by the developerWorks schema (as in our example here). An editor that displays line numbers will help you find errors quickly. See Figure 7, where we have marked the invalid text starting at the offending line and column number.

Figure 7. Locating an error in your content with the gedit editor
Locating an error in your content with nedit

After you have located and corrected your error, save the file and rerun the dw-transform.sh script to recheck your file. Repeat this process until you have no more errors.

Notes:

  • Some errors, such as a legitimate opening tag without a matching closing tag, may result in an error without a line number. Validate often.
  • If you use tabs to indent your content, the column number shown in your editor may not match the column number reported in an error message.

Step 4. Preview your content

When you have no more errors, you are ready to see a preview of how your content will look on developerWorks. You should have a file called index.html in your directory. Open this file with your preferred browser to preview your content.

Figure 8. Previewing your content
Previewing your article

Click to see larger image

Figure 8. Previewing your content

Previewing your article

Note: Some of the stylesheets and some dynamic elements are included by the server, so the formatting in your preview won't appear exactly as it will when generated by the developerWorks staff and published on the server.

Next steps

Congratulations! You've edited, validated, and previewed your content. Now, return to "Authoring with the developerWorks XML templates" for tips on finishing and submitting your content to your developerWorks editor.

Resources

Learn

Get products and technologies

  • Microsoft Core XML Services (MSXML) 6.0: To use the dw-transform.vbs script to transform your content, you need the latest version of the MSXML parser. This is currently version 6. The file you need is msxml6.msi.
  • IBM Developer Kit for Java, Version 6: To use the dw-transform.sh script on Linux to transform your content, you need the IBM Developer Kit for Java, Version 5.0 or later.
  • Apache Xalan: If you are using the Linux tools (dw-transform.sh) and not using the IBM Developer Kit for Java, you may need Apache Xalan.
  • Evaluate IBM products: Download a product trial, try a product online, or use a product in a cloud environment.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into XML on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=90784
ArticleTitle=Using the developerWorks XML validation tools
publish-date=08232013