If you've ever thought about writing documentation for an open source project, or if you've browsed the Linux Documentation Project or any page dedicated to documentation for Linux or other open source projects, you've probably heard of DocBook. But you may not be quite sure what it is.
DocBook is a markup language defined by an SGML or XML document type definition (DTD). Basically, DocBook is a set of tags that describe a document's structure. DocBook tags are similar to HTML tags, so if you've done any HTML, then DocBook won't be entirely foreign to you. DocBook is a bit more involved than HTML, but it is also much more useful than plain HTML because it facilitates the rendering of multiple formats from a single document. This article will describe how to create a simple article (document) in DocBook and how to use SGML-tools Lite to render several types of file-formats from that document.
DocBook has been around, in one form or another, since 1991. Originally DocBook was created to help exchange UNIX documentation. Since then DocBook gone through four major versions and now is under the guidance of the Organization for the Advancement of Structured Information Standards, better known as OASIS (see Resources).
Occasionally, the term DocBook is also used as a catchall term to describe the markup language and the tools used to convert DocBook documents into other formats. Technically, DocBook is only the DTD, but without SGML-tools Lite and other conversion tools DocBook isn't quite as useful.
The main selling point for DocBook is its portability. A document written in DocBook markup can be converted into HTML, PostScript, PDF, RTF, DVI, and plain ASCII text easily and quickly without any expensive tools. In fact, DocBook and all of the tools used to work with DocBook are freely available under open source licenses. DocBook documents are plain text, and can be edited with any text editor or word processor that can save documents as plain ASCII text. Note that if you use a word processor, take extra care to save DocBook documents as plain text; otherwise they will not parse correctly. If you'll want to use your documentation in more than one format, like print and online, you'll find DocBook is a great solution.
Another advantage of DocBook is that it frees the author from worrying about the formatting and layout of a document. DocBook is only concerned with the structure of a document. For instance, an author simply uses the DocBook markup to indicate text that should be emphasized with the <emphasis> tag. Depending on what format the document is converted to, the emphasized portion may be italicized, underlined, or set in boldface type. This is one less thing for the author to worry about while writing a document.
Creating a document with DocBook
Creating a document with DocBook is easy. We'll focus on creating a document using the SGML DTD. With the exception of the document declaration, everything in this article should apply to XML as well as the SGML DTD.
To begin, fire up your favorite text editor and create a new document. The first line of a DocBook document is the document declaration.
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.1//EN"> |
Every DocBook document requires a document declaration to be considered valid. This lets SGML-tools Lite, or whatever tool you're using, know what version of the DTD you are using and the type of document that you are creating.
Now we'll start adding a little meat to the document. We'll start with a title, author information, and a short paragraph. This brief example shows a few of the basic DocBook tags, or elements, in use. While DocBook elements may look similar to HTML tags, remember that DocBook parsers are much more demanding than your average Web browser. While you can get away with not declaring an HTML document, or even skipping some "required" tags, DocBook is not quite so forgiving. Be careful to include all required elements and use them in their proper order.
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.1//EN"> <article> <articleinfo> <title>A gentle guide to DocBook</title> <author> <firstname>Joe</firstname> <surname>Brockmeier</surname> </author> </articleinfo> <sect1 label="1.0"> <title>A brief introduction to DocBook</title> <para> If you've ever thought about writing documentation for an open source project... </para> </sect1> </article> |
Most of the elements used here are self-explanatory. Some of the elements, such as the <firstname> and <surname> elements, are only valid when nested inside their parent elements. The <firstname> and <surname> elements, for example, are valid nested within their parent element <author> but would not be valid if used within the <para> element.
There are five levels of the sect element: <sect1> through <sect5>. Unlike HTML where you can skip from a <h1> tag to a <h3> tag, you cannot skip from a <sect1> to a <sect3> element. DocBook is much more strict than HTML.
The next element is the <para> element. The <para> element is easy to remember because it stands for paragraph. You will probably find that the majority of DocBook elements make sense, and you probably won't need to look up the common elements after writing one or two documents with DocBook.
Some elements in DocBook can also include attributes that further describe the element. The <sect1> element in the above example includes a label attribute. Generally elements have optional attributes. However, some elements like the <ulink> element require an attribute. If you're unsure, check the official DocBook documentation to see what attributes are applicable to the elements you are using (see Resources).
To include an image in your document, you can use any of several elements. The <graphic> element is being phased out in favor of the <imageobject> element, so we will focus on the use of the <imageobject> element.
To include an image in your document, you will use three elements: the <mediaobject> element, the <imageobject> element, and the <imagedata> element.
<mediaobject> <imageobject> <imagedata fileref="images.d/fuzzy.eps" format="eps"> </imageobject> </mediaobject> |
This code includes the fuzzy.eps image in your document when you render it using SGML-tools Lite or one of the other DocBook conversion tools. Note that eps is a good format for printing documents or converting to PDF, but you would want to use a JPEG, GIF, or PNG file for Web publication. The DocBook rendering tools do not convert image file formats, so you'll need to have them in a native format for whatever type of output you want to use.
Often when writing documentation it is useful to quote source code within the document. The <programlisting> element is used to display source code as is. This is similar to the <pre> tag in HTML, however the <programlisting> element can include other DocBook elements that will get interpreted.
<para>
<programlisting>
function F_pollList() {
global $db,$G_URL;
$sql = "SELECT *,DATE_FORMAT(Birthstamp,'%c/%e/%y @ %h:%i %p') ";
$sql .= "AS Date FROM T_PollQuestions";
$question = @mysql_query($sql,$db);
$nquestion = mysql_num_rows($question);
F_start("List of Polls");
if ($nquestion > 0) {
} else {
print "There are no polls available at this time.";
}
F_end();
}
</programlisting>
</para>
|
Note that the above code uses the < and > characters, which would cause problems in an HTML document. SGML-tools Lite converts < and > to their appropriate escape codes when converting to HTML.
One thing that you'll probably want to do as well is make lists -- either bulleted lists or numbered lists -- using the <itemizedlist> element:
<para> This is how you create an non-numbered list. </para> <itemizedlist> <listitem> <para>This is a list entry</para> </listitem> <listitem> <para>This is another list entry</para> </listitem> </itemizedlist> <para> This is how you create a numbered list. </para> <orderedlist> <listitem> <para>This is a list entry</para> </listitem> <listitem> <para>This is another list entry</para> </listitem> </orderedlist> |
The <listitem> can be used with either the <orderedlist> or the <itemizedlist> elements. It cannot be used by itself, however.
When I started learning DocBook I found it was easier to create a basic template rather than trying to remember all of the necessary elements off the top of my head. This is a basic article template, which may help you get started on your first DocBook documents. You should be able to simply cut and paste this from your browser into your favorite text editor and get started. You should check to be sure that the version of DocBook you are using is the same as the version in the declaration in the template. If not, be sure to change it to the proper version.
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.1//EN"> <article> <articleinfo> <title>Sample DocBook Template</title> <author> <firstname>firstname here</firstname> <surname>lastname here</surname> </author> </articleinfo> <sect1> <title>Section 1 Title</title> <para> This is a paragraph. </para> <sect2> <title>Section 2 Title</title> <para> This is another paragraph... </para> </sect2> <sect2> <title>Section 2 pt. 2</title> <para> Yet another paragraph... </para> </sect2> </sect1> </article> |
Converting to other file formats
Often authors will use DocBook and never actually need to render the documents into other formats themselves. However, if you need to once you have a finished DocBook file, you can then convert that file into several other types of files using SGML-tools Lite. If you're using a Linux distribution, you may already have SGML-tools Lite or SGML-tools already installed. You may want to check and see if they're already installed and working, or if they are on your installation CDs.
If you get errors when trying to parse your DocBook files, check the syntax of your document. Often something as simple as a forgotten "/" or misplaced element is the only problem.
If you don't have them already installed, and you want to be able to render documents yourself, you can download them from the SGML-tools Lite home page. To get the latest version, go to the SGML-tools Lite home page hosted on SourceForge (see Resources) and download either the source or RPMs and follow the instructions on the SGML-tools Lite home page to install them.
To export a DocBook document named filename.sgml to HTML, simply type the following:
sgmltools -b html filename.sgml |
If the document has no major errors, SGML-tools Lite will produce a "filename" directory with the resulting HTML files inside.
DocBook also supports plain-text documents. To render a plain ASCII-text document, use the following command:
sgmltools -b txt filename.sgml |
This is the same as the command used to convert to HTML, except we've replaced "html" with "txt". The -b argument tells SGML-tools to use "txt" as the "backend". Currently there are several backends that are available: html, txt, rtf, ps, and dvi.
DocBook is capable of producing output suitable for commercial printing through PostScript. When converting to PostScript the command syntax is the same as for text or HTML:
sgmltools -b ps filename.sgml |
However, it is worth noting that if you are including images in the document, you will need to have them saved in EPS format to have them included in the PostScript document. SGML-tools Lite will not covert GIF or JPEG to PostScript for inclusion in the final document.
This has just been a brief overview of using DocBook. It is by no means an exhaustive look at all of the elements or potential that DocBook has. Hopefully, however, this article will suffice to get you started learning more about DocBook. After following along with this article you should be able to create basic DocBook documents and use SGML-tools Lite to produce usable output from DocBook files. For more information on DocBook, you can consult the online documentation at DocBook.org (see Resources).
If you would like to tinker a bit more with DocBook, a good place to start might be the Linux Documentation Project. Most of the documents in the LDP have a DocBook version available online that you could examine for more detailed usage of DocBook.
- Visit DocBook.org, the main DocBook site.
-
The Linux Documentation Project contains many documents written in DocBook. The LDP Author Guide has some tips on getting started with DocBook.
- Download either the source or RPMs at SGML-tools Lite and follow the instructions to install them. This site has the tools you need to convert DocBook documents to HTML, PDF, PostScript, RTF, or plain text.
- At the OASIS DocBook Pages, you'll find the DocBook Technical Committee home page.
- For help getting started with any SGML DTD, see the W3C Overview of SGML Resources.
- For more details, see the General SGML/XML Applications, OASIS' guide to SGML/XML apps.
Joe "Zonker" Brockmeier is a contributing editor for Linux Magazine and has written Install, Configure and Customize Slackware Linux for Prima Publishing. Joe welcomes your questions, comments, or ideas for future articles on DocBook and can be reached at jbrockmeier@earthlink.net.
Comments (Undergoing maintenance)





