 | Level: Intermediate Daniel Robbins (drobbins@gentoo.org), President/CEO, Gentoo Technologies, Inc.
01 May 2001 Have you ever woken up in the morning to the realization that your personal development Web site isn't really that great? If so, you're in good company. In this series, Daniel Robbins shares his experiences as he redesigns the www.gentoo.org Web site using technologies like XML, XSLT, and Python. Along the way, you may find some excellent approaches to use in your next Web site redesign. In this, the second installment, Daniel shows off the new documentation system and sets up a daily CVS-log mailing list.
If you've read the first installment of my series on the gentoo.org redesign, then
you know that I'm the Chief Architect of Gentoo Linux, making me responsible for
the Gentoo Linux Web site. And right now, the site leaves a lot to be
desired. Yes, it does look somewhat attractive, but when you look beyond the cute
graphics you will see that it really doesn't serve the needs of its primary target audience:
Gentoo Linux developers, users, and potential users.
Last time, I used a user-centric design approach to
create a set of priorities for the site, and then used these priorities to
create an action plan for revamping gentoo.org. Two things were at the
top of the priority list: new developer documentation and a new mailing list
to communicate to developers changes made to our CVS repository. While
adding the new CVS mailing list was relatively easy (though, as you will see, it was
more difficult than I thought), the new developer documentation required
a lot of planning and work.
Not only did I need to create some actual documentation (a task that I had
been ignoring for too long), but I also had to choose an official XML
syntax that our new master documentation would use. You see, until a few
weeks ago, I was creating the documentation in raw HTML. This was
definitely a naughty thing to do, because by doing this
content was being mixed (the actual information) with presentation (the display-related HTML
tags). And what did I end up with? An inflexible mess, that's what. It
was hard to edit the actual documentation and extremely difficult to make
site-wide HTML improvements.
In this article, I'll proudly demonstrate the site's new flexible XML documentation
solution. But first, I'll recap my experiences in adding the CVS log mailing
list to our site.
Adding the CVS log mailing list
The goal of the CVS log mailing list is to inform developers of new commits
made to our CVS repository. Since I already had the mailman mailing list manager
(see Resources) installed, I
thought that creating this new list would be easy. First, I would simply
create the mailing list, then add the proper "hook" to the CVS repository so
that e-mails would be automatically generated and sent out, describing the changes to
our sources as they happened.
I first started researching a special file in my repository's CVSROOT called
"loginfo." Theoretically, by modifying this file, I could instruct CVS to
execute a script when any commit (and thus, modification) was made to the
repository. So I created a special loginfo script and plugged it into my
existing repository. And it did indeed send out e-mails to the new "gentoo-cvs"
mailing list whenever modifications were made to our sources.
Unfortunately, this solution wasn't all I'd hoped it would be. First of all, it generated lots of e-mail messages -- one for each modified file -- and secondly, the messages
were cryptic and sometimes even empty! I quickly removed my loginfo
script and put the gentoo-cvs mailing list project on hold. It was clear that
CVS's loginfo hook wasn't appropriate for my needs, and I had a hard time tracking
down any loginfo-related documentation that could help me solve my problem.
cvs2cl.pl
Several weeks later I started looking for an alternative to loginfo. This
time I did the smart thing and headed over to freshmeat.net. There I quickly found just what I
was looking for: the incredibly wonderful cvs2cl.pl perl script available from
red-bean.com (see Resources). Instead of
using the loginfo hook, cvs2cl.pl uses the "cvs log" command to connect directly to the
repository and extract the appropriate relevant log information. Also, rather
than spitting out relatively cryptic CVS log messages, it does a great job of
reformatting everything into a readable ChangeLog format (see Listing 1):
Listing 1: Output generated by cvs2cl.pl
2001-04-09 20:58 drobbins
* app-doc/gentoo-web/files/xml/dev.xml: new fixes
2001-04-09 20:47 drobbins
* app-doc/gentoo-web/: gentoo-web-1.0.ebuild,
files/pyhtml/index.pyhtml, files/xml/gentoo-howto.xml: new
gentoo-howto fixes
2001-04-09 20:03 drobbins
* app-doc/gentoo-web/files/xml/dev.xml: typo fix
2001-04-09 20:02 drobbins
* app-doc/gentoo-web/files/pyhtml/index.pyhtml: little update
|
cvs2cl.pl can also be instructed to generate output in XML format, and in my
next article I'll take advantage of this by incorporating an up-to-date ChangeLog
into the new developer section of our site.
The cvslog.sh script
Here's the script I now use to generate the daily ChangeLog e-mails.
First, it changes the current working directory to the location of my
checked-out CVS repository. Then, it creates $yesterday and
$today environment variables that contain the appropriate dates
in RFC 822 format. Notice that both date variables have the time set to either
"00:00" or midnight. These
variables are, in turn, used to create a $cvsdate variable that
is then passed to cvs2cl.pl to specify the date range that I'm interested in --
the span of time from yesterday at midnight to today at midnight. Thus, the
$cvsdate variable contains a datespec that informs cvs2cl.pl to
log only changes made yesterday, but not others.
In addition, I also created a $nicedate variable (used in the mail subject
line) and use the mutt mailer (in mailx
compatibility mode [see Resources]) to send the e-mail to the gentoo-cvs mailing list (see Listing 2):
Listing 2: cvslog.sh
#!/bin/bash
cd /usr/portage
cvs -q update -dP
yesterday=`date -d "1 day ago 00:00" -R`
today=`date -d "00:00" -R`
cvsdate=-d\'${yesterday}\<${today}\'
nicedate=`date -d yesterday +"%d %b %Y %Z (%z)"`
/home/drobbins/gentoo/cvs2cl.pl -f /home/drobbins/gentoo/cvslog.txt -l "${cvsdate}"
mutt -x gentoo-cvs -s "cvs log for $nicedate" <\
/home/drobbins/gentoo/cvslog.txt
|
Using cron, I run this script every night at midnight. Thanks to cvs2cl.pl,
my developers now get accurate and readable daily CVS updates.
The documentation project
Now, for the Gentoo Linux documentation project. Our new documentation system
involves two groups of people or target audiences: the documentation
creators and the documentation readers. The creators need a
well-designed XML syntax that doesn't get in their way; the readers, who
couldn't care less about the XML, want
generated HTML documentation that is both functional and attractive. The
implementation challenge is to put together a complete system that addresses
the needs of both audiences. Oh, and I suppose there is a third "audience" --
me, the webmaster and the person designing the new system. Since I'm going to
be interacting with the new doc system whenever the site is upgraded, I need it
to be reliable and flexible.
The Web-ready HTML
First, let's talk a bit about the Web-ready HTML that'll be generated from my
master XML files. To make great, readable documentation, I'll need to have
support for the proper XML tags. For example, the ability to insert notes,
important messages, and warnings into the body of the document (and have them
prominently displayed in the resultant HTML) is a must. Also, I must be
able to insert blocks of code, and it would be great if actual user input could
somehow be offset from program output. I could even add tags that highlight the
source code comments in an alternate color so that the code blocks are
more readable.
The documents should have a table of contents (with hyperlinks to the
appropriate chapters), a synopsis, a revision date, version, and an authors
list at the top of the document. And, of course, every document should have a
header at the extreme top of the page containing a small Gentoo Linux logo.
Clicking on this logo should bring you back to the main Gentoo Linux page.
Last but not least, every document should have a footer that contains copyright
information, along with a contact e-mail address.
The spiffy new logo
This was a hefty list of requirements, and I decided to focus on the most
entertaining part first, the new Gentoo Linux logo that would appear in the
upper-left corner of every Gentoo Linux document. I used the "g" from the
"gentoo" graphic (created using the excellent and free Blender 3D program) on our main page as the
basis for the new smaller logo. I tweaked the extrusion settings a bit and
then added a chrome environment map. Finally, I positioned the lights and
camera just so, and the new logo was complete. After importing it into Xara X (see Resources) and adding some text, this was the result:
The new Gentoo Linux logo
I used this new logo as inspiration for the rest of the HTML
color scheme, using a purplish theme throughout. I made heavy use of
cascading style sheets (CSS) to control font attributes and spacing.
Once I had a decent HTML prototype
in place, I started focusing on the guts of the new documentation -- the new
XML syntax. I wanted the syntax to be as simple as possible, so I created just
enough XML tags to allow for the proper organization of the document, but no
more. Then I started working on the XSLT to transform the XML into the target
HTML.
The result!
After much tweaking and a good amount of feedback from one of my developers,
the new documentation system reached the point where it was ready for use.
I immediately began work on our first new development guide, "The Gentoo Linux
Documentation Guide" (xml-guide.html), which contains a complete description of the
new XML format. Not only
did this allow other developers to begin work on the new-style documentation,
but it also served as an excellent example of the new documentation system in
action. Be sure to read this guide to get a complete understanding of our new
XML syntax. Download and check out the other components of our new documentation engine:
- The gentoo-doc.css stylesheet which specifies font,
color and style information for CSS-enabled browsers
- The guide.xsl provides the XSLT transforms -- rules used to convert
guide XML to Web-ready HTML
- The master XML version ( xml-guide.xml) of the documentation guide
DocBook vs. Guide
If you're working on your own documentation solution, you may also want to
consider the DocBook XML and SGML formats (see Resources).
DocBook is well-suited for large-scale technical documentation and book
projects, is very flexible, and has many (maybe too many) features. In
addition, there are a number of existing packages that can be used to convert
DocBook XML/SGML to man pages, texinfo files, Postscript, PDF, and, of course,
HTML formats.
I didn't choose DocBook because a lightweight XML syntax worked best for
Gentoo's needs. Right now, our XML guide syntax has around 20 tags and about
10 attributes. The limited tagset makes guide XML easy to transform into
other formats such as HTML, and also ensures a certain level of consistency
throughout our entire documentation set, since the format is so simple. Because
I have my own XML format, I'll be able to extend the format with new tags as
needed. I like having that level of control. I view XML as a technology that
should be used by people to structure their data in ways that they find
most helpful. In other words, the ability to define our own elements and
attributes is a precious thing, and I should take full advantage of it. After all,
it's the defining feature of XML.
Of course, creating your own XML syntax is not always the best solution,
especially when data interchange is important to you. Amid all the XML
hype, one thing that is often overlooked is that conversion to and from
different XML formats can be extremely difficult. In many cases, the two
formats won't be 100% compatible, and you'll have the unpleasant choice of
either throwing away data and/or metadata, intentionally avoiding use of
certain elements or attributes, or creating a "super-format" that will
accommodate the data and metadata from both XML formats. In the documentation
world, DocBook is a pretty good choice as a "super-format" because
it's so flexible; it can easily accommodate documentation imported from a
variety of sources.
However, DocBook's richness and flexibility can also create problems. For
example, there may be hundreds of tags that you may never need, and supporting
all these tags in your XSLT can make conversion to other formats more
difficult. So, while DocBook is a great container for documentation converted
from other formats, your own minimal XML syntax will almost always be
easier to convert to other formats.
The most important thing is to carefully evaluate any potential solution while
keeping the needs of your target audience(s) in mind.
Wrapping it up
With the new doc system in place, I converted all our docs to the new format
and posted the new docs on our existing site. In addition, I created a link
to the gentoo-cvs mailing list subscription page. The key point here is that I
integrated these features into the existing site so that users could benefit
from the improvements right away.
Download | Description | Name | Size | Download method |
|---|
| gentoo.css, guide.xsl, xml-guide.html and xml | us-gent.zip | 8 KB | HTTP |
|---|
Resources
- Read the other articles in this developerWorks series about the redesign of the www.gentoo.org Web site using technologies like XML, XSLT, and Python:
- In Part 1, the author creates a user-centric action plan and introduces pytext, an embedded Python interpreter (March 2001).
- In Part 3, he creates a new look for the site (July 2001).
- In Part 4, Daniel completes the conversion to XML/XSLT, fixes a host of Netscape 4.x browser compatibility bugs, and adds an auto-generated XML Changelog to the site (Aug 2001).
- Visit the Gentoo mainpage to read the latest documentation or find out more about Gentoo Linux
- If you haven't started using Python yet, you're only hurting yourself. Go check it out.
-
Xara.com is the home of Xara X -- an excellent vector drawing package for Windows. With virtually no bloat and blazing speed, it has my personal recommendation.
- Learn more about XSLT.
- When you wake up, check out Sablotron, a fast XSLT processor
available from Gingerall.
- You can find the wonderful cvs2cl.pl CVS-to-ChangeLog script at Red-Bean.
- Learn more about DocBook at http://www.docbook.org.
- If you're looking for a great mailing list manager, be sure to take a look at Mailman.
- Check out www.mutt.org for the most current
version of the Mutt e-mail client.
- Visit IBM's Ease of Use site for the latest in design guidelines, from designing for the Web to out-of-box-experience.
- IBM offers training courses in user-centered design -- find out more here.
About the author  | 
|  | Residing in Albuquerque, New Mexico, Daniel Robbins is the
President/CEO of Gentoo Technologies,
Inc., the creator of <a href="http://www.gentoo.org">Gentoo Linux</a>,
an advanced Linux for the PC, and the <b>Portage</b> system, a next-generation
ports system for Linux.
He has also served as a contributing author for the Macmillan books
<i>Caldera OpenLinux Unleashed</i>, <i>SuSE Linux Unleased</i>, and <i>Samba Unleashed</i>.
Daniel has been involved with computers in some fashion since the
second grade, when he was first exposed to the Logo programming
language as well as a potentially dangerous dose of Pac Man. This
probably explains why he has since served as a Lead Graphic Artist at
<b>SONY Electronic Publishing/Psygnosis</b>. Daniel enjoys spending
time with his wife, Mary, and his new baby daughter, Hadassah. |
Rate this page
|  |