Skip to main content

The road to better programming: Chapter 2

Commenting your code

Teodor Zlatanov (tzz@iglou.com), Programmer, Gold Software Systems
Teodor Zlatanov graduated with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work on text parsing, 3-tier client-server database architectures, UNIX system administration, CORBA, and project management.

Summary:  This series of articles on developerWorks comprises a complete guide to better programming in Perl. In this second installment, Teodor dissects comments in code. The comments in a program's code are perhaps as important to the long-range goals of a software team as the actual code itself. Unfortunately, they are also often the most neglected. Through tips, quips, examples, and anecdotes, Teodor takes an in-depth look at the imperative nature of commenting a program's language from beginning to end.

Date:  01 Nov 2001
Level:  Introductory
Activity:  886 views
Comments:  

There is no such thing as too much documentation. Being clear often means repeating yourself. Think of your code as something you present to the world. There are a lot of people in the world. The one comment you thought was redundant could make someone's day. It could even be your day, five years from now, when you are adding a new feature.

Basic commenting

Use good planning when writing your programs. You don't have to determine every detail in advance, but you should break up the program into component parts, and use comments to fill in the gaps.

The following is my personal coding style. You may not like it, but try to look at it objectively and see what you can use for yourself or for your team.

First of all, think of the intended audience for the comments. Try to make the comments clear enough for a third-party consultant to follow. The more complex the code, the more comments you should add to clarify intent. Don't leave comments for later; make them a part of your thought process: problem, solution, comment, then debug. Especially important is creating comments before debugging. The comments in your own code will help you debug better and faster.

It is helpful sometimes to state not only the solution to the problem, but also the problem itself. For example:


Listing 1

# function: do_hosts
#
# purpose: to process every host in the /etc/hosts table and see if it
# resolves to a valid IP
#
# solution: read the list of hosts as keys in a hash, then go through
# the list of keys (hosts) and store the IP address for each host as
# the value for that key, or undef() if it doesn't resolve properly.
# Return a reference to the hash, or undef if the /etc/hosts file was
# not accessible.

Some prefer brevity:


Listing 2

# function: do_hosts:  process every host in the /etc/hosts table and
 see if it
# resolves to a valid IP; return a reference to the hash (key=host,
# value=IP or undef), or undef if the /etc/hosts file was not
 accessible. 

And here's another way:


Listing 3

# do_hosts: returns a ref to hash of hosts (key=host, value=IP/undef)
# from /etc/hosts 

All of the above ways are valid, depending on the complexity of do_hosts(). If the function is two lines, don't waste your time writing three paragraphs of comments. If it's several pages, however, don't be frugal with explanations.


Commenting the beginning of the program

The program should begin with a brief explanation of its purpose. Don't make people scroll down several pages to figure out what you were doing. If you are using a version control system such as CVS, place the appropriate headers at the beginning of the file, such as the Id header. Be concise. Two lines, four at most, should be sufficient to describe a program briefly. Give a contact name, e-mail, telephone number, or team contact.


Listing 4

#!/usr/bin/perl -w
# whodunit.pl: A script to solve a murder mystery
# by joe@shmoe.com $Id: whodunit.pl,v 1.92 2000/08/08 19:08:50 joe Exp 
$ 

The comment on the first line is a standard way on most UNIX systems to indicate which program runs when execution of the script occurs (everything after the '!' is considered the interpreter name). The -w flag signals to turn on warning -- always a good idea, even for an experienced programmer.

The second line (first comment line) is a brief description of the program and its purpose. The third line (second comment line) names the author, and gives an Id header that uniquely identifies the release date and version of the file. RCS and CVS specifically use the Id header, which updates automatically upon committing the script. For more on RCS and CVS, see Resources later in this article.


Commenting initialization sections

The initialization sections should be logically and physically separate from the beginning of the program, by virtue of extra comments or being at the beginning of the file, for example. Initialization sections, as opposed to the program's beginning described above, contain actual code that executes when the program starts up. In Perl, the initialization section should consist of the following (preferably in that order):

  • Modules and pragmas
  • Constants
  • BEGIN/END/INIT/CHECK subroutines
  • Initialization code

Modules and pragmas

The use keyword in Perl directs the interpreter to either load a module or turn a pragma on ("no pragma" turns a pragma off). Pragmas nudge the interpreter in the right direction. For example, use utf8 tells the interpreter to prepare for UTF-8 encoded data files and streams.

It's good to line up the comments for each module horizontally and to have one comment per module or pragma:


Listing 5

use Data::Dumper;               # for debugging printouts
use strict;                     # be strict - pragma for the 
interpreter
use POSIX;                      # use the POSIX functions

After the first time you do this, it's just a matter of copy and paste to get the modules and pragmas into a new program. I recommend the "strict" pragma. Among other things, it will ensure that you are honest about declaring your variables, which in my experience is as much a source of bugs in Perl as memory allocation is in C/C++.

See all module and pragma documentation with the perldoc command. For example, perldoc strict tells all about the strict pragma -- what it does, how to use it, and so on.

Some editors have the nice ability to always place comments at a certain position (in Emacs, the indent-for-comment command does this automatically). Thoroughly familiarize yourself with your editor's commands. It is time well spent.

Constants

Although you can view constants as just another Perl pragma, they deserve their own section. Commenting for them should be like that of modules and pragmas, but it looks nice if the arrows line up as well:


Listing 6

use constant ALPHA    => 1; # alpha code
use constant BETA     => 2; # beta code
use constant GAMMA    => 3; # gamma code
use constant USER     => 4; # user ID offset
use constant GROUP    => 5; # group ID offset
use constant DEPT     => 6; # dept. ID offset

BEGIN/END/INIT/CHECK subroutines

Comment the BEGIN/END/INIT/CHECK subroutines (see perldoc perlmod for more information on them) just like regular subroutines. Creation can be anywhere in the file, and it's possible to define them multiple times. I recommend placing them at the beginning or end of the file, where finding them can be easy. Note that a one-line BEGIN function does not need extensive commenting.


Listing 7

# BEGIN: executed at startup, assigns 'root' to the USER environment 
variable
BEGIN 
{
  $ENV{USER} = 'root';
}

Initialization code

Last in the initialization section comes the actual code. Again, line the comments up if possible within individual blocks.


Listing 8

$| = 1;  
                               # auto-flush the output
$Data::Dumper::Terse = 1;               # produce human-readable 
Data::Dumper output
# define the configuration variables
my $config = AppConfig->new();
$config->define(
                # list of undo commands
                'UNDO'            => { ARGCOUNT => ARGCOUNT_LIST },
                # file to log data
                'LOG_FILE'        => { ARGCOUNT => ARGCOUNT_ONE  }, 
                );
$config->file(whodunit.conf');          # load the whodunit 
configuration file

This initialization code turns auto-flushing on (so output will show up immediately), then tells the Data::Dumper module to produce human-readable output, and finally creates an AppConfig configuration.


Commenting regular code

Commenting regular code is pretty easy. Just line up the comments when possible, be concise, and don't be afraid to explain things in-depth when they are unclear.


Listing 9

print Dumper \%ENV;                     # print the full ENV hash

# get the environment variable names that begin with USER
@user_vars = grep(/^USER/, keys %ENV); 

# print the values in all the variables that begin with USER, using a
# hash slice
print Dumper @ENV{@user_vars};          

print "Done\n";                         # print "done" message

# TODO: find better method of sorting variables
# TODO: use Data::Dumper with variable names

Note that comments begin either at column 0 or column 40. Consistency makes comments more readable. Also, multi-line comments are fine when necessary. You can also use comments to note where functionality is missing, buggy, or incomplete. The "TODO" word is helpful in case you want to look through all your code and see what things are still incomplete -- a quick grep command will print out all the TODO items.

There's no need to comment every single line of code, but keep in mind that comments are the single best resource when debugging or extending programs. Any other source of programmer documentation is likely to be one step behind the actual code, unless the programmer has been very diligent.


Commenting loops and conditionals

Loops and conditionals should be commented like regular code and functions. Numbering loops to identify them later is somewhat extreme. A better approach is to use a folding editor, which can show a whole loop as one line upon folding the loop (lines between folding marks are hidden but not gone). Think of folding marks like XML/HTML begin/end tags, which are possible to nest. Your favorite editor may support folding already. (X)Emacs does, either with Outline or with folding.el modes.


Listing 10

# go through all the numbers between 2 and 200, and print a message
# for each one
foreach my $counter (2 .. 200) 
{
  print "Whoa, the counter is $counter!\n";
}

Always state the purpose and bounds of the loop. For example, "count from 2 to 200" is fine, but "process array" is not. If logical conditions affect the bounds, state them as well, but not at the top of the loop. The summary at the top of the loop should not note exceptions to the general iteration, unless they are very important to the loop. Let discretion be your guide.


Commenting the final stages of the program

In many ways, the end of the program is the most boring. The work has been done, the data structures have gone to sleep (there is no memory deallocation that you need to worry about in Perl), and now the end is just a few lines away. Don't let this fool you -- the finishing lines of a program can be just as perilous as the rest. Comment the most trivial lines here because the first thing a debugging programmer does is look at the program's exit behavior.


Listing 11

# delete old files, warn if they can't be removed
foreach (@myfiles)
{
 unlink $_ or warn "Couldn't remove $_: $!";
}
print "whodunit.pl is done!\n";         # tell the user we're done
exit;                                   # exit peacefully


Writing POD documentation and help for the program

Plain old documentation (POD) is a way to document a Perl script inside the script itself. The perldoc perlpod command will tell you more about POD documentation and its syntax. Good POD documentation means that users can access help for your program quickly and efficiently. Take the time to learn the POD syntax; writing manuals will be much easier. In addition, POD is compatible with a variety of manual formatters, so you can generate a plain text file, a UNIX-style man page, and a professional-looking LaTeX file from the same documentation. POD is a fairly limited format, but perfectly sufficient for most documentation needs.

Generally, the following sections should be present in POD documentation: NAME, SYNOPSIS, DESCRIPTION, OPTIONS, RETURN VALUE, ERRORS, DIAGNOSTICS, EXAMPLES, ENVIRONMENT, FILES, CAVEATS/WARNINGS, BUGS, RESTRICTIONS, NOTES, SEE ALSO, AUTHORS, HISTORY (from perldoc pod2man, where you can find more information on each section; keep in mind that these are suggestions rather than imperatives).

Some programmers make the -h switch to their programs invoke perldoc on the program, so the POD documentation is printed out as if the user had typed perldoc whodunit.pl. The problem here is that a user doesn't want too much extra information from the -h switch. He just wants the synopsis and the list of options. Thus, it is better to write separate help handlers arising from the use of the -h switch:


Listing 12

# print_help: help handler, prints out help for whodunit.pl and exits
sub print_help
{
 # print the help itself
 print << EOHIPPUS;
 This is help for the whodunit.pl program.

You can pass options to whodunit.pl as command-line arguments.  For
example:

..../whodunit.pl -h
..../whodunit.pl -show suspects

List of options:

-h     : print this help

-show  : show the suspects, victims, or detectives (all of them if no
         second argument is specified)

-quiet : print no information other than the killer's name

EOHIPPUS
 exit;                                  # do nothing else, just exit quietly
}

Note the documentation of print_help itself. Also, the the appearance of the POD documentation and other online help is important. The first place a user goes is not the manual. It's much more convenient to use the -h flag, or look at the POD documentation. Note the alignment of the colons, the spacing between lines, and the overall neatness. Outward appearances do matter, often more than the actual functionality provided by the program. Well-written programs should have good documentation first and foremost.

Some programmers like to include POD documentation in their program instead of regular comments. Such POD comments begin with =pod on a line by itself (there are other options, explained in the perlpod documentation), and end with =cut on a line by itself. The =pod line tells the Perl compiler to stop interpreting everything until the =cut line, in effect excluding that block of text from the script itself. This is fine if your users are also programmers, but may confuse normal users who just want to look at the documentation for the script, not the comments for the code itself. This approach also scatters documentation throughout the code. Use it with restraint.


Resources

About the author

Teodor Zlatanov graduated with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work on text parsing, 3-tier client-server database architectures, UNIX system administration, CORBA, and project management.

Comments



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=11176
ArticleTitle=The road to better programming: Chapter 2
publish-date=11012001
author1-email=tzz@iglou.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers