You don't have to revolutionize your thinking (or coding!) to get increased clarity with Perl. Although it's difficult to write complicated tasks with Perl, it can be done. And it can be done neatly. You don't have to be the only one that can understand and maintain your program once it's written. Using these nine tips, you can keep using Perl, keep your style, and still have an accessible and stable program.
You may be thinking: "I have years of C/C++/Ada/Assembler/Pascal/LISP/Java, and my code is perfect. Don't talk to me about improvements." My response is that there is no perfection in programming, only the pursuit of perfection. Good programmers learn something new every day and improve their technique constantly.
Perl as a language is very malleable. For example, you can print out your environment (starting with version 5.005) like this:
Printing the contents of %ENV with a one-liner
print "$_ => $ENV{$_}\n" foreach(sort keys %ENV);
|
Or you can do it like this:
Printing the contents of %ENV, broken up
foreach (sort keys %ENV)
{
print "$_ => $ENV{$_}\n";
}
|
Or you can even use the Data::Dumper module:
Printing the contents of %ENV with Data::Dumper
use Data::Dumper; print Dumper(\%ENV); |
Every one of these approaches does the same thing in a debugging context. But which one is easiest to understand, document, and maintain? The third one, of course. If you have never used Data::Dumper, you should read the documentation ("perldoc Data::Dumper") and try it in your programs.
Speed is not the only measure of improvement of a program's code. Ease of testing, documentation, and maintenance should be kept in mind as well in any software project. A language as flexible as Perl facilitates good coding in every stage of the software project, except the pre-coding stages (requirements gathering and architecture design).
Writing comments: when your script flashes before your eyes
There is no such thing as too much documentation. Being clear often means repeating yourself. Think of your code as something you present to the world. There are a lot of people in the world. The one comment you thought was redundant could make someone's day a little easier. It could be your day, five years from now, when you are adding a new feature.
Use good planning when writing your programs. You don't have to determine every detail in advance. But you should break up the program into component parts, and use comments to fill in the gaps.
Let's take a small example: a program that reverses all its input except for names found in the /etc/hosts file.
The input reverser, first version
#!/usr/bin/perl -w # author, date, revision, etc. # brief summary of program # modules used summary # pragmas used summary # function A summary # function B summary # main loop summary # POD documentation |
Next, we elaborate on each piece:
The input reverser, second version
#!/usr/bin/perl -w # author, date, revision, etc. # This program will process user input with two functions. # Command-line arguments will be treated as input file sources. # No flags are allowed. # modules used: none # pragmas used: strict use strict; # function A: return a parameter with the letters reversed # function B: return true if a word is in the /etc/hosts file # main loop: go through input lines, passing each word to B # and, if B returns false, to A __END__ POD documentation |
And finally we have the last version, with code and comments. See the script: reverser.pl.
Loops can be categorized as one-liners and multi-line loops. Avoid the temptation to squeeze everything into one-line loops, even though it may seem more natural. Compare:
A one-line loop to print all the environment variables in lowercase, with the first letter uppercase, and sorted
print ucfirst lc $_, "\n" foreach (sort keys %ENV); |
With:
A multi-line loop to print all the environment variables in lowercase, with the first letter uppercase, and sorted
foreach (sort keys %ENV)
{
print ucfirst lc $_, "\n";
}
|
The second loop does the exact same job as the first. Which one would you rather see in production code that you have to maintain? Right. Multi-line loops are much easier to read.
Try to forget the bad things about C-based languages. This is good advice in general. But regarding loops in particular, you should not have to negate the whole condition. And you should use the 'for' statement as little as possible.
Transform if(not condition) into unless(condition).
If becomes unless
print "Yes" if not $false; print "Yes" if !$false; # becomes... print "Yes" unless $false; |
Likewise, while(not condition) is equivalent to until(condition).
Listing 9: while becomes until
print "No signal" while !$signal; print "No signal" while not $signal; # becomes... print "No signal" until $signal; |
The 'for' loop is a bad idea. It condenses three things into one line: initial condition, increment, and ending condition. While this was necessary before Perl came along, the 'foreach' loop has now eliminated the need for the 'for' loop.
For example, going from 1 to 10 with the variable $i:
Iterating with foreach
foreach $i (1 .. 10)
{
# do something
}
|
Compare this with the equivalent 'for' loop:
Iterating with for
for ($i=1; $i <= 10; $i++)
{
# do something
}
|
In some instances, the 'for' loop can be useful. But I suggest heavily documenting its occurrence, and providing a reason for why 'foreach' was not sufficient. For a beginner or intermediate Perl programmer, it can be more harmful than beneficial to use the 'for' construct.
If you miss lint and -Wall from your C days, there is still hope.
Use the -w flag to run your script. Use the 'use strict' instruction to the Perl interpreter to make your programs legible, structured, and better running. You won't magically become a better programmer, but it will significantly improve your code. For example, variables and hash keys won't be created on their first use by default. Variables have to be declared, and you'll avoid many common bugs.
See reverser.pl for an example of how to use strict and -w.
Don't define constants as variables or functions. The 'use constant' directive does a better job. If your Perl doesn't support it, upgrade.
Use constant example
use constant TIMESLICE => 15; |
Use prototypes for your functions. Don't make up your functions' usage as you go along. Plan before you code, and your work will be all the better for the planning.
Be aware of the differences between scalars, numbers, and strings. Read the FAQ, the Programming (or Learning) Perl book. In general, do your homework on this one. Perl makes it easy to convert numbers into strings and vice versa, but that can create subtle bugs that take hours or days to find.
The topic of cleanliness can never be exhausted. You should always strive to produce clean code, learning as you go along. These are only the very basics of good usage in Perl syntax. Look at the perlsyn and perlstyle pages as well. They contain pertinent information on this topic.
Functional Programmers Anonymous meets at 9 PM tonight
For the programmers who want Perl to be a more functional language, map and grep can satisfy the craving. Map and grep evaluate a block or an expression for each element of a list. They can be surprisingly useful with the block syntax. We will look at some more advanced examples of that syntax in a minute. Map and grep set $_ to the current element they are examining inside the block.
The often-quoted Schwartzian transform is really just a temporary array by one of its fields. Here is an example:
The Schwartzian transform
@sorted_list =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] } # note that this is a numeric sort
map { [$_, index_function($_)] }
@unsorted_list;
|
The unsorted list can contain any data you like. Write an index_function that will generate a key that can be used for sorting each element of the unsorted list. The transform will sort the elements by those keys numerically.
If you do not understand this code listing, you should read the documentation for the map and sort functions carefully. Learn to understand the Perl notation for anonymous array references and dereferencing array elements. The Effective Perl Programming site (see Resources) has a detailed explanation of the Schwartzian transform.
Map and grep allow a lot of neat techniques. For example:
Mapping an array to uppercase
@uc = map { uc } @values;
|
This will convert an array to uppercase by mapping the uc function to each element in the array. Have we seen that before? Yes! The loop to print environment variables could benefit from this approach to increase legibility:
A one-line loop to print all the environment variables in lowercase, with the first letter uppercase, and sorted, using map
print $_, "\n" foreach (map {ucfirst lc } sort keys %ENV);
|
Even better code, because it is clearer and separates form from function, would be:
A multi-line loop to print all the environment variables in lowercase, with the first letter uppercase, and sorted, using map
foreach (map {ucfirst lc } sort keys %ENV)
{
print $_, "\n"
}
|
Grep works exactly like map, except that the elements only "pass through" if the expression or block returns true. Map lets everything through. You can use grep to extract every array element that's a valid file, for example:
Filter array of filenames for validity
@valid = grep { -f } @names;
|
The -f operator returns true only if the file exists.
When modules knock at your door
If you ever find more than one program using a particular functionality, it's time to put it in a module. You don't have to use object orientation in your module. Simply saying:
Using the Exporter module
package My::Package;
require Exporter;
@ISA = qw(Exporter);
@EXPORT = qw(my_function); # symbols to export by default
sub my_function()
{
}
1;
|
is enough to create your module My::Package and export the function my_function from it. Now, when other programs say "use My::Package" they will automatically import the function "my_function."
There is a lot more to modules. Read the perlmod documentation and consult the Resources section of this article. I strongly recommend learning about modules and making an effort to use them. They will make your life easier. As a general guideline, if you use a function in more than one program, it should be in a module.
Object-oriented programming and other unsightly habits
Object-oriented programming (OOP) is a wonderful tool. Don't try to make it a religion, though. OOP does not:
- Solve the world's problems
- Shorten development schedules halfway through the project
- Require expensive tools
- Make planning unnecessary, because "objects take care of themselves"
OOP does:
- Make life easier when used correctly
- Allow abstract concepts to take shape quickly
- Interface nicely with Perl's modules
In short, use OOP, but don't rely on it to do everything. For an introduction to OOP with Perl, see Resources and the perltoot page. OOP can be as simple as data abstraction, or as complex as a company-wide methodology.
An example of a simple object can be found in the PSI::ESP module: ESP.pm.
Get the ptkdb debugger from CPAN (see Resources) as soon as you can. It requires a few modules, but it's well worth it. It will let you view data and code in your programs and in the loaded modules. ptkdb has saved me from hours of frustration and early hair loss many times.
In the Perl community, the PSI::ESP module is famous. Unfortunately, only a few have real access to its wisdom. The rest of us have to help each other as best we can without psychic abilities. If you need help with a Perl problem, or with your Perl style, remember to include a complete description of the situation, what you have already tried, and your complete code in the SOS.
I will now provide a preliminary version of the ESP module. As long as the Earth's magnetic field is aligned correctly, this will solve every Perl problem you may encounter. I hope you find it useful.
You can use the module with the following syntax as long as you place the module in a PSI subdirectory of your current directory: ESP.pm:
perl -I. -MPSI::ESP -e 'use PSI::ESP; $p = new PSI::ESP; print $p->reason' |
I hope this article has helped you improve your Perl skills a bit. Much of the information presented here is complemented by what you will find in Resources. Enjoy your journey with Perl. I hope you find it to be a constant source of insight and excitement!
- Read Ted's other Perl articles in the "Cultured Perl" series on developerWorks.
-
reverser.pl contains the final version of the script mentioned in this article, with code and comments
- An example of a simple object can be found in the PSI::ESP module
- For all the Perl FAQs you can get your hands on, start at Perl.com or Perl.org
-
CPAN (the Comprehensive Perl Archive Network) contains all the Perl modules you ever wanted
- See the Parallel Module for Perl from CPAN
-
EZ Perl provides Comprehensive Perl Database Scripts for database management
-
Free Perl Code provides "tons" of free Perl code
-
Effective Perl Programming has lots of useful information
- See the
Lingua::EN::Fathom documentation, for details on the module that analyzes the readability of English text
-
Perl's Artistic License
-
The Perl Monks
-
Perl Power Tools: The Unix Reconstruction Project
-
The Roth Consulting Perl Page
-
Effective Perl Programming, by Joseph Hall and Randal Schwartz (Addison
Wesley, 1998) is the definitive source of Perl tips and tricks in book
form, relating to the language rather than specific tasks
-
Object Oriented Perl, by Damian Conway (Manning Publications, 2000) is an
excellent guide to modules and object orientation
-
Programming Perl, 2nd Edition, by Larry Wall, Tom Christiansen, and
Randal L. Schwartz (O'Reilly, 1996) is the best guide to Perl
today, but a little outdated with 5.005 and 5.6.0 out now
-
The Perl Journal
- The USENET comp.lang.perl.misc newsgroup is a wonderful place to
learn. Please read the articles for a while before you post, check
the FAQs, and generally behave nicely when you participate in the
c.l.p.misc community. Look at section 9 in particular.

Teodor Zlatanov graduated with an M.S. in Computer Engineering from Boston University in 1999. He has worked as a programmer since 1992 using Perl, Java, C and C++. His interests are in open source work on text parsing, 3-tier client-server database architectures, UNIX system administration, CORBA, and project management. He can be contacted at tzz@bu.edu.
Comments (Undergoing maintenance)





