This article is a sequel to "Use IMAP with Perl, Part 1" and discusses the ifrom.pl tool again. Please read the introductory material in the previous article to understand the specific mechanics of ifrom.ipl before tackling this article. The new topics in this article include the Maildir mail storage format and tunneling (also known as port forwarding).
The extensions to ifrom.pl have met my own needs as an IMAP user, so I hope they prove useful to you as well.
Tunneling, also known as port forwarding
Tunneling a network connection (also known as port forwarding, although technically the two are not exactly the same) is a common technique in computing environments. To tunnel a connection is to redirect it through one or more destinations (but not necessarily remote machines). The tunneling application usually doesn't know what data it carries. The program that uses the tunnel doesn't know it. Common UNIX tunneling applications are stunnel and OpenSSH, among others. (OpenSSH has other uses of course, but tunneling fits nicely with them.) Any program that can use a network port can use a tunneled network port as well. Think of it as a telephone connection that can go next door just as easily as it can go across the world.
The versions of SSH other than OpenSSH support tunneling, but because of the popularity and unrestricted availability of OpenSSH, the discussion and examples in this article will use it.
In restrictive environments, tunneling can be very useful. Say you are not allowed to connect to an outside IMAP server because your firewall is configured to restrict that activity. You can run OpenSSH in port-forwarding mode so it forwards port 143 (IMAP) on the outside IMAP server to port XYZ on your local machine. You don't have to log in to the IMAP server itself to do this! You just need an intermediate jump point that accepts SSH connections and can in turn connect to the IMAP server. Of course, if you can SSH into the IMAP server directly, the connection will be faster and more reliable because the intermediate jump point will be eliminated.
You can also tunnel from the IMAP server to your own machine, reversing the connection process.
Another use of tunneling is to encrypt the connection. When you tunnel with OpenSSH, you can use its cryptographic abilities to encrypt your IMAP traffic if your IMAP server doesn't provide a secure connection. You can even do it if the IMAP server provides a secure connection already, so you'll encrypt the connection twice. Go wild.
To provide tunneling, I added a TUNNEL option to ifrom.pl. It's embarrassingly simple. In fact, my 10-month old daughter can write a better one in half the time (if she doesn't eat the keyboard first). I just added a -tunnel option to ifrom.pl and when that option is present, I run it with system().
Listing 1. Using the TUNNEL option to ifrom.pl
# use -tunnel like so with OpenSSH:
# ifrom.pl -tunnel "ssh REMOTEHOST.COM -N -T -n -L 2002:127.0.0.1:143 &"
# [other options to follow...]
# This forwards the REMOTE port 143 on REMOTEHOST.COM to the LOCAL
# port 2002. See the OpenSSH documentation for details and more
# examples. There is no intermediate jump point.
if ($config->TUNNEL())
{
system($config->TUNNEL());
}
|
Afterwards, I go on to do what ifrom.pl would have done otherwise. That is, I connect to the IMAP server and check for mail. That's why tunneling is handy -- the program that uses it doesn't have to do anything unusual because the magic is all in the data-transport layer. In my case, ifrom.pl had to be modified slightly, but the program's main logic was not touched.
Maildir is a mail storage format. It is used by Courier IMAP and qmail among others, to store users' mail. A Maildir consists of a single directory with the subdirectories "cur," "new," and "tmp." There can be other subdirectories, but they are ignored. I will call the Maildir directory "the Maildir" for simplicity.
Of the three subdirectories mentioned in the previous paragraph, only "new" interests us directly. In each subdirectory, Maildir stores messages, one per file. The files are named uniquely.
The "new" subdirectory contains all the messages that are new to the Maildir. The semantics of what's new depend on the particular mail-delivery agent (MDA) and other programs that use the Maildir. For instance, Courier IMAP marks seen messages by moving them out of the "new" subdirectory of the Maildir.
First, I added the -maildir switch to ifrom.pl. It's just a scalar that tells me not to try an IMAP connection and to do the Maildir logic instead.
The Maildir processing is done with a glob() call. This means that the local shell's facilities for wildcard matching of "Maildir/new/*" are used. I could have used opendir() and readdir() as well, but the glob() call is much simpler.
For each file found with the glob() call, I grab the sender and subject and print them. Then, if the -dump or -print switches are given, I print the requested messages whole by printing the corresponding files one by one. The order of the files is determined by the shell's glob() function. I don't do a date sort because the shell's sorting of Maildir files has been fine (it already sorts them by date when the files are delivered just by qmail). With more mail-delivery agents, a date sort may be necessary.
Listing 2. Maildir support in ifrom.pl
if ($config->MAILDIR())
{
my $count = 0;
foreach my $file (glob($config->MAILDIR() . '/new/*'))
{
$count++;
open M, "<$file";
my $address = 'UNKNOWN';
my $subject = 'UNKNOWN';
while (<M>)
{
$address = $1 if m/^From: (.*)/; # the sender of the message
$subject = $1 if m/^Subject: (.*)/; # the subject of the message
last if $_ eq "\n";
}
printf "%5d %-35.35s %s\n", $count, $address, $subject;
if ($config->DUMP || grep {$_ == $count} @{$config->PRINT})
{
close M;
open M, "<$file";
print MARKER();
print foreach <M>;
print MARKER();
}
}
}
|
As a mail-server administrator, I have had to migrate users' mail from the so-called mbox format (one big file with messages in it) to the IMAP server. Because ifrom.pl already had the necessary IMAP logic, I added the -import switch and used the Mail::Box module to handle reading mail from an mbox file.
This is a slow method because every message has to be moved over a network connection. For true batch imports of many users, you may have to consider a more sophisticated method that's custom-tailored to your site. For example, if your site uses Maildirs for mail storage and you are migrating from the mbox format, just use an mbox-to-Maildir converter such as safecat (safecat has many other uses, by the way). See the safecat Web site for a recipe on mbox-to-Maildir conversion (see Resources).
The auto_mailbox parameter to -mailbox (stored in the MAILBOX_AUTO constant) tells ifrom.pl to pick a mailbox name based on the name of the file. I use the basename() function on the value of the -import switch, combined with the PREFIX parameter. If auto_mailbox is not specified, then whatever -mailbox says is what the new mailbox will be called. So, -import A/B/C/FILE -mailbox auto_mailbox will create and populate an IMAP folder called "FILE," while -import A/B/C/FILE -mailbox XYZ will create and populate a folder called "XYZ."
Furthermore, if a file called "XYZ.msf" (XYZ can be anything, .msf is the Mozilla mbox index filename extension) is seen, then just "XYZ" is used as the filename. That lets you do something like find DIRECTORY -name "*.msf" -exec ifrom.pl -import {} ... \; which passes the "find" results to ifrom.pl one by one. It will find all the Mozilla index files and then ifrom.pl will import the corresponding mailboxes one by one.
PREFIX tells ifrom.pl the IMAP server's prefix. The prefix is actually available from the IMAP server with the namespace() function, but the logic for obtaining it was too complex for a simple script such as ifrom.pl. In case you need the IMAP prefix or the IMAP separator, they are available with namespace(). As an example, the common UW IMAP prefix and separator are "" and "/", so mailboxes look
like "a/b/c" (note the similarity to file and directory names). The Courier IMAP server prefix and separator are "INBOX." and ".", so mailboxes look like "INBOX.a.b.c" (note the flat structure -- Courier folders are all under the top level Maildir, without subdirectories).
The -dryrun option to ifrom.pl is very useful. It lets you see everything ifrom.pl would do while importing, except the actual import doesn't happen.
Listing 3. Importing mail
elsif ($config->IMPORT)
{
eval { require Mail::Box::Manager; };
die "You need to install the Mail::Box module, exiting" if $!;
my $file = $config->IMPORT;
if ($file =~ m/^(.*)\.msf$/i)
{
$file = $1;
print "MSF file detected, using $file as the file name\n";
}
die "Can't access import file $file" unless -r $file;
if ($config->MAILBOX eq MAILBOX_AUTO)
{
$box = $config->PREFIX . basename($file);
}
my $mgr = Mail::Box::Manager->new;
my $folder = $mgr->open(folder => $file);
my $i;
if ($config->DRYRUN)
{
print "Skipping folder check and creation because a dry run was requested\n";
}
else
{
if ($imap->select($box))
{
print "Selected folder successfully, ready to import.\n"
if $config->VERBOSE;
}
else
{
print "Could not select folder $box, trying to create it...\n";
$imap->create($box)
or die "Could not create import folder: $@\n";
}
}
# Iterate over the messages.
foreach ($folder->messages)
{
printf "Appending to mailbox %s, message %d: ID %s, %d lines\n",
$box, ++$i, $_->messageId, $_->nrLines;
if ($config->DRYRUN)
{
print "Skipping import because a dry run was requested\n";
}
else
{
$imap->append($box, $_->string);
}
}
}
|
I wrote a help section. It's short and informative and you can bring it up with ifrom.pl -help.
I added a MARKER() constant because the "\n===\n\n" divider was used frequently.
For Maildirs, I did not add all the fancy code that IMAP has, such as the -backup or -delete_mailbox_really switch. Because Maildir is a local feature, meaning you can access the data in the Maildir over the locally mounted filesystems, backups and mailbox deletion are simple shell operations. For example, to back up a Maildir, follow this example:
Listing 4. Backing up a local Maildir
rsync -avP --delete MaildirLocation Destination # so, for example, to back up /var/qmail/maildirs/tzz # to /home/tzz/backups rsync -avP --delete /var/qmail/maildirs/tzz /home/tzz/backups |
Read the rsync documentation -- it's a great tool for backups and directory synchronizations in general. If you don't use it, you are missing out on one of the best UNIX tools available today.
The logic of the Maildir and IMAP loops may seem similar, but they were not similar enough to be merged. If another mail source was added to ifrom.pl, maybe it would make sense to merge all the loops into one. As it is, I did not think it worthwhile to merge the two loops, as it would have made the program logic more confusing and we would have gained nothing.
I added the ability to call -print with just numbers from the command line, so instead of ifrom.pl -print 1 -print 2 you can just say ifrom.pl 1 2 -- much better for the user, I think. The code is very simple; just make sure you call $config->args() first.
Listing 5. Take command-line arguments into the -print switch
# all non-switch command-line arguments are implied -print requests
if (scalar @ARGV)
{
$config->print($_) foreach @ARGV;
}
|
ifrom.pl is one of the most useful tools I have ever written. It's simple and fast, and it can do everything I need. I hope you enjoyed reading this article, and that ifrom.pl will serve you well. Also, I hope that the knowledge you have gained about IMAP and Maildirs will be useful in the future.
Feel free to explore ifrom.pl and to send me your suggestions for improvements.
- "Use IMAP with Perl, Part 1" (developerWorks, June 2003) introduces you to accessing IMAP with the
Mail::IMAPClientCPAN module and the ifrom utility. - Get the ifrom.pl script referenced in this article.
- Read all of Ted's Perl articles in the Cultured Perl series on developerWorks.
- Visit CPAN, the CPAN module archive.
- Find out more about
Mail::IMAPClientat the Mail::IMAPClient manual pages. - IMAP4rev1 RFC 2060 is the document where today's version of the IMAP protocol was defined. This is required reading.
- TAoUP, or The Art of Unix Programming, is a great book by Eric Raymond that can help you understand why the
-tunneloption is so simple yet so powerful in a UNIX environment. - safecat is an excellent tool that can be used to convert mboxes to Maildirs.
- Stunnel is a program that allows you to encrypt arbitrary TCP connections inside SSL (Secure Sockets Layer), available on both UNIX and Windows.
- OpenSSH is a free version of the SSH protocol suite of network-connectivity tools that encrypts all traffic (including passwords) to effectively eliminate eavesdropping, connection hijacking, and other network-level attacks.
- Find more resources for Linux developers in the developerWorks Linux zone.
- Get involved in the developerWorks community by participating in
developerWorks blogs.
- Browse for books on these and other technical topics.
- Order the SEK for Linux, a two-DVD set containing the latest IBM trial software for Linux from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
- Innovate your next Linux development project with IBM trial software, available for download directly from developerWorks.

Teodor Zlatanov graduated with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work on text parsing, 3-tier client-server database architectures, UNIX system administration, CORBA, and project management.