Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Cultured Perl: Use IMAP with Perl, Part 2

Examining Maildir and tunneling with the ifrom.pl script and Mail::IMAPClient

Teodor Zlatanov (tzz@bu.edu), Programmer, Gold Software Systems
Author photo
Teodor Zlatanov graduated with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work on text parsing, 3-tier client-server database architectures, UNIX system administration, CORBA, and project management.

Summary:  Ted returns to the subject of accessing IMAP with the Mail::IMAPClient by looking at ifrom.pl as an alternative to other IMAP and POP3 mail checkers. This time around Ted covers tunneling (or "port forwarding" as it is sometimes called), as well as applying the script to the Maildir mail-storage format.

Date:  19 May 2005
Level:  Introductory

Comments:  

This article is a sequel to "Use IMAP with Perl, Part 1" and discusses the ifrom.pl tool again. Please read the introductory material in the previous article to understand the specific mechanics of ifrom.ipl before tackling this article. The new topics in this article include the Maildir mail storage format and tunneling (also known as port forwarding).

The extensions to ifrom.pl have met my own needs as an IMAP user, so I hope they prove useful to you as well.

Stunnel, the universal SSL wrapper

Stunnel is a program that allows you to encrypt arbitrary TCP connections inside SSL (Secure Sockets Layer) available on both UNIX® and Windows®. With Stunnel you can secure non-SSL-aware daemons and protocols (POP, IMAP, LDAP) by having Stunnel provide the encryption, requiring no changes to the daemon's code. The source code (available under the GNU GPL) is not a complete product; you will still require a functioning SSL library to compile stunnel (meaning that stunnel can only support what your SSL library supports unless you want to make changes to the code).

OpenSSH is a free version of the SSH protocol suite of network-connectivity tools; it encrypts all traffic (including passwords) to effectively eliminate the eavesdropping, connection hijacking, and other network-level attacks that can occur when a user password is transmitted across the Internet unencrypted (like with telnet, rlogin, ftp, and other programs). OpenSSH also provides many secure tunneling abilities and a variety of authentication methods. OpenSSH -- which is primarily developed by the OpenBSD Project and supports SSH protocol 1.3, 1.5, and 2.0 -- sports ssh (replacing rlogin and telnet), scp (bumps rcp), sftp (eliminating ftp), sshd (the server side of the package), and such basic utilities as ssh-add, ssh-agent, ssh-keysign, ssh-keyscan, ssh-keygen, and sftp-server.

Tunneling, also known as port forwarding

Tunneling a network connection (also known as port forwarding, although technically the two are not exactly the same) is a common technique in computing environments. To tunnel a connection is to redirect it through one or more destinations (but not necessarily remote machines). The tunneling application usually doesn't know what data it carries. The program that uses the tunnel doesn't know it. Common UNIX tunneling applications are stunnel and OpenSSH, among others. (OpenSSH has other uses of course, but tunneling fits nicely with them.) Any program that can use a network port can use a tunneled network port as well. Think of it as a telephone connection that can go next door just as easily as it can go across the world.

The versions of SSH other than OpenSSH support tunneling, but because of the popularity and unrestricted availability of OpenSSH, the discussion and examples in this article will use it.

In restrictive environments, tunneling can be very useful. Say you are not allowed to connect to an outside IMAP server because your firewall is configured to restrict that activity. You can run OpenSSH in port-forwarding mode so it forwards port 143 (IMAP) on the outside IMAP server to port XYZ on your local machine. You don't have to log in to the IMAP server itself to do this! You just need an intermediate jump point that accepts SSH connections and can in turn connect to the IMAP server. Of course, if you can SSH into the IMAP server directly, the connection will be faster and more reliable because the intermediate jump point will be eliminated.

You can also tunnel from the IMAP server to your own machine, reversing the connection process.

Another use of tunneling is to encrypt the connection. When you tunnel with OpenSSH, you can use its cryptographic abilities to encrypt your IMAP traffic if your IMAP server doesn't provide a secure connection. You can even do it if the IMAP server provides a secure connection already, so you'll encrypt the connection twice. Go wild.

To provide tunneling, I added a TUNNEL option to ifrom.pl. It's embarrassingly simple. In fact, my 10-month old daughter can write a better one in half the time (if she doesn't eat the keyboard first). I just added a -tunnel option to ifrom.pl and when that option is present, I run it with system().


Listing 1. Using the TUNNEL option to ifrom.pl

# use -tunnel like so with OpenSSH:

# ifrom.pl -tunnel "ssh REMOTEHOST.COM -N -T -n -L 2002:127.0.0.1:143 &"
#          [other options to follow...]

# This forwards the REMOTE port 143 on REMOTEHOST.COM to the LOCAL
# port 2002.  See the OpenSSH documentation for details and more
# examples.  There is no intermediate jump point.

if ($config->TUNNEL())
{
 system($config->TUNNEL());
}

Afterwards, I go on to do what ifrom.pl would have done otherwise. That is, I connect to the IMAP server and check for mail. That's why tunneling is handy -- the program that uses it doesn't have to do anything unusual because the magic is all in the data-transport layer. In my case, ifrom.pl had to be modified slightly, but the program's main logic was not touched.


Maildir support

Maildir is a mail storage format. It is used by Courier IMAP and qmail among others, to store users' mail. A Maildir consists of a single directory with the subdirectories "cur," "new," and "tmp." There can be other subdirectories, but they are ignored. I will call the Maildir directory "the Maildir" for simplicity.

Of the three subdirectories mentioned in the previous paragraph, only "new" interests us directly. In each subdirectory, Maildir stores messages, one per file. The files are named uniquely.

The "new" subdirectory contains all the messages that are new to the Maildir. The semantics of what's new depend on the particular mail-delivery agent (MDA) and other programs that use the Maildir. For instance, Courier IMAP marks seen messages by moving them out of the "new" subdirectory of the Maildir.

First, I added the -maildir switch to ifrom.pl. It's just a scalar that tells me not to try an IMAP connection and to do the Maildir logic instead.

The Maildir processing is done with a glob() call. This means that the local shell's facilities for wildcard matching of "Maildir/new/*" are used. I could have used opendir() and readdir() as well, but the glob() call is much simpler.

For each file found with the glob() call, I grab the sender and subject and print them. Then, if the -dump or -print switches are given, I print the requested messages whole by printing the corresponding files one by one. The order of the files is determined by the shell's glob() function. I don't do a date sort because the shell's sorting of Maildir files has been fine (it already sorts them by date when the files are delivered just by qmail). With more mail-delivery agents, a date sort may be necessary.


Listing 2. Maildir support in ifrom.pl

if ($config->MAILDIR())
{
 my $count = 0;
 foreach my $file (glob($config->MAILDIR() . '/new/*'))
 {
  $count++;

  open M, "<$file";
  my $address = 'UNKNOWN';
  my $subject = 'UNKNOWN';

  while (<M>)
  {
   $address = $1 if m/^From: (.*)/;	# the sender of the message
   $subject = $1 if m/^Subject: (.*)/;	# the subject of the message
   last if $_ eq "\n";
  }
  printf "%5d %-35.35s %s\n", $count, $address, $subject;

  if ($config->DUMP || grep {$_ == $count} @{$config->PRINT})
  {
   close M;
   open M, "<$file";
   print MARKER();
   print foreach <M>;
   print MARKER();
  }
 }
}


Importing mail

As a mail-server administrator, I have had to migrate users' mail from the so-called mbox format (one big file with messages in it) to the IMAP server. Because ifrom.pl already had the necessary IMAP logic, I added the -import switch and used the Mail::Box module to handle reading mail from an mbox file.

This is a slow method because every message has to be moved over a network connection. For true batch imports of many users, you may have to consider a more sophisticated method that's custom-tailored to your site. For example, if your site uses Maildirs for mail storage and you are migrating from the mbox format, just use an mbox-to-Maildir converter such as safecat (safecat has many other uses, by the way). See the safecat Web site for a recipe on mbox-to-Maildir conversion (see Resources).

The auto_mailbox parameter to -mailbox (stored in the MAILBOX_AUTO constant) tells ifrom.pl to pick a mailbox name based on the name of the file. I use the basename() function on the value of the -import switch, combined with the PREFIX parameter. If auto_mailbox is not specified, then whatever -mailbox says is what the new mailbox will be called. So, -import A/B/C/FILE -mailbox auto_mailbox will create and populate an IMAP folder called "FILE," while -import A/B/C/FILE -mailbox XYZ will create and populate a folder called "XYZ."

Furthermore, if a file called "XYZ.msf" (XYZ can be anything, .msf is the Mozilla mbox index filename extension) is seen, then just "XYZ" is used as the filename. That lets you do something like find DIRECTORY -name "*.msf" -exec ifrom.pl -import {} ... \; which passes the "find" results to ifrom.pl one by one. It will find all the Mozilla index files and then ifrom.pl will import the corresponding mailboxes one by one.

PREFIX tells ifrom.pl the IMAP server's prefix. The prefix is actually available from the IMAP server with the namespace() function, but the logic for obtaining it was too complex for a simple script such as ifrom.pl. In case you need the IMAP prefix or the IMAP separator, they are available with namespace(). As an example, the common UW IMAP prefix and separator are "" and "/", so mailboxes look like "a/b/c" (note the similarity to file and directory names). The Courier IMAP server prefix and separator are "INBOX." and ".", so mailboxes look like "INBOX.a.b.c" (note the flat structure -- Courier folders are all under the top level Maildir, without subdirectories).

The -dryrun option to ifrom.pl is very useful. It lets you see everything ifrom.pl would do while importing, except the actual import doesn't happen.


Listing 3. Importing mail

elsif ($config->IMPORT)
{
 eval { require Mail::Box::Manager; };
 die "You need to install the Mail::Box module, exiting" if $!;

 my $file = $config->IMPORT;
 if ($file =~ m/^(.*)\.msf$/i)
 {
  $file = $1;
  print "MSF file detected, using $file as the file name\n";
 }
 die "Can't access import file $file" unless -r $file;

 if ($config->MAILBOX eq MAILBOX_AUTO)
 {
  $box = $config->PREFIX . basename($file);
 }

 my $mgr    = Mail::Box::Manager->new;
 my $folder = $mgr->open(folder => $file);
 my $i;
 if ($config->DRYRUN)
 {
  print "Skipping folder check and creation because a dry run was requested\n";
 }
 else
 {
  if ($imap->select($box))
  {
   print "Selected folder successfully, ready to import.\n"
    if $config->VERBOSE;
  }
  else
  {
   print "Could not select folder $box, trying to create it...\n";
   $imap->create($box)
    or die "Could not create import folder: $@\n";
  }
 }
 # Iterate over the messages.
 foreach ($folder->messages)
 {
  printf "Appending to mailbox %s, message %d: ID %s, %d lines\n",
   $box, ++$i, $_->messageId, $_->nrLines;
  if ($config->DRYRUN)
  {
   print "Skipping import because a dry run was requested\n";
  }
  else
  {
   $imap->append($box, $_->string);
  }
 }
}


Other improvements and notes

I wrote a help section. It's short and informative and you can bring it up with ifrom.pl -help.

I added a MARKER() constant because the "\n===\n\n" divider was used frequently.

For Maildirs, I did not add all the fancy code that IMAP has, such as the -backup or -delete_mailbox_really switch. Because Maildir is a local feature, meaning you can access the data in the Maildir over the locally mounted filesystems, backups and mailbox deletion are simple shell operations. For example, to back up a Maildir, follow this example:


Listing 4. Backing up a local Maildir

rsync -avP --delete MaildirLocation Destination
# so, for example, to back up /var/qmail/maildirs/tzz
# to /home/tzz/backups
rsync -avP --delete /var/qmail/maildirs/tzz /home/tzz/backups

Read the rsync documentation -- it's a great tool for backups and directory synchronizations in general. If you don't use it, you are missing out on one of the best UNIX tools available today.

The logic of the Maildir and IMAP loops may seem similar, but they were not similar enough to be merged. If another mail source was added to ifrom.pl, maybe it would make sense to merge all the loops into one. As it is, I did not think it worthwhile to merge the two loops, as it would have made the program logic more confusing and we would have gained nothing.

I added the ability to call -print with just numbers from the command line, so instead of ifrom.pl -print 1 -print 2 you can just say ifrom.pl 1 2 -- much better for the user, I think. The code is very simple; just make sure you call $config->args() first.


Listing 5. Take command-line arguments into the -print switch

# all non-switch command-line arguments are implied -print requests
if (scalar @ARGV)
{
 $config->print($_) foreach @ARGV;
}


Conclusion

ifrom.pl is one of the most useful tools I have ever written. It's simple and fast, and it can do everything I need. I hope you enjoyed reading this article, and that ifrom.pl will serve you well. Also, I hope that the knowledge you have gained about IMAP and Maildirs will be useful in the future.

Feel free to explore ifrom.pl and to send me your suggestions for improvements.


Resources

  • "Use IMAP with Perl, Part 1" (developerWorks, June 2003) introduces you to accessing IMAP with the Mail::IMAPClient CPAN module and the ifrom utility.

  • Get the ifrom.pl script referenced in this article.

  • Read all of Ted's Perl articles in the Cultured Perl series on developerWorks.

  • Visit CPAN, the CPAN module archive.

  • Find out more about Mail::IMAPClient at the Mail::IMAPClient manual pages.

  • IMAP4rev1 RFC 2060 is the document where today's version of the IMAP protocol was defined. This is required reading.

  • TAoUP, or The Art of Unix Programming, is a great book by Eric Raymond that can help you understand why the -tunnel option is so simple yet so powerful in a UNIX environment.

  • safecat is an excellent tool that can be used to convert mboxes to Maildirs.

  • Stunnel is a program that allows you to encrypt arbitrary TCP connections inside SSL (Secure Sockets Layer), available on both UNIX and Windows.

  • OpenSSH is a free version of the SSH protocol suite of network-connectivity tools that encrypts all traffic (including passwords) to effectively eliminate eavesdropping, connection hijacking, and other network-level attacks.

  • Find more resources for Linux developers in the developerWorks Linux zone.

  • Get involved in the developerWorks community by participating in developerWorks blogs.

  • Browse for books on these and other technical topics.

  • Order the SEK for Linux, a two-DVD set containing the latest IBM trial software for Linux from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

  • Innovate your next Linux development project with IBM trial software, available for download directly from developerWorks.

About the author

Author photo

Teodor Zlatanov graduated with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work on text parsing, 3-tier client-server database architectures, UNIX system administration, CORBA, and project management.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux, Open source
ArticleID=83695
ArticleTitle=Cultured Perl: Use IMAP with Perl, Part 2
publish-date=05192005
author1-email=tzz@bu.edu
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).