Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Using cron to automate maintenance

The cron subsystem schedules tasks to run any hour of the day or night, making regular upkeep a breeze

Martin Streicher (martin.streicher@gmail.com), Chief Technology Officer, McClatchy Interactive
Photo of Martin Streicher
Martin Streicher is a freelance Ruby on Rails developer and the former Editor-in-Chief of Linux Magazine. Martin holds a Masters of Science degree in computer science from Purdue University and has programmed UNIX-like systems since 1986. He collects art and toys. You can reach Martin at martin.streicher@gmail.com.
(An IBM developerWorks Contributing Author)

Summary:  To leverage round-the-clock computing, tasks must run at all hours of the day. You could punctuate your sleep with waking interludes to log in and run this command or that command on dozens of machines, or you can enjoy your forty winks and turn the work over to the ubiquitous cron, a daemon, or perennial process, to execute commands on a schedule. From very often to every so often, cron happily minds the clock and runs jobs day or night. Learn how to configure and maintain cron, and discover just some of its many uses.

Date:  07 Oct 2008
Level:  Intermediate PDF:  A4 and Letter (69 KB)Get Adobe® Reader®

Activity:  18861 views
Comments:  

Alternatives to cron

As useful as cron is, there are a couple of alternatives that you should be aware of.

Anacron

If your system is often off or in hibernation—for example, if you use a UNIX laptop—consider adding anacron to your system. Anacron is similar to cron in that it schedules jobs to run in the future; but unlike cron, anacron run jobs even if the job's scheduled time has passed.

For example, if you scheduled a file-system backup to run Sunday but the system is switched off from Friday to Monday, anacron runs the Sunday job as soon as the system is reactivated on Monday. In contrast, cron merely checks whether a job is to run right now; hence, if the system is off when the job is scheduled, the job doesn't run.

Anacron has far fewer scheduling options than cron. It can only schedule jobs in whole-day intervals, such as one, seven, or 30 days, but it's a better choice for jobs that must run frequently and reliably.

Also, you must launch anacron from cron. Each time anacron runs, it reads its own configuration file consisting of pairs, where each pair is a job and its frequency expressed in days. If a job hasn't run in its period, anacron runs the job and notes the time the job ran. When all jobs finish running, anacron exits.

Anacron is available on most Linux distributions, but you can also easily download and build the source code yourself. Visit the anacron project page to get the latest release.

The anacron primary configuration file can be found in /etc/anacron. You can set environment variables just as you do with cron, but its entries are simpler:

SHELL=/bin/zsh
PATH=/usr/bin:/bin:/usr/local/bin
# format: frequency delay name job
1 10 day-to-day daily.chores.sh 

The first number is the period, so 1 means run once every day. A 7 would mean run once per seven days, and so on. The second number is the delay, which is the number of minutes to wait after anacron launches to start this job. The delay field, if set to distinct values, prevents all jobs from starting at the same time. The name day-to-day is just a helpful nickname. The rest of the line specifies the job; here, the shell script daily.chores.sh, found in one of the directories in the named path, runs every day.

Anacron has good documentation in the form of man pages, and you can find very good tips on anacron on the Web. (Check out Rod Smith's Linux Magazine article, which I edited in October 2007.) Anacron is ideal for UNIX road warriors or for any systems administrator who wants a bit of extra insurance.


Launchd: A modern alternative to cron

Cron is certainly a capable and venerable utility, as evidenced by its widespread usage. Recent additions in Vixie cron, such as shorthand for @reboot, make it even easier to administer. However, cron does have some shortcomings:

  • Although cron jobs are defined in crontab files, you cannot start and stop a cron job from the command line. Moreover, you cannot create an ad hoc job at the command line and submit it to the calendar.
  • Cron does not enforce resource limits. A job can consume innumerable cycles and memory if run as root. You may, instead, want to damp down a job so that it cannot interfere with other cron jobs and the overall quality of system operations.
  • Cron jobs adhere rigidly to a schedule. There is no way, for instance, to have a job launch only when an event occurs, such as the creation of a file.
  • In a larger context, UNIX-like systems have many core components capable of launching other programs on demand, including cron, xinetd (or inetd) for networking daemons, and init, the progenitor of all system processes. Each core component has its own set of configuration files, making it difficult to know which components to tailor to make a change.

To address these shortcomings, Apple Computer created a unified launch facility, aptly named launchd, to start processes on boot, on demand, and at specified intervals. In fact, launchd replaced cron (and init and several other system utilities used to boot and initialize the system) in Mac OS 10.4 Tiger. (Apple left cron on the system, though as a convenience and because Vixie cron has more flexible scheduling options.) Indeed, the phenomenal boot speed of Mac OS X can be attributed to launchd: It enumerates what to launch at boot but executes the programs only when first needed.

The code for launchd is available as open source from its home page on Mac OS Forge. To date, launchd has been ported to FreeBSD but not to other UNIX or Linux systems. However, various projects are actively implementing the equivalent of launchd, so a brief survey of its features is necessary:

  • Rather than create a job to poll a directory for new files, launchd can automatically monitor a directory for new files or monitor an empty directory for any files and launch your job on demand. Launchd does not poll; instead, it uses the kqueues facility to have the kernel alert it when a directory is changed. (Linux has a similar event facility called inotify, which will be covered in a separate developerWorks article in the coming months.)
  • If specified, launchd uses chroot to send your job to a new directory. Pronounced "cha-root," chroot is a system call to change the directory that the forward slash (/) and the root directory point to. Thus, if you use chroot to send the file to /opt/root, all files outside /opt/root are inaccessible—after all, /opt/root is now /, the top-level directory of the file system—and all directories within /opt/root become top-level directories. You most commonly use chroot to secure jobs so that code cannot wander into the larger file system to wreak havoc.
  • You can set resource limits for a job. Resources you can constrain include memory, stack size, and the maximum number of open files.
  • When a task is defined and loaded into launchd, you can start and stop the job by name from the command line.

Launchd is made of three components: the launchd daemon itself; the launchctl utility used to add, alter, and remove jobs and affect launchd; and one or more configuration files, where each file defines one or more jobs. Given its origin on Mac OS X, launchd configuration files are simply properties files, which can be expressed as Extensible Markup Language (XML).

Briefly, here is how you would use launchd on Mac OS X—say, to monitor a directory for incoming files and run a job on demand:

  1. Create a properties file to express the job and all its attributes.

    You can use the Mac's Property Editor, or you can edit the XML by hand. In either case, the resulting file looks something like Listing 1.



    Listing 1. A sample launchd job to monitor a file system directory for changes
    	
    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 
                1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
    <plist version="1.0">
    <dict>
    	<key>Label</key>
    	<string>com.example.processor</string>
    	<key>OnDemand</key>
    	<true/>
    	<key>Program</key>
    	<string>/Users/strike/bin/processor</string>
    	<key>ProgramArguments</key>
    	<array>
    		<string>processor</string>
    	</array>
    	<key>WatchPaths</key>
    	<array>
    		<string>/Users/strike/data/incoming</string>
    	</array>
    </dict>
    </plist>
    

    In a nutshell, this file runs the utility script found in /Users/strike/bin/processor whenever the contents of the directory /Users/strike/data/incoming changes. Setting OnDemand to True tells launchd to start this job as needed. Save the file to ~/Library/LaunchAgents/com.example.processor.plist.

  2. Load the job into launchd with launchctl:
    % launchctl load ~/Library/LaunchAgents/com.example.processor.plist
    

    If you want to verify the last operation or see your list of saved jobs, simply type launchctl list.

  3. To remove a job, again use launchctl with unload:
    % launchctl unload -w ~/Library/LaunchAgents/com.example.processor.plist
    

    What does -w do? It removes the job from launchd completely. Without it, the job would re-load automatically at login (because the job is in the per-user collection of launch agents).

The launchd man pages have lots of information, and if you're a Mac OS X user, you can find any number of applications for launchd. Hopefully, some clever developer will port launchd more widely.

4 of 8 | Previous | Next

Comments



static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=342930
TutorialTitle=Using cron to automate maintenance
publish-date=10072008
author1-email=martin.streicher@gmail.com
author1-email-cc=mmccrary@us.ibm.com