The typical UNIX® administrator has a key range of utilities, tricks, and systems he or she uses regularly to aid in the process of administration. There are key utilities, command-line chains, and scripts that are used to simplify different processes. Some of these tools come with the operating system, but a majority of the tricks come through years of experience and a desire to ease the system administrator's life. The focus of this series is on getting the most from the available tools across a range of different UNIX environments, including methods of simplifying administration in a heterogeneous environment.
Scheduling over in-time management
The typical system administrator spends a lot of time doing repetitive tasks. At least they will if they don't have a task scheduling system that automatically runs various tasks for them at suitable points in time.
Classic examples include daily tasks like running a backup, to weekly and monthly tasks like clearing up logs, generating reports, and a host of other commands across a wide range of different situations.
There are also other tasks that you might want to run at particular intervals, such as commands that monitor a list of the currently running processes, or the current disk usage, all of which can be used to help diagnose and identify problems in the event of a failure or other issue. Alternatively, there might be commands that you want to execute at a specific time when you will not be available. For example, you might want to power off a machine at night in preparation for work the next day, even though you probably won't want to wait around until midnight just for the purpose of switching off a machine.
Solutions are available for all of these situations, but before you leap in to how to perform these operations, you should be aware of some of the downsides and pitfalls of the scheduled approach.
Because everything is automated, one of the primary issues with scheduled executions is that should anything go wrong or anything unexpected happen, there is no opportunity to resolve the problem. You are also at the mercy of the system and its ability to execute certain tasks at particular times. There are some limitations to when a command can be executed, and handling unpredictable situations like "if X happens, then do Y, otherwise do Z" require scripting experience and a lot of trial and error.
Scheduling can, however, save you a significant amount of time, so it is worth investigating the options.
The cron system handles all of the time-based scheduling of commands and provides two different solutions for running commands at a specific time. The at command schedules work for a specific time to be executed once. The crontab system enables you to specify a schedule for the execution of the command, either at specified times, on specific days, or a combination of the two.
There are two methods available for using the at command. The first is to simply type at and the time (and optionally date) you want the command to run. For example:
$ at 17:20 echo It's 17:20! job 1 at Tue Apr 11 17:20:00 2006 |
Once you have entered the at command, it waits for you to enter commands to be executed at the specified time. You can enter as many commands as you like, and they will be executed as a shellscript. To terminate the input, use the end-of-file command (generally Control-D).
The commands you type will be executed within a copy of the environment in which the at command was called. This means that your active PATH, library, and other environment settings will be recorded and used to execute the script you generate. The results will generally be emailed to you when the command completes.
When specifying the time, you can use the standard time format (as shown in the earlier example) and a variety of other shorthand techniques. If you specify a time, then the next occurrence of that time is used. For example, if it is 17:00 and you specify a time of 17:20, then the command will execute in 20 minutes. If you specify 09:00, then the command will be executed at 9 a.m. the next day.
For special alternatives, you can generally use:
midnight-- 12:00 a.m./00:00noon-- 12:00 p.m.now-- Immediate execution
You can also use today and tomorrow. Some environments (BSD and Linux®) might also support options that add time to a specification by adding a number of minutes, hours, days, weeks, months, and years. For example, you could schedule a job to run at the current time, but a week later, using:
$ at now + 1 week |
This specification is useful if you want to re-schedule a job for a period after the previous execution. For example, you might run a report that takes a few hours to run, but you want the report to run again a week later.
To obtain a list of the currently scheduled jobs, use the -l command line option:
$ at -l 1 Tue Apr 11 17:20:00 2006 2 Wed Apr 12 09:00:00 2006 |
The number in the output is the job ID. Unfortunately, it is not possible to determine what each job will do from this list of the standard commands.
You can remove scheduled jobs by using the -r option and specifying the job number as generated when the job was first submitted, or from the list of jobs shown when listing the entries in the scheduler. For example, to delete the job that will run at 09:00 on Wed April 12th in the example above, you would use:
$ at -r 2 |
Note that most systems will not provide any visual indication that the job has been removed from the queue, so you might need to list the jobs again to ensure the job has been cancelled:
$ at -l 1 Tue Apr 11 17:20:00 2006 |
Regular execution is handled by setting up a cron table (called a crontab) that defines the interval and sequence for each command. The format of the file is a single line for each command (with six fields):
minute hour day month dayofweek command |
The time specification uses numbers according to the following rules:
Minute: 0-59Hour: 0-23Day: 1-31Month: 1-12Day: 0-6 (where 0 is Sunday)
For any field, you can specify either a single number, a command-separated list of numbers, or the asterisk, which indicates that any value should match.
The time specification leads to the command being executed whenever the current time matches. For example, with a specification of: 0 * * * * do-something, the command would be executed whenever the current time has a minutes value of 0 (for example, every hour, on the hour).
The specification: 0 23 * * * do-something would run the command at 11 p.m. each evening.
If you specify multiple values, then these also match the appropriate time. For example, to execute a command every 15 minutes, try:
0,15,30,45 * * * * do-something |
Or you could specify that command runs every six hours, but only Monday through Friday, using:
0 0,6,12,18 * * 1,2,3,4,5 do-something |
You can have as many lines in your crontab as you like, and the same command can be referenced multiple times if you want schedules that would be otherwise difficult to define. For example, to run a command Monday through Thursday at 6 p.m., but lunchtime on Friday, you would have to use two lines:
0 18 * * 1,2,3,4 do-something 0 12 * * 5 do-something |
You should be careful with the first two options (minutes and hours); failing to specify these (using the asterisk) results in the command running every minute when the rest of the specification matches. For example, a common mistake is to want to run a command on the first of the month and use:
* * 1 * * do-something |
The problem is that the above specification actually runs the command for every minute of the first day of the month. If you want the command to run only once, you must specify both the minute and hour when the command should execute:
0 12 1 * * do-something |
Omitting the minutes will at the very least cause the command to run every minute for the specified hours (and date).
Even with all this flexibility, there are still occasions when the times you want to execute a command are difficult, and even impossible, using the crontab system.
The problem with cron is that although all of the different options available to you in crontab provide a wide range of different possibilities, there are some irritating limitations.
For example, to run a command or script on the last day of the month is difficult within cron, as there is no straightforward way to specify that information. Instead, you must specify the last day and months individually. For example, in a non-leap year, you could use all three lines:
59 23 31 1,3,5,7,8,10,12 * do-something 59 23 30 4,6,9,11 * do-something 59 23 28 2 * do-something |
The above example manually selects the last day of each month, but managing three rows can be cumbersome, and you would have to manually modify the crontab definition in a leap year to ensure that the information was calculated on the right date.
The solution is to use the echo command to perform the date check, rather than cron. To achieve this, the solution is to use cal, which outputs the calendar for the current month, and awk to determine what the last day of the month is. If you run the following command, you should obtain the last day of the month:
$ echo `cal`|awk '{print $NF}'
|
The command works by outputting the calendar through the echo command (which places the normal multiline output into just one line) and then counting the numbers that are output; the last number will be the last day of the current month.
To use it within a crontab, you would use:
59 23 * * * [`echo \`cal\`|awk '{print $NF}'` -eq `date +\%d`]
&& do-something
|
The square brackets initiate a test within the shell used to run the command. Also note that cron filters out the % sign, so it must be escaped when used in crontab. The first part of the test is the trick demonstrated earlier, and the second uses the date command to output the current day. The double && ensures that the command on the right of the && is only executed if the test on the left returns true.
Another common request is to run a command only on a particular day in a particular day of the month. For example, you might want to run a report on the first Monday of each month, or the last Friday. To achieve this, you can use a similar process to the one above. For any particular day in a given week, it must fall within one of the following dates:
- Week 1: 1st to the 7th
- Week 2: 8th to the 14th
- Week 3: 15th to the 21st
- Week 4: 22nd to the 28th
To determine whether the current date is within a given range, for example, the fourth week, you would use a test like this:
[ `date +\%e` -gt 21 -a `date +\%e` -lt 29 ] |
The %e is used to return a number for the day, where numbers less than 10 are prefixed with a space, rather than a zero, which ensures that numbers, not strings, are compared.
You can now combine this with a crontab definition that attempts to run the command every Friday:
59 23 * * 5 [ `date +\%e` -gt 21 -a `date +\%e` -lt 29 ] && do-something |
The command will be run every Friday, but because the test will only return true in the fourth week of the month, the real command will be executed on the fourth Friday.
Cron job execution environment
Although it is possible to change the environment used when executing cron jobs, it is often better to create a wrapper script that defines any of the environmental variables -- such as PATH -- before running the command you actually need.
The reason is partly security; the more areas you open up to cron jobs, the more likely you are to end up including some that could contain suspicious content. The other reason is that it ensures that your cron jobs will execute even if you change one of the dependencies in the environment.
By using a separate wrapper script, you can also take advantage of the extensions and abilities of a different shell, rather than the standard Bourne shell often used to run most cron jobs.
Finally, using a separate wrapper script also allows you to define different environments for different commands. That can be particularly useful if you want to run commands within different users who might use different versions of the same application or tool.
Tricks for logging and recording output
By default, commands run by crontab that generate output (both to standard output and standard error) have the output emailed to the user for that job. However, this isn't always a convenient solution and, for some results, you might only want part of the output, or you might want to ignore standard output and only have errors reported. You might even want the output to be sent to a different user or email alias.
You can use redirection within a crontab specification to output information either to a specific file, or to ignore output from different sources. To simply log the output to a file, you might use:
1 1 * * * do-something >/var/logs/do-something.log |
The above would overwrite information, so use the append if you want to keep a longer term record:
1 1 * * * do-something >>/var/logs/do-something.log |
To ignore output, redirect to the special /dev/null device. For just standard output, try:
1 1 * * * do-something >/dev/null |
For both standard output and errors, try:
1 1 * * * do-something >/dev/null 2>&1 |
If you want to collect logs organized by date, then use the date command in combination with a logfile specification, for example:
1 1 * * * do-something >/var/logs/something.`date +\%Y\%m\%d`.log |
To pick and choose the output from a range of commands in a cronjob, or to create a custom email based on the content, use a wrapper script, writing the information you want to keep to a temporary file and ignoring the rest. You can then email the contents of the file to any user you want.
To create a temporary file, use the time and process ID to generate a unique filename, as shown here:
LOGFILE=/tmp/`datetime +%Y%m%d`.$$.tempfile do-something >$LOGFILE 2>&1 mailx -s "Results of do-something report" reportees <$LOGFILE rm -f $LOGFILE |
Remember to delete the file once it is has been sent to the appropriate person. In the above example, mailx, rather than mail, has been used to enable the setting of a subject.
By using a combination of crontab and the at command, you can specify the time or interval to execute just about any command you want. When using at, you can run a command, or script, at a given time just once. With crontab, you can specify an interval for the execution that can be as specific, or loose, as you like. But care should be taken to ensure that your command runs exactly when you want them to. Omitting minutes or hours can cause problems, or might cause your command to be run at times or intervals you didn't expect.
When crontab is not specific or flexible enough, there are some other alternatives available that can handle more complex situations, such as running a command on the last day of the month or on a specific day within a specific week.
Scheduling can save you time and, with some careful organization, can help to reduce the load and repetitiveness of the work that you do.
Learn
-
Access Wikipedia pages on crontab.
- "System Administration Toolkit: Process administration tricks" (developerWorks, February 2006): Part 1 of this series covers getting the information you want on UNIX processes.
- System Administration Toolkit: Process administration tricks: Check out other parts in this series.
- "The road to better programming: Chapter 11. Crontab management with cfperl" (developerWorks, June 2003): Learn about the Crontab management with cftab/Perl.
- "Scheduling recurring tasks in Java applications" (developerWorks, November 2003): Build a general scheduling framework for task execution conforming to an arbitrarily complex schedule.
- For a three-part series on learning how to program in the bash scripting language, see:
- developerWorks
technical events and webcasts: Stay current with eveloperWorks technical events and webcasts.
- AIX and UNIX: Want more? The developerWorks AIX and UNIX zone hosts hundreds of informative articles and introductory, intermediate, and advanced tutorials.
Discuss
-
Participate in developerWorks
blogs and get involved in the developerWorks community.
Martin Brown has been a professional writer for more than seven years. He is the author of numerous books and articles across a range of topics. His expertise spans myriad development languages and platforms -- Perl, Python, Java™, JavaScript, Basic, Pascal, Modula-2, C, C++, Rebol, Gawk, Shellscript, Windows®, Solaris, Linux, BeOS, Mac OS X and more -- as well as Web programming, systems management, and integration. He is a Subject Matter Expert (SME) for Microsoft and regular contributor to ServerWatch.com, LinuxToday.com, and IBM developerWorks. He is also a regular blogger at Computerworld, The Apple Blog, and other sites. You can contact him through his Web site.





