Skip to main content

The road to better programming: Chapter 11. Crontab management with cfperl

Adding and deleting crontab entries easily with cfperl

Teodor Zlatanov (tzz@iglou.com), Programmer, Gold Software Systems
Teodor Zlatanov graduated with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work on text parsing, three-tier client-server database architectures, UNIX system administration, CORBA, and project management.

Summary:  In this series, Ted has been developing the cfperl project -- which is simply a cfengine interpreter written in Perl -- from the top down. In this installment, he discusses the "cron" section, where crontab entries can be added or deleted easily.

Date:  12 Jun 2003
Level:  Intermediate
Activity:  2141 views

As the title indicates, this is an installment in an ongoing series. We recommend that you read the preceding chapters for the background, rationale, and structure of cfperl.

Crontab management is desirable for cfperl so that crontab tasks can be described in plain English. The cfperl syntax can describe in one line what would take multiple lines in a standard crontab (for example, tasks that run at 5:30 and 6:00). cfperl uses no non-standard cron extensions to implement its crontab functionality.

The cfperl crontab section is parsed in the normal cfperl way, through a "cron" section that can be targeted to specific hosts, groups, and so on through the top-level parser.

What is the crontab?

The cron package consists of a pair of programs: a cron daemon that executes tasks periodically, and a crontab program to modify an individual user's list of cron tasks. They are extremely useful, and you have probably heard about crontabs if you use a UNIX system. The user's list of cron tasks is called a crontab, just like the program you use to modify that list. I'll discuss the standard cron package, without all the extensions added by cron rewrites. The improved cron packages, such as anacron, fcron, and ucron are better than the standard cron, but unfortunately, they all have non-standard extensions to the classic crontab format.

In typical UNIX fashion, the standard crontab format is very efficient but not very readable. Six fields are separated by spaces: minutes, hours, days of the month, months, weekdays, and command. Each one is a list of numbers or "*"; for instance "0,1,2" in the hour field means that the command should be run at midnight, 1 a.m., and 2 a.m. When "*" is in a field, that means cron should match anything for that field -- any hour, any day, any month, etc. An additional complication is that weekdays are 0-based (Sunday being 0), while days of the month are 1-based (1 through 31).

The weekday field is a little tricky, because it is exclusive of the day-of-month field. If day-of-month is "*" and weekday is "0," then the command will run only on Sundays. If day-of-month is "1" and weekday is "*," then the command will run on the first of the month only. If day-of-month is "5" and weekday is "2," however, the command will run both on the 5th of the month and on Tuesday.

Fortunately, cfperl will save you from having to remember all that stuff -- unless you are into that sort of thing, in which case cfperl's crontab functionality will just leave you alone and close the door quietly on its way out. It's not a pushy sort of software.


What does cfperl do to make crontabs easier to manage?

With cfperl, you can generate crontabs with readable specifications. For instance:

hourly at 0,10,20,30,40,50 do as cftest /usr/bin/synchronize

Simple, right? All you have to do is specify the minute at which the job will be executed and the name of the user (in this case, "cftest") whose crontab will be edited. This translates readily into one crontab line:

0,10,20,30,40,50 * * * * CFPERL=1 /usr/bin/synchronize

Note the "CFPERL=1" part; that indicates to cfperl in subsequent runs that this line was generated by cfperl and can be regenerated.

How about ranges? A minute range of "23-43,44-49" given to cfperl, for instance, will produce "23-49" in the crontab.


The user specification parser

The parser for the user specification of crontab entries is a member of the %parsers hash with key "cron." As usual for cfperl, the "cron" section in a cfperl configuration will invoke this parser.

The parser can handle three kinds of requests: delete, deletfull, and regular crontab requests. The deletions are available for those cases where cfperl should clean out its old entries or all crontab entries from a user's crontab.

Crontab requests in the cfperl "cron" parser are built from a frequency specification ("yearly," "monthly," "daily," and "hourly").


Listing 1. Top-level "cron" parser frequency specifications
frequency: yearly | monthly | weekly | daily | hourly

hourly: /hourly/i /at/i minute { { minute => $item{minute} } }

daily: /daily/i /at/i hour { { hour => $item{hour} } }

weekly: /weekly/i weektime_spec_list { { weektime => $item{weektime_spec_list} } }

monthly: /monthly/i monthtime_spec_list { { monthtime => $item{monthtime_spec_list} } }

yearly: /yearly/i yeartime_spec_list { { yeartime => $item{yeartime_spec_list} } }

The rest of the rules are fairly simple, providing for an exhaustive declaration of almost any kind of recurring event. The specification for the numeric items and ranges, however, is a little bit more complex and merits its own section.


Numeric items and ranges

The "cron" section parser needs to understand any numeric range or item. This includes hours, which are a little bit harder to parse because minutes are extra baggage.


Listing 2. Numeric items and ranges specifications
minute: numeric_list
hour:   numeric_hour_list
day:    numeric_list
month:  numeric_list
weekday: numeric_list

numeric: numeric_range | numeric_single

numeric_single: /\d+/

numeric_hour: /\d+/ ':' /\d\d/ 
              { { hour => $item[1], minutes => $item[3] } }
              | numeric_single 
              { { hour => $item{numeric_single}, minutes => 0 } }

numeric_range: numeric_single '-' numeric_single 
{ $return = [ ($item[1] .. $item[3]) ]; 1; }

numeric_hour_list: numeric_hour ',' numeric_hour_list
                 { $return = [ @{$item{numeric_hour_list}},
                               (ref $item{numeric_hour} eq 'ARRAY') ? 
                                @{$item{numeric_hour}} : $item{numeric_hour} ];
                    1; }
               | numeric_hour
                 { $return = [ (ref $item{numeric_hour} eq 'ARRAY') ? 
                               @{$item{numeric_hour}} : $item{numeric_hour} ];
                 1; }
               | '*'

numeric_list: numeric ',' numeric_list
                 { $return = [ @{$item{numeric_list}},
                               (ref $item{numeric} eq 'ARRAY') ? 
                                @{$item{numeric}} : $item{numeric} ]; 
                    1; }
               | numeric
                 { $return = [ (ref $item{numeric} eq 'ARRAY') ? 
                                @{$item{numeric}} : $item{numeric} ]; 1; }
               | '*'

The parser specification for the "numeric_list" and "numeric_hour_list" rules are recursive. The rule depends on itself or a terminator, meaning that eventually it must find text that matches a rule alternation that does not depend on the rule itself.

If this is confusing, think of a list of numbers such as "5,6,7." You can define that list as "(1) a number followed by a comma followed by a number list, or (2) a number by itself." Rules (1) and (2) are alternate solutions to the general rule for parsing a list of numbers. But rule (2) alone is useless, since it only understands a single number. Rule (1), on the other hand, is not sufficient at the end of the list, where there are no commas. Rules (1) and (2) must be alternates if a number list is to be parsed recursively.

There are other ways to parse a list of items, but with Parse::RecDescent (which is a recursive descent parser), the recursive approach works best. It is also a very elegant approach, summarizing in a few lines all the possible interpretations of a list. Consult the Parse::RecDescent documentation for more on recursive parsers and rules (see Resources for a link).

The other piece of complexity is the code that flattens array references. In several places, the item contents are treated differently if they are an array reference. That's because flattening array references into a list can be done in the parser.

The array references could have been left in the parser output, to be interpreted by the code that uses that parser. Flattening the references into a list, however, makes the parser only slightly more complex, while the code that would have to handle the references as output would have been considerably more complex.

A point of interest is the weekday rule.


Listing 3. Weekdays
weekday_name: 'Monday' { 1 } | 'Tuesday' { 2 } | 'Wednesday' { 3 } | 
              'Thursday' { 4 } | 'Friday' { 5 } | 'Saturday' { 6 } | 
              'Sunday' { $return = 0; 1; }

This rule treats Sunday differently, because it is 0 in the crontab format. That's great for crontabs, but bad for the parser. If we had simply said 0 for Sunday, the parser would have considered that 0 an indication of a parsing error. Thus, we set the $return variable to 0, but we return 1 from the rule. This is a common and dangerous Parse::RecDescent pitfall, where a properly parsed 0 is considered an error if returned directly by the rule.


Perl functionality for crontab generation

All the output of the "cron" parser is passed to the cron_op() function. The auxiliary functions get_cron_lines(), put_cron_lines(), and get_cron_command() are not interesting; these functions simply read or write the crontab and obtain the proper way to call the crontab command (which is used by the cron daemon, as discussed earlier).

The construct_cron_line() function builds an actual cron entry line, using make_cron_line() to do the mechanical work.

The extract_yearly_intervals(), extract_weekly_or_monthly_intervals(), and extract_daily_intervals() functions use reduce_interval() to obtain the shortest representation of their interval, except that they don't use "*" when they could. This is not a feature, and it's not a bug either, but it will be fixed at a later point.

Additionally, extract_daily_intervals() tries to merge intervals. This means that if a user asks for a command to be executed at 5:30 a.m. and 3:30 p.m. daily, extract_daily_intervals() will understand that only one crontab line ("30 5,15 * * * COMMAND") is sufficient to implement the user's request. Incompatible intervals such as 3:30 a.m. and 4:20 p.m. will be done as separate crontab lines. There is no way to fix that; it's a fundamental problem with the standard crontab format.

Only daily intervals will be merged. While weekly, monthly, and yearly intervals could be merged as well, I did not feel the extra work was worthwhile at this time. It could certainly be done, though.


Conclusion

The crontab capabilities of cfperl are a straightforward combination of parsing user input and producing crontab entries in a standard format. Through the parsing capabilities of Parse::RecDescent, the cfperl "cron" section grammar supports a wide range of free-form user specifications for recurring events. The grammar is extensible and easy to maintain.

Two compromises were made. One was to throw away crontab lines, previously generated by cfperl, when a new crontab was generated. In effect, the cfperl-generated portion of the crontab is regenerated every time. Other than some extra processing time when cfperl is run, this does not impact the user significantly. The alternative approach of using sequential tags, checksums, or other identifying measures to recognize the crontab lines in need of regeneration has two problems. The first is that this is a lot of extra work for a small gain. The other is that old cfperl-generated entries can clutter the crontab, in effect becoming legacy entries that may not be needed anymore but which no one dares remove.

The second compromise is that intervals for days and above are not merged. This means that a specification such as "monthly on 1 at 2; on 4 at 2" will not be merged into a single "monthly on 1,4 at 2" specification automatically. This functionality can be added, but its only benefit is a reduction in the number of crontab lines produced, which seems like a small gain for the extra work necessary.


Resources

About the author

Teodor Zlatanov graduated with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work on text parsing, three-tier client-server database architectures, UNIX system administration, CORBA, and project management.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=11321
ArticleTitle=The road to better programming: Chapter 11. Crontab management with cfperl
publish-date=06122003
author1-email=tzz@iglou.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers