One of the first requirements of a program, from the lowly directory lister to the Web browser, is that it should be configurable. The combination of file-based configurability and command-line options has proven to be a long-lived and flexible solution to configurability needs. Perl programs generally embrace this approach, although they tend to also include a configuration file and command-line parsing routines.
The command-line parsing we will use in this article becomes a little complex. So to avoid further confusion, I recommend you use Parse::RecDescent (or an equivalent parsing module) if you are doing parsing of an order higher than simple arguments. For a discussion of complex command-line parsing, see my previous article on Perl programs that speak English.
Please make sure at the outset that you have Perl 5.005 or later and the CPAN AppConfig module installed on your system. You'll also need Persistent::MySQL or the appropriate Persistent class to your particular database. These can also found in CPAN (see Resources later in this article).
The simple approach: Do it yourself (DIY)
Theoretically (and with the right tools!) anyone can build a configuration parser, right? The Perl Cookbook, for one, shows a quick implementation that provides a good start. So how hard can it be to write a configuration file parser if you begin with this kind of implementation?
Quite hard, actually, because this kind of project raises several more complex issues like these:
- Blank lines and comments in the configuration file
- Erroneous lines (like misspelled keywords), and the question of which are critical and which can be ignored
- The probability that you may have to write your own parser, because you are likely to need a variety of different data structures (booleans, scalars, arrays, and hashes, for example)
- Multiple configuration files
- Variable defaults
- Integrating command-line options with the file configuration and controlling how they interact
- Educating users in yet another DIY configuration file format (This usually goes something like: "This will work, as long as you have no '=' on a line by itself. Oh, and comments begin with '#' but they have to be by themselves. Don't forget to use uppercase for the keywords and lowercase for the values. Come back! Come back! I didn't tell you about the mandatory keywords!")
- Rewriting or copying possibly buggy configuration code instead of reusing a module
- Making the configuration an object with a consistent interface instead of the usual DIY haphazard hash of keywords
Scared yet? That's why we have AppConfig. It can handle all these concerns. It's more than likely that DIY is not what you should be using.
While the AppConfig CPAN module by Andy Wardley helps resolve all the issues listed above, it is not a panacea. It will not magically improve your programs. Sometimes a rewrite is needed to use AppConfig. (There is also a slight learning curve, which this article will attempt to diminish.)
Clearly, if you are uncertain whether you should use DYI or AppConfig, you should base your decision on your experience and on what you are writing. But I believe there are very few situations where AppConfig won't be able to do the job as well or better than the DIY approach.
Here is a point-by-point explanation of what AppConfig can do for you (following the list of issues from the previous section):
- Blank lines and comments in the configuration file: AppConfig knows about blank lines and comments, and will ignore them.
- What's critical and what can be ignored in erroneous lines (like misspelled keywords): You can set the sensitivity of AppConfig to ignore bad settings or to abort the program. Keywords can also be aliased, in case alternate spellings are possible (such as in an international setting).
- Writing your own parser because you need different data structures (booleans, scalars, arrays) and hashes: AppConfig handles all these data structures, but it does so without nesting. If you need nested hashes or arrays, you may have to do it yourself or assist AppConfig.
- Multiple configuration files: AppConfig will handle as many configuration files as necessary, loading settings from each one in turn. You can also assist AppConfig in resetting arrays and hashes so that values inserted at the bottom of the stack don't have to come out at the top.
- Variable defaults: AppConfig provides for variable defaults. The "-variable" syntax resets a variable to its default state in a configuration file.
- Controlling command-line options and integrating them with the file configuration: AppConfig provides support for both Getopt::Std and Getopt::Long command-line option parsing. The parsing can be done before or after the configuration file reading.
- Educating users in yet another DIY configuration file format: AppConfig uses a standard, flexible format. Both "KEYWORD value" and/or "KEYWORD=value" are acceptable for scalars. Because arrays add up, "ARR=1" followed by "ARR=2" would yield an array ARR with elements 1 and 2. You can also specify boolean options as "bool", "nobool", "!bool", "bool on", "bool off", "bool yes" (although "bool no" does the wrong thing), "bool=1", "bool=0". (Clearly the infamous inventor of symbolic logic, Dr. George Boole -- see Resources -- would feel right at home.) Hash options are specified as "KEYWORD PARAMETER = value", where the value will become the hash entry with key PARAMETER.
- Rewriting or copying possibly buggy configuration code, instead of reusing a module: AppConfig is quite stable at this point, and the interface to the module is not likely to change. It has also been tested by thousands of other programmers, so why not use it?
- Making the configuration an object with a consistent interface, instead of the usual DIY haphazard hash: A consistent API abstracts configuration handlers out of the main program and simplifies interfacing with the handlers (AppConfig in our case). This approach also introduces fewer bugs since, instead of directly working with data structures, only methods are used.
Now that we've seen a lot of good reasons to opt for using AppConfig, let's look at a complete annotated example of AppConfig usage. For now, we'll omit the more advanced features (discussed in the next section). Variables can be set from the command line with "-varname value" for scalars, booleans, and arrays, and with "-varname key=value" for hashes. The configuration file for this is config.pl, and here's an example:
# blank lines are ignored |
AppConfig can do variable expansions on several levels, depending on the EXPAND setting. See the AppConfig documentation for more details.
# expand all variables, globally
my $config = AppConfig->new({ GLOBAL => { EXPAND => EXPAND_ALL } }); |
INI-style sections are another AppConfig feature that you may find useful. By using [section] in a configuration file on a line by itself, you can preface all keywords used up to the end of the file or the next section with the section name and an underscore '_'. For example:
[file] location = /tmp type = txt name = accounts.txt [database] host = wyrm user = slayer password = amethyst |
is equivalent to:
file_location = /tmp file_type = txt file_name = accounts.txt database_host = wyrm database_user = slayer database_password = amethyst |
An AppConfig configuration object can be examined with the varlist() function. The code below prints out the contents of every variable in an AppConfig object. Note that the varlist() function can be a little tricky because it must take a regular expression (an empty string will NOT work).
use Data::Dumper; # for hash and array references |
There is a Getopt::Long interface in AppConfig that allows access to the full power of the Getopt::Long module. The code below defines variable parameters for Getopt::Long, which is invoked to parse the parameters from the command line. Invalid values will cause errors.
$config->define("help|h|!"); # define a boolean
$config->define("code|c|=i"); # define a scalar integer
$config->define("list|l|=f@"); # define a array of floating point values only
$config->define("uids|u|=f%"); # define a hash of floating point values only |
Variable validation is also available in AppConfig. This means that a variable may reject attempts to set its value to something malicious or simply nonsensical by referring to a regular expression (or even a piece of code).
# the username validation succeeds only when it is exactly "joe"
# the password validation succeeds when it contains "joe" or "joE"
$config->define(
'USERNAME' => { ARGCOUNT => ARGCOUNT_ONE,
VALIDATE => sub # subroutine validation
{
my $varname = shift @_;
my $value = shift @_;
print "$varname = $value\n";
return ($value eq "joe");
}
},
'PASSWORD' => { ARGCOUNT => ARGCOUNT_ONE,
VALIDATE => "jo[Ee]" # regex validation
}
); |
AppConfig makes auto-triggered actions possible so that every time a variable's value changes, the action will be performed. Note that a reference to AppConfig is also passed to the subroutine so other variable changes can be triggered by one single change.
$config->define(
'USERNAME' => { ARGCOUNT => ARGCOUNT_ONE,
ACTION => sub # autoaction
{
my $config = shift @_;
my $varname = shift @_;
my $value = shift @_;
print "$varname = $value\n";
}
}
); |
AppConfig does not handle code embedded in variables. In my opinion, code has no place in configuration files anyway, and allowing users to execute arbitrary code is a bad idea. Nevertheless, AppConfig does not provide for automatic evaluation of variables, although the validation and auto-action subroutines associated with a variable could conceivably be coordinated to perform this task. If you really feel the urge, just pick the variables in question and run eval() on them yourself (in the manner illustrated below). Needless to say, this is not to be done unless you are absolutely sure you want to give your users that level of control over your programs.
foreach my $varname ('username', 'password')
{
$config->set($varname, eval $config->get($varname));
} |
The INI-style sections in AppConfig are neither here nor there. They define code sections, but the sections must be known beforehand to be useful. It may have been better to design sections so that they create new AppConfig objects nested inside the parent object, but this is a minor issue.
In simple tests, loading and execution speed don't seem to suffer from using AppConfig. It is a fairly small module, and the size/speed penalty is usually negligible. If your program is timing-sensitive, of course, you should time it with and without AppConfig and decide for yourself if the module is worth the expense.
Largely because of the nice API, AppConfig's level of complexity and learning curve are much lower than you might otherwise expect. There are confusing areas, especially for novice programmers, but by and large this is not a significant concern for anyone with prior Perl experience.
The parsing limitations of AppConfig are a consequence of its usability. If you need advanced parsing of command-line options, see my previous article on Perl programs that speak English. (It is of particular relevance that AppConfig cannot do context-sensitive parsing.)
Uploading configurations to a database with AppConfig and Persistent::DBI
I suggest reading my article on saving data with the Persistent modules before you tackle this section. You should also have some understanding of Perl references and SQL databases. My particular code example uses the MySQL database and the corresponding Persistent module. If you are using another database (Postgres or Oracle, for example), you ought to look for other Persistent modules.
Before a configuration can be used in a database context, the database schema must already be designed. In other words, you need to decide what data you want to save before you start writing the code to save and restore that data. This example will store booleans, scalars, arrays, and hashes in separate tables.
This is not necessarily the best approach. You can also have one table for all data types, or tables divided by purpose. My example is just one of the many ways to implement persistent configurations. It is certainly not the only way.
The schema I present here is probably sufficient for most purposes. It does have some limitations on value and key lengths, but those are easily adjustable in the code. However, the way that array and hash element IDs are created might cause problems. There are no perfect solutions in these cases, only different approaches to solving the problem. Storing arbitrary structured data in a relational database will always be tricky.
We will be using the _argcount() method in AppConfig::State. For more detailed information on this method, you can read the manual page for AppConfig::State. In a nutshell, the method tells us what kind of variable we are dealing with, once we know the name.
Here is my code example using the MySQL database and the corresponding Persistent module: persistent-config.pl.
The AppConfig and Persistent classes can play nicely together with a little bit of work. The persistent-config script shown in the previous section can handle most configurations with short keys and variable names, but there is room for improvement. Once the script is done to your liking, you can start it anywhere on the network and have it load the current configuration from a central host. At the very least, making improvements will teach you about database configurations and will help you look at your applications in a new network-centric way.
Code reuse is the single greatest advantage of modules in general, and of AppConfig in particular. In cases where the DIY (Do It Yourself) approach creates bugs and delays, AppConfig provides an efficient single solution that will more than likely satisfy most configuration needs.
The limitations listed in the section "AppConfig limitations" are few and far between. Of course, you should decide what's best for your project before committing to AppConfig. It's best to keep in mind how the information presented here applies to your particular project and to look at the AppConfig manual page.
What now? You should take only what you need from AppConfig. Don't try to make it do all the work for you. Putting half of your program in a configuration file may seem like fun, but your users will be storming your office soon thereafter. Make your configuration files logical and simple. Write detailed explanations of the configuration syntax accepted by the program, including the command-line options that AppConfig thoughtfully provides.
- Read Ted's other Perl articles in the "Cultured Perl" series on developerWorks.
- Check out CPAN.org for all the Perl modules you ever wanted.
- Go to Perl.com for Perl information and related resources.
- Get the AppConfig module, by Andy Wardley:
- at CPAN
- Other configuration management CPAN modules include:
- XRDB-style resources
- ConfigReader::Simple (little documentation, use at own risk)
- Config::IniFiles (different from AppConfig, some common functionality)
- Parse::PerlConfig (more complex than AppConfig, not as easy for the users)
- Programming Perl Third Edition, by Larry Wall, Tom Christiansen, and
Jon Orwant (O'Reilly & Associates, 2000) is the best guide to Perl
today, up-to-date with 5.005 and 5.6.0 now.
- Perl Cookbook, by Tom Christiansen and Nathan Torkington (O'Reilly & Associates, 1998) is the definitive how-to for all Perl issues. It's somewhat out of date with 5.6.0 out, but still worth every cent. A DIY approach to configurations is presented in recipe 8.16.
- Read more about Dr. George Boole, the inventor of symbolic logic, after whom booleans are named.

Teodor Zlatanov graduated with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work on text parsing, 3-tier client-server database architectures, UNIX system administration, CORBA, and project management. You can contact him at tzz@bu.edu.




