Beginners and intermediate system administrators will benefit most from this article; the explanations and examples assume you can handle most beginner-level system admin concepts.
To use the examples in this article, your system should be a recent (2000 or later) mainstream UNIX (Linux, Solaris, BSD) installation. The examples may work with earlier versions of Perl and UNIX, as well as with other operating systems, but their failure to function should be considered an exercise for you to solve.
Cfengine is under development. Version 1.6.3 is stable, and is the basis for this article. Plans for version 2.0, currently in alpha testing, include reworking the architecture, adding lots of great new features, and leaving the syntax largely unchanged. To get a sense of the full functionality of cfengine -- as well as the enhancements in version 2.0 -- visit the cfengine home page (see Resources later in this article).
Cfengine will change the way you do system administration. You will run a command and watch the system converge to a stable state. It will seem like magic, I promise. Cfengine will edit files, run commands, and make symlinks while you sip your tea.
Cfengine, however, will not think for you. You still have to write a configuration file and test it before putting it into production. On the other hand, very few things that cfengine does are harmful.
Cfengine makes system convergence possible. But why is convergence necessary? Well, consider that the first thing a system administrator wants is peace of mind; everything else is secondary. Stability, reliability, and predictability are the means of achieving peace of mind. This may seem simplistic, but ask any good sysadmin and you'll get this answer every time. Convergence helps achieve stability, reliability, and predictability. While it's not the only way to achieve those goals, it's the least painful, in my experience.
Stability can be defined as resistance to unintended changes. When cfengine implements convergence, it does so through sets of rules. Well-designed rules (and rest assured, cfengine makes it easy to design well) will only move the system to an ideal state (ideal as defined in the cfengine rules). For instance, crucial system symlinks can be recreated every time cfengine runs. Or init.d startup scripts can be copied from a trusted repository, whenever the local copies are modified. Processes can be restarted if cfengine doesn't see them running.
Reliability is the ability of a machine to survive problems. A network outage or a failed disk are great reliability tests. Can your systems survive those problems? With convergence to an ideal state, you can expect the system to be in or near that ideal state. While reliability is not achieved by cfengine alone, convergence facilitates it. Stability is also a great asset to reliability, since a stable system is less likely to be affected by problems. Finally, cfengine's convergence makes it possible to take a "blank" system and move it to a desired state, enabling quick stand-in replacements for crucial systems.
More needs to be said about desired and ideal states. So far, we have assumed that there is only one ideal state, and that is all that's desired. In reality, there is no ideal state for all machines. Machines are classified by task, location, connectivity, users, operating system type and version, and so on. Every sysadmin classifies his machines by some means, usually by all of the above and more (we'll talk more about classes later). It follows, then, that while there is no ideal state for all machines, there is an ideal state to be achieved for a given machine class. This is the design goal of cfengine. Cfengine makes it practical to seek an ideal state, and easy to converge to that state.
Predictability is the ability of a machine to behave as expected. Convergence achieves predictability by making a system stable and reliable. Furthermore, a new machine can be expected to behave as the old one it replaced, once it has converged to an ideal state. It's very easy to estimate the scheduling cost of adding systems or replacing them when your machines will converge to a known state shortly. Finally, software written for your systems can expect them to be in a near-ideal state. System resources, then, are in a predictable state, so the software can concentrate more on features and less on treating every system as hostile unknown territory.
Cfengine is made up of several programs. The main one is called
cfengine in version 1.6.3. The cfengine program interprets sets of
rules from a file and executes the actions demanded by those rules.
Strictly speaking, the cfengine program is just an interpreter of the
cfengine language, and any cfengine programs are just scripts for that
interpreter.
There's also a daemon program called cfd in version 1.6.3, and its
companion cfrun. Cfd will be beefed up in version 2.0, but the
1.6.3 version left much to be desired. Fortunately, I was able to achieve the
tasks I needed (signalled runs of
cfengine and remote copying of files) without cfd. I prefer starting cfengine
through an explicit script over ssh. It's a little slower than cfd,
but easier to monitor. Errors emitted by cfengine and classes defined
remotely didn't show up reliably when started through cfd. For remote
file copying, I found cfd unreliable and a security hazard, so I use
rsync instead. Mark Burgess, the author of cfengine, states that
cfd will be much improved in version 2.0, and that rsync will not be
necessary, but until 2.0 comes around, I suggest avoiding
cfd.
To get started with cfengine, you should compile and install it. RPMs are available for systems that can use them, and there's a Solaris package available (see Resources). If you want to store permanent checksums of files, similar to what Tripwire does, you should compile with the Berkeley DB support. You should then start creating your configuration files. The main one, which is run when cfengine is invoked without a file name, is /etc/cfengine/cfengine.conf (you can specify a different default configuration directory when compiling in 1.6.3, but in 2.0 and later, /etc/cfengine will be the only location checked, so you should probably stick with that).
Below is a starting configuration for cfengine. It is not a
finished product, and you should read the cfengine reference and tutorial (see Resources) carefully before putting it in action. Try cfengine with the -v -n
(verbose dry run) options to see what that configuration would do.
Nothing will be affected on the system when the -n (dry run) option is
used.
Listing 1. Starting configuration file for cfengine
/etc/cfengine/cfengine.conf
# note that only some of the possible sections are used here;
# refer to the cfengine documentation for the full list of sections
# you can have. Comments, as you can see, are like shell or Perl
# comments.
# see the tutorial and reference for any unexplained phenomena
import:
any::
cf.groups
groups:
# all groups are defined in cf.groups, imported above, but you can
# define extras here. The format is simple:
class = ( machine1 machine2 )
# and then any machine named machine1 or machine2 will have that class
# defined.
# the control section sets up how cfengine will behave
control:
any::
# you have to state in AddInstallable what classes unknown to cfengine
# by default you will be using. Run cfengine as "cfengine -v" to see
# the built-in classes you don't have to define. Here we divide
# machines into the ones that run inetd and the ones that run xinetd,
# as an example.
AddInstallable = ( inetd xinetd )
editfilesize = ( 300000 )
moduledirectory = ( /etc/cfengine/modules )
domain = ( yourdomain.com )
any::
LogDirectory = ( /etc/cfengine/log )
netmask = ( 255.255.255.0 )
Repository = ( /etc/cfengine/repository )
sysadm = ( "tzz@iglou.com" )
# Bug in cfengine: actionsequence must follow LogDirectory and Repository
actionsequence = ( directories files editfiles copy links processes disable
shellcommands )
directories:
# this ensures that these directories will be created when cfengine runs
/etc/cfengine/log
/etc/cfengine/repository
/etc/cfengine/cfcollector
files:
any::
# set the permissions for these files
/etc/sudoers mode=0440 owner=root group=root action=fixall
/etc/hosts.allow mode=0644 owner=root group=root action=fixall
/etc/hosts.deny mode=0644 owner=root group=root action=fixall
# just warn if this file's permissions are wrong
/etc/shadow mode=0400 owner=root action=warnall inform=true
# CERT advisory CA-2001-05, for Solaris only
solaris::
/usr/lib/dmi/snmpXdmid mode=0000 owner=root group=root action=fixall
# example of setting permissions differently for different OS types
# (not Linux and Linux), and negating classes
!linux::
/.ssh mode=0700 owner=root action=fixall inform=true
linux::
/root/.ssh mode=0700 owner=root action=fixall inform=true
editfiles:
any::
# add the rsync service to /etc/services and /etc/inetd.conf
{ /etc/services
SetLine "rsync 873/tcp # rsync"
AppendIfNoLineMatching "rsync.*"
}
{ /etc/inetd.conf
# add rsync
SetLine "rsync stream tcp nowait root /usr/local/bin/rsync rsyncd --daemon"
AppendIfNoLineMatching "rsync.*"
}
copy:
# set up sshd startup script, from trusted master distribution in /etc/cfengine
/etc/cfengine/sshd dest=/etc/init.d/sshd repository=/etc/cfengine/repository
links:
any::
# link the sshd init.d script to /etc/rc3.d, overwriting existing
# links if they exist
/etc/rc3.d/S72local_sshd ->! /etc/init.d/sshd
processes:
# invoke cfengine with "cfengine -DHupInetd" to define this class and
# send inetd the HUP signal (the machine has to be in the inetd class
# discussed above, too). This is an example of compound classes.
inetd.HupInetd::
"inetd" signal=hup
disable:
# empty this file (this can also be used to rotate logs, with
# different rotate options)
/etc/rc3.d/S77dmi rotate=empty
shellcommands:
any::
# always put the contents of the $domain variable in this file.
# note that all the cfengine variables can be interpolated inside strings.
"/bin/echo $(domain) > /etc/cfengine/cfdomainname"
|
Simple usage: Editing and copying files
To edit files, use the editfiles section. The syntax is quite
complex, but we'll use only a few of the possible
commands in our examples. See the cfengine reference (see Resources) for the full range of commands.
Listing 2. Editing files
editfiles:
development::
{ /etc/sudoers
DeleteLinesContaining "cpa "
}
accounting::
{ /etc/sudoers
SetLine "cpa ALL=(ALL) ALL"
AppendIfNoLineMatching "cpa .*"
}
|
In Listing 2, we see what happened when user "cpa" moved from the
Development department to the Accounting department. His sudo access
on development machines needed to be revoked, and he needed sudo
access on the accounting machines. We use machine classes you should
have declared in the control: section, under AddInstallable(). Based
on the machine class, the appropriate action is taken. Note that a
user named "cpa1" would not be a problem in these rules, since we
specifically ask for a space after the "a".
Another simple common usage is copying files. In the skeleton
example in Listing 1, we saw the copy: and links: section that set up
sshd, but what do they do?
Listing 3. Copying files
copy: # set up the sshd startup script /etc/cfengine/sshd dest=/etc/init.d/sshd repository=/etc/cfengine/repository links: any:: # link the sshd init.d script to /etc/rc3.d /etc/rc3.d/S72local_sshd ->! /etc/init.d/sshd |
At our site, we rsync /etc/cfengine with a central trusted location
before running cfengine. That way, /etc/cfengine is a local copy of
trusted files we want to use. One of those files is
/etc/cfengine/sshd, which is a startup script for the SSH daemon. The
copy: section, then, is just copying the trusted startup script over
the one in /etc/init.d if they differ. That way a malicious attacker
would see his changes disappear (we maintain most /etc/init.d scripts
this way). Also, changes in the startup scripts can be easily
propagated this way with a "signalled pull" method.
The repository option to copy: just tells cfengine where to put the
old sshd script, if it had to be overwritten. That option is good
insurance against accidents.
The links: section makes a link to the right startup directory. Note
that this is a Solaris or System V startup hierarchy. The Linux
equivalent would be handled in a solaris::/linux:: class division, but
this snippet takes the easy road to make the example simpler. The
exclamation sign (!) is optional, and says whether existing links
should be overwritten (existing files are never overwritten).
The contents of an entire directory can be copied or linked at once.
For instance, you could copy or link all the files in
/etc/cfengine/sbin to /usr/local/sbin in one shot. Only the necessary
copies or links will be made. Cfengine allows for a remote copy
option to the copy: commands, but those copies are done through cfd,
and I have found them insufficient for a production environment
because of problems with cfd itself. Pre-running rsync is a better
option with cfengine 1.6.3 and earlier.
Advanced usage: Restarting processes
Processes are best handled with a regularly run cfengine
sub-configuration, usually stored in /etc/cfengine/cf.minute or
something similar. Let cf.minute include the same cf.groups group
definitions that cfengine.conf uses, and just invoke it with
cfengine -f /etc/cfengine/cf.minute.
The example below shows how processes can be restarted or signalled.
Listing 4. Restarting or signalling processes
processes: any:: # restart cfd if it's not running already "/usr/local/sbin/cfd" restart "/etc/init.d/start-cfd start" # restart sshd if it's not running already "/usr/local/sbin/sshd" restart "/etc/init.d/sshd start" # HUP inetd if the HupInetd class is defined, see listing 1 inetd.HupInetd:: "inetd" signal=hup |
Cfd is the cfengine daemon. It proved to be quite unstable despite the frequent restarts, and that's why I abandoned it as of version 1.6.3. The 2.0 and above daemon is supposed to be much better.
Like any computer software, daemon software as a group suffers from a wide variety of implementations. As far as cfengine is concerned, there are two types of daemons: the ones that fork and exit (inetd, sshd, and cfd are good examples of this type), and the ones that don't (the qmail startup daemons, for instance).
I personally believe that daemons should have either behavior optional, so we don't have two very different types of programs to monitor. But that's not an option for most daemons today, so we have to cope.
For the fork and exit daemons, cfengine works great. The daemon
forks, and cfengine continues on its merry way. For the non-forking
daemons, however (and this includes most in-house daemon software),
cfengine needs special help. In version 2.0 this is supposed to be
resolved somehow, but in 1.6.3 the software has to print out
"cfengine-die" followed by a carriage return to let cfengine know it's
safe to leave it be. You can also monitor the non-forking daemons
with software like daemontools by Dan Bernstein (see Resources), but that's an
enterprise in itself. It's usually easier to put a print statement in
the program as a temporary fix that won't affect anything else.
This article skims the surface of what cfengine can do. You should try it out yourself, especially after you have browsed the cfengine reference, to see how cfengine can be useful to you.
Version 1.6.3 is very stable, except for the cfd daemon, in my experience. The syntax will carry to the next version (2.0), possibly with some additions, so the time you'll spend learning it will be well spent.
Cfengine is a unique system administration tool. Even if you decide not to use it at your site, the concepts and execution will help you at your job. If you do decide to use it, you will find cfengine to be infinitely flexible and incredibly useful.
- Read Ted's other Perl articles in the "Cultured Perl" series on developerWorks.
- To download cfengine, visit the cfengine home page.
- For additional information on cfengine, consult the cfengine reference and cfengine tutorial.
- RPMs for cfengine are available for systems that can use them; there's a Solaris
package available from Sunfreeware.com.
- Check out Dan Bernstein's daemontools home page.
- Browse more Linux resources on developerWorks.
- Browse more Open source resources on developerWorks.
Teodor Zlatanov graduated with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work on text parsing, 3-tier client-server database architectures, UNIX system administration, CORBA, and project management.