Contents


Common threads

Advanced filesystem implementor's guide, Part 6

Implementing devfs (using the init wrapper)

Comments

Content series:

This content is part # of # in the series: Common threads

Stay tuned for additional content in this series.

This content is part of the series:Common threads

Stay tuned for additional content in this series.

Are you ready?

In this article, we'll complete converting our Linux system over to devfs, or the Device Filesystem. For those who are just joining the series on devfs, read Part 4 of this series, where I explained how devfs solves device registration headaches at the kernel level. Then read Part 5 of this series, where I covered all the steps needed to make your Linux system devfs-compatible so that you would be ready to do the final conversion to devfs.

If you haven't read Part 5, it's very important that you do so now before following the instructions here. If you skip the steps in Part 5, it's almost guaranteed that the init wrapper that we will be installing won't work correctly and you'll end up with system that doesn't boot, requiring emergency resuscitation. That's not a good thing. However, if you have already read Part 5, then you are ready to go.

The init wrapper

Beginnings

I ended Part 5 by introducing the concept of an init wrapper, and explained why it was such a good fit for solving several devfs initialization problems. Without further ado, let's step through the full version of the init wrapper and take a look at what each piece does. We'll start at the top:

#!/bin/bash
# Copyright 2001 Daniel Robbins <drobbins@gentoo.org>, Gentoo Technologies, Inc.
# Distributed under the GNU General Public License, version 2.0 or later.

trap ":" INT QUIT TSTP	
export PATH=/sbin:/bin:/usr/sbin:/usr/bin
umask 022

if [ $$ -ne 1 ]
then
	exec /sbin/init.system $*
fi

As you can see, the init wrapper is a true bash script, since we have a #!/bin/bash at the top of the script. This would be a good time to mention that our init wrapper requires bash 2.0 or greater to run; type /bin/bash --version to see if your bash shell is recent enough. If not, you may want to see if you have a /bin/bash2 executable installed. If so, change the first line of the script to read #!/bin/bash2.

Now, let's walk throught the script. The trap command prevents the script from being interrupted by the user (by pressing control-C during boot, for example) while the script executes. Then, we export a reasonable default path and set a default umask of 022. It's always a good idea to set a default umask as early as possible in the boot process, since a good number of the earlier 2.4 kernel releases had a bug that resulted in a default umask of 0, which can pose a security threat.

Next, we encounter our first conditional statement, if [ $$ -ne 1 ]. bash expands $$ to the process ID of the currently-running process, so you can see that we're really asking the question "is our process ID anything but 1?" What's the significance of this? Well, if we are being started by the kernel during the boot process, we'll always have a PID of 1, since PID 1 is reserved for the init process. If our PID isn't 1, then we know that we're being run from the command-line after the system has already booted. This is not unusual, since the /sbin/init command has the dual purpose of allowing the superuser to change the runlevel of an already-booted system. If this is the case, we simply exec the original /sbin/init, now renamed to /sbin/init.system. We pass any of our command-line arguments to init.system by using the $* variable, our init wrapper terminates, and init.system begins execution.

Kernel boot options

However, if our wrapper is being started by the kernel at boot-time, bash's PID will be 1 and this conditional will be skipped altogether as bash continues executing our wrapper. Speaking of which, here are the next few lines:

mount -n /proc
devfs="yes"
for copt in `cat /proc/cmdline`
do
	if [ "${copt%=*}" = "wrapper" ]
	then
		parms=${copt##*=}
		#parse wrapper option
		if [ "${parms/nodevfs//}" != "${parms}" ]
		then
			devfs="no"
		fi
	fi
done

If we've gotten to this chunk of code, it means that we're being run by the kernel during the boot process; and as our first order of business, we mount /proc to our root filesystem, which is currently read-only. After that, we execute a big, complicated chunk of bash code that takes advantage of a very handy Linux feature. You may not know this, but the kernel allows us to see what options were passed to it by LILO or GRUB by looking at the contents of /proc/cmdline. On my development box, the contents of /proc/cmdline are as follows:

# cat /proc/cmdline
root=/dev/hda6 hda=89355,16,63 mem=524224K

Above, we take advantage of the existence of /proc/cmdline by scanning it for a kernel boot variable that we created ourselves, called wrapper. If wrapper=nodevfs appears among the kernel boot options, then the script knows not to enabled devfs. However, if this variable doesn't appear in /proc/cmdline, then our wrapper will proceed with devfs initialization. The moral of this story is that you can easily disable devfs by booting with the wrapper=nodevfs kernel boot option. If you do, the devfs variable will be set to no; otherwise, it'll be yes.

Wrapping it up

Here's the rest of the wrapper:

if [ "$devfs" = "yes" ]
then
 if [ -e /dev/.devfsd ] 
 then
	clear
	echo
	echo "The init wrapper has detected that /dev has been automatically mounted by"
	echo "the kernel. This will prevent devfs from automatically saving and"
	echo "restoring device permissions. While not optimal, your system will still"
	echo "be able to boot, but any perm/ownership changes or creation of new compat."
	echo "device nodes will not be persistent across reboots until you fix this"
	echo "problem."
	echo
	echo "Fortunately, the fix for this problem is quite simple; all you need to"
	echo "do is pass the \"devfs=nomount\" boot option to the kernel (via GRUB"
	echo "or LILO) the next time you boot.  Then /dev will not be auto-mounted."
	echo "The next time you compile your kernel, be sure that you do not"
	echo "enable the \"Automatically mount filesystem at boot\" devfs kernel"
	echo "configuration option.  Then the \"devfs=nomount\" hack will no longer be"
	echo "needed."
	echo
     read -t 15 -p "(hit Enter to continue or wait 15 seconds...)" 
 else	
	mount -n /dev /dev-state -o bind
	mount -n -t devfs none /dev
	if [ -d /dev-state/compat ]
	then
			echo Copying devices from /dev-state/compat to /dev
			cp -ax /dev-state/compat/* /dev
	fi
 fi
 /sbin/devfsd /dev >/dev/null 2>&1;
fi 

exec /sbin/init.system $*

We now arrive at a large conditional statement that only executes if devfs is set to yes. If this isn't the case, devfs initialization is skipped completely and devfs doesn't even get mounted. This will result in a traditional non-devfs boot.

However, if we are setting up devfs then we dive inside the conditional. Inside, we check to see if devfs has already been mounted by the kernel; we do this by checking to see if the /dev/.devfsd character device exists. When devfs is mounted, this device is automatically created by the kernel, and our future devfsd process will use it to communicate with the kernel. If devfs is already mounted (because the user selected the "Automatically mount devfs at boot" kernel option), we print out an informational message letting the user know that we won't be able to set up the persistence features of devfs, since we can only do that if devfs has not been mounted by the kernel.

Device persistence

However, if everything is OK, we perform the devfs setup that I covered at the end of my last article: /dev is bind-mounted to /dev-state and a devfs filesystem is mounted at /dev. Then, we perform a step that I didn't mention last article; we check for the existence of a /dev-state/compat directory and recursively copy its contents to /dev. While this procedure may seem a bit redundant at first (we're going to be taking advantage of devfsd's device persistence features, aren't we?) it turns out to be necessary and useful. The reason why we need a compat directory is that devfsd's persistence features only work with devfs-enabled drivers.

So, if you happen to be using a non-devfs kernel module, you'll need to create a device node in /dev manually. The problem with this approach is that this new device node will be ignored by devfsd, meaning that the next time you reboot, it will disappear. Our solution to this problem is to have the /dev-state/compat directory; if you have a non-devfs module, simply create your old-style device nodes in /dev-state/compat and they will be manually added to the devfs filesystem at boot time, thanks to the considerate steps of our handy init wrapper.

Finally, we start up devfsd, and then exit the conditional and exec our real init, /sbin/init.system to begin the standard system boot process. Well, everything's standard except for the fact that we now have a devfs-enabled system! :)

Init wrapper installation

Here's how we get the init wrapper installed. First, grab the source for wrapper.sh, and save it somewhere on your system. Then, do the following:

# cd /sbin
# cp init init.system
# cp /path/to/wrapper.sh init
# chmod +x init

The init wrapper is now in place.

Tweaking umount

By using the init wrapper, we've avoided a good amount of complicated initscript tweaking. Nevertheless, there is likely to be one tweak that we can't avoid. Your rc scripts will probably have a hard time umounting your root filesystem now that we have a devfs filesystem mounted at /dev. Fortunately, there's an easy fix for this. Simply grep your rc script directory all occurrences of umount by typing cd /etc/rc.d; grep -r umount * or cd /etc/init.d; grep -r umount * depending on where your particular distributions' rc scripts are installed. Then, in every script that makes reference to umount, make sure that it is being called with the -r option. Of particular importance is the specific umount command that umounts the root filesystem, although sprinkling umount -r's all over the place will also work. :)

The -r option tells umount to try to remount the filesystem as read-only if unmounting is unsuccessful. This is sufficient for putting the root filesystem into a consistent state and get it ready for rebooting, even if it can't be unmounted due to an existing mount at /dev that can't be unmounted itself due to open device nodes.

Now, we're almost ready to reboot; but before we do, let's look at devfsd and whip /etc/devfsd.conf into shape so that compatibility devices and device persistence is enabled. Don't fear, we're just one step away from completing our transition to devfs.

devfsd.conf

Load /etc/devfsd.conf into your favorite editor. Here are the first four lines of my recommended devfsd.conf:

REGISTER        .*              MKOLDCOMPAT
UNREGISTER      .*              RMOLDCOMPAT
REGISTER        .*              MKNEWCOMPAT
UNREGISTER      .*              RMNEWCOMPAT

Each of the above four lines consists of an event (REGISTER or UNREGISTER), a regular expression (.*) and an action (the *COMPAT strings). So, what do they all mean? The first line tells devfsd to perform the MKOLDCOMPAT action when any device (.* is a regular expression that will match any device) is registered with the kernel. The MKOLDCOMPAT action is built-in to devfsd and is understood to mean "make any old compatibility devices that correspond to the device being registered thru devfs". As you've probably figured out, the RM*COMPAT actions that get run at device unregistration cause these special compatibility devices to magically disappear. Taken as a whole, these four lines instruct devfsd to create compatibility devices (if any) when a device is registered, and to remove the compatibility devices when the device is unregistered. Thanks to these lines, when the IDE device driver registers the /dev/ide/host0/bus0/target0/lun0/disc devfs-style device with the system, devfs automatically creates a matching /dev/hda compatibility-style device. This is extremely helpful for commands such as mount and fsck who may be reading an /etc/fstab that contains old-style device names. Generally, the creation of compatibility devices makes the transition to devfs a seamless one. The next line in my devfsd.conf is:

Module auto-loading

LOOKUP          .*              MODLOAD

This entry tells devfsd to execute the MODLOAD action whenever any device (.*) is "looked up", which is what happens when a program looks for the existence of a particular device node. The MODLOAD action will cause modprobe /dev/mydev to be executed, where /dev/mydev is the name of the device that a particular process is trying to find. Thanks to this feature (along with a properly-configured /etc/modules.conf), it's possible for your sound card drivers to be auto-loaded on demand when you start up your music player, and other neat things.

Device persistence

Here are the next few lines of my devfsd.conf:

REGISTER        ^pt[sy]/.*      IGNORE
CHANGE          ^pt[sy]/.*      IGNORE
REGISTER        .*              COPY    /dev-state/$devname $devpath
CHANGE          .*              COPY    $devpath /dev-state/$devname
CREATE          .*              COPY    $devpath /dev-state/$devname

These next few lines tell devfsd to use /dev-state as a repository for any device permission or ownership changes, as well as any new compatibility devices that the user may create. On the first two lines, we explicitly tell devfsd to not perform any special actions when any pseudo-terminal devices are registered with the kernel, or when their attributes are changed. Without these lines, the permissions and ownership of our pseudo-terminals would be preserved across reboots. This isn't optimal since we should always have a fresh set of default perms on our pseudo-terminal devices right after the system starts up.

The next three lines turn on /dev-state persistence for all other devices. Specifically, we will restore any attributes from /dev-state when a device is registered or devfsd itself is started (as well as copying over any existing compatibility devices), and we will immediately back up any changes to attributes, as well as any newly-created compatibility devices to /dev-state.

CFUNCTION and symlinks

And to complete my devfsd.conf, I have these lines:

REGISTER        ^cdrom/cdrom0$          CFUNCTION GLOBAL symlink cdroms/cdrom0 cdrom
UNREGISTER      ^cdrom/cdrom0$          CFUNCTION GLOBAL unlink cdrom
REGISTER        ^misc/psaux$            CFUNCTION GLOBAL symlink misc/psaux mouse
UNREGISTER      ^misc/psaux$            CFUNCTION GLOBAL unlink mouse

These last four lines are optional, but they are worth taking a look at. While /dev-state persistence works wonderfully for device nodes, it has no effect at all on symbolic links, which are ignored. So, this raises the question: how does one go about ensuring that /dev/mouse or /dev/cdrom symlinks not only exist, but are persistent across reboots? Fortunately for us, devfsd is extremely configurable, and these four lines (or something similar, customized to your particular system) will do the trick. The first two instruct devfsd to make a /dev/cdrom symlink appear when the /dev/cdrom/cdrom0 device is registered. To do this, devfsd actually performs a dynamic call to the libc function you specify, in this case symlink() and unlink(). The last two lines of the file use an identical approach to create a /dev/mouse symlink when the /dev/misc/psaux (PS/2 mouse) device is registered with devfs. Customize these lines to your system, and then save this file. If you'd like, you can download this devfsd.conf for use on your own system.

Notes before rebooting

Before rebooting, you may want to take a look at Richard Gooch's devfs FAQ; you may find the information about the devfs naming scheme particularly helpful as you get acquainted with the new-style device names (see Related topics below). I also recommend that you print out a copy of Part 5 of this series in case you need to make use of my "emergency bash rescue" instructions in order to fix a boot-related problem. Remember that if for some reason the new init wrapper bombs out, you can always remove it by following my emergency rescue instructions, remounting the root filesystem as read-write, and then performing the following steps:

# cd /sbin
# mv init wrapper.sh
# mv init.system init

After performing these steps, remounting your filesystem(s) as read-only and rebooting, your system will be back in its pre-wrapper state. Now go ahead, reboot, and enjoy devfs!


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux, Open source
ArticleID=11155
ArticleTitle=Common threads: Advanced filesystem implementor's guide, Part 6
publish-date=10012001