A common complaint about GNU/Linux (other than its lack of a reasonable kernel debugger) is the amount of time the operating system takes to start. You could sum up this process as booting, but in fact several independent tasks are involved to evolve from a cold system to one that you can interact with through a shell or window manager. Let's review the Linux boot and initialization process.
While booting Linux involves many steps, you can partition the process into three basic steps that I call BIOS, kernel boot, and system initialization, as shown in Figure 1.
Figure 1. A temporal view of the Linux boot process
When you first turn on a computer or reset it, the computer's processor begins execution at a well-known location in what's called the basic input/output system (BIOS). BIOS is typically stored in a flash memory device on the system's motherboard. The BIOS has many jobs, such as initial testing of basic components (such as the system's memory) and determining how to boot the operating system. As PC-based computers are extremely flexible, the boot device can be one of many individual devices attached to the motherboard, including hard disks, CD-ROMs, or other devices such as the network interface.
You can optimize the process of determining the boot device by selecting the device from which you'll most commonly boot (typically, the hard disk). But by far, the most time-consuming aspect of the BIOS stage is in memory test. Disabling certain aspects of this test (such as a full memory test) can certainly help boot speed but at the cost of a boot-time system integrity test.
When a boot device is found, the Linux kernel boot process begins. This process occurs in (approximately) two stages -- first-stage boot and second-stage boot.. The first stage consists of a simple boot loader (found on the boot device's master boot record, or MBR), whose job is to load the second-stage boot loader. The first-stage boot loader finds the second-stage boot loader using the partition table. The first-stage boot loader scans the table, looking for the active partition; when the loader locates the partition, it loads the second-stage boot loader into RAM and invokes it.
With the second-stage boot loader in RAM, the Linux kernel image and the initial
RAM disk image (initrd) are loaded into RAM. When the
kernel is invoked, it decompresses itself into high memory and copies the
initrd for later mounting and use.
The kernel boot process is fairly complicated but very fast, as most of the code
is written in the system's machine language. At the end of the kernel boot
sequence, the init process starts. As
init is the first process created in a Linux system,
it's the mother of all other processes (all processes are descendants of
init).
The init process -- the focus of this article -- is
the first process created as the kernel boot sequence is completed. Linux uses
the init process to initialize the services and
applications that make Linux useful.
When the init process starts, it opens a file called
/etc/inittab. This file is the configuration file for
init and defines how to initialize the system. This
file also contains information about what to do when a power failure occurs (if
the system supports it) and how to react when it detects the Ctrl-Alt-Delete key
sequence. Look at the short segment of this file shown in Listing 1
to understand what it provides.
The inittab configuration file defines several entries
with a common format: id:runlevels:action:process. The id is a
sequence of characters that uniquely identifies the entry. The runlevels
define the runlevels for which the action should be taken. The action
specifies the particular action to take. Finally, process defines the
process to be executed.
Listing 1. Excerpt from the inittab file
# The default runlevel
id:2:initdefault
# Boot-time system configuration/initialization script
si::sysinit:/etc/init.d/rcS
# Runlevels
l0:0:wait:/etc/init.d/rc 0
l1:1:wait:/etc/init.d/rc 1
l2:2:wait:/etc/init.d/rc 2
l3:3:wait:/etc/init.d/rc 3
l4:4:wait:/etc/init.d/rc 4
l5:5:wait:/etc/init.d/rc 5
l6:6:wait:/etc/init.d/rc 6
z6:6:respawn:/sbin/sulogin
# How to react to ctrl-alt-del
ca:12345:ctrlaltdel:/sbin/shutdown -t1 -a -r now
|
After init loads /etc/inittab, it brings the system
up to the runlevel that the initdefault action
defines. As Listing 1 showed, this is runlevel 2. Think of
a runlevel as the state of the system. For example, runlevel 0 defines the
system halt state, runlevel 1 is the single-user mode. Runlevels 2 through 5
are multi-user states, and runlevel 6 indicates reboot. (Note that some
distributions differ on the runlevel representations.) Another way to think of
runlevel is that it's a way to define which processes may execute (processes
that define the state of the system).
Note: To see the current runlevel of your system, use the command
runlevel.
As defined in Listing 1, initdefault
specifies that the default init level is 2
(multi-user mode). After the initial runlevel is defined, the script
rc with the argument 2
(the runlevel) is invoked to bring the system. This script then invokes a
variety of service and application scripts to start or stop the particular
element. In this case, the files are defined in /etc/rc2.d/. For example, if
the MySQL application were to be started (such as system startup), it would be
invoked as /etc/rc2.d/S20mysql start. When the
system is shut down, the same set of scripts is invoked with the
stop argument.
In the end, many scripts are executed serially to start the various services that are required (which you can typically see as part of the boot screen with Linux). Even when services are unrelated to one another, they're still started one after the other. The result is that this process can take time (especially in a large system with many services).
An obvious solution to this problem is to remove the serial nature of the
init command and replace it with something that
operates more in parallel. You can find this metaphor in more places that just
multi-processing systems. For example, socket striping, or using two or
more sockets to move data in parallel, is a solution based on this theme.
Redundant array of independent disks (RAID) systems also strip across disks
(typically in parallel) to increase I/O performance.
Because the traditional init process
(sysvinit) is a serial process, this portion of the
system is ripe for optimization. In fact, you can use any of several approaches
to optimize the init process. Let's look at a few of
these approaches and how they solve the problem. The first two approaches are
dependency based (that is, they use dependencies to provide the parallelization),
and the third is an event-based system (that is, processes depend on events to
indicate when they can start or stop).
The first option, initng (for
init
next generation), is a full replacement for init
that asynchronously starts processes to more quickly complete the
init process. At the time of this writing,
initng is a beta product, the creator of which is
Jimmy Wennlund.
The fundamental idea behind initng is that services are
started as soon as their dependencies are met. This system results in a better
balance of CPU versus I/O. While one script is being loaded from disk or waiting
for a hardware device to start, another script can be running to start another
service.
As a dependency-based solution, initng uses its own
set of initialization scripts that encode the service and daemon dependencies.
An example is shown in Listing 2. This script specifies a
service that is to be started for the given runlevel. The service has two
dependencies, as defined by the need keyword, for
system/initial and net/all. These services must be available before
system/my_service can be started. When these services are available, the
exec keyword comes into play. The
exec keyword (with the start
option) defines how to start the service, with any available options. When the
service is to stop, the exec keyword with the
stop option is used.
Listing 2. Defining a service for initng
service system/my_service {
need = system/initial net/all;
exec start = /sbin/my_service --start --option;
exec stop = /sbin/my_service --stop --option;
}
|
You can encode an entire system with service definitions, as shown in Listing 2.
Those without dependencies can then be started immediately (and in parallel),
while those that have dependencies must wait to start safely. You can think of
initng as a goals-based system. The goals are the
services to be started. No explicit planning occurs; instead, the dependencies
simply define the flow of service initiation, with parallelization implicit in
the process.
The initng package is relatively simple to install
for typical uses. For systems that use non-standard packages (not present in
default configuration), some assembly may be required.
A typical installation of initng requires the
initng distribution (source or binary) and the ifiles
distribution. You can build the initng distribution
with ./configure, make, and
make install. You must build the ifiles (which are
the script files) with cmake. Depending on your
system requirements, you may be required to create new service/daemon definitions
(although it's likely that someone in the initng
community has done so already). You must then modify the LILO or GRUB configuration
to point to the new /sbin/initng.
To control initng, you use ngc
(as compared to telinit with traditional
init). The syntax differs somewhat, but the
capabilities remain the same.
Another option to replace init, upstart
takes a somewhat different approach to what you just saw with
initng. Upstart is an
event-based init replacement, which means that the
starting and stopping of services is based on the communication of events.
Upstart is being developed for the Ubuntu distribution
by Scott James Remnant but is intended as a general replacement for
init with any Linux distribution.
Upstart requires that you update the initialization
scripts to support the event-based mode of operation. Upstart
maintains its own init process that starts on system
start (as for all other approaches). First, init
emits the startup event -- one of the two core events. Event
startup is emitted by init when the system
starts, with event shutdown emitted when the system is to be shut down.
Other core events include ctrlaltdel, which indicates that you pressed
Ctrl-Alt-Delete, or kbdrequest, which is used to indicate that you
pressed the Alt-Up arrow key combination.
You can create new events for other uses. For example, you can create an
arbitrary event called myevent and indicate its receipt by using the
echo command. Take the following short job:
on myevent exec echo myevent received console output |
This code specifies that the job is triggered when the myevent event is
received. The code then performs the actions specified (emitting text to the
console). With this file present in the upstart
configuration (/etc/event.d), you can trigger it using the
initctl utility:
initctl emit myevent |
The script files for upstart work similar to the
traditional rc init files, except that they operate
autonomously based on asynchronous events. Listing 3
provides a simple example script that accepts three events: startup,
which causes the job to start, or shutdown and runlevel-3, which
cause the job to stop. The shell executes the contents of the
script portion of the job (using the
-e option to terminate the script on error).
Listing 3. Simplified upstart script for the sysvinit rc 2 script
start on startup
stop on shutdown
stop on runlevel-3
script
set $(runlevel --set 2 || true)
exec /etc/init.d/rc 2
end script
|
The initctl utility offers functionality similar to
telinit but with some additional features specific
to upstart. As you saw above, you can use
initctl with the emit
option to generate an event to upstart. The
list option gives you insight into the operation
of the system by identifying the state of the jobs. It tells you which are
currently waiting and which are active. The initctl
utility can also display the events that are received for debugging purposes.
Upstart is an interesting replacement for
init and has some distinct advantages over it. There
really is no reason for runlevels any longer, as a system will boot as far as
it can go with the available hardware. Any hardware that's not present will not
trigger the jobs that would require it. Upstart also
handles hot-plugging devices well. For example, if you plugged a PCMCIA network
card in long after the system was booted, the network-interface-added event
would be generated. This event would cause the Dynamic Host Configuration
Protocol (DHCP) job to configure it, generating a network-interface-up
event. When a default route was assigned to the new interface, a
default-route-up event would result. From here, jobs that required a
network interface (such as a mail server or Web server) would start automatically
(and stop, if the interface disappeared).
Building and installing upstart is simple and follows
the typical configure, make,
and make install pattern.
Upstart provides a set of example jobs that are
compatible with the typical init configuration
runlevels. Like initng, new applications must have
their own jobs written based upon their requirements (with the potential for
adding new events). With either case, deploying a new init
system can require some risk. But the advantages that upstart
provides certainly outweigh the risks and additional work that may be necessary.
As illustrated above, the initctl utility provides
the functionality that one would expect of telinit.
But initctl also provides additional functionality
for tracing and debugging.
The two options this article explores -- initng and
upstart -- are not the only two games in town. You can
also find init replacements such as
runit, pardus,
minit, and einit. All have
supporters and some amount of momentum in the Linux community. At this point,
upstart is probably the one to watch, because it has
been adopted as the init replacement for the popular
Ubuntu distribution. See Resources for more information.
Monitoring init performance with bootchart
As you change the landscape of the system boot process, it's useful to understand
what changed and how it affects the overall time to boot. Ziga Mahkovec has built
a very useful tool called bootchart to visualize the
makeup of the boot process. This tool consists of several elements, including a
data logger utility and a visualization utility.
The data logger (bootchartd) runs in the place of the
init process (usually, specified in the grub or
lilo.conf files). After bootchartd has initialized, it
surrenders control back to the real init process
(typically, /sbin/init). Bootchartd is essentially a
profiler that samples the environment at a periodic interval (by default, once
every 200 ms). By sampling the environment, I mean that it reads the current
CPU statistics, I/O and idle times, disk usage, and information about every active
process (through the proc file system). This data is
stored in a temporary file (/var/log/bootchart.tgz) for later post-processing.
Bootchart then uses a post-processing tool to transform
the raw data into a boot chart. This process can occur locally using a Java™
application (part of the bootchart distribution), but an
easier method is through a Web form located at the bootchart
home page. An example piece of a boot chart is shown in Figure 2.
Note that these charts tend to be quite large (depending on the services and
applications started). For links to complete examples, see
Resources.
Figure 2. Snippet of a boot chart created by bootchartd
Like Linux itself, there are plenty of options and lots flexibility for boot time
optimization. From dependency-based solutions like initng
to event-based solutions like upstart, there's an
optimization solution that should fit your needs. Using the bootchart
package, you can dig in further to understand where your system is spending its
boot time to optimize even more.
Learn
-
For a perspective on system administration from various distributions of Linux,
check out this developerWorks resource on
differentiating
UNIX and Linux.
-
While a bit dated, this Red Hat resource explores the various
runlevels
for Linux.
-
In the developerWorks Linux zone, find more resources for Linux developers.
-
Stay current with developerWorks technical events and Webcasts.
Get products and technologies
-
The next generation
initsystem (initng) is a dependency-based approach forinitsystem replacement. -
The Ubuntu
upstartsystem is an event-based approach forinitsystem replacement. -
Bootchartis a performance-analysis and visualization tool for the boot process. It collects performance data during the system initialization process, and then post-processes the data into a time line. -
The
einitpackage is another approach to the initialization scripts that uses Extensible Markup Language (XML) for the configuration file. -
Another interesting
initparallelization scheme isPardus. This approach not only removes the serial nature of Linux boot but also adds flexibility by using the Python language. -
The
runitpackage is a replacementinitscheme with service supervision. -
The
minitpackage is a small but complete version of theinitsystem. You can also explore thesysvinitsource. -
Order the SEK for Linux, a two-DVD set containing the latest IBM trial software for Linux from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
-
With IBM trial software, available for download directly from developerWorks, build your next development project on Linux.
Discuss
-
Check out developerWorks
blogs and get involved in the developerWorks community.

M. Tim Jones is an embedded software architect and the author of GNU/Linux Application Programming, AI Application Programming, and BSD Sockets Programming from a Multilanguage Perspective. His engineering background ranges from the development of kernels for geosynchronous spacecraft to embedded systems architecture and networking protocols development. Tim is a Consultant Engineer for Emulex Corp. in Longmont, Colorado.



