Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Parallelize applications for faster Linux booting

Working with initng and upstart

M. Tim Jones (mtj@mtjones.com), Consultant Engineer, Emulex
M. Tim Jones
M. Tim Jones is an embedded software architect and the author of GNU/Linux Application Programming, AI Application Programming, and BSD Sockets Programming from a Multilanguage Perspective. His engineering background ranges from the development of kernels for geosynchronous spacecraft to embedded systems architecture and networking protocols development. Tim is a Consultant Engineer for Emulex Corp. in Longmont, Colorado.

Summary:  One of the biggest complaints about Linux®, particularly from developers, is the speed with which Linux boots. By default, Linux is a general-purpose operating system that can serve as a client desktop or server right out of the box. Because of this flexibility, Linux serves a wide base but is suboptimal for any particular configuration. This article shows you options to increase the speed with which Linux boots, including two options for parallelizing the initialization process. It also shows you how to visualize graphically the performance of the boot process.

Date:  07 Mar 2007
Level:  Intermediate
Also available in:   Chinese  Korean  Russian  Japanese

Activity:  27060 views
Comments:  

A common complaint about GNU/Linux (other than its lack of a reasonable kernel debugger) is the amount of time the operating system takes to start. You could sum up this process as booting, but in fact several independent tasks are involved to evolve from a cold system to one that you can interact with through a shell or window manager. Let's review the Linux boot and initialization process.

Major stages of Linux boot

While booting Linux involves many steps, you can partition the process into three basic steps that I call BIOS, kernel boot, and system initialization, as shown in Figure 1.


Figure 1. A temporal view of the Linux boot process
Linux boot process

BIOS

When you first turn on a computer or reset it, the computer's processor begins execution at a well-known location in what's called the basic input/output system (BIOS). BIOS is typically stored in a flash memory device on the system's motherboard. The BIOS has many jobs, such as initial testing of basic components (such as the system's memory) and determining how to boot the operating system. As PC-based computers are extremely flexible, the boot device can be one of many individual devices attached to the motherboard, including hard disks, CD-ROMs, or other devices such as the network interface.

You can optimize the process of determining the boot device by selecting the device from which you'll most commonly boot (typically, the hard disk). But by far, the most time-consuming aspect of the BIOS stage is in memory test. Disabling certain aspects of this test (such as a full memory test) can certainly help boot speed but at the cost of a boot-time system integrity test.

Kernel boot

When a boot device is found, the Linux kernel boot process begins. This process occurs in (approximately) two stages -- first-stage boot and second-stage boot.. The first stage consists of a simple boot loader (found on the boot device's master boot record, or MBR), whose job is to load the second-stage boot loader. The first-stage boot loader finds the second-stage boot loader using the partition table. The first-stage boot loader scans the table, looking for the active partition; when the loader locates the partition, it loads the second-stage boot loader into RAM and invokes it.

With the second-stage boot loader in RAM, the Linux kernel image and the initial RAM disk image (initrd) are loaded into RAM. When the kernel is invoked, it decompresses itself into high memory and copies the initrd for later mounting and use.

LILO and GRUB

The first-stage and second-stage boot loaders are better known as LInux LOader (LILO) or GRand Unified Bootloader (GRUB), depending on which name your system uses.

The kernel boot process is fairly complicated but very fast, as most of the code is written in the system's machine language. At the end of the kernel boot sequence, the init process starts. As init is the first process created in a Linux system, it's the mother of all other processes (all processes are descendants of init).

System init

The init process -- the focus of this article -- is the first process created as the kernel boot sequence is completed. Linux uses the init process to initialize the services and applications that make Linux useful.

When the init process starts, it opens a file called /etc/inittab. This file is the configuration file for init and defines how to initialize the system. This file also contains information about what to do when a power failure occurs (if the system supports it) and how to react when it detects the Ctrl-Alt-Delete key sequence. Look at the short segment of this file shown in Listing 1 to understand what it provides.

The inittab configuration file defines several entries with a common format: id:runlevels:action:process. The id is a sequence of characters that uniquely identifies the entry. The runlevels define the runlevels for which the action should be taken. The action specifies the particular action to take. Finally, process defines the process to be executed.


Listing 1. Excerpt from the inittab file
                
# The default runlevel
id:2:initdefault

# Boot-time system configuration/initialization script
si::sysinit:/etc/init.d/rcS

# Runlevels
l0:0:wait:/etc/init.d/rc 0
l1:1:wait:/etc/init.d/rc 1
l2:2:wait:/etc/init.d/rc 2
l3:3:wait:/etc/init.d/rc 3
l4:4:wait:/etc/init.d/rc 4
l5:5:wait:/etc/init.d/rc 5
l6:6:wait:/etc/init.d/rc 6
z6:6:respawn:/sbin/sulogin

# How to react to ctrl-alt-del
ca:12345:ctrlaltdel:/sbin/shutdown -t1 -a -r now

Init and telinit

You can communicate with the init process by using the telinit utility (which is a link to the init utility). For example, if you're in multi-user mode (runlevel 2) and want to go to single-user mode (runlevel 1), simply use the command telinit 1 (in super-user mode).

After init loads /etc/inittab, it brings the system up to the runlevel that the initdefault action defines. As Listing 1 showed, this is runlevel 2. Think of a runlevel as the state of the system. For example, runlevel 0 defines the system halt state, runlevel 1 is the single-user mode. Runlevels 2 through 5 are multi-user states, and runlevel 6 indicates reboot. (Note that some distributions differ on the runlevel representations.) Another way to think of runlevel is that it's a way to define which processes may execute (processes that define the state of the system).

Note: To see the current runlevel of your system, use the command runlevel.

As defined in Listing 1, initdefault specifies that the default init level is 2 (multi-user mode). After the initial runlevel is defined, the script rc with the argument 2 (the runlevel) is invoked to bring the system. This script then invokes a variety of service and application scripts to start or stop the particular element. In this case, the files are defined in /etc/rc2.d/. For example, if the MySQL application were to be started (such as system startup), it would be invoked as /etc/rc2.d/S20mysql start. When the system is shut down, the same set of scripts is invoked with the stop argument.

Altering the init process

Altering the initialization process is quite simple. At boot time (using LILO or GRUB), specify a new process to start to handle system initialization. Specify init=/sbin/mynewinit as part of the kernel boot line to invoke that process instead of the default init. You can see this in the kernel source in ./linux/init/main.c. If you provide an init command on the kernel boot line, it is used. If not, the kernel attempts to start one of four alternatives (the first being /sbin/init).

In the end, many scripts are executed serially to start the various services that are required (which you can typically see as part of the boot screen with Linux). Even when services are unrelated to one another, they're still started one after the other. The result is that this process can take time (especially in a large system with many services).

An obvious solution to this problem is to remove the serial nature of the init command and replace it with something that operates more in parallel. You can find this metaphor in more places that just multi-processing systems. For example, socket striping, or using two or more sockets to move data in parallel, is a solution based on this theme. Redundant array of independent disks (RAID) systems also strip across disks (typically in parallel) to increase I/O performance.


Init daemon replacements

Simple init optimization

The simplest way to optimize the init process is to disable unnecessary services. For example, if you're running a desktop (rather than a server), you could disable services such as apache, sendmail, and mysql, thereby shortening the init sequence.

Because the traditional init process (sysvinit) is a serial process, this portion of the system is ripe for optimization. In fact, you can use any of several approaches to optimize the init process. Let's look at a few of these approaches and how they solve the problem. The first two approaches are dependency based (that is, they use dependencies to provide the parallelization), and the third is an event-based system (that is, processes depend on events to indicate when they can start or stop).


Initng

The first option, initng (for init next generation), is a full replacement for init that asynchronously starts processes to more quickly complete the init process. At the time of this writing, initng is a beta product, the creator of which is Jimmy Wennlund.

The fundamental idea behind initng is that services are started as soon as their dependencies are met. This system results in a better balance of CPU versus I/O. While one script is being loaded from disk or waiting for a hardware device to start, another script can be running to start another service.

How initng works

As a dependency-based solution, initng uses its own set of initialization scripts that encode the service and daemon dependencies. An example is shown in Listing 2. This script specifies a service that is to be started for the given runlevel. The service has two dependencies, as defined by the need keyword, for system/initial and net/all. These services must be available before system/my_service can be started. When these services are available, the exec keyword comes into play. The exec keyword (with the start option) defines how to start the service, with any available options. When the service is to stop, the exec keyword with the stop option is used.


Listing 2. Defining a service for initng
                
service system/my_service {

  need = system/initial net/all;

  exec start = /sbin/my_service --start --option;
  exec stop = /sbin/my_service --stop --option;

}

You can encode an entire system with service definitions, as shown in Listing 2. Those without dependencies can then be started immediately (and in parallel), while those that have dependencies must wait to start safely. You can think of initng as a goals-based system. The goals are the services to be started. No explicit planning occurs; instead, the dependencies simply define the flow of service initiation, with parallelization implicit in the process.

Using initng

The initng package is relatively simple to install for typical uses. For systems that use non-standard packages (not present in default configuration), some assembly may be required.

A typical installation of initng requires the initng distribution (source or binary) and the ifiles distribution. You can build the initng distribution with ./configure, make, and make install. You must build the ifiles (which are the script files) with cmake. Depending on your system requirements, you may be required to create new service/daemon definitions (although it's likely that someone in the initng community has done so already). You must then modify the LILO or GRUB configuration to point to the new /sbin/initng.

To control initng, you use ngc (as compared to telinit with traditional init). The syntax differs somewhat, but the capabilities remain the same.


Upstart

Another option to replace init, upstart takes a somewhat different approach to what you just saw with initng. Upstart is an event-based init replacement, which means that the starting and stopping of services is based on the communication of events. Upstart is being developed for the Ubuntu distribution by Scott James Remnant but is intended as a general replacement for init with any Linux distribution.

How upstart works

Upstart requires that you update the initialization scripts to support the event-based mode of operation. Upstart maintains its own init process that starts on system start (as for all other approaches). First, init emits the startup event -- one of the two core events. Event startup is emitted by init when the system starts, with event shutdown emitted when the system is to be shut down. Other core events include ctrlaltdel, which indicates that you pressed Ctrl-Alt-Delete, or kbdrequest, which is used to indicate that you pressed the Alt-Up arrow key combination.

You can create new events for other uses. For example, you can create an arbitrary event called myevent and indicate its receipt by using the echo command. Take the following short job:

on myevent
exec echo myevent received
console output

This code specifies that the job is triggered when the myevent event is received. The code then performs the actions specified (emitting text to the console). With this file present in the upstart configuration (/etc/event.d), you can trigger it using the initctl utility:

initctl emit myevent

The script files for upstart work similar to the traditional rc init files, except that they operate autonomously based on asynchronous events. Listing 3 provides a simple example script that accepts three events: startup, which causes the job to start, or shutdown and runlevel-3, which cause the job to stop. The shell executes the contents of the script portion of the job (using the -e option to terminate the script on error).


Listing 3. Simplified upstart script for the sysvinit rc 2 script
                
start on startup
stop on shutdown
stop on runlevel-3

script
	set $(runlevel --set 2 || true)
	exec /etc/init.d/rc 2
end script

The initctl utility offers functionality similar to telinit but with some additional features specific to upstart. As you saw above, you can use initctl with the emit option to generate an event to upstart. The list option gives you insight into the operation of the system by identifying the state of the jobs. It tells you which are currently waiting and which are active. The initctl utility can also display the events that are received for debugging purposes.

Upstart is an interesting replacement for init and has some distinct advantages over it. There really is no reason for runlevels any longer, as a system will boot as far as it can go with the available hardware. Any hardware that's not present will not trigger the jobs that would require it. Upstart also handles hot-plugging devices well. For example, if you plugged a PCMCIA network card in long after the system was booted, the network-interface-added event would be generated. This event would cause the Dynamic Host Configuration Protocol (DHCP) job to configure it, generating a network-interface-up event. When a default route was assigned to the new interface, a default-route-up event would result. From here, jobs that required a network interface (such as a mail server or Web server) would start automatically (and stop, if the interface disappeared).

Using upstart

Building and installing upstart is simple and follows the typical configure, make, and make install pattern. Upstart provides a set of example jobs that are compatible with the typical init configuration runlevels. Like initng, new applications must have their own jobs written based upon their requirements (with the potential for adding new events). With either case, deploying a new init system can require some risk. But the advantages that upstart provides certainly outweigh the risks and additional work that may be necessary.

As illustrated above, the initctl utility provides the functionality that one would expect of telinit. But initctl also provides additional functionality for tracing and debugging.


Other options

The two options this article explores -- initng and upstart -- are not the only two games in town. You can also find init replacements such as runit, pardus, minit, and einit. All have supporters and some amount of momentum in the Linux community. At this point, upstart is probably the one to watch, because it has been adopted as the init replacement for the popular Ubuntu distribution. See Resources for more information.


Monitoring init performance with bootchart

As you change the landscape of the system boot process, it's useful to understand what changed and how it affects the overall time to boot. Ziga Mahkovec has built a very useful tool called bootchart to visualize the makeup of the boot process. This tool consists of several elements, including a data logger utility and a visualization utility.

The data logger (bootchartd) runs in the place of the init process (usually, specified in the grub or lilo.conf files). After bootchartd has initialized, it surrenders control back to the real init process (typically, /sbin/init). Bootchartd is essentially a profiler that samples the environment at a periodic interval (by default, once every 200 ms). By sampling the environment, I mean that it reads the current CPU statistics, I/O and idle times, disk usage, and information about every active process (through the proc file system). This data is stored in a temporary file (/var/log/bootchart.tgz) for later post-processing.

Bootchart then uses a post-processing tool to transform the raw data into a boot chart. This process can occur locally using a Java™ application (part of the bootchart distribution), but an easier method is through a Web form located at the bootchart home page. An example piece of a boot chart is shown in Figure 2. Note that these charts tend to be quite large (depending on the services and applications started). For links to complete examples, see Resources.


Figure 2. Snippet of a boot chart created by bootchartd
Boot chart

Summary

Like Linux itself, there are plenty of options and lots flexibility for boot time optimization. From dependency-based solutions like initng to event-based solutions like upstart, there's an optimization solution that should fit your needs. Using the bootchart package, you can dig in further to understand where your system is spending its boot time to optimize even more.


Resources

Learn

Get products and technologies

  • The next generation init system (initng) is a dependency-based approach for init system replacement.

  • The Ubuntu upstart system is an event-based approach for init system replacement.

  • Bootchart is a performance-analysis and visualization tool for the boot process. It collects performance data during the system initialization process, and then post-processes the data into a time line.

  • The einit package is another approach to the initialization scripts that uses Extensible Markup Language (XML) for the configuration file.

  • Another interesting init parallelization scheme is Pardus . This approach not only removes the serial nature of Linux boot but also adds flexibility by using the Python language.

  • The runit package is a replacement init scheme with service supervision.

  • The minit package is a small but complete version of the init system. You can also explore the sysvinit source.

  • Order the SEK for Linux, a two-DVD set containing the latest IBM trial software for Linux from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

  • With IBM trial software, available for download directly from developerWorks, build your next development project on Linux.

Discuss

About the author

M. Tim Jones

M. Tim Jones is an embedded software architect and the author of GNU/Linux Application Programming, AI Application Programming, and BSD Sockets Programming from a Multilanguage Perspective. His engineering background ranges from the development of kernels for geosynchronous spacecraft to embedded systems architecture and networking protocols development. Tim is a Consultant Engineer for Emulex Corp. in Longmont, Colorado.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux, Open source
ArticleID=200455
ArticleTitle=Parallelize applications for faster Linux booting
publish-date=03072007
author1-email=mtj@mtjones.com
author1-email-cc=tomyoung@us.ibm.com