Administer Linux on the fly
Use the /proc filesystem to get a handle on your system
Anyone who has administered a system of commercial importance knows the value of uptime -- or, conversely, knows the headaches you get from users because of downtime. One of the main reasons a company will run a UNIX server is because of its reliability and stability. If managed carefully, there's usually no need to restart these servers for long periods of time. And to improve matters further, there are administrative tasks -- even at the kernel level -- that you can perform on the fly, keeping your servers available. While you may still need to restart a system to upgrade hardware or if someone trips over the power cord, it's good to know that many administrative tasks can be performed without disrupting service.
This article includes hints and tips for performing various administrative tasks and changing your system without rebooting. Linux provides various ways to change underlying operating system values and settings while keeping the system up and running. These come in two basic forms, those that are general to all Linux systems and are provided in the Linux kernel (you can find more information about the Linux kernel and download kernel source at the Linux Kernel Archives; see Related topics for a link), and those that are distribution specific and provided by the vendor. This article deals with both types.
Note: This article is written for a 2.4-level kernel. Some of the options and features may be different for other kernel versions.
Changing running kernel parameters
Linux provides a really neat way for administrators to change the kernel
while the system is running and without the need to reboot the
kernel/system. This is done with a virtual filesystem called
/proc. Linux Gazette provides one of the
simplest and easiest references on
/proc I have
seen. (See Related topics for a link.) Very basically,
/proc filesystem gives you a view into the
running kernel, which can be useful for monitoring performance,
discovering system information, finding out how the system is configured,
and changing that configuration. This filesystem is called a virtual
filesystem, because it is not really a filesystem at all. It's
just a map provided by the kernel that is attached to your usual
filesystem structure to give you access to it.
The fact that we have some way of changing the running kernel parameters
while the system is up and running gives the system administrator great
power and flexibility in changing kernel settings. This sort of
implementation was an inspired idea on the part of the Linux kernel
developers. But can too much power be a bad thing? Sometimes. If you are
going to change anything in the
filesystem, you must make sure that you know what you are changing
and what effect this will have on the system. These are really useful
techniques, but a wrong move can give you some rather undesired
consequences. If you are new to this sort of thing or are not sure what
effect one of your changes will have, practice on a machine that is not
important to you or your business.
How to make changes
First, think about how not to make changes to the kernel. There are
two good reasons why you should not just jump into the
/proc filesystem, open a file in your text
editor, make a bunch of changes, and save the file back out again. These
- Data integrity: All of these files represent the running system, and since the kernel may change any of these files at any time, if you open an editor and change some data while the system is changing it underneath you, whatever you save back is unlikely to be what the kernel is expecting.
- Virtual files: All of these files do not actually exist. How would the saved data be synchronized, etc.?
The answer to making changes to any of these files, therefore, is not to
use an editor. When making changes to anything at all in the
/proc filesystem, you should use the
echo command and redirect the output from the
command line into your chosen files under
/proc. For example:
echo "Your-New-Kernel-Value" > /proc/your/file
Similarly, if you wish to view information from
/proc, you should either use a command that is
designed for the purpose or use the command line
What to change
You do not need to be a kernel hacker to get good use out of
/proc, and a basic understanding of the
structure of this filesystem will aid you greatly. You may find that you
don't need to know about anything in
until the day a user asks you for a certain bit of functionality that
makes you glad you bothered to learn where to look to make changes. The
/proc filesystem helps the system administrator
in this respect via its structure and file permissions.
Each file in
/proc has a very particular set of
file permissions assigned to it and will be owned by a particular user ID.
This is very carefully done so that the correct functionality is presented
to the administrator and to the users. The following list summarizes what
particular permissions may do on individual files:
- Read-only: File is not changeable by any user; used for presenting system information
- Root-write: If a file is writeable in
/proc, it is usually writeable only by the root user
- Root-read: Some files may not be viewable to normal system users, only to root
- Other: You may find combinations other than the common three, above, for various reasons
A very broad generalization about
/proc is that
you will find most of it read-only except for the
/proc/sys directory. This directory is the one
that holds most kernel parameters (rather than information) and is the one
that is designed to be changed while the system is running. As a result,
this is the directory that this article will look mainly at.
The last thing to know about learning what to change in
/proc is what you should actually be writing to
these files. You will notice as you look at various files in
/proc that some of them are human readable and
some are data files. The data files can still be read by using specific
utilities such as
will also notice that the human-readable files take two different formats:
some are binary switches and others contain more information. The binary
switch files only contain a zero (off) or a one (on) for that particular
Detailing the exact information and usage of each file in
/proc is outside the scope of this article. For
more information about any
/proc files not
discussed in this article, one of the best sources is the Linux kernel
source itself, which contains some very good documentation. The following
/proc are more useful to a system
administrator. This is not meant to be an exhaustive treatment but an
easy-access reference for day-to-day use.
One of the most useful things to learn as a system administrator is how to add more disk space if you have hot-swap drives available to you, without rebooting the system. Without using
/proc, you could insert your drive, but you
would then have to reboot in order to get the system to recognize the new
disk. Here, you can get the system to recognize the new drive with the
echo "scsi add-single-device w x y z" > /proc/scsi/scsi
For this command to work properly, you must get the parameter values w, x, y, and z correct, as follows:
- w is the host adapter ID, where the first adapter is zero (0)
- x is the SCSI channel on the host adaptor, where the first channel is zero (0)
- y is the SCSI ID of the device
- z is the LUN number, where the first LUN is zero (0)
Once your disk has been added to the system, you can mount any previously
formatted filesystems or you can start formatting it, and so on. If you
are not sure about what device the disk will be, or you want to check any
pre-existing partitions, for example, you can use a command such as
fdisk -l, which will report this information
back to you.
Conversely, the command to remove a device from your system without a reboot would be:
echo "scsi remove-single-device w x y z" > /proc/scsi/scsi
Before you enter this command and remove your hot-swap SCSI disk from your system, make sure you have unmounted any filesystems from this disk first.
This specifies the maximum number of file handles that can be allocated. You may need to increase this value if users get error messages stating that they cannot open more files because the maximum number of open files has been reached. This can be set to any number of files and can be changed by writing a new numeric value to the file.
Default setting: 4096
This file is related to file-max and holds three values:
- Number of allocated file handles
- Number of used file handles
- Maximum number of file handles
This file is read-only and for informational purposes only.
Any files starting with the name "inode" will perform the same operation as files starting with the name "file" as above, but perform their operation relative to inodes instead of file handles.
/proc/sys/fs/overflowuid and /proc/sys/fs/overflowgid
This holds the User ID (UID) and Group ID (GID) for any filesystems that support 16-bit user and group IDs. These values can be changed, but if you really do find the need to do this, you might find it easier to change your group and password file entries instead.
Default Setting: 65534
This specifies the maximum number of super block handlers. Any filesystem you mount needs to use a super block, so you could possibly run out if you mount a lot of filesystems.
Default setting: 256
This shows the currently allocated number of super blocks. This file is read-only and for informational purposes only.
This holds three configurable values that control when process accounting takes place based on the amount of free space (as a percentage) on the filesystem that contains the log:
- If free space goes below this percentage value then process accounting stops
- If free space goes above this percentage value then process accounting starts
- The frequency (in seconds) at which the other two values will be checked
To change a value in this file you should echo a space separated list of numbers.
Default setting: 2 4 30
These values will stop accounting if there is less than 2 percent free space on the filesystem that contains the log and starts it again if there is 4 or more percent free space. Checks are made every 30 seconds.
This file holds a binary value that controls how the system reacts when it receives the ctrl+alt+delete key combination. The two values represent:
- A zero (0) value means the ctrl+alt+delete is trapped and sent to the init program. This will allow the system to have a graceful shutdown and restart, as if you typed the shutdown command.
- A one (1) value means the ctrl+alt+delete is not trapped and no clean shutdown will be performed, as if you just turned the power off.
Default setting: 0
This allows you to configure your network domain name. This has no default value and may or may not already be set.
This allows you to configure your network host name. This has no default value and may or may not already be set.
This specifies the maximum size of a message that can be sent from one process to another process. Messages are passed between processes in kernel memory that is not swapped out to disk, so if you increase this value, you will increase the amount of memory used by the operating system.
Default setting: 8192
This specifies the maximum number of bytes in a single message queue.
Default setting: 16384
This specifies the maximum number of message queue identifiers.
Default setting: 16
This represents the amount of time (in seconds) the kernel will wait before rebooting if it reaches a "kernel panic." A setting of zero (0) seconds will disable rebooting on kernel panic.
Default setting: 0
This holds four numeric values that define where logging messages are sent, depending on their importance. For more information on different log levels, read the manpage for syslog(2). The four values of the file are:
- Console Log Level: messages with a higher priority than this value will be printed to the console
- Default Message Log Level: messages without a priority will be printed with this priority
- Minimum Console Log Level: minimum (highest priority) value that the Console Log Level can be set to
- Default Console Log Level: default value for Console Log Level
Default setting: 6 4 1 7
This is the total amount of shared memory (in bytes) that can be used on the system at any given point.
Default setting: 2097152
This specifies the largest shared memory segment size (in bytes) allowed by the kernel.
Default setting: 33554432
This represents the maximum number of shared memory segments for the whole system.
Default setting: 4096
This activates the System Request Key, if non-zero.
Default setting: 0
This is the maximum number of threads that can be used by the kernel.
Default setting: 2048
This is the time required (in 1/10 seconds) to write a new warning message; other warning messages received during this time will be dropped. This is used to prevent Denial of Service attacks by someone attempting to flood your system with messages.
Default setting: 50 (5 seconds)
This holds a cost value associated with every warning message. The higher the value, the more likely the warning message is to be ignored.
Default setting: 5
This gives the maximum number of packets allowed to queue when an interface receives packets faster than the kernel can process them.
Default setting: 300
This specifies the maximum buffer size allowed per socket.
This is the receive socket buffer's default size (in bytes).
This is the receive socket buffer's maximum size (in bytes).
This is the send socket buffer's default size (in bytes).
This is the send socket buffer's maximum size (in bytes).
All of the IPv4 and IPv6 parameters are fully documented in the kernel source documentation. See the file
Same as IPv4.
This controls the amount of the total system memory (as a percent) that will be used for buffer memory. It holds three values that can be set by writing a space-separated list to the file:
- Minimum percentage of memory that should be used for buffers
- The system will try and maintain this amount of buffer memory when system memory is being pruned in the event of a low amount of system memory remaining
- Maximum percentage of memory that should be used for buffers
Default setting: 2 10 60
This controls how the system reacts to different levels of free memory. It holds three values that can be set by writing a space-separated list to the file:
- If the number of free pages in the system reaches this minimum limit, only the kernel will be permitted to allocate any more memory.
- If the number of free pages in the system falls below this limit, the kernel will start swapping more aggressively to free memory and maintain system performance.
- The kernel will try to keep this amount of system memory free. Falling below this value will start the kernel swapping.
Default setting: 512 768 1024
This controls how the kernel is allowed to swap memory. It holds three values that can be set by writing a space separated list to the file:
- Maximum number of pages the kernel tries to free at one time. If you want to increase bandwidth to/from swap, you will need to increase this number.
- Minimum number of times the kernel tries to free a page on each swap.
- The number of pages the kernel can write in one swap. This has the greatest impact on system performance. The larger the value, the more data can be swapped and the less time is spent disk seeking. However, a value that is too large will adversely affect system performance by flooding the request queue.
Default setting: 512 32 8
This does the same job as
/proc/sys/vm/buffermem, but it does it for
memory mapping and generic caching of files.
Making your kernel settings persistent
A handy utility is provided for making changes to any kernel parameters
/proc/sys directory. It allows you to
make changes to the running kernel (similarly to the echo and redirection
method used above), but it also has a configuration file that is executed
on system boot. This lets you make changes to the running kernel and add
them to the configuration file so that any changes you make will remain
after a system reboot.
The utility is called
sysctl and is fully
documented in the man pages at sysctl(8). The configuration file for
/etc/sysctl.conf, which can be edited and is
documented under sysctl.conf(8).
the files under
/proc/sys as individual
variables that can be changed. So, for example, the file under
/proc/sys that represents the maximum number of
file handles allowed on the system,
/proc/sys/fs/file-max, is represented as
This example reveals some oddities in
sysctl can only change
variables under the
/proc/sys directory, that
part of the variable name is missing as the variables are always assumed
to be under that directory. The next change to note is that the directory
separators (slash, /) have changed to periods (dot, .).
There are two simple rules for converting between files in
/proc/sys and variables in
- Drop the
/proc/sysfrom the beginning.
- Swap slashes for dots in the filenames.
These two rules will let you swap any file name in
/proc/sys for any variable name in
sysctl. The general file to variable conversion
/proc/sys/dir/file --> dir.file
dir1.dir2.file --> /proc/sys/dir1/dir2/file
You can view all the variables that are available to be changed, along with
their current setting, using the command
Variables can also be changed using
which does exactly the same job as the echo method used above. This
notation is as follows:
sysctl -w dir.file="value"
Using the file-max example again, we could change this value to 16384 using one of two methods as follows:
sysctl -w fs.file-max="16384"
echo "16384" > /proc/sys/fs/file-max
Don't forget that
sysctl does not add changes
made to the configuration file; this is left for you to do manually. If
you want your changes to persist after a reboot, you must maintain this
Note: Not all distributions provide
sysctl support. If this is the case for your
particular system, then you can use the echo and redirect method described
above and add these commands to a start-up script so they are executed
every time the system boots.
Commands for setting the system
It is possible to change other non-kernel system parameters while the
system is running and also get these settings to take effect without
rebooting. These can mainly be classified as services, daemons, and
servers that will be listed in the
directory. Since there is an increasingly wide range of scripts that can
be listed in this directory, it is not possible to go through all the
different configurations here. However, below are a few examples of how
/etc/init.d scripts can be manipulated on
different distributions of Linux. Examples of where changes to a daemon
and a reload of the configuration without rebooting might be useful are:
- Changing your Web server configuration and reloading Apache
- Removing an inetd login service that you don't require
- Manipulating your network settings
- Exporting new filesystems via NFS
- Starting/stopping your firewall
First, the generic way of manipulating system services is directly, via the
/etc/init.d. These scripts take
parameters to manipulate the services that they control; you can type the
script name without any parameters to see what the valid options are.
Common parameters are:
- start: Starts a stopped service
- stop: Stops a running service
- restart: Stops and then starts a running service; will start a stopped service
- reload: Reloads service configuration without breaking any connection(s)
- status: Outputs whether the service is running or not
As an example, the following command would reload your xinetd configuration without terminating any connected user's sessions (useful if you make a change to /etc/xinetd.conf):
Red Hat provides a command,
service, that will
manipulate services for you. The
command provides the same functionality as typing the script name itself.
The syntax is as follows:
service script-name [parameter]
service xinetd reload
SuSE also provide a command called
rc. This is
similar to the
service command above, but has
no space between the command and the script name. The syntax is as
Similarly to changing kernel parameters, once you reboot your system, any
changes made to services will be lost. More and more distributions are
adopting the use of the
which manages the services that are started at various run levels
(including on boot). At the time of this writing, the
chkconfig command syntax differs slightly on
different versions of Linux, but if you enter the command
chkconfig without any parameters, you will get
a list of how to use it. More information about
chkconfig can also be found via the man pages
Configuring the Linux kernel on the fly using the
/proc filesystem isn't to be taken lightly, but
once you understand its structure and how to manipulate the various files
and parameters, you've gained the use of a powerful tool for keeping your
servers available around the clock.
I would like to thank Mr. Adrian Fewings for proofreading this article.
- Learn about the Linux kernel at the Linux Kernel Archives.
- Refer to the kernel documentation in the Documentation directory where you installed the kernel source.
Read the man pages for
- Find more Linux documents at the Linux Documentation Project homepage.
- "Understanding Linux configuration files" (developerWorks, December 2001) gives an overview of files that control permissions, system applications, daemons, and more.
- If you want to know what "high availability" means from a mainframe perspective, read the IBM Redpaper "Linux on IBM zSeries and S/390: High Availability for z/VM and Linux."
- In the developerWorks Linux zone, find more resources for Linux developers, and scan our most popular articles and tutorials.
- See all Linux tips and Linux tutorials on developerWorks.