Anyone who has administered a system of commercial importance knows the value of uptime -- or, conversely, knows the headaches you get from users because of downtime. One of the main reasons a company will run a UNIX server is because of its reliability and stability. If managed carefully, there's usually no need to restart these servers for long periods of time. And to improve matters further, there are administrative tasks -- even at the kernel level -- that you can perform on the fly, keeping your servers available. While you may still need to restart a system to upgrade hardware or if someone trips over the power cord, it's good to know that many administrative tasks can be performed without disrupting service.
This article includes hints and tips for performing various administrative tasks and changing your system without rebooting. Linux provides various ways to change underlying operating system values and settings while keeping the system up and running. These come in two basic forms, those that are general to all Linux systems and are provided in the Linux kernel (you can find more information about the Linux kernel and download kernel source at the Linux Kernel Archives; see Resources for a link), and those that are distribution specific and provided by the vendor. This article deals with both types.
Note: This article is written for a 2.4-level kernel. Some of the options and features may be different for other kernel versions.
Changing running kernel parameters
Linux provides a really neat way for administrators to change the kernel
while the system is running and without the need to reboot the
kernel/system. This is done with a virtual filesystem called
/proc. Linux Gazette provides one of the
simplest and easiest references on /proc I have
seen. (See Resources for a link.) Very basically,
the /proc filesystem gives you a view into the
running kernel, which can be useful for monitoring performance,
discovering system information, finding out how the system is configured,
and changing that configuration. This filesystem is called a virtual
filesystem, because it is not really a filesystem at all. It's
just a map provided by the kernel that is attached to your usual
filesystem structure to give you access to it.
The fact that we have some way of changing the running kernel parameters
while the system is up and running gives the system administrator great
power and flexibility in changing kernel settings. This sort of
implementation was an inspired idea on the part of the Linux kernel
developers. But can too much power be a bad thing? Sometimes. If you are
going to change anything in the /proc
filesystem, you must make sure that you know what you are changing
and what effect this will have on the system. These are really useful
techniques, but a wrong move can give you some rather undesired
consequences. If you are new to this sort of thing or are not sure what
effect one of your changes will have, practice on a machine that is not
important to you or your business.
First, think about how not to make changes to the kernel. There are
two good reasons why you should not just jump into the
/proc filesystem, open a file in your text
editor, make a bunch of changes, and save the file back out again. These
are:
- Data integrity: All of these files represent the running system, and since the kernel may change any of these files at any time, if you open an editor and change some data while the system is changing it underneath you, whatever you save back is unlikely to be what the kernel is expecting.
- Virtual files: All of these files do not actually exist. How would the saved data be synchronized, etc.?
The answer to making changes to any of these files, therefore, is not to
use an editor. When making changes to anything at all in the
/proc filesystem, you should use the
echo command and redirect the output from the
command line into your chosen files under
/proc. For example:
echo "Your-New-Kernel-Value" > /proc/your/file
Similarly, if you wish to view information from
/proc, you should either use a command that is
designed for the purpose or use the command line
cat command.
You do not need to be a kernel hacker to get good use out of
/proc, and a basic understanding of the
structure of this filesystem will aid you greatly. You may find that you
don't need to know about anything in /proc,
until the day a user asks you for a certain bit of functionality that
makes you glad you bothered to learn where to look to make changes. The
/proc filesystem helps the system administrator
in this respect via its structure and file permissions.
Each file in /proc has a very particular set of
file permissions assigned to it and will be owned by a particular user ID.
This is very carefully done so that the correct functionality is presented
to the administrator and to the users. The following list summarizes what
particular permissions may do on individual files:
- Read-only: File is not changeable by any user; used for presenting system information
- Root-write: If a file is writeable in
/proc, it is usually writeable only by the root user - Root-read: Some files may not be viewable to normal system users, only to root
- Other: You may find combinations other than the common three, above, for various reasons
A very broad generalization about /proc is that
you will find most of it read-only except for the
/proc/sys directory. This directory is the one
that holds most kernel parameters (rather than information) and is the one
that is designed to be changed while the system is running. As a result,
this is the directory that this article will look mainly at.
The last thing to know about learning what to change in
/proc is what you should actually be writing to
these files. You will notice as you look at various files in
/proc that some of them are human readable and
some are data files. The data files can still be read by using specific
utilities such as top,
lspci, and free. You
will also notice that the human-readable files take two different formats:
some are binary switches and others contain more information. The binary
switch files only contain a zero (off) or a one (on) for that particular
kernel feature.
Detailing the exact information and usage of each file in
/proc is outside the scope of this article. For
more information about any /proc files not
discussed in this article, one of the best sources is the Linux kernel
source itself, which contains some very good documentation. The following
files in /proc are more useful to a system
administrator. This is not meant to be an exhaustive treatment but an
easy-access reference for day-to-day use.
/proc/scsi/scsi
One of the most useful things to learn as a
system administrator is how to add more disk space if you have hot-swap
drives available to you, without rebooting the system. Without using
/proc, you could insert your drive, but you
would then have to reboot in order to get the system to recognize the new
disk. Here, you can get the system to recognize the new drive with the
following command:
echo "scsi add-single-device w x y z" > /proc/scsi/scsi
For this command to work properly, you must get the parameter values w, x, y, and z correct, as follows:
- w is the host adapter ID, where the first adapter is zero (0)
- x is the SCSI channel on the host adaptor, where the first channel is zero (0)
- y is the SCSI ID of the device
- z is the LUN number, where the first LUN is zero (0)
Once your disk has been added to the system, you can mount any previously
formatted filesystems or you can start formatting it, and so on. If you
are not sure about what device the disk will be, or you want to check any
pre-existing partitions, for example, you can use a command such as
fdisk -l, which will report this information
back to you.
Conversely, the command to remove a device from your system without a reboot would be:
echo "scsi remove-single-device w x y z" > /proc/scsi/scsi
Before you enter this command and remove your hot-swap SCSI disk from your system, make sure you have unmounted any filesystems from this disk first.
/proc/sys/fs/file-max
This specifies the maximum number of
file handles that can be allocated. You may need to increase this value if
users get error messages stating that they cannot open more files because
the maximum number of open files has been reached. This can be set to any
number of files and can be changed by writing a new numeric value to the
file.
Default setting: 4096
/proc/sys/fs/file-nr
This file is related to file-max and
holds three values:
- Number of allocated file handles
- Number of used file handles
- Maximum number of file handles
This file is read-only and for informational purposes only.
/proc/sys/fs/inode-*
Any files starting with the name "inode"
will perform the same operation as files starting with the name "file" as
above, but perform their operation relative to inodes instead of file
handles.
/proc/sys/fs/overflowuid and /proc/sys/fs/overflowgid
This holds the User ID (UID) and Group ID (GID) for any filesystems that
support 16-bit user and group IDs. These values can be changed, but if you
really do find the need to do this, you might find it easier to change
your group and password file entries instead.
Default Setting: 65534
/proc/sys/fs/super-max
This specifies the maximum number of
super block handlers. Any filesystem you mount needs to use a super block,
so you could possibly run out if you mount a lot of filesystems.
Default setting: 256
/proc/sys/fs/super-nr
This shows the currently allocated
number of super blocks. This file is read-only and for informational
purposes only.
/proc/sys/kernel/acct
This holds three configurable values
that control when process accounting takes place based on the amount of
free space (as a percentage) on the filesystem that contains the log:
- If free space goes below this percentage value then process accounting stops
- If free space goes above this percentage value then process accounting starts
- The frequency (in seconds) at which the other two values will be checked
To change a value in this file you should echo a space separated list of numbers.
Default setting: 2 4 30
These values will stop accounting if there is less than 2 percent free space on the filesystem that contains the log and starts it again if there is 4 or more percent free space. Checks are made every 30 seconds.
/proc/sys/kernel/ctrl-alt-del
This file holds a binary value
that controls how the system reacts when it receives the ctrl+alt+delete
key combination. The two values represent:
- A zero (0) value means the ctrl+alt+delete is trapped and sent to the init program. This will allow the system to have a graceful shutdown and restart, as if you typed the shutdown command.
- A one (1) value means the ctrl+alt+delete is not trapped and no clean shutdown will be performed, as if you just turned the power off.
Default setting: 0
/proc/sys/kernel/domainname
This allows you to configure your
network domain name. This has no default value and may or may not already
be set.
/proc/sys/kernel/hostname
This allows you to configure your
network host name. This has no default value and may or may not already be
set.
/proc/sys/kernel/msgmax
This specifies the maximum size of a
message that can be sent from one process to another process. Messages are
passed between processes in kernel memory that is not swapped out to disk,
so if you increase this value, you will increase the amount of memory used
by the operating system.
Default setting: 8192
/proc/sys/kernel/msgmnb
This specifies the maximum number of
bytes in a single message queue.
Default setting: 16384
/proc/sys/kernel/msgmni
This specifies the maximum number of
message queue identifiers.
Default setting: 16
/proc/sys/kernel/panic
This represents the amount of time (in
seconds) the kernel will wait before rebooting if it reaches a "kernel
panic." A setting of zero (0) seconds will disable rebooting on kernel
panic.
Default setting: 0
/proc/sys/kernel/printk
This holds four numeric values that
define where logging messages are sent, depending on their importance. For
more information on different log levels, read the manpage for syslog(2).
The four values of the file are:
- Console Log Level: messages with a higher priority than this value will be printed to the console
- Default Message Log Level: messages without a priority will be printed with this priority
- Minimum Console Log Level: minimum (highest priority) value that the Console Log Level can be set to
- Default Console Log Level: default value for Console Log Level
Default setting: 6 4 1 7
/proc/sys/kernel/shmall
This is the total amount of shared
memory (in bytes) that can be used on the system at any given point.
Default setting: 2097152
/proc/sys/kernel/shmax
This specifies the largest shared
memory segment size (in bytes) allowed by the kernel.
Default setting: 33554432
/proc/sys/kernel/shmmni
This represents the maximum number of
shared memory segments for the whole system.
Default setting: 4096
/proc/sys/kernel/sysrq
This activates the System Request Key,
if non-zero.
Default setting: 0
/proc/sys/kernel/threads-max
This is the maximum number of
threads that can be used by the kernel.
Default setting: 2048
/proc/sys/net/core/message_burst
This is the time required (in
1/10 seconds) to write a new warning message; other warning messages
received during this time will be dropped. This is used to prevent Denial
of Service attacks by someone attempting to flood your system with
messages.
Default setting: 50 (5 seconds)
/proc/sys/net/core/message_cost
This holds a cost value
associated with every warning message. The higher the value, the more
likely the warning message is to be ignored.
Default setting: 5
/proc/sys/net/core/netdev_max_backlog
This gives the maximum
number of packets allowed to queue when an interface receives packets
faster than the kernel can process them.
Default setting: 300
/proc/sys/net/core/optmem_max
This specifies the maximum
buffer size allowed per socket.
/proc/sys/net/core/rmem_default
This is the receive socket
buffer's default size (in bytes).
/proc/sys/net/core/rmem_max
This is the receive socket
buffer's maximum size (in bytes).
/proc/sys/net/core/wmem_default
This is the send socket
buffer's default size (in bytes).
/proc/sys/net/core/wmem_max
This is the send socket buffer's
maximum size (in bytes).
/proc/sys/net/ipv4
All of the IPv4 and IPv6 parameters are
fully documented in the kernel source documentation. See the file
/usr/src/linux/Documentation/networking/ip-sysctl.txt.
/proc/sys/net/ipv6
Same as IPv4.
/proc/sys/vm/buffermem
This controls the amount of the total
system memory (as a percent) that will be used for buffer memory. It holds
three values that can be set by writing a space-separated list to the
file:
- Minimum percentage of memory that should be used for buffers
- The system will try and maintain this amount of buffer memory when system memory is being pruned in the event of a low amount of system memory remaining
- Maximum percentage of memory that should be used for buffers
Default setting: 2 10 60
/proc/sys/vm/freepages
This controls how the system reacts to
different levels of free memory. It holds three values that can be set by
writing a space-separated list to the file:
- If the number of free pages in the system reaches this minimum limit, only the kernel will be permitted to allocate any more memory.
- If the number of free pages in the system falls below this limit, the kernel will start swapping more aggressively to free memory and maintain system performance.
- The kernel will try to keep this amount of system memory free. Falling below this value will start the kernel swapping.
Default setting: 512 768 1024
/proc/sys/vm/kswapd
This controls how the kernel is allowed to
swap memory. It holds three values that can be set by writing a space
separated list to the file:
- Maximum number of pages the kernel tries to free at one time. If you want to increase bandwidth to/from swap, you will need to increase this number.
- Minimum number of times the kernel tries to free a page on each swap.
- The number of pages the kernel can write in one swap. This has the greatest impact on system performance. The larger the value, the more data can be swapped and the less time is spent disk seeking. However, a value that is too large will adversely affect system performance by flooding the request queue.
Default setting: 512 32 8
/proc/sys/vm/pagecache
This does the same job as
/proc/sys/vm/buffermem, but it does it for
memory mapping and generic caching of files.
Making your kernel settings persistent
A handy utility is provided for making changes to any kernel parameters
under the /proc/sys directory. It allows you to
make changes to the running kernel (similarly to the echo and redirection
method used above), but it also has a configuration file that is executed
on system boot. This lets you make changes to the running kernel and add
them to the configuration file so that any changes you make will remain
after a system reboot.
The utility is called sysctl and is fully
documented in the man pages at sysctl(8). The configuration file for
sysctl is
/etc/sysctl.conf, which can be edited and is
documented under sysctl.conf(8). Sysctl treats
the files under /proc/sys as individual
variables that can be changed. So, for example, the file under
/proc/sys that represents the maximum number of
file handles allowed on the system,
/proc/sys/fs/file-max, is represented as
fs.file-max.
This example reveals some oddities in sysctl
notation. Since sysctl can only change
variables under the /proc/sys directory, that
part of the variable name is missing as the variables are always assumed
to be under that directory. The next change to note is that the directory
separators (slash, /) have changed to periods (dot, .).
There are two simple rules for converting between files in
/proc/sys and variables in
sysctl:
- Drop the
/proc/sysfrom the beginning. - Swap slashes for dots in the filenames.
These two rules will let you swap any file name in
/proc/sys for any variable name in
sysctl. The general file to variable conversion
is:
/proc/sys/dir/file --> dir.file
dir1.dir2.file --> /proc/sys/dir1/dir2/file
You can view all the variables that are available to be changed, along with
their current setting, using the command
sysctl -a.
Variables can also be changed using sysctl,
which does exactly the same job as the echo method used above. This
notation is as follows:
sysctl -w dir.file="value"
Using the file-max example again, we could change this value to 16384 using one of two methods as follows:
sysctl -w fs.file-max="16384"
Or:
echo "16384" > /proc/sys/fs/file-max
Don't forget that sysctl does not add changes
made to the configuration file; this is left for you to do manually. If
you want your changes to persist after a reboot, you must maintain this
file.
Note: Not all distributions provide
sysctl support. If this is the case for your
particular system, then you can use the echo and redirect method described
above and add these commands to a start-up script so they are executed
every time the system boots.
Commands for setting the system
It is possible to change other non-kernel system parameters while the
system is running and also get these settings to take effect without
rebooting. These can mainly be classified as services, daemons, and
servers that will be listed in the /etc/init.d
directory. Since there is an increasingly wide range of scripts that can
be listed in this directory, it is not possible to go through all the
different configurations here. However, below are a few examples of how
the /etc/init.d scripts can be manipulated on
different distributions of Linux. Examples of where changes to a daemon
and a reload of the configuration without rebooting might be useful are:
- Changing your Web server configuration and reloading Apache
- Removing an inetd login service that you don't require
- Manipulating your network settings
- Exporting new filesystems via NFS
- Starting/stopping your firewall
First, the generic way of manipulating system services is directly, via the
scripts in /etc/init.d. These scripts take
parameters to manipulate the services that they control; you can type the
script name without any parameters to see what the valid options are.
Common parameters are:
- start: Starts a stopped service
- stop: Stops a running service
- restart: Stops and then starts a running service; will start a stopped service
- reload: Reloads service configuration without breaking any connection(s)
- status: Outputs whether the service is running or not
As an example, the following command would reload your xinetd configuration without terminating any connected user's sessions (useful if you make a change to /etc/xinetd.conf):
/etc/init.d/xinetd reload
Red Hat provides a command, service, that will
manipulate services for you. The service
command provides the same functionality as typing the script name itself.
The syntax is as follows:
service script-name [parameter]
For example:
service xinetd reload
SuSE also provide a command called rc. This is
similar to the service command above, but has
no space between the command and the script name. The syntax is as
follows:
rc{script-name} parameter
For example:
rcapache start
Similarly to changing kernel parameters, once you reboot your system, any
changes made to services will be lost. More and more distributions are
adopting the use of the chkconfig command,
which manages the services that are started at various run levels
(including on boot). At the time of this writing, the
chkconfig command syntax differs slightly on
different versions of Linux, but if you enter the command
chkconfig without any parameters, you will get
a list of how to use it. More information about
chkconfig can also be found via the man pages
at chkconfig(8).
Configuring the Linux kernel on the fly using the
/proc filesystem isn't to be taken lightly, but
once you understand its structure and how to manipulate the various files
and parameters, you've gained the use of a powerful tool for keeping your
servers available around the clock.
I would like to thank Mr. Adrian Fewings for proofreading this article.
Learn
-
Download, get involved
with, or just learn about the
Linux kernel at the Linux Kernel
Archives.
-
Refer to the kernel documentation in the Documentation directory
where you installed the kernel source.
-
Read the man pages for
sysctl(8)andsysctl.conf(8). -
Find more Linux documents at the
Linux Documentation Project
homepage.
-
"Understanding Linux configuration files"
(developerWorks, December 2001) gives an overview of files
that control permissions, system applications, daemons, and more.
-
If you want to know what "high availability" means from a mainframe
perspective, read the IBM Redpaper
"Linux
on IBM zSeries and S/390: High Availability for z/VM and Linux."
-
In the
developerWorks Linux zone,
find more resources for Linux developers, and scan our
most popular articles and
tutorials.
-
See all
Linux tips and
Linux tutorials on developerWorks.
-
Stay current with
developerWorks technical events and Webcasts.
Get products and technologies
-
With
IBM trial software,
available for download directly from developerWorks, build your next development
project on Linux.
Discuss
-
Get involved in the
My developerWorks community; with your personal profile and custom home page, you
can tailor developerWorks to your interests and interact with other developerWorks users.

Graham graduated from The University of Exeter with a B.Sc. (Honors) degree in Computer Science with Management Science in July 2000. He joined IBM as an IT support worker in September 2000 with no previous experience and started learning Linux. One year later, in September 2001, he was certified as a Red Hat Certified Engineer. His work and personal interests gave him experience with many different versions of Linux running on a wide range of platforms to support the development community at the IBM Hursley Laboratory in the UK. Recently, he has taken up writing articles about Linux, his first and only other publication being a guide for the Linux Documentation Project. Contact Graham at gwhite at uk.ibm.com.
Comments (Undergoing maintenance)





