As with so many aspects of Linux, you have choices for the file system type. Most likely, you'll work with Linux partitions that have been designed with one of the extended file systems, as they are universally supported across Linux distributions and offer a robust solution out of the box.
The extended file system goes back to the early days of Linux. This file system removed the early 2GB limitation, but it also suffered from excessive fragmentation. Thus, shortly after the first extended file system's release, the second extended file system (ext2) was developed to offset some of the limitations, such as increasing the size to 4TB. Ext2 quickly became the de facto standard for Linux file systems. As Linux has evolved, so have its file systems to what we have today: the third extended file system (ext3) and the latest, the fourth extended file system (ext4).
Ext3 is the next evolution of the older ext2 and is still widely used
today. One of its principle advantages over ext2 is journaling. Ext3 is
backwards compatible with ext2, so you can convert an ext2 installation to
ext3 without re-partitioning. Using an account with root privileges,
typing tune2fs -j will usually do the job. For
example, if your ext2 file system is located on the second partition of
the first hard disk, you can type
tune2fs -j /dev/sda2 to convert it.
In addition to journaling, ext3 offers improvements over ext2 such as better write speed and robustness. Without journaling, ext2 suffers from an unclean system shutdown in the event of an unexpected power failure or system crash. Upon boot, each ext2 system has to be checked before it's mounted. With the large file systems today, the time for the consistency check is not acceptable in many environments, as it severely limits availability. With journaling (the NTFS file system has journaling), the data is written to disk and marked as either complete or incomplete. If an unclean system shutdown occurs, only those files marked incomplete are checked, thus eliminating the need to check the whole file system. With ext3, you have the option of one of three journaling modes:
- Journal. Performs full data journaling. All data, not just metadata, is written to the journal first (slowest mode).
- Ordered. Technically, only journals metadata but helps solve the corruption issue of write-back by writing to data blocks first.
- Writeback. No data journaling, only journals metadata (fastest mode).
Ext4 is the current evolution of the extended file system and is backwards compatible with ext2 and ext3. Ext4 offers improvements over ext3 primarily in robustness and speed. Ext4 has been available since Linux kernel version 2.6.28.
Table 1 shows some of the main characteristics of the most popular Linux file systems. Understanding these characteristics can help you if plan to design partition schemes or convert an existing partition.
Table 1. Evolution of the extended file system
| File system | |
|---|---|
| Extended file system | (circa 1991) The earliest Linux file system; suffered from excessive fragmentation |
| Ext2 | (circa 1993) Highly robust but no journaling; runs
fsck on the entire file system after a
system crash or unexpected shutdown |
| Ext3 | (circa 2001) Can contain up to 32,000 subdirectories; introduced journaling capabilities; backwards compatible with ext2 |
| Ext4 | (circa 2008) Can contain up to 64,000 subdirectories; improvements over existing ext3 with the option of turning journaling completely off; backwards compatible with ext3 and ext2 |
Understanding how data is stored
Your Linux file system stores two types of data. One is the user data, which is the normal files and directories that users (yourself included) work with. Files can vary among four types: regular, links, FIFOs (named pipes), and sockets.
You may have heard "Everything in Linux is a file or a process." This expression alludes to the fact that there is no registry concept in Linux. Instead, everything is stored in one of the file types. The other type of data your file system stores is metadata, which is the index node, commonly called the inode,. The inode is Linux's way of indexing attributes about a file. Every file has an inode, and these inodes commonly contain information on the file such as:
- File size
- User and group owners
- File permissions
- Number of hard and soft links
- File access and modification time
- Access control list (ACL) information
- Any additional attributes defined on the file, such as immutability
The stat command can provide you with this inode
information, as Listing 1 shows.
Listing 1. Using the stat command
$ stat /etc/services File: `/etc/services' Size: 362031 Blocks: 728 IO Block: 4096 regular file Device: fd00h/64768d Inode: 1638437 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2011-12-19 00:01:25.000000000 -0600 Modify: 2006-02-23 07:09:23.000000000 -0600 Change: 2011-09-18 17:29:37.000000000 -0500 |
Listing 1 uses the stat command on the
/etc/services file. All the inode information and file attributes are
provided in a usable format.
When working from the command line in Linux, you'll see folders often referred to as directories. Directories serve the same purpose as folders in Windows or in a graphical user interface (GUI) environment in Linux. But really, directories are just empty files that can categorize files or even other directories.
All directories are categorized in a hierarchy, with the root (/) directory being at the top of the hierarchy. This is actually a logical categorization, as not all directories reside in the same partition or file system. In fact, even if you are mounting a network file system such as NFS, the mount point will reside somewhere in the hierarchy under the root directory. This is a significant difference from Windows, where you may be accustomed to drive C typically containing the disk file system, while sequential file systems such as network mapping, CD-ROM, and USB are mounted on their own drive, such as D, E, F, or G.
At the highest level of the file system, the superblock contains
information about the file system itself. Although working with the
superblock may not be of much interest, understanding this concept using
the dump2fs command can help you get a picture
of the file system's storage concepts.
The command in Listing 2 obtains information on a
partition on /dev/sda1—in this case, a /boot partition. The
grep -i superblock command uses
grep in a case-insensitive fashion to output
only information related to the string
superblock.
Listing 2. Using dumpe2fs to get superblock information
# dumpe2fs /dev/sda1 | grep -i superblock Primary superblock at 1, Group descriptors at 2-2 Backup superblock at 8193, Group descriptors at 8194-8194 Backup superblock at 24577, Group descriptors at 24578-24578 Backup superblock at 40961, Group descriptors at 40962-40962 Backup superblock at 57345, Group descriptors at 57346-57346 Backup superblock at 73729, Group descriptors at 73730-73730 |
Naturally, you'll want to establish a baseline for your file system for
growth allocation, security check points, and performance expectations.
The GNU arsenal contains many tools for working with a file system.
Popular tools include df,
du, fsck, and
fdisk. Useful but less common tools are
iostat and sar.
You can use the df and
du commands to get an idea of disk usage and
free space. The du -csh /var command displays
directory size information on the /var file system. If you're interested
in getting the file size for subdirectories located in /var , the
du -h command is sufficient.
# du -csh /var 73M /var 73M total |
The df -h command reports disk file system usage
across mount points on the Linux computer in human readable
(- h) format:
# df -h File System Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 37G 3.2G 32G 10% / /dev/sda1 99M 12M 82M 13% /boot tmpfs 506M 0 506M 0% /dev/shm |
To check (and possibly repair) the file system for errors, use the
fsck command. For example, if you want to
check for errors on a partition located on /dev/sda2, type the command
fsck /dev/sda:
# umount /var # fsck /var fsck from util-linux-ng 2.17.2 e2fsck 1.41.12 (17-May-2010) /dev/sda3: clean, 702/192000 files, 52661/768000 blocks |
Note: Use this command on a file system that is not mounted.
In the above examples, the tasks are performed in single user mode. The
/var partition located at /dev/sda3 is first unmounted. The
fsck command found no errors, but if it had, an
attempt would be made to fix them.
iostat can provide disk input/output
activity:
$ iostat
Linux 2.6.18-164.el5 (DemoServer) 12/19/2011
avg-cpu: %user %nice %system %iowait %steal %idle
0.25 1.74 1.26 2.89 0.00 93.86
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 10.69 351.52 227.60 1759192 1139038
sda1 0.06 0.45 0.00 2254 22
sda2 10.62 351.01 227.60 1756658 1139016
dm-0 40.06 350.72 227.60 1755178 1139016
dm-1 0.02 0.18 0.00 920 0
hdc 0.00 0.03 0.00 144 0
fd0 0.00 0.00 0.00 16 0
|
This example demonstrates how the iostat command
is useful for providing Read/Write and overall system usage information.
Notice that by default, the command returns Read/Write usage on all
devices and a cumulative usage snapshot on the top line.
sar can provide system counter information
similar to the Windows Performance Monitor. You can use
sar to display past counters, or you can use it
to display real time:
$ sar 4 5 Linux 2.6.18-164.el5 (DemoServer) 12/19/2011 12:20:20 AM CPU %user %nice %system %iowait %steal %idle 12:20:24 AM all 0.00 0.00 0.00 0.00 0.00 100.00 12:20:28 AM all 0.00 0.00 1.01 0.00 0.00 98.99 12:20:32 AM all 0.00 0.00 0.50 0.00 0.00 99.50 12:20:36 AM all 0.00 0.00 0.00 0.00 0.00 100.00 12:20:40 AM all 0.25 0.00 1.01 0.00 0.00 98.74 Average: all 0.05 0.00 0.50 0.00 0.00 99.45 |
The sar command in this example uses a frequency
of 4 seconds for an internal of five times to display the counter
information.
Optimizing and tuning the file system
One of your primary responsibilities as a systems administrator is to ensure that your users' data is accessible within a satisfactory time frame. Like Windows, monitoring the system's performance on a Linux server is a primary task. Outside of network performance, the file system Read/Write performance can become a bottleneck and is a candidate for potential tuning and optimization.
Ways to tune the file system include:
- Using the
tune2fstool - Changing mount options in the /etc/fstab file
- Changing kernel parameters
Use the command-line tool tune2fs to tune the
volume parameters on the hard disk. For example, if you have large
directories on an ext3 partition, you can speed up lookups by using hashed
b-trees, which can be done with the
tune2fs dir_index switch:
# tune2fs -O dir_index /dev/sda5 |
You should run the tune2fs command with root
privileges. The -O switch specifies the option
to the instructed partition.
When a file system is made available for use, the process is referred to as
mounting the file system. In fact, there is a
mount command to do just that. When you turn on
the Linux computer, it needs to know how to mount the available file
systems. The /etc/fstab file serves this purpose. Like all configuration
files in Linux, you can edit this file with a text editor such as
vi or vim. Inside
this file, you'll see the mount points for the various file systems. When
tuning mounting options, you use the fourth column in each line of a
mount. For example, you can add noatime to
eliminate auditing of last-accessed timestamps on a particular file
system, which could potentially improve performance. If you have a file
system for archives (for example) and users shouldn't write to the data,
you could mount a partition as ro (Read only).
To change the mounting options in /etc/fstab, use the following command line:
UUID=97ee2cc4-8a26-41e9-9da1 /archives ext4 defaults,ro,noatime 1 2 |
Any changes you make to the /etc/fstab file do not take effect immediately.
To enforce the changes without a reboot, you can use the
mount command to unmount, and then remount the
changed file system:
# mount -o remount /archives |
If you have a partition that can be unmounted in your current working
environment, the mount -o remount command is
useful to avoid a reboot after modifying the /etc/fstab file.
You can use the sysctl command to view and
change running kernel parameters. To get a listing of the file
system-related parameters and their current values, type the command
sysclt -a | grep fs, as shown in Listing 3.
Listing 3. Viewing the file system-related kernel parameters
# sysctl -a | grep fs. | less .... fs.quota.warnings = 1 fs.quota.syncs = 23 fs.quota.free_dquots = 0 fs.quota.allocated_dquots = 0 fs.quota.cache_hits = 0 fs.quota.writes = 0 fs.quota.reads = 0 fs.quota.drops = 0 fs.quota.lookups = 0 fs.suid_dumpable = 0 fs.inotify.max_queued_events = 16384 fs.inotify.max_user_watches = 8192 fs.inotify.max_user_instances = 128 fs.aio-max-nr = 65536 fs.aio-nr = 0 fs.lease-break-time = 45 fs.dir-notify-enable = 1 fs.leases-enable = 1 fs.overflowgid = 65534 fs.overflowuid = 65534 fs.dentry-state = 26674 23765 45 0 0 0 fs.file-max = 102263 ......... |
Listing 3 shows a partial listing of file system-related parameters for the
kernel and uses the grep command to filter for
file system-related parameters only. You can change these parameters using
the sysclt -w command. For example, if your
server handles a lot of small files and you are increasingly getting error
messages about "running out of file handles," you can increase the maximum
number of open file descriptors with the command
sysclt -w file-max=xxxxxx, with
xxxxxx being your desired number of maximum file handlers. As
with making any changes to default parameters, there are trade-offs, so be
sure your computer has the memory allocation to handle the increase in
load for file handlers.
Any changes you make using sysctl are not
persistent across reboots. To persist across reboots, you'll need to open
the /etc/sysconf file in a text editor and make the change. Not all the
possible kernel parameters are listed in this file, so if you make a
change and see the parameter listed, simply add it to the file with your
desired value.
Typically, you'll want to "defrag" when fragmentation is at 20% or more on
an operating system. When the extended file system is created, it reserves
about 5% of its disk space for system use to avoid the defragmentation
issue. So in short, in normal scenarios, you shouldn't have to worry about
defragmentation. However, this doesn't mean the generational extended file
systems are completely immune to fragmentation. If you suspect a file of
being fragmented, you can check it with the
filefrag command. The
-v switch provides more detailed information.
The concept of virtual memory in Linux is not much different than virtual memory in Windows. Your Windows operating system uses the page file when the RAM hardware is expended. Virtual memory provides for a relatively inexpensive way to increase performance during the times when RAM may be used to its maximum.
Linux's virtual memory allocation space is located on the swap "file
system." The disk file system needs to be of type
swap. Several command-line (and graphical)
tools are available to provide information about the system's swap usage:
freetopvmstatsar
The free -m command can provide a view of memory
utilization, including swap. The top command
provides a real-time view of processes, CPU, and memory utilization, while
the vmstat command provides system memory and
CPU activity with the added benefit of block input/output. However, I have
found the sar tools similar to Windows
Performance Monitor for getting an analysis of how swap is being used on a
server:
$ sar -w $ sar -B |
In this example, sar -w can provide output to
swap activity, while the sar -d command can
provide information about the Reads and Writes to the SWAP partition.
Consult the sar documentation for configuration
of sar.
If you need more swap space, you have two options: create a swap partition, or create a file in an existing partition for swap space. If you have the partition space, the recommended approach is to allocate swap space on a dedicated swap partition. However, creating a file the size of a needed swap on an existing working partition such as ext3 is possible.
If you happen to create a new swap area, such as from resizing a partition
or even by adding a new hard disk, you should create the swap file system
type with the mkswap command. The sequence of
steps to create new swap space is as follows:
- Use
fdiskto create the partition, and set the type to82(Linux swap). - Use
mkswapto create the swap volume. - Turn on the swap space with the
swapon -acommand. - Add the new swap mount to the /etc/fstab file.
- Reboot, and use the
swapon -scommand to verify that the new swap is available.
Resource usage increases over time, which is why you manage the system. Whether the increase is the result of data growth over normal organizational growth or from an unexpected surge such as from a merger, you can resize or even change the file system type of your existing partitions. Of course, these tasks come with their risks and should be planned carefully with backups.
fdisk, parted, and
its GUI cousin, the GNOME Partition Editor (GParted), are common Linux
tools used for modifying partitions. However, whenever you modify an
existing partition, plan ahead, as the risk of data loss does exist. If
you design your partitions using Logical Volume Manager (LVM), the tasks
should be more seamless than with traditional methods, because LVM
lets you modify partitions without risk of losing data.
This article acquainted you with the capabilities and management options of ext2, ext3, and ext4, while introducing tools to monitor file system usage. In the absence of Windows Performance Monitor, the tools described in this article can give you the statistics you need to effectively manage a Linux file system on a variety of hardware platforms.
Learn
- Learn more about partitions and file
systems in Linux in "Windows-to-Linux roadmap: Part 6. Working with partitions and file
systems" (developerWorks, November 2003).
- Learn more about resizing conventional
partitions using GParted in "Resizing Linux partitions, Part 1: Basics" (developerWorks, August 2010).
- Learn more about advanced partition
resizing issues, including using LVM features, troubleshooting, and
alternatives to partition resizing in "Resizing Linux partitions, Part 2: Advanced resizing" (developerWorks, September 2010).
- Discover how
sarhelps you pinpoint performance bottlenecks in "Easy system monitoring with SAR" (developerWorks, February 2006). - Learn more about LVM.
- In the
developerWorks Linux zone,
find hundreds of how-to articles,
as well as downloads, discussion
forums, and a wealth of other resources for Linux developers and
administrators.
- Stay current with
developerWorks technical events
and webcasts focused on a variety
of IBM products and IT industry topics.
- Attend a
free
developerWorks Live! briefing to get up-to-speed quickly on
IBM products and tools, as well as IT industry trends.
- Watch
developerWorks on-demand demos
ranging from product installation
and setup demos for beginners, to advanced functionality for experienced
developers.
- Follow developerWorks on
Twitter, or subscribe to a
feed of Linux tweets on developerWorks.
Get products and technologies
-
Evaluate
IBM products in the way that suits you best: Download a product
trial, try a product online, use a product in a cloud environment, or
spend a few hours in the
SOA Sandbox
learning how to implement Service Oriented
Architecture efficiently.
- Download GParted.
- Download Parted Magic.
Discuss
- Get involved in the
developerWorks
community. Connect with other developerWorks users while exploring
the developer-driven blogs, forums, groups, and wikis.

Tracy Bost is a seasoned software developer and systems engineer. He is also a lecturer and trainer for the Linux operating system. Tracy has been certified as both a Red Hat Certified Engineer (RHCE) and a Microsoft Certified Systems Engineer (MCSE), along with being an active member of the Linux Foundation. He has worked in several industries, including mortgage, real estate, and the nonprofit sector.



