Linux for Windows systems administrators: Managing and monitoring the extended file system

Use your Windows skills to gain insight on Linux's most popular disk file systems

Windows and Linux use different file system architectures. Fortunately, your Windows experience can put you on the fast track to being comfortable managing and monitoring the Linux extended file systems. This article helps you learn your way around the extended disk file system family on Linux.

Share:

Tracy Bost, Consultant and Trainer, Freelance

Author photo - Tracy BostTracy Bost is a seasoned software developer and systems engineer. He is also a lecturer and trainer for the Linux operating system. Tracy has been certified as both a Red Hat Certified Engineer (RHCE) and a Microsoft Certified Systems Engineer (MCSE), along with being an active member of the Linux Foundation. He has worked in several industries, including mortgage, real estate, and the nonprofit sector.



17 January 2012

Also available in Chinese Russian Japanese Portuguese

About this series

This series of articles builds on your knowledge of working in a Windows environment to develop your Linux systems administration familiarity and skills. To get the most from the articles in this series, you should have experience working with the NTFS file system in a Windows® environment and an understanding of basic the GNU/Linux® shell environment. A working Linux computer on which to explore the concepts and examples in these articles will also be helpful.

As with so many aspects of Linux, you have choices for the file system type. Most likely, you'll work with Linux partitions that have been designed with one of the extended file systems, as they are universally supported across Linux distributions and offer a robust solution out of the box.

The extended file system goes back to the early days of Linux. This file system removed the early 2GB limitation, but it also suffered from excessive fragmentation. Thus, shortly after the first extended file system's release, the second extended file system (ext2) was developed to offset some of the limitations, such as increasing the size to 4TB. Ext2 quickly became the de facto standard for Linux file systems. As Linux has evolved, so have its file systems to what we have today: the third extended file system (ext3) and the latest, the fourth extended file system (ext4).

Ext3 and ext4

Disk file systems vs. extended file systems

This article discusses primarily working with the extended file system family on Linux. Linux supports many disk file system types, such as XFS, ReiserFS, the B-tree file system (Btrfs), and IBM Journaled File System (JFS), to name a few. And you may find one of these file systems more suitable than the extended file system, depending on your environment and system use. But learning about the extended file systems is a good start, as most Linux distributions use either ext3 or ext4 by default.

Ext3 is the next evolution of the older ext2 and is still widely used today. One of its principle advantages over ext2 is journaling. Ext3 is backwards compatible with ext2, so you can convert an ext2 installation to ext3 without re-partitioning. Using an account with root privileges, typing tune2fs -j will usually do the job. For example, if your ext2 file system is located on the second partition of the first hard disk, you can type tune2fs -j /dev/sda2 to convert it.

In addition to journaling, ext3 offers improvements over ext2 such as better write speed and robustness. Without journaling, ext2 suffers from an unclean system shutdown in the event of an unexpected power failure or system crash. Upon boot, each ext2 system has to be checked before it's mounted. With the large file systems today, the time for the consistency check is not acceptable in many environments, as it severely limits availability. With journaling (the NTFS file system has journaling), the data is written to disk and marked as either complete or incomplete. If an unclean system shutdown occurs, only those files marked incomplete are checked, thus eliminating the need to check the whole file system. With ext3, you have the option of one of three journaling modes:

  • Journal. Performs full data journaling. All data, not just metadata, is written to the journal first (slowest mode).
  • Ordered. Technically, only journals metadata but helps solve the corruption issue of write-back by writing to data blocks first.
  • Writeback. No data journaling, only journals metadata (fastest mode).

Ext4 is the current evolution of the extended file system and is backwards compatible with ext2 and ext3. Ext4 offers improvements over ext3 primarily in robustness and speed. Ext4 has been available since Linux kernel version 2.6.28.

Table 1 shows some of the main characteristics of the most popular Linux file systems. Understanding these characteristics can help you if plan to design partition schemes or convert an existing partition.

Table 1. Evolution of the extended file system
File system
Extended file system(circa 1991) The earliest Linux file system; suffered from excessive fragmentation
Ext2(circa 1993) Highly robust but no journaling; runs fsck on the entire file system after a system crash or unexpected shutdown
Ext3(circa 2001) Can contain up to 32,000 subdirectories; introduced journaling capabilities; backwards compatible with ext2
Ext4(circa 2008) Can contain up to 64,000 subdirectories; improvements over existing ext3 with the option of turning journaling completely off; backwards compatible with ext3 and ext2

Understanding how data is stored

Your Linux file system stores two types of data. One is the user data, which is the normal files and directories that users (yourself included) work with. Files can vary among four types: regular, links, FIFOs (named pipes), and sockets.

You may have heard "Everything in Linux is a file or a process." This expression alludes to the fact that there is no registry concept in Linux. Instead, everything is stored in one of the file types. The other type of data your file system stores is metadata, which is the index node, commonly called the inode,. The inode is Linux's way of indexing attributes about a file. Every file has an inode, and these inodes commonly contain information on the file such as:

Standard user account and root privilege commands

For the listings in this article, notice that each command begins with either $ or #. In the Linux shell, these symbols have meaning. The $ symbol at the shell prompt denotes that the user has standard account privileges, while # denotes root (Administrator) privileges. When executing commands in listings that have the #, you'll need sudo access or access to the root account directly to perform the command.

  • File size
  • User and group owners
  • File permissions
  • Number of hard and soft links
  • File access and modification time
  • Access control list (ACL) information
  • Any additional attributes defined on the file, such as immutability

The stat command can provide you with this inode information, as Listing 1 shows.

Listing 1. Using the stat command
$ stat /etc/services  
File: `/etc/services'
Size: 362031    	Blocks: 728        IO Block: 4096   regular file
Device: fd00h/64768d	Inode: 1638437     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2011-12-19 00:01:25.000000000 -0600
Modify: 2006-02-23 07:09:23.000000000 -0600
Change: 2011-09-18 17:29:37.000000000 -0500

Listing 1 uses the stat command on the /etc/services file. All the inode information and file attributes are provided in a usable format.

Directories

When working from the command line in Linux, you'll see folders often referred to as directories. Directories serve the same purpose as folders in Windows or in a graphical user interface (GUI) environment in Linux. But really, directories are just empty files that can categorize files or even other directories.

All directories are categorized in a hierarchy, with the root (/) directory being at the top of the hierarchy. This is actually a logical categorization, as not all directories reside in the same partition or file system. In fact, even if you are mounting a network file system such as NFS, the mount point will reside somewhere in the hierarchy under the root directory. This is a significant difference from Windows, where you may be accustomed to drive C typically containing the disk file system, while sequential file systems such as network mapping, CD-ROM, and USB are mounted on their own drive, such as D, E, F, or G.

Superblock

At the highest level of the file system, the superblock contains information about the file system itself. Although working with the superblock may not be of much interest, understanding this concept using the dump2fs command can help you get a picture of the file system's storage concepts.

The command in Listing 2 obtains information on a partition on /dev/sda1—in this case, a /boot partition. The grep -i superblock command uses grep in a case-insensitive fashion to output only information related to the string superblock.

Listing 2. Using dumpe2fs to get superblock information
# dumpe2fs  /dev/sda1 | grep -i superblock 
  Primary superblock at 1, Group descriptors at 2-2
  Backup superblock at 8193, Group descriptors at 8194-8194
  Backup superblock at 24577, Group descriptors at 24578-24578
  Backup superblock at 40961, Group descriptors at 40962-40962
  Backup superblock at 57345, Group descriptors at 57346-57346
  Backup superblock at 73729, Group descriptors at 73730-73730

Viewing file system status

Naturally, you'll want to establish a baseline for your file system for growth allocation, security check points, and performance expectations. The GNU arsenal contains many tools for working with a file system. Popular tools include df, du, fsck, and fdisk. Useful but less common tools are iostat and sar.

The du and df commands

You can use the df and du commands to get an idea of disk usage and free space. The du -csh /var command displays directory size information on the /var file system. If you're interested in getting the file size for subdirectories located in /var , the du -h command is sufficient.

# du -csh  /var 
73M	/var
73M total

The df -h command reports disk file system usage across mount points on the Linux computer in human readable (- h) format:

# df -h 
 File System            Size  Used Avail Use% Mounted on
 /dev/mapper/VolGroup00-LogVol00    37G  3.2G   32G  10% /
/dev/sda1              99M   12M   82M  13% /boot
tmpfs                 506M     0  506M   0% /dev/shm

The fsck command

To check (and possibly repair) the file system for errors, use the fsck command. For example, if you want to check for errors on a partition located on /dev/sda2, type the command fsck /dev/sda:

# umount  /var
# fsck /var
fsck from util-linux-ng 2.17.2
e2fsck 1.41.12 (17-May-2010)
/dev/sda3: clean, 702/192000 files, 52661/768000 blocks

Note: Use this command on a file system that is not mounted.

In the above examples, the tasks are performed in single user mode. The /var partition located at /dev/sda3 is first unmounted. The fsck command found no errors, but if it had, an attempt would be made to fix them.

The iostat command

iostat can provide disk input/output activity:

$ iostat
Linux 2.6.18-164.el5 (DemoServer) 	12/19/2011

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.25    1.74    1.26    2.89    0.00   93.86

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              10.69       351.52       227.60    1759192    1139038
sda1              0.06         0.45         0.00       2254         22
sda2             10.62       351.01       227.60    1756658    1139016
dm-0             40.06       350.72       227.60    1755178    1139016
dm-1              0.02         0.18         0.00        920          0
hdc                0.00         0.03         0.00        144          0
fd0               	 0.00         0.00         0.00         16           0

This example demonstrates how the iostat command is useful for providing Read/Write and overall system usage information. Notice that by default, the command returns Read/Write usage on all devices and a cumulative usage snapshot on the top line.

The sar command

sar can provide system counter information similar to the Windows Performance Monitor. You can use sar to display past counters, or you can use it to display real time:

$ sar 4 5
Linux 2.6.18-164.el5 (DemoServer) 	12/19/2011

12:20:20 AM       CPU     %user     %nice   %system   %iowait    %steal     %idle
12:20:24 AM       all      0.00      0.00      0.00      0.00      0.00    100.00
12:20:28 AM       all      0.00      0.00      1.01      0.00      0.00     98.99
12:20:32 AM       all      0.00      0.00      0.50      0.00      0.00     99.50
12:20:36 AM       all      0.00      0.00      0.00      0.00      0.00    100.00
12:20:40 AM       all      0.25      0.00      1.01      0.00      0.00     98.74
Average:          all      0.05      0.00      0.50      0.00      0.00     99.45

The sar command in this example uses a frequency of 4 seconds for an internal of five times to display the counter information.


Optimizing and tuning the file system

One of your primary responsibilities as a systems administrator is to ensure that your users' data is accessible within a satisfactory time frame. Like Windows, monitoring the system's performance on a Linux server is a primary task. Outside of network performance, the file system Read/Write performance can become a bottleneck and is a candidate for potential tuning and optimization.

Ways to tune the file system include:

  • Using the tune2fs tool
  • Changing mount options in the /etc/fstab file
  • Changing kernel parameters

Tuning with tune2fs

Use the command-line tool tune2fs to tune the volume parameters on the hard disk. For example, if you have large directories on an ext3 partition, you can speed up lookups by using hashed b-trees, which can be done with the tune2fs dir_index switch:

# tune2fs  -O dir_index  /dev/sda5

You should run the tune2fs command with root privileges. The -O switch specifies the option to the instructed partition.

Mounting with special options

When a file system is made available for use, the process is referred to as mounting the file system. In fact, there is a mount command to do just that. When you turn on the Linux computer, it needs to know how to mount the available file systems. The /etc/fstab file serves this purpose. Like all configuration files in Linux, you can edit this file with a text editor such as vi or vim. Inside this file, you'll see the mount points for the various file systems. When tuning mounting options, you use the fourth column in each line of a mount. For example, you can add noatime to eliminate auditing of last-accessed timestamps on a particular file system, which could potentially improve performance. If you have a file system for archives (for example) and users shouldn't write to the data, you could mount a partition as ro (Read only).

To change the mounting options in /etc/fstab, use the following command line:

UUID=97ee2cc4-8a26-41e9-9da1	/archives	ext4	 defaults,ro,noatime	1 2

Any changes you make to the /etc/fstab file do not take effect immediately. To enforce the changes without a reboot, you can use the mount command to unmount, and then remount the changed file system:

# mount -o remount  /archives

If you have a partition that can be unmounted in your current working environment, the mount -o remount command is useful to avoid a reboot after modifying the /etc/fstab file.

Tuning the kernel parameters

You can use the sysctl command to view and change running kernel parameters. To get a listing of the file system-related parameters and their current values, type the command sysclt -a | grep fs, as shown in Listing 3.

Listing 3. Viewing the file system-related kernel parameters
# sysctl -a | grep fs. | less 
....
fs.quota.warnings = 1
fs.quota.syncs = 23
fs.quota.free_dquots = 0
fs.quota.allocated_dquots = 0
fs.quota.cache_hits = 0
fs.quota.writes = 0
fs.quota.reads = 0
fs.quota.drops = 0
fs.quota.lookups = 0
fs.suid_dumpable = 0
fs.inotify.max_queued_events = 16384
fs.inotify.max_user_watches = 8192
fs.inotify.max_user_instances = 128
fs.aio-max-nr = 65536
fs.aio-nr = 0
fs.lease-break-time = 45
fs.dir-notify-enable = 1
fs.leases-enable = 1
fs.overflowgid = 65534
fs.overflowuid = 65534
fs.dentry-state = 26674	23765	45	0	0	0
fs.file-max = 102263
.........

Listing 3 shows a partial listing of file system-related parameters for the kernel and uses the grep command to filter for file system-related parameters only. You can change these parameters using the sysclt -w command. For example, if your server handles a lot of small files and you are increasingly getting error messages about "running out of file handles," you can increase the maximum number of open file descriptors with the command sysclt -w file-max=xxxxxx, with xxxxxx being your desired number of maximum file handlers. As with making any changes to default parameters, there are trade-offs, so be sure your computer has the memory allocation to handle the increase in load for file handlers.

Any changes you make using sysctl are not persistent across reboots. To persist across reboots, you'll need to open the /etc/sysconf file in a text editor and make the change. Not all the possible kernel parameters are listed in this file, so if you make a change and see the parameter listed, simply add it to the file with your desired value.

Fragmentation

Typically, you'll want to "defrag" when fragmentation is at 20% or more on an operating system. When the extended file system is created, it reserves about 5% of its disk space for system use to avoid the defragmentation issue. So in short, in normal scenarios, you shouldn't have to worry about defragmentation. However, this doesn't mean the generational extended file systems are completely immune to fragmentation. If you suspect a file of being fragmented, you can check it with the filefrag command. The -v switch provides more detailed information.


Working with virtual memory

The concept of virtual memory in Linux is not much different than virtual memory in Windows. Your Windows operating system uses the page file when the RAM hardware is expended. Virtual memory provides for a relatively inexpensive way to increase performance during the times when RAM may be used to its maximum.

Linux swap

Linux's virtual memory allocation space is located on the swap "file system." The disk file system needs to be of type swap. Several command-line (and graphical) tools are available to provide information about the system's swap usage:

  • free
  • top
  • vmstat
  • sar

The free -m command can provide a view of memory utilization, including swap. The top command provides a real-time view of processes, CPU, and memory utilization, while the vmstat command provides system memory and CPU activity with the added benefit of block input/output. However, I have found the sar tools similar to Windows Performance Monitor for getting an analysis of how swap is being used on a server:

$ sar  -w
$ sar  -B

In this example, sar -w can provide output to swap activity, while the sar -d command can provide information about the Reads and Writes to the SWAP partition. Consult the sar documentation for configuration of sar.

Creating new swap space

If you need more swap space, you have two options: create a swap partition, or create a file in an existing partition for swap space. If you have the partition space, the recommended approach is to allocate swap space on a dedicated swap partition. However, creating a file the size of a needed swap on an existing working partition such as ext3 is possible.

If you happen to create a new swap area, such as from resizing a partition or even by adding a new hard disk, you should create the swap file system type with the mkswap command. The sequence of steps to create new swap space is as follows:

  1. Use fdisk to create the partition, and set the type to 82 (Linux swap).
  2. Use mkswap to create the swap volume.
  3. Turn on the swap space with the swapon -a command.
  4. Add the new swap mount to the /etc/fstab file.
  5. Reboot, and use the swapon -s command to verify that the new swap is available.

Modifying file systems

Resource usage increases over time, which is why you manage the system. Whether the increase is the result of data growth over normal organizational growth or from an unexpected surge such as from a merger, you can resize or even change the file system type of your existing partitions. Of course, these tasks come with their risks and should be planned carefully with backups.

fdisk, parted, and its GUI cousin, the GNOME Partition Editor (GParted), are common Linux tools used for modifying partitions. However, whenever you modify an existing partition, plan ahead, as the risk of data loss does exist. If you design your partitions using Logical Volume Manager (LVM), the tasks should be more seamless than with traditional methods, because LVM lets you modify partitions without risk of losing data.


Conclusion

Other articles in this series

View more articles in the Linux for Windows systems administrators series.

This article acquainted you with the capabilities and management options of ext2, ext3, and ext4, while introducing tools to monitor file system usage. In the absence of Windows Performance Monitor, the tools described in this article can give you the statistics you need to effectively manage a Linux file system on a variety of hardware platforms.

Resources

Learn

Get products and technologies

  • Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.
  • Download GParted.
  • Download Parted Magic.

Discuss

  • Get involved in the developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Linux on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=788052
ArticleTitle=Linux for Windows systems administrators: Managing and monitoring the extended file system
publish-date=01172012