Setting up UNIX file systems

Designing a file system layout to improve system performance and safety

Learn how you can improve your computer's performance and help protect it from harm by setting up your file systems in an optimal manner.

Roderick W. Smith, Consultant and author

Rod SmithRoderick W. Smith is a consultant and author of more than a dozen books on UNIX and Linux, including The Definitive Guide to Samba 3, Linux in a Windows World and Linux Professional Institute Certification Study Guide. He resides in Woonsocket, Rhode Island, and you can reach him at rodsmith@rodsbooks.com.



17 February 2009

Also available in Chinese

The UNIX® operating system enables you to split up your disk data into multiple volumes. Knowing how to do this is only half the battle, though; to make effective use of this ability, you must understand how the files on a UNIX system are organized as well as why they're organized in this way. This article addresses the issue of why you should use multiple volumes.

Terminology notes

Before proceeding further, a couple of terminology issues must be addressed. First, file system can mean either the low-level disk structures used to organize data on the disk or the higher-level structure of files and directories on the computer. The operating system you run supports one or more low-level file systems and therefore determines your available choices. High-level file system structure, by contrast, is a matter of convention.

The second terminology issue relates to the words volume, partition, and slice. In the past, low-level file systems were written to fixed-size hard disk segments called partitions (also known as slices). Computers could have multiple partitions on the disk, but they were inflexible, because they were difficult to change once created. Most modern UNIX implementations support a newer system known as logical volume management (LVM), in which low-level file systems are created within logical volumes that can be moved and resized. A single logical volume can span multiple disks. For brevity's sake, I use the term volume to refer to any of these low-level file system containers. When the distinctions are important, I use their specific names, such as partition or logical volume.

Creating partitions, logical volumes, and file systems

The details of how to create partitions and logical volumes vary greatly from one UNIX variant to another, so you should refer to system-specific documentation for details. Briefly, though, most x86 systems use master boot record (MBR) partitions, which are typically created with a program called fdisk. The GNU Parted program (see Resources for a link) is a more flexible tool for creating partitions; it supports advanced options such as partition resizing, and it can also create partitions using newer schemes than most fdisk programs support.

On some systems, such as Linux®, logical volumes exist atop ordinary partitions and therefore add a great deal of complexity to the process. You may need to create one or more partitions to hold logical volumes, then create the logical volumes in which you'll ultimately create file systems.

File system creation varies from one UNIX flavor to another, but the most common names for the tool that does this job are mkfs and newfs. Typically, you pass the program the name of the volume on which you want to create a new file system, as in:

mkfs /dev/sda3

or

newfs /dev/da0s4e

The UNIX file system tree

UNIX uses a unified directory tree. The root directory, denoted by a forward slash (/), is at the base of this tree, and each subdirectory off of the root is a branch of the tree. Although there are differences among the various UNIX flavors, the most critical features of the UNIX directory tree are similar across all variants, as summarized in Table 1.

Table 1. Common UNIX directories
DirectoryPurpose
/etcSystem configuration files are stored here.
/binThis directory holds binaries that must be accessible at all times and that ordinary users are likely to run.
/sbinThis directory is similar to /bin, but these binaries are likely to be used only by the system administrator.
/libCritical library files reside here.
/bootThis directory holds system boot files. These files may include the kernel, the boot loader, and similar files.
/usrThis directory tree holds extended system files, including its own /usr/bin, /usr/sbin, and /usr/lib directories. These files aren't necessary for basic system operation, but they may include program files for word processors, Web browsers, graphics programs, server programs, and other tools important to users.
/usr/localThis directory tree holds locally compiled programs in a directory that can be protected from package-management tools or system reinstallation.
/optThird-party commercial applications typically reside in this directory.
/varSystem files of a transient or variable nature reside here, such as log files, mail queues, and databases.
/home or /usersEach user receives a separate subdirectory of this directory as a home directory.
/rootThis directory is the root user's home directory.
/tmpThis directory is temporary "scratch space" for all users.
/mnt or /mediaThese directories or their subdirectories hold removable media, such as DVD-ROMs or flash disks, although some systems place removable media elsewhere.
/devUNIX device files reside in this directory, enabling programs to access hardware devices.

Table 1 isn't comprehensive, but it covers the most important directories that exist on most UNIX systems. Some of these directories can be separated into their own volumes. However, some—in particular, /etc, /bin, /sbin, /lib, and /dev—should never be placed on separate volumes. UNIX relies on the contents of these directories to perform critical tasks, including mounting other volumes. For instance, the mount command is likely to reside in /bin, and /dev holds the device files needed to mount a volume. (Some UNIX variants create a dynamic /dev file system, so it may be a separate file system but not a separate volume.)

Every other directory in Table 1 might reasonably be placed in its own volume. However, that doesn't mean that you should necessarily do so for every one. The following sections describe when it makes the most sense to separate file systems for specific purposes.

Creating volumes for performance

One reason to create separate volumes is to improve system performance. As defined here, performance can refer to system speed, efficient use of storage space, or other factors. Volumes can improve performance in several ways:

  • Some file systems are much faster than others, but performance varies from task to task. For instance, the fastest file system for accessing a few large files might not be the best one for accessing many small files.
  • File systems vary in how efficiently they use space. Both overall file system overhead and efficiency issues relating to individual file storage can play a role here.
  • All file systems have size limits. Although several file systems exist with multi-terabyte partition and file size limits, others are more, well, limiting. If you must use such a file system, employing multiple volumes is a practical necessity.
  • Many older file systems lack journals, which often means that they require lengthy disk checks after a system crash. Newer journaling file systems simplify the disk-check process when the system starts up again. If possible, you should use journaling file systems exclusively. If you need to use a non-journaling file system for any volume, try to keep that volume as small as possible.
  • If your computer boots multiple operating systems, you may want to create a volume explicitly for data exchange between the operating systems or provide access to one operating system's files from the other one.
  • If your computer has multiple physical hard disks, you may be forced to create at least one volume per disk. (Some redundant array of independent disks [RAID] and LVM implementations enable you to create a single volume that spans multiple disks, though.) If the disks vary in size or speed, you can place volumes to take best advantage of the disks' characteristics, such as putting seldom-accessed data on a slower physical disk.

These factors interact with one another, and finding the optimum volume configuration to take advantage of these features can be difficult. You should probably start by doing some basic research on the characteristics of the low-level file systems available to you. You can then consider how these characteristics interact with the data you'll be storing. Do you have directories that store lots of small files? Do you have directories that store lots of large sequential-access files, such as MPEG video files? Are some of your files, such as databases, accessed frequently and in time-critical ways? Can you isolate seldom-used files for storage on slower physical disks?

Creating volumes for safety

Safety is one of the prime reasons for creating multiple volumes. In this context, file system safety can go two ways: You can protect your file systems from broader system problems, and you can protect your computer as a whole from problems that the file system causes.

Several types of system problems can affect your file systems, and using multiple volumes can help in these cases:

  • If an operating system bug, bad disk sector, or even certain superuser errors cause corruption to a file system, using multiple volumes can limit that damage to just one volume. Losing one volume's data is likely to be better than losing all the data on the computer.
  • If you split your volumes up such that directories that don't need to be written very often are on their own volumes, then you can mount those volumes read only, thus limiting the risk of accidental damage to their files. For instance, in most UNIX implementations, the /usr directory tree seldom needs to be written, so it can be mounted read only.
  • In extreme cases, you can unmount a volume to block access to it. Some administrators like to do this with the /boot directory. When the operating system is booted, it doesn't normally need to access these files, so unmounting the /boot volume can help protect it.

One important caveat is that these security improvements aren't perfect. Disk errors, administrator mistakes, and other problems can affect all the volumes on a disk by wiping out critical disk structures, such as the partition table. Use of volumes is no substitute for keeping proper disk backups!

The other safety advantage of volumes is that they help protect the system as a whole from problems that can develop on a single volume. Examples include:

  • If an individual file (or set of files) grows too large, it can destabilize the entire computer. Using volumes can minimize this risk. If you set aside a /home volume for user files, for instance, and it fills up, the computer's ability to write its log files will be unaffected.
  • Some file systems support advanced security features, such as access control lists (ACLs). By using these features, you can improve the security on the volumes that use them.

Taken together, the safety advantages of volumes constitute a powerful reason for using multiple volumes. You should carefully consider how best to use these features. Typically, separating user files from the rest of the system provides the most important advantages. You may have system-specific reasons to split off one or more additional volumes, though, such as one that holds mail spool files or a database.

Using network volumes

So far, I've described UNIX file storage as if it were confined to local hard disks. This isn't always the case, though; all UNIX variants support network disk storage. The Network File System (NFS) is the traditional UNIX file-sharing solution; but others, such as the Server Message Block/Common Internet File System (SMB/CIFS), which is most strongly associated with Microsoft® Windows® and the UNIX Samba server, are also quite common. For the most part, you can treat network volumes as if they were local volumes; however, network volumes have certain unique characteristics.

The greatest advantage of network volumes is that they enable easy access to the same files from multiple computers. For instance, you might set aside an NFS or Samba server to hold data files that are to be used by several people collaborating on a project. You can even store users' home directories on such a server.

Network volumes need not be limited to user data, however. If your network includes a number of identical or nearly-identical computers, you can export the /usr directory (or parts of it) from one computer for all the computers to share. This can be a good way to simplify system maintenance—upgrading software on a single server computer automatically updates the software on all the clients. There are problems with this approach, though. For one thing, software upgrades that affect more than the exported file systems will require special attention, because you'll still need to patch the software on the clients. Another problem is that the speed of access to the shared file system is reduced. This approach also creates a single point of failure: If the server dies, all the network's computers that rely on it will go down, too.

Implementing a volume plan

Using the information just presented, you can begin planning how to partition a new UNIX installation or how to rework an existing installation. (Reconfiguring an existing system is likely to be tedious, because it requires backing up and restoring data, transferring data to a new physical disk, or using tools to dynamically adjust volumes or partitions in place.)

The best way to create volumes depends on your needs. I recommend that you determine where your important data reside and whether particular directories might benefit from the features of specific file systems or from isolation from the rest of the system. Do you have directories with many very large or small files that might benefit from particular low-level file systems? Do you have sensitive data that might benefit from file systems with extra security features? Do you have directories that should be protected from disk-full errors that might occur because of files stored in other directories? In any of these cases, consider splitting the directories into separate partitions.

Most commonly, users' home directories (in /home or /users) and the /usr directory are separated into their own volumes. Other frequently isolated directories include /boot, /tmp, /var or some of its subdirectories, /usr/local, and /opt. This list isn't inclusive, though; you may have quite valid reasons to isolate some other specific directory.

Don't go overboard creating volumes, though, especially if you use partitions rather than logical volumes. If you create too many, chances are you'll make one or more much too small or too large for your needs, which will require either resizing the volumes or creating symbolic links to store data on a volume other than the intended one. If you're in doubt, it's generally better to err on the side of using too few volumes—particularly for novice administrators; but once you gain experience and confidence, volumes are a powerful tool.

Summing up

UNIX's method of handling file systems and volumes provides you with an opportunity to improve your computers' security and performance. You can create separate volumes to optimize file system selections and options. Creating separate volumes also enables you to protect your data from your system and to protect your system from data problems. Adding network volumes to the mix provides a way to enable easy user-to-user data exchange or to simplify maintenance of similar networked computers. Understanding how a UNIX system places its data in its directories should be helpful in planning a volume configuration for your system to optimize performance and security.

Resources

Learn

Get products and technologies

  • The GNU Parted program is a flexible tool for creating partitions.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into AIX and Unix on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=370085
ArticleTitle=Setting up UNIX file systems
publish-date=02172009