Skip to main content

OpenAFS helps corral distributed data

Next-generation NFS-like file system might be the answer to data headaches

Frank Pohlmann (frank@linuxuser.co.uk), U.K. Technical Editor, Linuxuser and Developer
Frank Pohlmann dabbled in the history of Middle Eastern religions before various funding committees decided that research in the history of religious polemics was quite irrelevant to the modern world. He has focused on his hobby -- free software -- ever since. He admits to being the technical editor of the U.K.-based Linuxuser & Developer and came to Linux and FreeBSD through a strong interest in UNIX kernel internals and Linux applications for writers and artists.

Summary:  Distributed file systems haven't had much press lately because it's mostly corporate and educational networks that use them, adding up to only thousands of users. Conceptually, it isn't always clear how such systems fit into the open source file system puzzle. The Open Andrew File System (OpenAFS) is a mature alternative to the Network File System (NFS), which scales only to large numbers of users and doesn't relieve management pain.

Date:  17 May 2005
Level:  Introductory
Activity:  765 views

Users understand the concept of a file system in two ways. The first is a way to organize files, the directories that contain them, and the partitions holding a directory structure. And second, a file system is the way in which files are organized and mapped to the raw metal. Naturally, further layers exist in between, like the virtual file system (VFS) layer and the actual memory management routines, but regarding managing structured information accessible to users, it makes sense for power users to peer into file system internals and get just a sulfurous whiff of the kernel's infernal recesses.

The metal might consist of RAM or hard disks, but in either case, file system data structures organize the sectors and bytes formatted by the hardware manufacturer. Although rather crude, users can sustain this conceptual split fairly comfortably in their working lives. Tools are available that increase, for example, the speed with which users can access files greater than a certain size. Tools are also available to help reorganize directories and files, but these tools keep us safe from bits, bytes, and sectors.

File system metaconcepts

A classic case of this conceptual distinction is the way that FreeBSD -- harking back to the BSD UNIX® world -- uses UNIX File System V2 (UFS2) to organize data on the disk and the Flash File System (FFS) to organize files into directories and optimize directory access. Linux® systems work a bit differently because Linux permits much more than just one or two file systems natively. Thus, the VFS layer makes it possible for Linux users to add new file system support without worrying too much about the way in which Linux manages memory.

When I talk about further distinctions like static and journal file systems, I'm emphasizing the consistency and, to some extent, security of file system contents. Again, in terms that the BSD UNIX world used to view things, static and journal file systems relate to the way in which the UNIX File System (UFS) organizes and secures files. Although Linux file systems have encompassed journal file systems since the Journal File System (JFS), the next-generation file system (XFS), and the early ReiserFS were made available, another area in which neither technical journalism nor corporate publicity sheds much light is distributed file systems.


What we learned from NFS

This state of affairs is related to the fact that today, it would be judged imprudent to make networkwide file system layers available via TCP or User Datagram Protocol (UDP) to a large number of users. Horror scenarios surrounding pre-V3 NFS put off many administrators managing networks with less than a few dozen users. In addition, the appearance of multiple-processor architectures supported by extremely fast motherboard architectures seems to make distributed file system issues a lesser priority. Speed seems guaranteed by hardware, rather than by intelligently implemented distributed systems. Given that distributed file systems tend to rely on underlying file system implementations -- for example, the existing ext2, ext3, and ReiserFS file system drivers -- distributed file systems appear to be confined to the realms of large university networks and the occasional scientific or corporate network.

So, are distributed file systems a third layer on top of the two we have mentioned? One large issue in modern networking is getting heterogeneous networks to cooperate. (Samba is a prominent example.) But you need to understand that today, we have three major players in the file system puzzle: the group of Microsoft® Windows® file systems (FAT16, FAT32, and NTFS file system); Apple Mac OS X (HFS+); and native Linux journal file systems (mostly ReiserFS and ext3). Samba helps get Windows and Linux file systems to cooperate, but it is not meant to make access to files on all major file systems uniformly quick and easy to administer.

One could cite NFS V4 as an attempt to resolve this problem, but given that Request for Comments (RFC) 3530 dealing with NFS V4 is only two years old and NFS4 for kernel V2.6 is fairly new, I'd hesitate to recommend it for production servers. Fedora cores 2 and 3 provide NFS4 patches and NFS4 utilities that demonstrate the rather impressive progress developers have made since NFS forced suffering network administrators to open more ports and configure separate clients for each namespace exported to nervous users. RFC 3530 addresses most security concerns. Still, NFS directories have to be mounted individually. You can make things secure using unified sign-ons and Kerberos, but it all needs work.


OpenAFS rationale

OpenAFS tries to take the pain out of installing and administering software that makes differing file systems cooperate. OpenAFS also works to make differing file systems cooperate efficiently. Although the original metaphor for UNIX and its fascinating successor, Plan 9, was the file, commercial realities dictated that rather than rearchitect modern networked file systems completely, another distributed file system layer had to be added.

Carnegie Mellon University programmers developed AFS in 1983. Soon after, the university set up a company called Transarc to sell services based on AFS. IBM acquired Transarc in 1998 and made AFS available as an open source product under the name OpenAFS. The saga does not end there, however, because OpenAFS spawned other distributed file systems like Coda and Arla, which I cover later. Clients exist for all major operating systems, and documentation is plentiful, if somewhat dated. Gentoo.org made a special effort for OpenAFS to be accessible to Linux users, even though other organizations still seem to refer to NFS when they need distributed file systems.

OpenAFS architecture

OpenAFS is organized around a group of file servers, known as a cell. Each server's identity is usually hidden under the file system itself. Users logging in from an AFS client would not be able to tell which server they were working on because from the users' point of view, they would work on a single system with recognizable UNIX file system semantics. File system content is usually replicated across the cell so that failure of one hard disk would not impair working at the OpenAFS client. OpenAFS requires large client-caching facilities of up to 1 GB to enable accessing frequently used files. It also works as a fully secure Kerberos-based system that uses access control lists (ACLs) to make fine-grained access possible that is not based on the usual Linux and UNIX security models.

Except for the cache manager, which happens to be part of OpenAFS -- curiously only running with ext2 as an underlying file system -- the basic superficial structure of OpenAFS resembles modern NFS implementations. The basic architectures do not look alike at all, though, and you must view any parallels with a large dose of skepticism. For those of us who still prefer to use NFS, but would like to take advantage of OpenAFS facilities, it is possible to use a so-called NFS/AFS translator. As long as an OpenAFS client machine is configured as an NFS server machine, you should be able to enjoy the advantages of both file systems.


How OpenAFS manages its world

NFS is location-dependent, mapping local directories to remote file system locations. OpenAFS hides file locations from users. Because all source files are likely to be saved in read-write copies at various replicated file server locations, you must keep the replicated copies in sync. You do so through a technology known as Ubik, a play on the word ubiquitous and in Eastern European spelling. Ubik processes keep the files, directories, and volumes on the AFS file system in sync, but usually systems with more than three file server processes running benefit the most. A system administrator can group several AFS cells -- the old AFS abbreviation has been retained within OpenAFS file system semantics -- to an AFS site. The administrator would decide on the amount of AFS cells and the extent to which the cells can make storage and files available to other AFS cells within the site.

Partitions, volumes, and directories

AFS administrators divide cells into so-called volumes. Although volumes can be co-extensive with hard-disk partitions, most administrators would not fill a complete partition with a single volume. AFS volumes are actually managed by a separate UNIX-type process called the Volume Manager. You can mount a volume in a manner familiar from a UNIX file system directory. However, you can move an AFS volume from file server to file server -- again, a UNIX-type process -- but a UNIX directory cannot be physically moved from partition to partition. AFS automatically tracks the location of volumes and directories via the Volume Location Manager and keeps track of replicated volumes and files. Therefore, the user never needs to worry whenever a file server ceases operation unexpectedly because AFS would just switch the user to a replicated volume on a different file server machine without the user likely noticing.

Users never work on files located on AFS servers. They work on files that have been fetched from file servers by the client-side cache managers. The Cache Manager is a rather interesting beast that lives in the client's operating system kernel. In the case of Linux, a patch would be added to the kernel. (You can run the Cache Manager on any kernel from 2.4 onward.)

Cache Manager

The Cache Manager can respond to requests from a local application to fetch a file from across the AFS file system. Of course, if the file is a source file you change often, it might not be ideal that the file is likely to exist in several replicated versions. Because users are likely to change an often-requested source file frequently, you have two sets of problems: First, the file is likely to be kept in the client cache, as well as on several replicated volumes on several file server machines; and second, the Cache Manager has to update all volumes. The file server process sends the file to the client cache with a callback attached to it so that the system can deal with any changes happening somewhere else. If a user adds changes to a replicated file cached somewhere else, the original file server will activate the callback and remind the original cached version that it needs to be updated.

Distributed version control systems face this classic problem, but with an important difference: Distributed version control systems work perfectly well when disconnected, while AFS cannot have part of its file system cut off. The separated AFS section would not be able to reconnect with the original file system. File server processes that fail have to resynchronize with the still-running AFS file servers, but cannot add new changes that might have been preserved locally after it was cut off.


AFS descendants

AFS has provided an obvious point of departure for several attempts at new file systems. Two such systems incorporate lessons developers learned from the original distributed file system architecture: Coda and the Swedish open source volunteer effort, Arla.

The Coda file system was the first attempt at improving the original AFS. Starting in 1987 at Carnegie Mellon University, developers meant for Coda to be a conscious improvement on AFS, which had reached V2.0 by that time. In the late 1980s and early '90s, the Coda file system premiered a different cache manager: Venus. Although the basic feature set of Coda resembles that of AFS, Venus enables continued operation for the Coda-enabled client even if the client has been disconnected from the distributed file system. Venus has exactly the same function as the AFS Cache Manager, which takes its file system jobs from the VFS layer inside the kernel.

Connection breakdowns between Coda servers and the Venus cache manager are not always detrimental to network function: A laptop client must be able to work away from the central servers. Thus, Venus stores all updates in the client modification log. When the cache manager reconnects to the central servers, the system reintegrates the client modification log, making all file system updates available to the client.

Disconnected operation can create other problems, but the Venus cache manager illustrates that distributed file systems can be extended to encompass much more than complex networks that are always running in a connected fashion.

Programmers have been developing Arla, a Swedish project that provides a GPLed implementation of OpenAFS, since 1993, even though most of the development and ports have taken place since 1997. Arla imitates OpenAFS fairly well, except that the XFS file system must function on all operating systems that Arla runs on. Arla has reached V0.39 and, just like OpenAFS, runs on all BSD flavors, a good number of Linux kernels since kernel V2.0x, and Sun Solaris. Arla does partly implement a feature for AFS that was not originally in the AFS code: disconnected operation. Mileage may vary, however, and developers have not completed testing.

Other AFS-type file systems are available, like the GPLed InterMezzo, but they do not replicate AFS command-line semantics or its architecture. The world of open source distributed file systems is very much alive, and other distributed file systems have found applications in the mobile computing world.


Resources

  • Check out OpenAFS for sources, binaries, and documentation.

  • NFS has progressed, and you can find the RFC and other documentation on the NFS Version 4 Web site.

  • Find information about the original Andrew File System, although many commands are identical to the OpenAFS version.

  • Carnegie Mellon University still maintains the Coda file system.

  • Find Coda file system documentation, even though this version is somewhat dated.

  • Arla provides an entry point. Documentation tends to be between terse and nonexistent.

  • A fairly popular attempt at writing a new distributed file system is the InterMezzo distributed file system.

  • Gentoo offers downloads, documentation, and news about this compile-it-from-scratch version of Linux.

  • Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.

  • Innovate your next open source development project with IBM trial software, available for download or on DVD.

  • Browse for books on these and other technical topics.

  • Get involved in the developerWorks community by participating in developerWorks blogs.

About the author

Frank Pohlmann dabbled in the history of Middle Eastern religions before various funding committees decided that research in the history of religious polemics was quite irrelevant to the modern world. He has focused on his hobby -- free software -- ever since. He admits to being the technical editor of the U.K.-based Linuxuser & Developer and came to Linux and FreeBSD through a strong interest in UNIX kernel internals and Linux applications for writers and artists.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source, Linux
ArticleID=83460
ArticleTitle=OpenAFS helps corral distributed data
publish-date=05172005
author1-email=frank@linuxuser.co.uk
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers