Storing configuration data is a central concern of modern application development. Most users expect to be able to set their individual preferences for using an application, and that information has to be stored somewhere, preferably in some kind of reasonable format. Likewise, most applications rely on various data files to perform routine operations, and these files must be stored and retrievable. Even a simple Space Invaders clone needs some kind of system for storing high scores.
In addition to figuring out where to store all this boring but essential data, developers have, over the years, decided on various conventions for how it should be stored. UNIX® systems favor rc files (such as foo.rc) stored in the user's home directory. Older Windows™ systems denoted configuration files with INI (foo.ini) and stored them in the same directory as the program. The early Macs stored preferences in the Preferences folder within the System folder. Today, you'll find Windows config data stored in the registry, although the old INI files still show up now and again. The Mac has moved preferences into a subdirectory of the user's home directory.
Wading through all these options is tough enough as a user, but what about when you start to develop or customize your own applications? In this month's column, I'll shed some light on the various formats and conventions for storing configuration data. I'll also explain the essential principles involved; namely human readability, separation of data for different programs, and making it easy to find configuration data.
Human-readable formats are an unequivocal win when it comes to storing configuration data. One of the most annoying aspects of the Classic Mac OS was that it made preferences almost completely opaque to users. Windows fared far better with its INI files, which had a clear and easy-to-read format and included such niceties as sections and variables, support for comments, and a convention of selecting reasonably clear names. UNIX rc files have always been a potluck; some are brilliant, some are horrible. UNIX loses points in human readability by not mandating consistent file formats. The lack of consistency is a serious problem.
Consistency is more important, and more closely related to readability, than a lot of developers realize. It's tempting to try to invent a better file format, but it's not such a good idea. There are enough formats already without bringing more into the mix. It's one thing for an operating system to introduce a standard format for all applications on that platform; it's another for every application to introduce a new format.
The modern Mac has done especially well by moving preferences data into its ever-so-elegant property list XML files, which actively encourage developers to use clear names for values. The code in Listing 1 shows just how well Apple understood this one.
Listing 1. Saving a preference on the Mac
[[NSUserDefaults standardUserDefaults] setObject:regCode forKey:@"RegCode"];
That's just one line of code (it's Objective-C, if you're wondering) and it does all the work for you -- opening the file, saving the value, the whole nine yards. Making that code so easy was no simple trick, but Apple scored by recognizing that it was important. The best way to ensure a usable system is to make it easy for developers to do the right thing. On the downside, you can't keep comments in property list files because any time the system updates them the comments are erased.
The Windows registry, as you might expect, does not so successfully support human-usability factors. Not only do you have to use a special program to edit your registry settings, but the registry settings themselves have names that seem calculated to discourage users. For instance, the setting to disable the annoying pop-ups that encourage you to "clean up your desktop" every so often is shown in Listing 2.
Listing 2. A variable name in the Windows registry
Note: The code in Listing 2 normally appears as a single line of code. It is on two lines of code in this article to meet printing requirements.
The names are egregiously long because the registry has to hold every setting for every kind of program for every user, as well as system-global settings. That leads me to my second consideration: the importance of separating data for different programs.
One of the most significant and consistent principles in storing configuration data is that config files should be separate from each other. You must be able to delete the preferences or settings for a single program without affecting the others in your system.
The big losers here are Windows and PalmOS. The Windows registry -- oft-touted as a solution to all sorts of problems -- has created a nightmarish single point of failure for configuration data. A troubled registry can easily cost you the settings and configuration data for every program in your system. Also, since detailed system internals are stored in the registry along with user preferences, you can't copy them over to a new system.
PalmOS has the same problem: you can't copy the Saved Preferences file from one Palm system to another without risking catastrophic failure. It's likely that this oversight explains the continued existence of .INI files on modern Windows systems; INIs allow programs to store their settings in a way that is more convenient to the user.
Where to store data is a tricky question. The convention of storing preferences in the application's directory works fairly well on Windows, but is inconvenient on the Mac, where applications are generally self-contained bundles. One obvious weakness of storing preferences and other changeable data with the application is that the data no longer is part of the user's files. This becomes even more important if multiple users share a machine; each user should have an individual copy of preference files. (The same issue can apply to saved games, documents, and other data. I once lost a couple months worth of accounting information because an accounting package kept its data files in a subdirectory of its application directory, rather than in the directory with all my other personal files.)
A centralized location for user data is preferable in most cases. The leap from a centralized directory to a centralized file seems reasonable, until you consider the most terrifying phrase in all of reliable computing: single point of failure. If any program on your computer can corrupt the One Global Database, it poses a threat to every program on your computer.
UNIX doesn't do as well here as it could. Most UNIX systems have system-wide settings in a reasonable-sounding location (such as /etc), and personal settings in home directories. But then you'll find another half-dozen application settings scattered where you'd least expect to find them. Wouldn't it be nice if application developers just stored configuration files in a standard location?
So what do you do when you are the application developer? The first thing, if at all possible, is to use some kind of system-standard feature or library for configuration data. On Windows, you might consider making an .INI file in the user's directory. If you make it in your own directory, it will get lost when the user has to reinstall the system. If you use the registry, it will get lost when the user has to reinstall the system. On the Mac, the default preference stuff is pure gold, so don't mess with it. On UNIX-like systems, you should probably go with a file, or even subdirectory, in the user's home directory. The convention of .programname is a good one, in general, but if your program name is likely to be common you should use something a little more distinctive.
If you have to define your own format, favor human-readable formats. The Mac property list format is great, and nothing prevents you from using it elsewhere, although it does require you to have at least most of a functioning XML parser. Something similar to .INI files is easy and cheap to implement, and might be a good choice for most purposes; do keep an eye on input validation, though, as it's potentially vulnerable to insertion attacks. (Of course, users can just edit the file anyway.)
While the need to move an application from one computer platform to another isn't especially common, it can be a huge source of stress when it comes up. The more you can do to make this process easy, the better. Reinstallation isn't just a hassle; it can also create serious security holes for some programs. If you make it both easier and safer for users to copy applications, you will keep your users happy while you also secure your apps.
As it turns out, storing configuration data isn't so tricky, and avoiding the temptation to re-invent the wheel is half the battle. In this month's column, I've explained the importance of consistency in handling configuration data: consistent (and human-readable) naming conventions, consistent data formats, and a consistent storage location go a long way toward improving an application, from both the user's perspective and the developer's.
This week's action item: Try to move an installed application from one machine to another on at least two platforms; for instance, Mac and UNIX, or Linux® and Windows. How much time do you spend trying to figure out where the settings are stored?
- Dissecting shared libraries (Peter Seebach, developerWorks, January 2005): Read how shared libraries on Linux avoid the DLL problems that the Windows registry was supposed to fix.
- Usability off the beaten path (Peter Seebach, developerWorks, March 2006): Take a look at recent user-centered innovations in computer software and hardware design.
- Managing Linux configuration files (Teodor Zlatanov, developerWorks, June 2004): Learn to back up, distribute, and make portable your peskiest Linux and UNIX config files using the CVS tree.
- Cultured Perl: Application configuration with Perl (Teodor Zlatanov, developerWorks, October 2000): Take a hands-on look at application configuration using the CPAN AppConfig module.
- The cranky user columns: Read any of the earlier articles in this column.
- developerWorks technical events and webcasts: Stay current with jam-packed technical sessions that shorten your learning curve, and improve the quality and results of your most difficult software projects.
Get products and technologies
- Free downloads and learning resources: Improve your work with articles, tutorials, and downloads from the developerWorks Web Architecture zone.
- developerWorks blogs: Get involved in the developerWorks community!