Discover the Boost Filesystem Library
Creating platform-independent code
One of the most common issues with the C++
language—and
indeed, the C++
standard—is the lack of a
well-defined library that helps deal with file system queries and manipulation. This
absence leads programmers to use the native operating system-provided application
program interfaces (APIs), which makes for code that isn't portable across platforms.
Consider a simple situation: You need to figure out whether a file is of type Directory.
In the Microsoft® Windows® platform, you could do this by calling the
GetAttributes
library function, defined in the windows.h
header file:
DWORD GetFileAttributes (LPCTSTR lpFileName);
For directories, the result should be FILE_ATTRIBUTE_DIRECTORY, and your code must check
for the same. On UNIX® and Linux® platforms, the same functionality could
be achieved by using the stat
or fstat
routines and the S_ISDIR macro defined in sys/stat.h. You must also understand the
stat
structure. Here's the corresponding code:
#include <sys/stat.h> #include <stdio.h> int main() { struct stat s1; int status = stat(<const char* denoting pathname>, &s1); printf(“Path is a directory : %d\n”, S_ISDIR(s1.st_mode)); return 0; }
For programs with heavy-duty I/O, such inconsistencies imply a significant engineering
effort for porting the code across platforms. It is in this light that you introduce
the Boost Filesystem Library. This widely used library provides a safe, portable,
and easy-to-use C++
interface for performing file
system operations. It's available as a free download from the
Boost site.
Your first program using boost::filesystem
Before delving into the finer nuances of the Boost Filesystem Library, Listing 1 shows the code to figure out whether a file is of type Directory using Boost APIs.
Listing 1. Code to determine whether a file is of type Directory
#include <stdio.h> #include “boost/filesystem.hpp” int main() { boost::filesystem::path path("/usr/local/include"); // random pathname bool result = boost::filesystem::is_directory(path); printf(“Path is a directory : %d\n”, result); return 0; }
The code is self-explanatory, and you don't need to know any system-specific routines. The code is verified to compile on both gcc-3.4.4 and cl-13.10.3077 without modification.
Understanding the Boost path object
The key to understanding the Boost Filesystem Library is the
path
object, as several routines defined in the
Filesystem Library are only meant to work with a proper
path
object. Often, file system paths are operating
system dependant. For example, it's well known that UNIX and Linux systems use
the virgule ( /
) character as a directory separator,
while Windows uses a backslash (\
) for similar
purpose. The boost::filesystem::path
is meant to
precisely abstract this feature. A path
object could
be initialized in several ways, the most common being an initialization with a
char*
or std::string
,
as shown in Listing 2.
Listing 2. Ways to create a Boost path object
path(); // empty path path(const char* pathname); path(const std::string& pathname); path(const char* pathname, boost::filesystem::path::name_check checker); path(const std::string& pathname, boost::filesystem::path::name_check checker);
While initializing the path
object, the PATHNAME variable
could be provided in either the native format or in the portable format defined
by the Portable Operating System Interface (POSIX) committee. Both approaches
have their relative trade-offs in practice. Consider a situation in which you want
to manipulate a directory that your software created and locate the directory in
/tmp/mywork on UNIX and Linux systems and in C:\tmp\mywork on Windows.
There are several approaches to this problem. Listing 3 shows
the native format-oriented approach.
Listing 3. Initialize the path using the native format
#ifdef UNIX boost::filesystem::path path("/tmp/mywork"); #else boost::filesystem::path path("C:\\tmp\\mywork "); #endif
A single #ifdef
is needed to initialize the path on a
per-operating system basis. However, if you prefer using the portable format, take
a look at Listing 4.
Listing 4. Initialize the path using a portable format
boost::filesystem::path path("/tmp/mywork");
Note that path::name_check
refers to a name-checking
function prototype. The name-check function returns "True" if its argument input
PATHNAME is valid for a particular operating system or file system. Several
name-checking functions come with the Boost Filesystem Library, and you're welcome
to provide your own variants, as well. Some of the more commonly used name-check
functions are the Boost-provided portable_posix_name
and windows_name
.
Overview of the path member functions
The path
object comes with several member methods. These
member routines don't modify the file system but do provide useful information based
on the path name. This section provides an overview of several of these routines:
const std::string& string( )
: This routine returns a copy of the string with which the path was initialized, with formatting per the path grammar rules.std::string root_directory( )
: Given a path, this API would return the root directory; otherwise, it would return an empty string. For example, if the path consists of /tmp/var1, then this routine would return/
, which is the root of the UNIX file system. However, if the path is a relative path such as ../mywork/bin, this routine would return an empty string.std::string root_name( )
: Given a path that begins from the root of the file system, this routine returns a string that contains the first character of the PATHNAME.std::string leaf( )
: Given an absolute path name (for example, /home/user1/file2), this routine provides only the string corresponding to the file name (that is, file2).std::string branch_path( )
: This is the complementary routine toleaf
. Given a path, it returns all the elements used to construct it except the last. For example, for a path initialized with /a/b/c,path.branch_path( )
returns/a/b
. For paths with a single element such as c, this routine returns an empty string.bool empty( )
: If the path object contains an empty string (for example, path path1(""), then this routine returns True or False.boost::filesystem::path::iterator
: This routine is used to traverse the individual elements of the path. Consider Listing 5.Listing 5. Using the path::iterator (begin and end interface)
#include <iostream> #include “boost/filesystem.hpp” int main() { boost::filesystem::path path1("/usr/local/include"); // random pathname boost::filesystem::path::iterator pathI = path1.begin(); while (pathI != path1.end()) { std::cout << *pathI << std::endl; ++pathI; } return 0; } // result: 1
The output of the above program is
/
,usr
,local
,include
, in that order, signifying the directory hierarchy.path operator / (char* lhs, const path& rhs)
: This routine is a non-member function ofpath
. It returns a concatenation of the path formed usinglhs
andrhs
. It automatically inserts/
as the path separator, as Listing 6 shows.Listing 6. Concatenation of path strings
#include <iostream> #include "boost/filesystem.hpp" int main() { boost::filesystem::path path1("local/include"); boost::filesystem::path path2 = operator/("/usr", path1); std::cout << "Path: " << path2 << std::endl; return 0; } // result: /usr/local/include
Error handling
File system operations often encounter unexpected issues, and the Boost Filesystem
Library reports run time errors using C++
exceptions.
The boost::filesystem_error
class is derived from the
std::runtime_error
class. Functions in the library use
filesystem_error
exceptions to report operational
errors. Corresponding to the different possible error types, there are error codes
that are defined in the Boost headers. The user code typically resides within
try...catch
blocks that use the
filesystem_error
exceptions for reporting relevant error
messages. Listing 7 provides a small example of a file is being
renamed, except that the file in the from
path doesn't
exist.
Listing 7. Error handling in Boost
#include <iostream> #include “boost/filesystem.hpp” int main() { try { boost::filesystem::path path("C:\\src\\hdbase\\j1"); boost::filesystem::path path2("C:\\src\\hdbase\\j2"); boost::filesystem::rename(path, path2); } catch(boost::filesystem::filesystem_error e) { // do the needful } return 0; }
Categories of function in the Boost Filesystem Library
The boost::filesystem
provides for different categories
of function: Some, like is_directory
, are meant to query
the file system, while others, like create_directory
,
actively modify it. Based on their functionality, the functions are broadly
classified in the following categories:
- Attribute functions: Provide for miscellaneous information such as file size and disk usage.
- File system-manipulation functions: Meant to create regular files, directories, and symbolic links; copy and rename files; and provide for deletion.
- Utility/predicate functions: Test for the existence of a file, and so on.
- Miscellaneous convenience functions: Change file name extensions programmatically, and so on.
Attribute functions
The Boost Filesystem Library consists of the following attribute functions:
uintmax_t file_size(const path&)
: Returns the size of a regular file in bytesboost::filesystem::space_info space(const path&)
: Takes in a path and returns aspace_info
structure defined as:struct space_info { uintmax_t capacity; uintmax_t free; uintmax_t available; };
Based on the disk partition to which the file system belongs, this routine returns the same disk usage statistics in bytes for all directories in that partition. For example, both C:\src\dir1 and C:\src\dir2 would return the same disk use data.
std::time_t last_write_time(const path&)
: Returns the last modification time of a file.void last_write_time(const path&, std::time_t new_time)
: Modifies the last modification time of a file.const path& current_path( )
: Returns the full path for the current working directory of the program (Note that this path may not be the same directory from which the program was originally run, because it is programmatically possible to change directories.).
File system-manipulation functions
This set of functions is responsible for creating new files and directories, removing files, and so on:
bool create_directory(const path&)
: This function creates a directory with the given path name. (Note that if the PATHNAME itself consists of invalid characters, the result is often platform defined. For example, in both UNIX and Windows systems, the asterisk (*
), the question mark (?
), and other such characters are considered invalid and cannot be in a directory name.)bool create_directories(const path&)
: You can create a directory tree as opposed to a single directory using this API. For example, consider the directory tree /a/b/c, which must be created inside the /tmp folder. Calling this API accomplishes the task, whilecreate_directory
, with the same argument, will throw an exception.bool create_hard_link (const path& frompath, const path& topath)
: This function creates a hard link betweenfrompath
andtopath
.bool create_symlink(const path& frompath, const path& topath)
: This function creates a symbolic (soft) link betweenfrompath
andtopath
.void copy_file(const path& frompath, const path& topath)
: The contents and attributes of the file referred to byfrompath
is copied to the file referred to bytopath
. This routine expects a destination file to be absent; if the destination file is present, it throws an exception. This, therefore, is not equivalent to the system specifiedcp
command in UNIX. It is also expected that thefrompath
variable would refer to a proper regular file. Consider this example:frompath
refers to a symbolic link /tmp/file1, which in turn refers to a file /tmp/file2;topath
is, say, /tmp/file3. In this situation,copy_file
will fail. This is yet another difference that this API sports compared to thecp
command.void rename(const path& frompath, const path& topath)
: This function is the API for renaming a file. It is also possible to simultaneously rename and change the location of the file by specifying the full path name in thetopath
argument, as Listing 8 shows.Listing 8. The rename functionality in Boost
#include <stdio.h> #include “boost/filesystem.hpp” int main() { boost::filesystem::path path("/home/user1/abc"); boost::filesystem::rename(path, "/tmp/def"); return 0; } // abc is renamed def and moved to /tmp folder
bool remove(const path& p)
: This routine attempts to remove the file or directory being referred to by the path p. In the case of a directory, if the contents of the directory are not already empty, this routine throws an exception. A word of caution: This routine does not care what it's deleting, even if the same file is being simultaneously accessed by other programs!unsigned long remove_all(const path& p)
: This API attempts to remove a file or directory referred by path p. Unlikeremove
, it makes no special consideration for directories that are not empty. This function is the Boost equivalent of the UNIXrm –rf
command.
Utility and predicate functions
The Boost Filesystem Library contains the following utility and predicate functions:
bool exists(const path&)
: This function checks for the existence of a file. The file could be anything: a regular file, a directory, a symbolic link, and so on.bool is_directory(const path&)
: This function checks whether a path corresponds to a directory.bool is_regular(const path&)
: This function checks for a normal file (that is, not a directory, a symbolic link, a socket, or a device file).bool is_other(const path&)
: Typically, this function checks for device files such as /dev/tty0 or socket files.bool is_empty(const path&)
: If the path corresponds to a folder, this function checks whether the folder is empty and returns True accordingly. If the path corresponds to a file, this function checks whether file size equals 0. In case of hard or symbolic links to a file, this API checks whether the original file is empty.bool equivalent(const path1& p1, const path2& p2)
: This extremely useful API compares relative- and absolute-style path names. Consider Listing 9:Listing 9. Testing for the equivalence of two paths
#include <stdio.h> #include “boost/filesystem.hpp” int main() { boost::filesystem::path path1("/usr/local/include"); // random pathname boost::filesystem::path path2("/tmp/../usr/local/include"); bool result = boost::filesystem::is_equivalent(path1, path2); printf(“Paths are equivalent : %d\n”, result); return 0; } // result: 1
path system_complete(const path&)
: This function is another API along the same lines asbool equivalent(const path1& p1, const path2& p2)
. Given an arbitrary file path from the current working directory, this API returns the absolute path name of the file. For example, if the user is in the directory /home/user1 and queries for a file ../user2/file2, this function returns/home/user2/file2
, which is the complete PATHNAME of the file, file2.
Miscellaneous functions
The Boost Filesystem Library consists of the following miscellaneous functions:
std::string extension(const path&)
: This function returns the extension of a given file name prefixed with a period (.
). For example, for a file with name test.cpp,extension
returns.cpp
. In case a file does not have an extension, the function returns an empty string. In the case of hidden files (that is, files whose names begin with a.
in UNIX systems), the function appropriately calculates the extension type or returns empty string (so, for .test.profile, this routine returns.profile
).std::string basename(const path&)
: This is the complementary routine toextension
: It returns the string before.
in the file name. Note that even in cases where a absolute file name is provided, this API still returns only that string that directly forms part of the file name, as shown in Listing 10.Listing 10. Using boost::basename
#include <stdio.h> #include <cstring> #include “boost/filesystem.hpp” use namespace std; int main() { boost::filesystem::path path1("/tmp/dir1/test1.c "); boost::filesystem::path path2("/tmp/dir1/.test1.profile"); string result1 = boost::filesystem::basename (path1); string result2 = boost::filesystem::basename (path2); printf(“Basename 1: %s Basename2 : %s\n”, result1.c_str(), result2.c_str()); return 0; } // result: Basename1: test1 Basename2: .test1
std::string change_extension(const path& oldpath, const std::string new_extension)
: This API returns a new string that reflects the changed name. Note that the file corresponding tooldpath
remains unchanged. This is just a convenience function. Also note that you must explicitly specify the dot in the extension. For example,change_extension("test.c", "so")
results intestso
as opposed totest.so
.
Conclusion
This article provided a brief overview of the Boost Filesystem Library. It's not to be treated as a comprehensive document of the entire file system interface in Boost. The internals of the API set are not discussed, nor are the subtleties of these APIs in a non-UNIX or Windows platform such as VMS. For more information about the file system, see Related topics.
Downloadable resources
Related topics
- Boost Filesystem Library documentation: Read the Boost Filesystem Library documentation (somewhat biased in the favor of developers as opposed to users).
- Detailed Filesystem Library documentation: Find information on the PATHNAME check functions.
- IBM trial software: Build your next development project with software for download directly from developerWorks.