As a caretaker of cats, I've learned that the key to looking cool is to pretend that you make mistakes on purpose. Or, as many a teacher has put it, "Now, which of you spotted my deliberate mistake?"
Throughout the pseudo project, there have been a number of interesting false starts, strange bugs, and other learning experiences. Some of these are weird corner cases; some of them are things I can't believe ever worked. Mercifully, I do not recount the hundreds of occasions I ended up with code that wouldn't even compile. I have elevated the simple typo to an art form.
In the initial working release of pseudo, rename operations didn't actually work. This didn't produce any real-world consequences, because the pseudo daemon fixed up the database entries automatically on further access.
One of the limitations of the SQLite SQL engine is that it cannot use
indexes for LIKE comparisons. When renaming a
directory, obviously, you want to rename the files within the directory.
So, if you are renaming the directory /foo to
/bar, you want to replace the string
/foo with /bar in
every path that starts with /foo/.
If you do this using the SQL clause
(path LIKE ? || '/'), though, it can't use the
index, and it's horribly slow. Browsing around, I found a delightfully
perverse workaround:
(path > (? || '/') AND path
< (? || '0').
Assuming an ASCII system, this is precisely equivalent to
path/ followed by anything whatsoever, but
because it's just relational operators, it used the index. This produced a
factor of roughly twenty thousand speedup even on small file systems.
However, while converting to this, I made a tiny little mistake. The net
result is that I changed the order of the parameter bindings, such that,
if you renamed /foo to
/bar, I ended up replacing
/bar with /foo in
all paths starting with /foo/. Which did
nothing, but at least it did nothing quickly.
Because of pseudo's paranoia and sanity checks, this never actually caused bogus results, just a bunch of warnings in log files.
An early assumption about pseudo was that there would be no serialization problems because all operations were serialized at the server, and you couldn't get two consecutive operations from a given client out of order. This isn't quite at the level of "640K should be enough for anybody," but it certainly was a serious mistake.
In the original design, underlying operations were attempted, then reported to the server if they succeeded. For a single program, this always worked. However, with multiple programs, there was a possible race condition.
Process A creates a temporary file, with inode number 12345. Process A then removes this temporary file. After it is removed, Process B creates a new file, which reuses inode number 12345. However, as it happens, the pseudo daemon sees the creation message from Process B before it sees the unlink message from Process A. What happens?
On receiving the creation message from B, the pseudo daemon notices that there's an old entry in the database (A's temporary file) with the same inode number; it logs the discrepancy and removes the entry. It then creates the new database entry. However, it gets worse. When the deletion message comes in, the daemon notices that there's an old entry in the database (B's file) with the same inode number. It removes the spurious entry, then goes ahead and tries to remove A's temporary file from the database too. At the end of this, B's file is no longer recorded in the database.
My first attempt to fix this was a dismal failure; I modified the
UNLINK operation to return the previous
database entry for the file, and had the client send an
UNLINK message, then relink a file if the
underlying system call failed. This did eliminate the race condition.
However, it created an even worse failure mode:
rmdir(2) on a directory with files in it
deleted the database entries for all of the files (as removing a directory
implies removing all of its contents).
Adding "deleting" flag to files, and adding
MAY_UNLINK,
DID_UNLINK, and
CANCEL_UNLINK messages finally fixed this.
These messages allow the database to record that a file is believed to be
about to be deleted, so creation messages for it don't generate errors.
Then, a DID_UNLINK message deletes a file only
if the file has the deleting flag set. Thus, I think this one's finally
dead.
Three, three, three bugs in one!
We experienced a mysterious problem when renaming a directory caused files in that directory to be forgotten. This problem was the result of three distinct bugs; fixing any of them corrected the problematic behavior.
When renaming a directory, pseudo checks whether the directory is already known in the pseudo database, and if it isn't, creates a directory of that name so that the rename operation can occur normally (and thus rename any files that were contained in that directory, and were known to pseudo already). This could happen if, for example, you created a directory outside of the pseudo environment, then created files inside that directory while running in the pseudo environment.
The problem came from a combination of three choices. The first was that, when linking a file, pseudo helpfully unlinked any existing file of the same name. The second was that, when unlinking a directory, pseudo helpfully unlinks the contents of that directory. Combining these with the implicit link from a rename means that, when renaming a directory not previously recorded in the database, pseudo would lose all the entries for files in that directory which had been recorded in the database.
This alone wouldn't have come up in our build system. What triggered it was
the completely inexplicable decision on my part to try to improve handling
for the case where rename(3) renames a file
across file systems. In fact, this can't happen, yet for some reason, not
only did I try to implement support, but I did it very badly, such that
the rename wrapper ended up always trying to link the old name in the
database before renaming. The net result was when you moved a directory
that contained files, the files were always removed from
the database.
We fixed these major errors. The implicit unlink done by a
LINK operation now removes only the named file,
not any files that look like they're contained in it. Rename operations no
longer spuriously try to create links. The net result is that renaming a
directory doesn't blow things up anymore.
A five-dimensional vertex case
You've heard of edge cases, and corner cases. This is, in all the time I've been doing software, the only five-dimensional vertex case I've ever seen.
When the "deleting" flag was added, this meant a change in the data structure used by pseudo for IPC. Because I never versioned that it is theoretically possible for a client and server to disagree on the version of the IPC message that they're using. However, that never happens; our build system ensures that you always rebuild the components at the same time.
Yet, we had a very odd problem where a single program would sometimes fail at a specific point in the build. By "fail" I mean "hang indefinitely waiting for a response from the daemon". Meanwhile, the daemon was waiting for input from the socket.
A little more detail about the pseudo protocol is perhaps in order. When a
client starts up, the first thing it does is send a
PSEUDO_MSG_PING message to the server. The
information in that message includes the client's PID, the name of the
client binary, and an optional "tag" message for use in logging events
from that client. If there's no tag message, it's simply omitted. (The
name and tag are sent as the "path", with their length indicated in the
pathlen field.)
The hang was occurring during the ping. It happened only on one developer's machine, and only temporarily. However, we did eventually track it down.
The change we'd made increased the length of the pseudo message structure by four bytes. The server is smart about reads past the base structure size, but the initial part, it just assumes it will always get a full read. (I haven't fixed this bug yet.)
If you were somehow able to arrange to run the new four-byte-longer structure pseudo daemon with an old pseudo client, it wouldn't get as much data as it expected. The client is also sending the path name and tag. Therefore, the failure in question could only happen if the executable running had a name under four characters (it was sed), and had no tag set. Even then, how do you get the old pseudo client and the new pseudo daemon?
In our build system, you can have prebuilt host tools, which are mirrored into the build directory as a tree of symlinks (using lndir), and then any tools that need to be rebuilt to get new versions are rebuilt. The developer in question had old host tools, including the pseudo daemon and client library, which were mirrored in, and then built new tools, including a new daemon and new client library, in the project directory.
Since we set LD_LIBRARY_PATH to point to the
project directory, we consistently picked up the new libraries, and all
was well. Yet, there was one tiny flaw. You can set a linker search path
in an executable, and there are two ways to do it. The modern and friendly
RUNPATH setting is used the way you'd expect it
to be. However, the older and less friendly
RPATH setting has the unusual trait that it is
processed before
LD_LIBRARY_PATH. The binary in question had
been built with an RPATH set to
$ORIGIN/../lib:$ORIGIN/../lib64. The
$ORIGIN magic cookie expands to the directory
containing the binary.
Remember how I said the tools were mirrored with symlinks? The processing
of the $ORIGIN cookie follows symlinks. Thus,
when running this particular executable, the dynamic linker ended up
looking, not in LD_LIBRARY_PATH, but in the
library directory for the prebuilts, causing it to get the old pseudo
client library. Because the executable name was under three characters,
this resulted in a hang rather than a crash or a diagnostic.
To reproduce this bug, you had to have:
- A prebuilt version of pseudo which was at least a week old
- A source tree that would rebuild the newer version
- An executable in the prebuilt tree which didn't need to be rebuilt
- ... with a name no more than three characters long
- ... which specified a library search path using
$ORIGIN, usingRPATH
Tracking this down took some time. The long-term fix involves adding
versioning to the messages (ideally using some indicator that can never
occur in current messages), and a number of other improvements. It also
involves ceasing to use RPATH to indicate link
paths, and possibly copying binaries in rather than symlinking them.
On some recent Linux machines, files copied with plain old
/bin/cp were ending up with incorrect
permission bits. It turns out that the
getxattr()/setxattr()
family of functions can be used to query or set POSIX modes, not just
extended attributes. On one particular system, this is done
instead of using plain
chmod(). Conveniently, the spec requires
falling back to chmod() if the
*xattr() functions fail, so for now, pseudo
intercepts them and fails, setting errno to
ENOTSUP. This may need to be fixed later.
Similarly, during a major refactoring phase, many of pseudo's wrappers were
re-implemented as trivial functions which just called other functions; for
instance, using open() with
O_CREAT to implement
creat(). In particular, many functions which
had *at() variants were implemented by calling
the corresponding *at() function with
AT_FDCWD as the
dirfd parameter. This worked beautifully until
we tried it on a machine that didn't provide
openat().
It is likely that as time goes on, we'll have to develop more complete handling for systems that offer a different range of API support.
Lessons learned and future directions
Many of the problems we've encountered during the initial development and ongoing maintenance of pseudo have been relatively easy to track down and diagnose. The decision to focus on robustness and good logging early on has definitely paid off. On the other hand, the test suite we keep planning to write "soon" has been a major and noticeable gap; building more testing support earlier and using it more could have saved a lot of time.
While it's always a good thing to use existing code and projects when they do match what you're doing, don't be afraid to conclude that the problem you're solving really is a new problem. It happens. Not very often (I think this is the first time it's happened to me), but it does happen, and when it does, be ready for it.
For future work, we still have more robustness and diagnostic improvements to do, but the next major field of inquiry may well be performance; for all that pseudo does what it's supposed to do fairly well, it is undeniably quite a bit slower than fakeroot, and we can probably improve it a fair bit. It'll never be as fast to store things in a stable database format on disk as to keep them only in memory, but there's still plenty of room to speed things up.
Learn
- The pseudo project was
developed entirely to meet internal needs, but was released as open
source.
- developerWorks
podcasts: Tune into interesting interviews and discussions for
software developers
- Technical events and webcasts: Stay current with developerWorks
Live! briefings.
- developerWorks on
Twitter: Follow us for the latest news.
- Events of interest: Check out upcoming conferences, trade shows,
and webcasts that are of interest to IBM open source
developers.
- developerWorks
Open source zone: Find extensive how-to information, tools, and
project updates to help you develop with open source technologies and use
them with IBM's products, as well as our most popular articles and tutorials.
- developerWorks On demand demos: Watch our no-cost demos and learn
about IBM and open source technologies and product functions.
Get products and technologies
- IBM trial
software: Innovate your next open source development project using
trial software, available for download or on DVD.
Discuss
- developerWorks
community: Connect with other developerWorks users while exploring
the developer-driven blogs, forums, groups, and wikis.





