Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

All about pseudo, Part 3: Lessons learned

I meant to do that

Peter Seebach (dw-nospam@seebs.net), member of technical staff, Wind River Systems
Author photo
Peter Seebach has been messing about with source code since before it was fashionable. He has worked on everything from language standardization to mouse drivers.

Summary:  In this article, third in a series, Peter Seebach looks at a few of the many mistakes he made while developing pseudo. This is not only educational, but a great chance to make fun of someone's mistakes without feeling guilty.

View more content in this series

Date:  24 May 2011
Level:  Intermediate PDF:  A4 and Letter (38KB | 8 pages)Get Adobe® Reader®
Also available in:   Korean  Japanese  Portuguese

Activity:  5217 views
Comments:  

As a caretaker of cats, I've learned that the key to looking cool is to pretend that you make mistakes on purpose. Or, as many a teacher has put it, "Now, which of you spotted my deliberate mistake?"

Throughout the pseudo project, there have been a number of interesting false starts, strange bugs, and other learning experiences. Some of these are weird corner cases; some of them are things I can't believe ever worked. Mercifully, I do not recount the hundreds of occasions I ended up with code that wouldn't even compile. I have elevated the simple typo to an art form.

Let's not and say we did

In the initial working release of pseudo, rename operations didn't actually work. This didn't produce any real-world consequences, because the pseudo daemon fixed up the database entries automatically on further access.

The first lesson learned

The guy who says "I think it would be worth it to replace this piece of infrastructure with a new one" gets to do it. Even if his schedule is already full. This isn't necessarily a bad thing, but keep in mind, that when proposing a course of action, proposing it and volunteering to do it are inextricably linked for many people. If you are advocating change, be prepared to implement it. If you want to implement something, be prepared to support it.

It's been a great experience. I didn't know what I was signing up for at the time. If I had, I might not have suggested it. Yet, I'm glad I did.

One of the limitations of the SQLite SQL engine is that it cannot use indexes for LIKE comparisons. When renaming a directory, obviously, you want to rename the files within the directory. So, if you are renaming the directory /foo to /bar, you want to replace the string /foo with /bar in every path that starts with /foo/.

If you do this using the SQL clause (path LIKE ? || '/'), though, it can't use the index, and it's horribly slow. Browsing around, I found a delightfully perverse workaround: (path > (? || '/') AND path < (? || '0').

Assuming an ASCII system, this is precisely equivalent to path/ followed by anything whatsoever, but because it's just relational operators, it used the index. This produced a factor of roughly twenty thousand speedup even on small file systems.

However, while converting to this, I made a tiny little mistake. The net result is that I changed the order of the parameter bindings, such that, if you renamed /foo to /bar, I ended up replacing /bar with /foo in all paths starting with /foo/. Which did nothing, but at least it did nothing quickly.

Because of pseudo's paranoia and sanity checks, this never actually caused bogus results, just a bunch of warnings in log files.


Serialization is not enough

An early assumption about pseudo was that there would be no serialization problems because all operations were serialized at the server, and you couldn't get two consecutive operations from a given client out of order. This isn't quite at the level of "640K should be enough for anybody," but it certainly was a serious mistake.

In the original design, underlying operations were attempted, then reported to the server if they succeeded. For a single program, this always worked. However, with multiple programs, there was a possible race condition.

Process A creates a temporary file, with inode number 12345. Process A then removes this temporary file. After it is removed, Process B creates a new file, which reuses inode number 12345. However, as it happens, the pseudo daemon sees the creation message from Process B before it sees the unlink message from Process A. What happens?

On receiving the creation message from B, the pseudo daemon notices that there's an old entry in the database (A's temporary file) with the same inode number; it logs the discrepancy and removes the entry. It then creates the new database entry. However, it gets worse. When the deletion message comes in, the daemon notices that there's an old entry in the database (B's file) with the same inode number. It removes the spurious entry, then goes ahead and tries to remove A's temporary file from the database too. At the end of this, B's file is no longer recorded in the database.

My first attempt to fix this was a dismal failure; I modified the UNLINK operation to return the previous database entry for the file, and had the client send an UNLINK message, then relink a file if the underlying system call failed. This did eliminate the race condition. However, it created an even worse failure mode: rmdir(2) on a directory with files in it deleted the database entries for all of the files (as removing a directory implies removing all of its contents).

Adding "deleting" flag to files, and adding MAY_UNLINK, DID_UNLINK, and CANCEL_UNLINK messages finally fixed this. These messages allow the database to record that a file is believed to be about to be deleted, so creation messages for it don't generate errors. Then, a DID_UNLINK message deletes a file only if the file has the deleting flag set. Thus, I think this one's finally dead.


Three, three, three bugs in one!

We experienced a mysterious problem when renaming a directory caused files in that directory to be forgotten. This problem was the result of three distinct bugs; fixing any of them corrected the problematic behavior.

When renaming a directory, pseudo checks whether the directory is already known in the pseudo database, and if it isn't, creates a directory of that name so that the rename operation can occur normally (and thus rename any files that were contained in that directory, and were known to pseudo already). This could happen if, for example, you created a directory outside of the pseudo environment, then created files inside that directory while running in the pseudo environment.

The problem came from a combination of three choices. The first was that, when linking a file, pseudo helpfully unlinked any existing file of the same name. The second was that, when unlinking a directory, pseudo helpfully unlinks the contents of that directory. Combining these with the implicit link from a rename means that, when renaming a directory not previously recorded in the database, pseudo would lose all the entries for files in that directory which had been recorded in the database.

This alone wouldn't have come up in our build system. What triggered it was the completely inexplicable decision on my part to try to improve handling for the case where rename(3) renames a file across file systems. In fact, this can't happen, yet for some reason, not only did I try to implement support, but I did it very badly, such that the rename wrapper ended up always trying to link the old name in the database before renaming. The net result was when you moved a directory that contained files, the files were always removed from the database.

We fixed these major errors. The implicit unlink done by a LINK operation now removes only the named file, not any files that look like they're contained in it. Rename operations no longer spuriously try to create links. The net result is that renaming a directory doesn't blow things up anymore.


A five-dimensional vertex case

You've heard of edge cases, and corner cases. This is, in all the time I've been doing software, the only five-dimensional vertex case I've ever seen.

When the "deleting" flag was added, this meant a change in the data structure used by pseudo for IPC. Because I never versioned that it is theoretically possible for a client and server to disagree on the version of the IPC message that they're using. However, that never happens; our build system ensures that you always rebuild the components at the same time.

Yet, we had a very odd problem where a single program would sometimes fail at a specific point in the build. By "fail" I mean "hang indefinitely waiting for a response from the daemon". Meanwhile, the daemon was waiting for input from the socket.

Setting the stage

A little more detail about the pseudo protocol is perhaps in order. When a client starts up, the first thing it does is send a PSEUDO_MSG_PING message to the server. The information in that message includes the client's PID, the name of the client binary, and an optional "tag" message for use in logging events from that client. If there's no tag message, it's simply omitted. (The name and tag are sent as the "path", with their length indicated in the pathlen field.)

The hang was occurring during the ping. It happened only on one developer's machine, and only temporarily. However, we did eventually track it down.

The change we'd made increased the length of the pseudo message structure by four bytes. The server is smart about reads past the base structure size, but the initial part, it just assumes it will always get a full read. (I haven't fixed this bug yet.)

If you were somehow able to arrange to run the new four-byte-longer structure pseudo daemon with an old pseudo client, it wouldn't get as much data as it expected. The client is also sending the path name and tag. Therefore, the failure in question could only happen if the executable running had a name under four characters (it was sed), and had no tag set. Even then, how do you get the old pseudo client and the new pseudo daemon?

The mystery revealed

In our build system, you can have prebuilt host tools, which are mirrored into the build directory as a tree of symlinks (using lndir), and then any tools that need to be rebuilt to get new versions are rebuilt. The developer in question had old host tools, including the pseudo daemon and client library, which were mirrored in, and then built new tools, including a new daemon and new client library, in the project directory.

Since we set LD_LIBRARY_PATH to point to the project directory, we consistently picked up the new libraries, and all was well. Yet, there was one tiny flaw. You can set a linker search path in an executable, and there are two ways to do it. The modern and friendly RUNPATH setting is used the way you'd expect it to be. However, the older and less friendly RPATH setting has the unusual trait that it is processed before LD_LIBRARY_PATH. The binary in question had been built with an RPATH set to $ORIGIN/../lib:$ORIGIN/../lib64. The $ORIGIN magic cookie expands to the directory containing the binary.

Remember how I said the tools were mirrored with symlinks? The processing of the $ORIGIN cookie follows symlinks. Thus, when running this particular executable, the dynamic linker ended up looking, not in LD_LIBRARY_PATH, but in the library directory for the prebuilts, causing it to get the old pseudo client library. Because the executable name was under three characters, this resulted in a hang rather than a crash or a diagnostic.

To reproduce this bug, you had to have:

  • A prebuilt version of pseudo which was at least a week old
  • A source tree that would rebuild the newer version
  • An executable in the prebuilt tree which didn't need to be rebuilt
  • ... with a name no more than three characters long
  • ... which specified a library search path using $ORIGIN, using RPATH

Tracking this down took some time. The long-term fix involves adding versioning to the messages (ideally using some indicator that can never occur in current messages), and a number of other improvements. It also involves ceasing to use RPATH to indicate link paths, and possibly copying binaries in rather than symlinking them.


API drift

API drift, redux

After this article was drafted, pseudo has been partially ported to work on Mac OS X, where core system utilities require a working getxattr()/setxattr(). Therefore, in OS X, pseudo just quietly passes the calls along. Fortunately, the utilities in question don't use setxattr() to implement file modes. I'll burn that bridge when I get to it.

On some recent Linux machines, files copied with plain old /bin/cp were ending up with incorrect permission bits. It turns out that the getxattr()/setxattr() family of functions can be used to query or set POSIX modes, not just extended attributes. On one particular system, this is done instead of using plain chmod(). Conveniently, the spec requires falling back to chmod() if the *xattr() functions fail, so for now, pseudo intercepts them and fails, setting errno to ENOTSUP. This may need to be fixed later.

Similarly, during a major refactoring phase, many of pseudo's wrappers were re-implemented as trivial functions which just called other functions; for instance, using open() with O_CREAT to implement creat(). In particular, many functions which had *at() variants were implemented by calling the corresponding *at() function with AT_FDCWD as the dirfd parameter. This worked beautifully until we tried it on a machine that didn't provide openat().

It is likely that as time goes on, we'll have to develop more complete handling for systems that offer a different range of API support.


Lessons learned and future directions

Many of the problems we've encountered during the initial development and ongoing maintenance of pseudo have been relatively easy to track down and diagnose. The decision to focus on robustness and good logging early on has definitely paid off. On the other hand, the test suite we keep planning to write "soon" has been a major and noticeable gap; building more testing support earlier and using it more could have saved a lot of time.

While it's always a good thing to use existing code and projects when they do match what you're doing, don't be afraid to conclude that the problem you're solving really is a new problem. It happens. Not very often (I think this is the first time it's happened to me), but it does happen, and when it does, be ready for it.

For future work, we still have more robustness and diagnostic improvements to do, but the next major field of inquiry may well be performance; for all that pseudo does what it's supposed to do fairly well, it is undeniably quite a bit slower than fakeroot, and we can probably improve it a fair bit. It'll never be as fast to store things in a stable database format on disk as to keep them only in memory, but there's still plenty of room to speed things up.


Resources

Learn

Get products and technologies

  • IBM trial software: Innovate your next open source development project using trial software, available for download or on DVD.

Discuss

  • developerWorks community: Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

About the author

Author photo

Peter Seebach has been messing about with source code since before it was fashionable. He has worked on everything from language standardization to mouse drivers.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=660096
ArticleTitle=All about pseudo, Part 3: Lessons learned
publish-date=05242011
author1-email=dw-nospam@seebs.net
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.

Special offers