Contents


Basic use of pthreads

An introduction to POSIX threads

Comments

Outside of Anne McCaffrey's writing in the Dragonriders of Pern series of novels, the word "thread" is most feared by programmers. Threads, sometimes called lightweight processes, are associated with gigantic, complicated projects. Library functions have dire warnings that they may not be "thread safe." But what are these "threads?" What do you use them for? What are the risks?

This article introduces threading with a simple threaded application. The threading model used is the POSIX threading interface, often called pthreads. Instructions are based on SuSE Linux 8.2. The code has been tested on both SuSE Linux 8.2 and a recent build of NetBSD-current.

What are threads?

Threads are just like processes, only smaller. The idea of threads is that multiple threads of execution can share a lot of resources; for instance, they generally operate in the same address space. Switching from one thread to another is generally cheaper than switching from one process to another. Furthermore, for processes using a lot of memory, threads may allow substantially more efficient use of memory.

Threads often need to interact with each other, and in doing so, they run into the same problems that would be faced by multiple processes communicating using some kind of inter-process communication (IPC). Some of these issues are covered briefly in this article, because the POSIX thread API provides tools to help deal with these problems, such as deadlocks and race conditions. This article mainly covers problems and solutions specific to multithreaded programming, leaving the general questions of multiprogramming for another day.

Threaded programs also run into a few problems that are not normally seen with multiple processes and IPC. For instance, if two threads both call a function, such as asctime() (which uses a single static data area), they can produce surprising results. This is where the concept of "thread safety" comes from.

A simple program

The sample program used for this article is a die-rolling program. People who play role-playing games, or war games, often spend a lot of time rolling dice. A network-accessible die roller has many features that make it suitable as a demonstration program. Most importantly, the actual code for the program is very simple, so the program's logic doesn't get in the way of understanding the threading.

The biggest distraction, in fact, is networking code. In the hopes of keeping things fairly simple, that's all been hidden away in a couple of subroutines. The first one, socket_setup(), returns a socket which is ready to accept connections. The second one, get_sockets(), takes the socket, accepts a connection, and creates a struct sockets object.

The struct sockets object is just a quick abstraction of an input/output port. One FILE * in, one out, and a flag to remind us to shut the socket down properly later.

You can download the different versions of the program as a tarball from Related topics and view or run them in a separate shell, or view them in a separate browser window. The first is dthread1.c.

Experiment a bit with the provided program, dthread1. It has several options. The first (currently unused but convenient to have around) is a debugging flag, specified by the -d option. The second, -s, determines whether or not it will run on the console. If -s is not provided, the program will listen for connections and talk over them. The third, -t, determines whether or not to run multithreaded. If the -t option is not specified, the program will handle exactly one connection and then exit. Finally, there's the -S option, which causes the program to sleep for one second between each die rolled. This is only interesting because it makes it easier to see the interleaving between multiple connections.

Play with this program a bit. To connect to it once it's running, try telnet localhost 6173. If you're not used to the sorts of games where people roll dice, start with 2d6. The general syntax supported is "NdX", which rolls N dice, each producing a range from 1 through X. (Technically, it actually rolls the X-sided die N number of times, as you can see in the code).

Here we see the simplest form of a threaded application. An important theme to notice; each thread is only manipulating data that's unique to that thread. (This isn't quite true; award yourself a gold star if you spot the error.) This keeps threads from stepping on each other. To that end, pthread_create() takes an argument of type void * that will be passed in to the function where the thread starts execution. This allows you to create an arbitrarily complicated data structure for the thread to work on, and pass it in as a single pointer. When the function whose address is passed in to pthread_create() terminates, the thread is done, but other threads can keep running.

Conspicuous by its absence is any way for the multithreaded version to terminate cleanly. (In the meantime, you can just kill it with Ctrl-C.) It's easy to see what to do with a single thread: exit when the user says to quit. But if you have four users connected, when should you exit? An obvious answer would be "after the last user exits." But how do we check that? One way would be to increment a variable every time a thread is created, and decrement it whenever a thread terminates. When it reaches zero again, close the whole process.

This sounds wonderful, but there is, of course, a catch.

Race conditions and mutexes

Imagine, if you will, what happens if one thread is exiting just as another is being created. If the thread scheduler happens to interleave their execution, we can end up being very much surprised. Thread 1 is executing code equivalent to i = i + 1;. Thread 2 is executing code equivalent to i = i - 1;. For the sake of argument, imagine that the variable i starts out with the value 2.

Thread 1: fetch value of i (2)
Thread 1: add 1 to value (yielding 3)
Thread 2: fetch value of i (2)
Thread 2: subtract one from value (yielding 1)
Thread 2: store value in i (i = 1)
Thread 1: store value in i (i = 3)

Oops.

What you have here is called a "race condition," a condition where something can go wrong only during a small window, meaning that you have to be pretty fast, or pretty unlucky, to see the problem. These can be nightmarish to debug, since they may not occur more than one time in a few million.

We need some way to keep thread 1 and thread 2 from doing this; some way for thread 1 to say "no one else can touch i until I'm done with it." You could have a great deal of fun trying to invent a way to do this; a way for two threads to ensure that they don't clash. You could have even more fun just using the existing mechanism and working on your own code.

The next thing to understand is the concept of a mutex. A mutex (it's short for MUTual EXclusion) is a way of keeping threads from stepping on each other. Think of it as a unique object, which must be picked up. You can only pick it up if no one else has it. Once you have it, no one can take it away from you until you put it down. The process of picking up the unique object is called locking, or acquiring, the mutex. Different people will have learned different terminology for this, so don't be surprised if someone you talk to has another word for it. Conveniently, the POSIX thread interface uses fairly consistent terminology, so "locking" it is.

The procedure for creating and using a mutex is a bit more complicated than the procedure for starting a thread. The mutex object must be declared; once declared, it must be initialized. After this, it can be locked and unlocked.

Mutex code: First example

Now view dthread2.c in your browser (or view it in your expanded pth directory).

For convenience, the pthread_create() calls are separated out into a new routine called spawn(). This way, the change to add mutex code only has to happen in one place. This routine does something new; it locks a mutex called, creatively enough, count_mutex. After locking that, it creates a new thread and increments threadcount; after it's done, it unlocks the mutex. Then, when a thread is ready to terminate, it locks the mutex, decrements threadcount, and unlocks the mutex. If the decrement leaves threadcount at zero, we know that there are no threads running, so it's time to exit. This isn't quite true; there's one thread still running, the one the process started with.

You will notice a call to a function called pthread_mutex_init(). This function handles any run-time preparation needed for a mutex. When you're done with a mutex, you can release any resources allocated during initialization with a call to pthread_mutex_destroy(). Some implementations don't allocate or release any resources; follow the API anyway. If you don't, you can bet that sometime when you least expect it, a production system will use a new implementation that does depend on these calls being made -- and you'll find out about it at 3 a.m.

It might seem reasonable to put the mutex locking code only immediately around the change to threadcount in spawn(). Sounds great; unfortunately, it might actually introduce a very frustrating bug. That program would, every so often, print a prompt, then hang up on you. Sound surprising? Remember, as soon as pthread_create() is called, the new thread starts execution. So, the sequence of events looks sort of like this:

Main thread: Call pthread_create()
New subthread: Print prompt
Old subthread: exit, decrementing threadcount to 0
Main thread: Lock mutex and increment threadcount

Except, of course, that you never get to that last step.

If you moved the change in threadcount in front of the call to pthread_create(), then the mutex code could just surround the actual decrement. The sample program wasn't written that way, because the example was more interesting this way.

Untangling deadlocks

Earlier, this article referred to a subtle potential race condition in the original program. Now, the mystery is revealed. The hidden race condition is that rand() has internal state. If calls to rand() from two threads overlap, it is possible for the wrong numbers to be returned. For this program, that's probably no big loss, but for a serious simulation that depends on the reproducibility of its random numbers, it could be a real problem.

So, let's say we go ahead and add a mutex that surrounds the random number generator. No problem. We've easily fixed our race condition.

See dthread3.c or open the same file in your pth directory.

Unfortunately, there's another potential problem looming, called a deadlock. A deadlock happens when two (or more) threads are each stuck waiting for another. Imagine, if you will, a pair of mutexes; we'll call them count_mutex and rand_mutex. Now, imagine that two threads want to use them. Thread one does this:

mutex_lock(&count_mutex);
mutex_lock(&rand_mutex);
mutex_unlock(&rand_mutex);
mutex_unlock(&count_mutex);

Thread two, however, does them in the other order:

mutex_lock(&rand_mutex);
mutex_lock(&count_mutex);
mutex_unlock(&count_mutex);
mutex_unlock(&rand_mutex);

This is a deadlock waiting to happen. If these two threads start on these code paths at the same time, they'll start to do their locking:

Thread 1: mutex_lock(&count_mutex);
Thread 2: mutex_lock(&rand_mutex);

What happens next? If thread one runs, it'll want to lock rand_mutex, which is held by thread two. If thread two runs, it'll want to lock count_mutex, which is held by thread one. To quote an apocryphal Texas law, "When two trains meet each other at a railroad crossing, each shall come to a full stop, and neither shall proceed until the other has gone."

One simple solution to problems like this is to ensure that locks are always acquired in the same order. Similarly, one simple way to drive safely is to never lose control of the vehicle. In real programs, it's very unlikely that the calls that result in a deadlock condition are all nicely lined up. Even in our simple program (which, if I've done my job correctly, is free of deadlocks), the mutex calls aren't all adjacent.

Now, turn your attention to the next example, dthread4.

This version of the program demonstrates a common source of deadlocks: programmer stupidity.

This version allows multiple specifications on a line. It also limits dice to the common polyhedral types most often seen in role-playing games. The hapless developer of this monstrosity came up with the clever idea of only locking things when they are actually about to be used, but deferring unlocking until the end of a run. So, if the user enters "2d6 2d8," the program locks the six-sided die, rolls it twice, then locks the eight-sided die, and rolls that twice. Only when it's done rolling all the dice does it unlock any of them.

This, unlike previous versions, is vulnerable to a deadlock state. Imagine, if you will, that two users simultaneously request die rolls. One asks for "2d6 2d8." The other asks for "2d8 2d6." What happens?

Thread 1: lock the six-sided die; roll it
Thread 2: lock the eight-sided die; roll it
Thread 1: try to lock the eight-sided die
Thread 2: try to lock the six-sided die

The "clever" solution is, in fact, no solution at all. If the die roller unlocked each die as soon as it was done with a given roll, there would be no problem.

The first lesson from this is that you can go too far in simulation; locking access to individual dice is, frankly, just silly. However, the second lesson is that it is not necessarily possible to see the deadlock inside the code. The decision about which mutexes to lock is made based on data only available at runtime. This is the essence of the problem.

If you want to see this bug in action, try using the -S option, which will give you enough time to swap from one terminal to another and investigate. Now, how would you fix this? Imagine, for the sake of argument, that it is necessary to lock individual dice. How would you go about it? One naive solution is suggested in dthread5.c. Having done this, you could then go ahead and give each die its own random number generator. All good gamers are aware that dice have good and bad rolls in them; you wouldn't want to waste the really good rolls on the small dice, right?

Deadlocks can also happen within a single thread. The default mutex type used is the "fast" mutex type, which has the delightful feature that, if you try to lock it while you already have it locked, it deadlocks. If possible, just design your program never to re-lock a mutex it already has. Otherwise, you can use the "recursive" mutex type, which allows the holder of the lock to lock the same mutex multiple times. Another kind of mutex checks for common errors, such as unlocking a mutex that is already unlocked. Note that a recursive mutex won't help you if you actually have a locking bug. An early version of dthread3.c contained the following code fragment:

Listing 1. Fragment from dthread3.c, containing bug
   int
   roll_die(int n) {
      pthread_mutex_lock(&rand_mutex);
      return rand() % n + 1;
      pthread_mutex_unlock(&rand_mutex);
   }

See if you can spot the bug faster than I did (about five minutes). For extra credit, try to spot it at 4:30 a.m. Check your answer in the sidebar at the end of this article.

A full discussion of deadlocks is beyond the scope of this article, but now you know what to look for. And you can check the references listed in the Related topics section for more on mutexes and deadlocks.

Condition variables

Condition variables are another feature of particular interest. A condition variable is used to allow a thread to sleep until a condition is true. The function primarily used for this is pthread_cond_wait(). It takes two arguments; the first is a pointer to a condition variable, and the second is a locked mutex. Condition variables, like mutexes, must be initialized using an API call, in this case, pthread_cond_init(). When you're done with a condition variable, you can release any resources allocated during initialization with a call to pthread_cond_destroy(). As with mutexes, it's possible that these calls won't do very much in some implementations, but you should make them anyway.

When invoked, pthread_cond_wait() unlocks the mutex and then pauses execution of its thread. It will now remain paused until such time as some other thread wakes it up. These operations are "atomic;" they always happen together, without any other threads executing in between them. After this, other threads run. If another thread calls pthread_cond_signal() on a condition, then a thread that was waiting on that condition variable is woken up. If another thread calls pthread_cond_broadcast() on a condition, then all threads waiting on that variable are woken up.

Finally, the first thing any thread tries to do when waking up from pthread_cond_wait() is re-lock the mutex it unlocked when initially called. This may take some time. In fact, it may take so much time that the condition you were waiting for is no longer the case. For instance, a thread waiting until items are added to a linked list may wake up to find the list empty. In some implementations, threads may occasionally wake up without a signal sent to their condition variable. Thread programming is not always an exact science: always program defensively.

Further reading

This is, of course, only a surface look at the POSIX threading API. Some users may find a need to use the pthread_join() call, which allows one thread to wait until another is completed. There are attributes which can be set, and scheduling can be controlled. There are a number of resources on POSIX threads out on the Web. Don't forget to read the man pages. You can get a list of pthread-related man pages with apropos pthread or man -k pthread.


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=11373
ArticleTitle=Basic use of pthreads
publish-date=01212004