Many mistakes are made in the name of multi-processing. Most academic programs and many programming texts explain the concepts of concurrency clearly, but it's a difficult topic, and nearly all of us can use a refresher.
Concurrency labels situations where more than one "application" is running at a time. I quote "application" here because the meaning is context dependent. Linux hosts always fill their process tables with a bunch of more-or-less simultaneous programs: network protocol daemons, cron managers, the kernel itself, and often much more. Linux is a multi-tasking operating system. It's built for such duty.
On a typical uniprocessor host, tasks don't really execute simultaneously. The part of the kernel called the scheduler swaps jobs in and out so that they all get a turn. Your browser downloads during the same interval as you're editing a program source, and also playing a music track. Concurrency most often has to do with this appearance of simultaneity.
Keep in mind the "user view" or "programming model" of concurrency as a matter of scheduling access to a unitary resource. Complementing this, though, is a second, "back end," meaning of concurrency. People out for raw performance emphasize this different aspect. In their context, "multi-processing" generally means dividing up a single task into parts on which different central processing units (CPUs) can collaborate. The idea is to finish a job in a shorter time as measured by an external clock, even if at the cost of more hardware and programming complexity.
Both aspects of concurrency have to do with scheduling, or assignment of tasks to CPUs. Both bear on usability. Confusing the two aspects, though, is a common and troublesome error. Beginning programmers seem particularly prone to false beliefs about one of the most important concurrency methods, called "multi-threading." Often abbreviated as "threading," its misconceptions include the ideas that:
- Threading makes programs run faster.
- Threading is the only concurrency construct, or the only practical one.
- N-way hosts work approximately N times as fast as uni-processing hosts.
Just a little conceptual clarity helps correct these mistakes quickly.
Naive developers often say ask, "My program is too slow; how can I make it threaded so it'll be faster?" The answer often is, "You can't." A straightforward transformation of an existing single-tasking application to break it into multi-tasking parts always demands more computations. In general, "threading" such a program makes it take longer.
There's a reason the falsehood persists, of course. Many programs can be factored into parts in a way that eases bottlenecks. A compute-intensive job -- simulation of a Space Shuttle re-entry, say -- that can be spread over eight CPUs rather than one probably will finish much faster. Even more common is to restructure a program to avoid input/output (I/O) "blocks." If your consumer-level application can do useful work while waiting for keyboard input, for data to swap in from disk, or for messages to arrive through the network, it will appear to have "sped up" for free.
It is hazardous, though, to credit threading for these accelerations. They all depend on a deeper analysis; speed-up is possible only when an under-utilized resource is available. Moreover, threading isn't the only way to achieve these concurrencies, and frequently it's not the best one.
Academic literature studies at least a dozen concurrency models important enough to go into production. Along with threading, you're likely to have heard of multi-processing (in a programmatic sense), co-routines, event-based programming, and perhaps continuations, generators, and several of the more esoteric constructs. All of these methods have a rough formal equivalence in the sense that if you have a language that supports, say, generators but not threads, you can write an emulation for threads in terms of generators (and vice-versa).
Programming appropriateness is different from abstract equivalence, though. There are real differences between the concurrency models when you're working to deliver reliable applications on schedule. Threading, for example, has frailties that have been known for many years. It's a relatively low-level programming construct. It's hard to program safely; programs that manipulate threads are prone to inconsistent data, deadlocks, unscalable locking, and inverted priorities. Java recently abandoned its initial intent to support only multi-threading as a core concurrency concept because of the performance problems threading has. Thread-savvy debuggers have been notoriously expensive.
Not all the news is bad, though. If you make the time to understand basic concepts clearly, you can work with threads as reliably as you do XML, LDAP, or any other specialized domain. More immediately, there are safer -- and sometimes faster! -- concurrency models for many situations.
In many, many situations, the best way to multi-task an application is to decompose it into collaborating processes rather than threads. Programmers commonly resist this reality. One reason is history: processes used to be far "heavier" than threads, and, under most flavors of Windows, still are. With modern Linux, though, a context switch between distinct processes might take only 15% more time than the corresponding context switch between same-process threads. What you gain for that cost in time is a far better understood and more robust programming model. Many programmers can safely write independent processes. Relatively few are safe with threads.
When is it good to multi-process rather than multi-thread? Suppose, for example, that you have a "control panel" graphical user interface (GUI) that monitors results of several large calculations, retrieves and updates database records, and perhaps even reports on the status of external physical devices. You could put all this in one process, with a separate thread for each task. That's often the preferred course under Windows.
My development practice, though, generally is to put each task in its own process, communicating through sockets, pipes, or occasionally shared memory. This enormously simplifies unit testing, as you can use all your usual command-line tools for automation of the separate processes. A crash in one process doesn't harm any of the others. Performance is usually about as good as with multi-threading, and, depending on hardware and programming details, occasionally better.
Such a multi-process implementation frequently depends on
event-based programming. Events are a distinct concurrency
concept useful for managing I/O and related multi-tasking
responsibilities. Events relate asynchronous "externalities"
to programmed callbacks (also called signals, bindings, and
so on). Think of the GUI control panel; a high-performance
way to program this with Unix is to update the display only
when the select() system call detects arriving
data. C-oriented programmers often label event-based methods
in terms of select.
You might regard "co-routines" or "generators" as classroom exotica. They were built into the definitions of such languages as Modula and Icon, though, because they make for multi-tasking programming that is expressively powerful while remaining comprehensible and therefore safe. If you have complex performance requirements, if your applications are best modeled in terms of hundreds of subtasks, and especially if your server room is home to a large number of multi-way hosts, you ought to study more of the range of concurrency models. You'll find that each one has applications where it's ideal. Some of these might match your own needs.
Also, be aware that you can probably find support for any of the models you might want to use with Linux. The references below point, among other resources, to implementations and experiments with a wide variety of concurrency models.
One final caution: don't assume that your multitasking software works sensibly on your multiprocessing (often "symmetric multiprocessing" -- SMP) hardware. Especially with older versions of Linux, expertise was often involved in getting useful results from an SMP box. A default Linux 2.4 installation does a good job of using up to four (and sometimes more) processors for distinct processes. Threads within a process, though, can be bottlenecked on a single processor, while other processors sit idle. Other concurrency methods sometimes suffer similarly.
To avoid these resource wastes depends on the details of your platform. With Linux 2.4 and popular multi-way hardware, you can reasonably expect default "kernel threads" (see the Linux threading FAQ in Resources) to schedule threads properly, that is, to share them among different CPUs. Use top and other system management tools to verify that scheduling appears to be correct, and ask your Linux vendor or users' group specific questions about practical thread scheduling.
Much of your programming is likely to have a natural decomposition into distinct logical tasks. Understand basic concurrency concepts clearly, and you can apply them to to meet your own requirements. Remember that concurrency has both a forward- and backward-facing aspect: the "user view" or "programming model" controls the functionality of how you interact with an application, while the "back end" manages assignment of tasks to hardware. Rigorously distinguish your functional and performance requirements. Finally, keep in mind that there's more to concurrency than just threading. You often can make best use of your servers by programming with a model that is not overtly multi-threading.
- Participate in the discussion forum.
- Check out the other installments of Server clinic.
- I have stylistic disputes with the comp.programming.threads
FAQ. There's no doubt, though, that it's a valuable
resource, particularly if you're already elbow-deep in
practical threaded programming challenges.
- My first major complaint about the previous FAQ is that
it's so narrowly oriented toward C/C++ programming that it doesn't
acknowledge other languages or concurrency models.
Quite a few other languages, including Java, have their
own language-specific
threading FAQs.
- The Linux
Threads Home Page
does recognize other languages
than C/C++. Its major problem is that it's been nearly
unmaintained for a couple of years. Still, it has valuable
discussions of user- and kernel-level multi-threading,
co-operative vs. pre-emptive scheduling, and more.
- "Modern
Concurrency Abstractions for C#" illustrates the ferment that
still exists in concurrency theory. Researchers and engineers
continue to invent and apply new models for better multi-tasking.
- Several members of the IBM Linux Technology Center are working on the Next Generation POSIX Threading project. More information is available at the NGPT home page.
- Also on developerWorks, read other viewpoints in:
- Runtime: Context switching (developerWorks, July 2002)
- Charming Python: Generator-based state machines (developerWorks, July 2002)
- Find more Linux articles in the developerWorks Linux zone.