Java concurrency bug patterns for multicore systems

Six lesser known Java concurrency bug patterns

By studying concurrency bug patterns, you both increase your general awareness of concurrency programming and learn to recognize coding idioms that don't, or might not, work. In this article, authors Zhi Da Luo, Yarden Nir-Buchbinder, and Raja Das unpack six lesser-known concurrency bugs that threaten the thread-safety and performance of Java™ applications running on multicore systems.

Zhi Da Luo, Software Engineer, IBM China

Zhi Da LuoZhi Da Luo is a software engineer at the Emerging Technology Institute, IBM China Development Lab. Mr. Luo joined IBM in 2008. He has experience in program analysis, bytecode instrumentation, and Java concurrency programming. He is currently working on a static/run time analysis tool for Java parallel software. Mr. Luo holds a Master's degree in Software Engineering from Peking University in Beijing, China.



Yarden Nir-Buchbinder, Research Scientist, IBM

Yarden Nir-BuchbinderYarden Nir-Buchbinder completed his BSc in Computer Science at the Technion, Israel, and earned his MA in philosophy from Haifa University. Since 2000 he has worked at the IBM Haifa Research Lab, with a research focus on concurrency and test coverage. Mr. Nir-Buchbinder is author and co-author of several publications and patents.



Raja Das, Software Architect, IBM

Raja DasRaja Das is a software architect in the IBM Software Group. His current focus is developing libraries and frameworks for multicore/manycore systems. Previously, he was product architect for the WebSphere Partner Gateway. Dr. Das's interests include programming languages, parallel software, and systems.



21 December 2010

Also available in Chinese Russian Japanese Portuguese

Develop skills on this topic

This content is part of a progressive knowledge path for advancing your skills. See Java concurrency

For programmers inexperienced with multithreaded programming, the challenge of adapting software for multicore systems is twofold: First, concurrency introduces a new category of bugs to Java programs, such as data race and deadlock, that are very difficult to reproduce and diagnose. Second, many programmers are unaware of the subtleties of certain multithreaded programming idioms, which can lead to errors in code.

In order to avoid introducing bugs to concurrent programs, Java programmers must learn to recognize the critical junctures where bugs are likely to show up in multithreaded code, and then write software that is bug proof. In this article, we address Java developers in the early and intermediate stages of understanding what makes concurrent programming different. Rather than focusing on well-known Java concurrency bug patterns like double-checked locking, spin wait, and wait-not-in-loop, we introduce six patterns that are less known, but which do show up frequently in real Java applications. In fact, our first two examples are real bugs found in two popular web servers.

1. An antipattern from Jetty

Our first concurrency bug is found in the widely used open source HTTP server, Jetty. It's a real bug that has been confirmed by the Jetty community (see Resources for the bug report).

Listing 1. Non-atomic operations on a volatile field without a lock held
// Jetty 7.1.0,
// org.eclipse.jetty.io.nio,
// SelectorManager.java, line 105

private volatile int _set;
......
public void register(SocketChannel channel, Object att)
{
   int s=_set++;
   ......
}
......
public void addChange(Object point)
{
   synchronized (_changes)
   {
      ......
   }
}

The error in Listing 1 is the sum of its parts:

  • First, _set is declared volatile, which implies that the field can be accessed by multiple threads.
  • But _set++ is not atomic, which means that it doesn't necessarily execute as a single, indivisible operation. Instead, it is shorthand for a sequence of three discrete operations: read-modify-write.
  • Finally, _set++ is not protected by a lock. If method register is invoked simultaneously by multiple threads, it could result in a race condition leading to an incorrect _set value.

An error of this type could show up just as easily in your code as it did in Jetty's, so let's take a closer look at how it happened.

Elements of a bug pattern

Following the code through its logical sequence helps clarify this bug pattern. Operations on a variable i, such as

i++
--i
i += 1
i -= 1
i *= 2

and so on are non-atomic (that is, read-modify-write). If you know that the volatile keyword in the Java language only guarantees variable visibility, but not atomicity, this should make you pause. A non-atomic operation on a volatile field that is not protected by a lock could result in a race condition — but only if the non-atomic operation is concurrently accessed by multiple threads.

In a thread-safe program, only one writing thread can modify the variable; other threads can read the up-to-date value by declaring the variable volatile.

So, whether the code is buggy depends on how many threads can access the operation concurrently. If the non-atomic operation is invoked by only one thread, due to a start-join relation or external locking, then the code idiom will be thread safe.

Bear in mind that the volatile keyword in Java code only guarantees that a variable is visible: it does not guarantee atomicity. In cases where the variable operation is non-atomic and can be accessed by multiple threads, don't rely on volatile synchronization facilities. Instead, use synchronized blocks, lock classes, and atomic classes from the java.util.concurrent package. These are designed to ensure the thread-safety of programs.


2. Synchronization on mutable fields

In the Java language, we use synchronized blocks to acquire mutual-exclusion locks, which protect shared-resource access in multithreaded systems. There's a loophole when synchronizing on a mutable field, however, which can break down mutual exclusion. The solution is to always declare the synchronized field as private final. Let's take a closer look at the problem to understand why.

Simultaneous locking on updated fields

Synchronized blocks are protected by the object referenced from the synchronized field rather than by the field itself. If a synchronized field is mutable (meaning the field can be assigned anywhere in the program other than its initialization), it is unlikely to have useful semantics, because different threads can synchronize on different objects.

You can see the problem in Listing 2, which is a code snippet from Tomcat, the open source web application server:

Listing 2. Tomcat in error
96: public void addInstanceListener(InstanceListener listener) {
97:
98:    synchronized (listeners) {
99:       InstanceListener results[] =
100:        new InstanceListener[listeners.length + 1];
101:      for (int i = 0; i < listeners.length; i++)
102:          results[i] = listeners[i];
103:      results[listeners.length] = listener;
104:      listeners = results;
105:   }
106:
107:}

Suppose that listeners refers to Array A, and that the thread T1 first acquires the lock on Array A and is then busy with creating Array B. In the meantime, T2 comes along and blocks to lock on Array A. When T1 finishes setting listeners on Array B and exits the block, T2 locks on Array A and starts making a copy of Array B. T3 then comes along and locks on Array B. Because they have acquired different locks, T2 and T3 are now simultaneously making copies of Array B.

Figure 1 further illustrates this sequence:

Figure 1. Lack of mutual exclusion due to synchronization on a mutable field
A diagram illustrating lack of mutual exclusion due to synchronization on a mutable field.

Numerous undesirable behaviors could come out of a setup like this one. At the least, one of the new listeners could get lost, or one of the threads could get an ArrayIndexOutOfBoundsException (due to the fact that the listeners reference and its length can change at any point in the method).

It's good practice to always declare the synchronized field as private final, which ensures that the lock object remains unchanged and mutex is guaranteed.


3. java.util.concurrent lock leak

A lock that implements the java.util.concurrent.locks.Lock interface controls how multiple threads access a shared resource. These locks do not require block structures, so they are more flexible than synchronized methods or statements. However, this flexibility can lead to coding errors because a lock without a block is never automatically released. If a Lock.lock() invocation does not have a corresponding unlock() invocation on the same instance, the result could be a lock leak.

It's easy to introduce the java.util.concurrent lock leak bug by overlooking method behavior in critical code, such as exceptions that can be thrown. You can see this in Listing 3, where the accessResource method throws an InterruptedException while accessing the shared resource. As a result, the unlock() is not invoked.

Listing 3. Anatomy of a lock leak
private final Lock lock = new ReentrantLock();

public void lockLeak() {
   lock.lock();
   try {
      // access the shared resource
      accessResource();
      lock.unlock();
   } catch (Exception e) {}

public void accessResource() throws InterruptedException {...}

To ensure that locks are released, simply pair every lock method with an unlock method, which should be placed in a try-finally block. This is illustrated in Listing 4:

Listing 4. Always put the unlock invocation in the finally block
private final Lock lock = new ReentrantLock();

public void lockLeak() {
   lock.lock();
   try {
      // access the shared resource
      accessResource();
   } catch (Exception e) {}
   finally {
      lock.unlock();
   }

public void accessResource() throws InterruptedException {...}

4. Performance tuning synchronized blocks

Some concurrency bugs won't break your code, but they can lead to poor application performance. Consider the synchronized block in Listing 5:

Listing 5. Synchronized block invariant code
public class Operator {
   private int generation = 0; //shared variable
   private float totalAmount = 0; //shared variable
   private final Object lock = new Object();

   public void workOn(List<Operand> operands) {
      synchronized (lock) {
         int curGeneration = generation; //requires synch
         float amountForThisWork = 0;
         for (Operand o : operands) {
            o.setGeneration(curGeneration);
            amountForThisWork += o.amount;
         }
         totalAmount += amountForThisWork; //requires synch
         generation++; //requires synch
      }
   }
}

Access to the two shared variables in Listing 5 is correctly synchronized, but if you look closely, you'll notice that the synchronized block requires more computation than it should. We can fix it by reordering the lines, as shown in Listing 6:

Listing 6. Synchronized block without the invariant code
public void workOn(List<Operand> operands) {
   int curGeneration;
   float amountForThisWork = 0;
   synchronized (lock) {
      int curGeneration = generation++;
   }
   for (Operand o : operands) {
      o.setGeneration(curGeneration);
      amountForThisWork += o.amount;
   }
   synchronized (lock)
      totalAmount += amountForThisWork;
   }
}

What about ...

You might wonder if it would be better to use AtomicInteger and AtomicFloat for the two shared variables in Listings 5 and 6, and get rid of synchronization altogether. Whether it's possible depends on what the other methods do with these variables, and whether there is a dependency between them.

The second version will perform much better on multicore machines. The reason is that the synchronized block in Listing 5 prevents parallel execution. Computation time in this method is likely to be spent on the loop. In Listing 6, the loop is out of a synchronized block, so it can be executed by several threads in parallel. In general, strive to make your synchronized blocks as lean as possible, without harming thread safety.


5. Multi-stage access

Suppose you're working on an application that holds two tables: One maps an employee name to a serial number and the other maps the serial number to a salary. This data needs to support concurrent accesses and updates, which you enable via the thread-safe ConcurrentHashMap, as shown in Listing 7:

Listing 7. Two-stage access
public class Employees {
   private final ConcurrentHashMap<String,Integer> nameToNumber;
   private final ConcurrentHashMap<Integer,Salary> numberToSalary;

   ... various methods for adding, removing, getting, etc...

   public int geBonusFor(String name) {
      Integer serialNum = nameToNumber.get(name);
      Salary salary = numberToSalary.get(serialNum);
      return salary.getBonus();
   }
}

This solution looks thread-safe, but in fact it isn't. The problem is that the getBonusFor method is not thread-safe. Between obtaining the serial number and using it to obtain the salary, another thread could remove the employee from both tables. In that case, the second map access would return null and an exception would be thrown.

Making each map in itself thread-safe is not sufficient. There is a dependency between them, and some operations that access both maps require atomic access. You could achieve thread-safety in this case by using containers that were not thread-safe (such as java.util.HashMap), and then employing explicit synchronization to protect each access. The synchronized block could then encompass both accesses if necessary.


6. Symmetric lock deadlock

Consider a thread-safe container class — a data structure that guarantees thread safety to its clients. (This is unlike most containers in java.util, which require their client to synchronize around usages of the container.) In Listing 8, a modifiable member stores the data and a lock object protects all accesses to it.

Listing 8. A thread-safe container
public <E> class ConcurrentHeap {
   private E[] elements;
   private final Object lock = new Object(); //protects elements

   public void add (E newElement) {
      synchronized(lock) {
         ... //manipulate elements
      }
   }

   public E removeTop() {
      synchronized(lock) {
         E top = elements[0];
         ... //manipulate elements
         return top;
      }
   }
}

Now let's add a method that takes another instance and adds all of its elements to the current instance. This method needs to access the elements member in both instances, so it takes both locks, as shown in Listing 9:

Listing 9. This road leads to deadlock
public void addAll(ConcurrentHeap other) {
   synchronized(other.lock) {
      synchronized(this.lock) {
         ... //manipulate other.elements and this.elements
      }
   }
}

Not just for containers

The symmetric lock deadlock scenario is somewhat well-known because it happened in a release of Java 1.4., where some of the synchronized containers returned by the Collections.synchronized methods deadlocked. Not only containers are susceptible to symmetric lock deadlock, however. All you need is a class with a method that takes another instance of the same class as its argument, that also needs to perform an operation atomically on members of the two instances. The compareTo and equals methods are two good examples.

Do you see the potential for deadlock? Suppose a program holds two instances, heap1 and heap2. If one thread calls heap1.addAll(heap2), and another thread concurrently calls heap2.addAll(heap1), then the threads could end up deadlocked. To put it differently: Say the first thread takes the lock of heap2, but before it does that the second thread starts executing the method, also taking the lock of heap1. As a result, each thread ends up waiting for a lock held by the other thread.

You can prevent the symmetric lock deadlock by determining an order among instances, so that when locks of two instances need to be taken together, the order is computed dynamically and determines which lock to take first. Brian Goetz discusses this workaround in detail in his book Java Concurrency in Practice (see Resources).


In conclusion

Many Java developers are in the early stages of learning to write concurrent programs for multicore environments. In this process, we are relinquishing the single-threaded programming idioms we've mastered in favor of multithreaded ones, which are inherently more complex. Studying concurrency bug patterns is a good way to discover the pitfalls of multithreaded programming and will help you master the subtleties of its idioms.

You can learn to recognize bug patterns as the sum of their parts, so that certain signals will act as red flags when you're writing code or during code reviews. You can also use static analysis tools for this purpose. FindBugs is an open source static analysis tool that looks for likely bug patterns in code. In fact, FindBugs could be used to detect the second and third bug patterns discussed in this article.

One known drawback of static analysis tools is that they generate false alarms, so that you may spend more time than you would like checking code patterns that are not buggy. An emerging class of dynamic analysis tools is more specifically suited to testing concurrent programs. Two such tools, the IBM® Multicore Software Development Kit (MSDK) and ConcurrentTesting (ConTest), are available at no charge from alphaWorks.

Resources

Learn

Get products and technologies

  • IBM Multicore SDK: This toolkit can be used to find data race, deadlock, and lock contention in Java multithreaded programs.
  • IBM ConcurrentTesting tool: Used for unit testing multithreaded applications, ConcurrentTesting can help eliminate concurrency-related bugs in parallel and distributed Java programs.

Discuss

  • Get involved in the My developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology
ArticleID=603438
ArticleTitle=Java concurrency bug patterns for multicore systems
publish-date=12212010