One of the best ways to simplify the often difficult task of writing thread-safe Java programs is to rely on immutable objects. The primary mechanisms supporting objects' immutability are final fields. Starting with Java 5, the semantics of final fields were strengthened as part of a thorough revision of the Java Memory Model (JMM) (see Resources). For example, contrary to intuition, declaring a field private and never changing its value outside of constructors is not sufficient to ensure thread-safe access. To avoid synchronizing access to it, a field must be declared final. In some cases, though, the immutability model imposed by the JMM is too restrictive. This article shows you a way to cache computed values in normal fields instead, without compromising a class's immutability.
Under the JMM, final fields must be set to their ultimate values in constructors. After the execution of the constructor ends (and provided no reference to the object itself escapes during construction), the JMM ensures that all threads read the same value of a final field, even without synchronizing access to it. (This is important, for example, in the String class to prevent security holes that might emerge if final fields could be perceived to change values as a result of visibility issues in different threads.)
Moreover, if the final field is a reference, the state of the entire graph of objects rooted at it (as seen by other threads) will be read as being at least as up-to-date as the final field itself, whether or not the fields of objects in the graph are declared final. If the graph's state isn't modified after construction of the object containing the final root reference, all threads will read the graph identically in any part, as long as they traverse it starting from that field. (Conversely, if some part of the graph can also be reached through normal references, nothing is guaranteed about what a thread traversing the graph from outside will read.)
When a less restrictive immutability model is needed
In some cases, the model of immutability implicitly imposed by the JMM is too restrictive. As an example, suppose you have an immutable Point class for geometry on the Euclidean plane. x and y are final fields denoting the Cartesian coordinates, and getRho() and getPhi() methods get the polar coordinates. The values returned by getRho() and getPhi() never change because they functionally depend only on x and y, which are immutable.
You can implement getRho() and getPhi() in three ways:
-
Recompute on demand: A common approach is to recompute the values at each invocation. This preserves thread safety but might lose performance if frequently requested or if the computation is expensive.
-
Eager initialization: Another approach is to precompute the values during construction and store them in
finalfieldsrhoandphi. The methods then merely return the values in the fields, the same way normal getters would do. This still preserves thread safety, but the precomputation might be useless or expensive if the values are never needed. Note that it's necessary to declarerhoandphiasfinal(orvolatile) to avoid declaringgetRho()andgetPhi()assynchronized. -
Initialization on demand: If performance is critical and you want to avoid the issues associated with recompute on demand and eager initialization, another strategy is to compute the values lazily the first time they are needed and cache them in
rhoandphifields to reuse them on subsequent requests. The fields cannot befinal, however, because you set them outside of constructors. It seems, then, thatgetRho()andgetPhi()need to be declaredsynchronized(or the fields asvolatile) to ensure thread-safe access. But observe thatrho, for example, while mutable from the perspective of the JMM, is logically written only once: Its value, once computed and set, never changes. It is as if the precomputation part of eager initialization were deferred until the value inrhois needed, when it is ultimately initialized with its definitive value.
It's important to stress that the values returned by getRho() and getPhi() functionally depend on the final fields x and y alone. That is, they can be computed deterministically from the immutable values in x and y and do not depend on any other state.
In the initialization-on-demand setting, both rho and phi are examples of what I call write-once fields. Such a field has a value that is either the initial default value (null, 0, false) for its type or a value that never changes once it has been written for the first time. (As I discuss more deeply below, it can be written more than once, but always with the same value.)
You can efficiently implement the initialization-on-demand strategy, where write-once fields are set to their ultimate values outside constructors. It requires some careful programming and sometimes some additional objects, but it explicitly avoids any other form of synchronization. Whenever an immutable class exposes values that functionally depend on its (immutable) state alone, they can be lazily computed, cached in normal fields, and accessed by means of nonsynchronized methods. As a general guideline, whenever eager initialization is viable, so is initialization on demand.
Be warned, however: Lack of synchronization can lead to serious safety problems (similar to those present in the double-check antipattern; see Resources). The requirements for this technique to work are that the cache fields are logically write-once and that if the field is physically written more than once, no one can tell the difference.
As an appetizer, take a look at a possible implementation of Point, shown in Listing 1. The implementation of getPhi() (not shown) is similar.
Listing 1. Initialization on demand of write-once fields
@Immutable
public class Point {
private final double x;
private final double y;
private double rho;
private double phi;
public Point(double x, double y) {
this.x = x;
this.y = y;
}
public double getRho() {
double r = rho;
if (r == 0) {
r = Math.hypot(x, y); // never returns -0.0
if (r != 0)
rho = r;
}
return r;
}
...
}
|
The nonstandard @Immutable annotation is meant to document the class better and to help static code analyzers. (Java Concurrency in Practice covers annotations for concurrency; see Resources.)
Now here's a slightly more complicated example: the rational numbers (fractions). This section introduces the Rational class and uses it as to illustrate some practical techniques for achieving synchronization-free access to the caching fields. I propose several approaches, from the most naive to a more-sophisticated implementation. You'll see the technique I used to implement getRho() reappear in several forms.
Listing 2 shows a straightforward implementation of Rational. It defines two fields: n and d for the numerator and denominator, respectively.
Listing 2. Naive implementation of the Rational class
import java.math.BigInteger;
@Immutable
public class Rational {
private final BigInteger n;
private final BigInteger d;
private static boolean isZero(BigInteger val) {
return val.signum() == 0;
}
public Rational(BigInteger n, BigInteger d) {
if (isZero(d))
throw new IllegalArgumentException("denominator is 0");
this.n = n;
this.d = d;
}
public boolean equals(Object o) { ... }
public int hashCode() { ... }
public Rational multiply(Rational val) { ... }
// other operations
...
}
|
In Listing 2, the fields are declared final to make Rational immutable and therefore thread-safe. (Unfortunately, as of this writing, BigInteger isn't immutable -- see Resources for a link to the relevant Bug Database entry -- but this will be fixed sooner or later.)
The first problem that arises is how to implement a proper hashCode() method. Remember the general contract of hashCode(): Among other properties, the most important is that whenever an object equals() another, both must also have the same hashCode(). You want -4/6 to equals() 6/-9, so you also need the hashCode() of -4/6 to equal that of 6/-9. You could simply always return 42, for example, but this would be a poor implementation of hashCode(). A good one must return possibly different values for objects that are unequal, so it needs to rely on the object's state, somehow.
Before reading further, try to implement a hashCode() based on the values of the n and d fields in Rational. There's no good way to do it correctly without reducing n and d to some canonical and unique form. The most natural canonical representation for Rational has n and d relatively prime (gcd(n, d) = 1) and d > 0. This can be enforced during construction, as shown in Listing 3:
Listing 3. Expensive implementation of
Rational
@Immutable
public class Rational {
private final BigInteger n;
private final BigInteger d;
private static boolean isZero(BigInteger val) { ... }
private static boolean isOne(BigInteger val) {
return val.signum() > 0 && val.bitLength() == 1;
}
public Rational(BigInteger n, BigInteger d) {
if (isZero(d))
throw new IllegalArgumentException("denominator is 0");
BigInteger gcd = n.gcd(d);
if (!isOne(gcd)) {
n = n.divide(gcd);
d = d.divide(gcd);
}
if (d.signum() < 0) {
n = n.negate();
d = d.negate();
}
this.n = n;
this.d = d;
}
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Rational))
return false;
Rational other = (Rational) o;
return n.equals(other.n) && d.equals(other.d);
}
public int hashCode() {
return n.hashCode() ^ d.hashCode();
}
public Rational multiply(Rational val) {
return new Rational(n.multiply(val.n), d.multiply(val.d));
}
// other operations
...
}
|
The constructor guarantees that an instance of Rational is always in its reduced form, with n and d relatively prime and d > 0. This permits straightforward implementations of equals() and hashCode(). This is the sole constructor, and Rational is immutable because all methods operate only on final fields. Moreover, hashCode() works well: it fulfills the contract and produces reasonably well-distributed values. But this Rational is by no means efficient.
It requires every newly constructed Rational to be in its reduced form. This is quite costly: it entails computing the gcd, which requires many costly divisions or bit shifts. Further, it needs to perform two full divisions if the gcd is not 1. All invocations of the constructor need to compute the gcd, and about two invocations out of five also need the reducing divisions. (A theorem in number theory states that the probability for two integers to be relatively prime is about 61%, or 6/pi2 to be precise.)
A better strategy is to allow noncanonical representations explicitly and to expose a getReduced() method, leaving it to Rational's clients to decide when reduction is appropriate. Equipped with a getReduced() method, Rational can be reimplemented as shown in Listing 4:
Listing 4. Rational with
getReduced() but no caches
@Immutable
public class Rational {
private final BigInteger n;
private final BigInteger d;
private static boolean isZero(BigInteger val) { ... }
private static boolean isOne(BigInteger val) { ... }
public Rational(BigInteger n, BigInteger d) {
if (isZero(d))
throw new IllegalArgumentException("denominator is 0");
this.n = n;
this.d = d;
}
private Rational newReduced(BigInteger gcd) {
BigInteger nRed = isOne(gcd) ? n : n.divide(gcd);
BigInteger dRed = isOne(gcd) ? d : d.divide(gcd);
return dRed.signum() > 0 ?
new Rational(nRed, dRed) :
new Rational(nRed.negate(), dRed.negate());
}
public Rational getReduced() {
BigInteger gcd = n.gcd(d);
return isOne(gcd) && d.signum() > 0 ?
this :
newReduced(gcd);
}
public boolean equals(Object o) { ... }
public int hashCode() {
Rational thiz = getReduced();
return thiz.n.hashCode() ^ thiz.d.hashCode();
}
public Rational multiply(Rational val) { ... }
// other operations
...
}
|
Although you can implement equals() without resorting to getReduced(), hashCode() needs the object's reduced representation to meet hashCode()'s contract. But once a reduced value has been computed -- either because the client requested it or because hashCode() needed it -- it would be useful to cache it for the future: Arithmetic operations compute faster if you use reduced values. But you can't modify the final fields or make them normal fields without enforcing synchronization in every method.
You'd like to implement a Rational class that:
- Is immutable and thread-safe.
- Is not required to instantiate only reduced representations.
- Caches its reduced value as soon as it is needed for external or internal reasons.
- Uses the cached reduced value whenever possible, once it has been computed.
Listing 5 shows an attempt:
Listing 5. Implementation with write-once fields
@Immutable
public class Rational {
private final BigInteger n;
private final BigInteger d;
private Rational red; // the cached reduced representation of this
private int hash; // the cached hashCode()
private static boolean isZero(BigInteger val) { ... }
private static boolean isOne(BigInteger val) { ... }
public Rational(BigInteger n, BigInteger d) {
if (isZero(d))
throw new IllegalArgumentException("denominator is 0");
this.n = n;
this.d = d;
}
private Rational newReduced(BigInteger gcd) { ... }
public Rational getReduced() {
Rational r = red;
if (r == null) {
BigInteger gcd = n.gcd(d);
if (isOne(gcd) && d.signum() > 0)
r = this;
else {
r = newReduced(gcd);
r.red = r; // red of an already reduced r is itself
}
red = r;
}
return r;
}
public boolean equals(Object o) { ... }
public int hashCode() {
Rational thiz = getReduced();
int h = thiz.hash;
if (h == 0) {
h = thiz.n.hashCode() ^ thiz.d.hashCode();
if (h != 0)
thiz.hash = h;
}
return h;
}
public Rational multiply(Rational val) {
Rational thiz = red;
if (thiz == null)
thiz = this;
Rational other = val.red;
if (other == null)
other = val;
return new Rational(thiz.n.multiply(other.n), thiz.d.multiply(other.d));
}
// other operations
...
}
|
Evidently, red cannot be made final because it is computed lazily. So Rational is potentially not immutable because it declares non-final fields. However, like rho in class Point in Listing 1, red isn't really part of the
object's state: it's here only to improve its performance. This is a general principle: Whenever a value functionally depends on an object's immutable state, it is redundant and not logically part of the object's state. From a logical perspective, the Rational's state is defined by n and d alone.
This implementation depends on the JMM guarantee of not-out-of-thin-air safety: When getting the value of a field, a thread reads the default value or a value set by some thread -- never a random value.
Consider hashCode(), a client of getReduced(). You can suppose that computing BigInteger.hashCode() is expensive, especially for large integers. So you cache Rational.hashCode() once it's computed. The right place to save it is in the cached reduced instance. Is this implementation of hashCode() thread-safe? Two different threads could operate on the same hash field, one reading it and the other writing to it. Because access to the field isn't synchronized, this access to hash is a data race, but -- to answer my question -- one that turns out to be harmless.
In fact, the value of hash is fully determined by the reduced value. The reduced value, in turn, depends only on the immutable state of the object referenced by this. A thread will read only the default 0 value or the value written by some thread. And the value written by any thread will always be the same, no matter how threads interact to compute and set it. Once a thread reads a nonzero value from hash, it will subsequently always see the same value, even if in the meantime the field has been visibly written by some other thread. This is called a benign data race. Write-once fields like hash work because the data races they are subject to are of a benign nature. (A similar implementation is used in String.hashCode().)
So far so good for primitives like hash. (For floating-points, however, you need to take care that -0.0 and the default initial value +0.0 are distinct but that == considers them equal.) For references, it turns out that the issue is a little bit more complicated.
To see why, I'll analyze what happens with red on a particular instance -- rat -- of Rational. The only method that writes to red is getReduced(). Because multiple threads can update the same field without synchronizing access to it (getReduced() isn't synchronized), you're facing a data race. However, the value referenced by red is fully determined by the immutable state of rat. No matter how many times red is written to, once a thread reads a nonnull, it will subsequently always see the same reduced value, although not necessarily the same object. Again, the data race is a benign one.
It's not guaranteed that a thread will continue to see the same instance of Rational in red. What is ensured, instead, is that the value is indistinguishable, even if the reference itself changes. In fact, suppose that a write to red done by a thread doesn't become visible to a second thread. This can happen because the write isn't synchronized. The second thread, executing getReduced() for the first time on the same rat, would then set red to a new instance of the same reduced value. Later, this write may happen to be seen by the first thread. From the point of view of the first thread, the reference in red would have changed, although the value would still be the same.
Also, for a particular rat, a thread computes its reduced value at most once, and it might never need to if writes to red by other threads soon become visible. This defines an upper bound on how many times the reduced value is recomputed for the same rat. It will be computed at most as many times as there are threads operating on rat.
Now, consider what happens with multiply(). Again, it is thread-safe. Observe that multiply() first carefully snapshots this or its reduced value in the local variable thiz. Once more, these steps are subject to benign data races, and thiz always references a correct arithmetic value: either the original value of this or its reduced value. The same happens for the val argument and its local counterpart other. So, through thiz and through other, multiply() always sees correct arithmetic values, and that's sufficient to compute the product correctly. You can write similar code for all other operations.
There's a little disturbing point in hashCode(), though. When this method happens to return 0, all threads will recompute it again and again. As implemented, hashCode() has no way to distinguish a hash initialized with its default value of 0 from a truly computed 0 hash. You could introduce a boolean isComputed field to indicate whether a 0 hash value is the result of a computation or if it's the initial default value. But you can't do this without coordinating access to isComputed and hash by some form of synchronization; try this as an exercise to convince yourself.
It would be rare to need to do this for performance reasons, but you could fix this issue without synchronization by relying on immutable wrapper objects that internally make use of final fields, as shown in Listing 6:
Listing 6. Another form of
hashCode()
@Immutable
public class Rational {
private final BigInteger n;
private final BigInteger d;
private Rational red; // the cached reduced representation of this
private Integer hash; // the cached hashCode()
// autoboxing and unboxing as in Java 5
public int hashCode() {
Rational thiz = getReduced();
Integer h = thiz.hash;
if (h == null) {
h = thiz.n.hashCode() ^ thiz.d.hashCode();
thiz.hash = h;
}
return h;
}
...
}
|
Now hashCode() can reliably distinguish if hash has been computed or not. The additional null value helps here, but it costs one object more. Provided BigInteger.hashCode() distributes its results over the 232 (4 billion or so) different int values reasonably well, the probability of a computed hashCode() being 0 is relatively small. The solution with the primitive hash field is therefore more efficient and space-savvy. The most obvious Rational for which hashCode() = 0 is 1/1, so no performance bottleneck occurs here if this hash is recomputed repeatedly. So, it makes sense to adopt wrapper objects only if the computation is expensive and if it returns the default value with high probability.
As you've seen, the computed values of the write-once fields rho and phi of the Point class, and red and hash of the Rational class, functionally depend only on the immutable state of their instances. So they are logically immutable in turn. However, they don't need to be stored in final fields and can be computed lazily.
They illustrate a general principle, valid for values that functionally depend only on the immutable state. By careful programming, you can cache the computed values in normal fields without compromising the class's immutability. If such a value rarely or never takes the default value (null, 0, false) for its type, a normal field of the same type can directly cache it. Otherwise, a normal field that references an immutable wrapper object (Integer, Double, Rational itself, and so on) will do the job, because you can always test if the computed value has been assigned to the field or not. (But values of type long and double always need to be wrapped; see Resources for details.)
The author would like to express many thanks to Brian Goetz and Jeremy Manson for clarifying some points about the JMM. Brian also reviewed and appropriately commented on earlier versions of this article and proposed the "write-once" term.
Learn
-
Java Concurrency in Practice (Brian Goetz et al., Addison Wesley Professional, 2006): Visit the Web site for this comprehensive book on Java concurrency and explore the online supplements.
-
Concurrent Programming in Java: Design Principles and Patterns, 2nd ed. (Doug Lea, Addison Wesley Professional, 2000): Another good concurrency book with supplementary online materials.
-
"Concurrency in JDK 5.0" (Brian Goetz, developerWorks, November 2004): This tutorial explains the major new support for developing concurrent applications added in JDK 5.0.
-
The Java Language Specification, Third Edition: Read about the Java Memory Model, in particular the section 17.5 Final Field Semantics.
-
"Double-checked locking and the Singleton pattern" (Peter Haggar, developerWorks, May 2002): An examination of the roots of the double-checked locking idiom, why it was developed, and why it doesn't work.
-
BigInteger: Check out theBigIntegerentry in the Java Bug Database. -
The Java technology zone: Hundreds of articles about every aspect of Java programming.
Discuss
-
Check out developerWorks
blogs and get involved in the
developerWorks community.





