 | Level: Advanced Brian Goetz (brian@quiotix.com), Software Consultant, Quiotix Corp.
16 Oct 2001 The ThreadLocal class appeared with little fanfare in version 1.2 of the Java platform. While support for thread-local variables has long been a part of many threading facilities, such as the Posix pthreads facility, the initial design of the Java Threads API lacked this useful feature. Further, the initial implementation was quite inefficient. For these reasons, ThreadLocal gets relatively little attention, but it can be very handy for simplifying the development of thread-safe concurrent programs. In this third installment of Threading lightly, Java software consultant Brian Goetz examines ThreadLocal and offers tips for exploiting its power.
Participate in Brian's Multithreaded Java programming discussion forum for help with threading and concurrency issues in your projects.
Writing thread-safe classes is difficult. It requires a careful
analysis of not only the conditions under which variables will be read
or written, but also of how the class might be used by other classes.
Sometimes, it is very difficult to make a class thread-safe without compromising its functionality, ease of use, or performance. Some classes retain state information from one method invocation to the next, and it is difficult to make such classes
thread-safe in any practical way. It may be easier to manage the use of a non-thread-safe class
than to try and make the class thread-safe. A class that is not
thread-safe can often be used safely in a multithreaded program as
long as you ensure that instances of that class used by one thread are
not used by other threads. For example, the JDBC
Connection class is not thread-safe -- two threads cannot
safely share a Connection at a fine level of granularity
-- but if each thread had its own Connection, then
multiple threads can safely perform database operations
simultaneously. It is certainly possible to maintain a separate JDBC connection (or
any other object) for each thread without the use of
ThreadLocal; the Thread API gives us all the tools we
need to associate objects with threads. However, the
ThreadLocal class makes it much easier for us to manage
the process of associating a thread with its per-thread data. What is a thread-local variable?
A thread-local variable effectively provides a separate copy of
its value for each thread that uses it. Each thread can see only the
value associated with that thread, and is unaware that other threads
may be using or modifying their own copies. Some compilers (such as the Microsoft Visual C++ compiler or the IBM XL FORTRAN compiler) have incorporated support for
thread-local variables into the language using a storage-class modifier
(like static or volatile). Java compilers
offer no special language support for thread-local variables; instead,
they are implemented with the ThreadLocal class, which has
special support in the core Thread class. Because thread-local variables are implemented through a class,
rather than as part of the Java language itself, the syntax for using
thread-local variables is a bit more clumsy than for language dialects
where thread-local variables are built in. To create a thread-local
variable, you instantiate an object of class ThreadLocal.
The ThreadLocal class behaves much like the various
Reference classes in java.lang.ref; it acts
as an indirect handle for storing or retrieving a value. Listing 1
shows the ThreadLocal interface. Listing 1. The ThreadLocal interface
public class ThreadLocal {
public Object get();
public void set(Object newValue);
public Object initialValue();
}
|
The get() accessor retrieves the current thread's
value of the variable; the set() accessor modifies the
current thread's value. The initialValue() method is an
optional method that lets you set the initial value of the variable if
it has not yet been used in this thread; it allows for a form of lazy
initialization. How ThreadLocal behaves is best
illustrated by an example implementation. Listing 2 shows one way to
implement ThreadLocal. It isn't a particularly good implementation (although it is quite similar to the initial implementation), as it would likely perform poorly, but it illustrates clearly how ThreadLocal behaves. Listing 2. Bad implementation of ThreadLocal
public class ThreadLocal {
private Map values = Collections.synchronizedMap(new HashMap());
public Object get() {
Thread curThread = Thread.currentThread();
Object o = values.get(curThread);
if (o == null && !values.containsKey(curThread)) {
o = initialValue();
values.put(curThread, o);
}
return o;
}
public void set(Object newValue) {
values.put(Thread.currentThread(), newValue);
}
public Object initialValue() {
return null;
}
}
|
This implementation will perform poorly because it requires
synchronization on the values map for each
get() and set() operation, and if multiple
threads are accessing the same ThreadLocal at once, there
will be contention. Additionally, this implementation is impractical
because using Thread objects as the key in the
values map will prevent the Thread from
being garbage collected after the thread exits, and the
thread-specific values of the ThreadLocal for deceased
threads will also not be garbage collected.
Using
ThreadLocal to implement a per-thread Singleton
Thread-local variables are commonly used to render stateful Singleton
or shared objects thread-safe, either by encapsulating the entire
unsafe object in a ThreadLocal or by encapsulating the
object's thread-specific state in a ThreadLocal. For
example, in an application that is tightly tied to a
database, many methods may need to access the database. It could be
inconvenient to include a Connection as an argument to
every method in the system -- a sloppier, but significantly more
convenient technique would be to access the connection with a
Singleton. However, multiple threads cannot safely share a JDBC
Connection. By using a ThreadLocal in our
Singleton, as shown in Listing 3, we can allow any class in our
program to easily acquire a reference to a per-thread
Connection. In this way, we can think of a
ThreadLocal as allowing us to create a
per-thread-singleton. Listing 3. Storing a JDBC Connection in a per-thread Singleton
public class ConnectionDispenser {
private static class ThreadLocalConnection extends ThreadLocal {
public Object initialValue() {
return DriverManager.getConnection(ConfigurationSingleton.getDbUrl());
}
}
private static ThreadLocalConnection conn = new ThreadLocalConnection();
public static Connection getConnection() {
return (Connection) conn.get();
}
}
|
Any stateful or non-thread-safe object that is relatively more
expensive to create than to use, such as a JDBC
Connection or a regular-expression matcher, is a good
candidate for the per-thread-singleton technique. Of
course, for situations like this, you can use other approaches, like
pooling, for safely managing shared access. However, even pooling has
some potential drawbacks from a scalability perspective. Because pool
implementations must synchronize to maintain the integrity of the pool
data structures, if all threads are using the same pool, program
performance may suffer due to contention in a system with many threads
accessing the pool frequently.
Using ThreadLocal to simplify debug logging
Other applications for ThreadLocal in which pooling would
not be a useful alternative include storing or accumulating per-thread
context information for later retrieval. For example, suppose you
wanted to create a facility for managing debugging information in a
multithreaded application. You could accumulate debugging information
in a thread-local container as shown by the DebugLogger
class in Listing 4. At the beginning of a unit of work, you empty the
container, and when an error occurs, you query the container to
retrieve all the debugging information that has been generated so far
by this unit of work. Listing 4. Using ThreadLocal for managing a per-thread debugging log
public class DebugLogger {
private static class ThreadLocalList extends ThreadLocal {
public Object initialValue() {
return new ArrayList();
}
public List getList() {
return (List) super.get();
}
}
private ThreadLocalList list = new ThreadLocalList();
private static String[] stringArray = new String[0];
public void clear() {
list.getList().clear();
}
public void put(String text) {
list.getList().add(text);
}
public String[] get() {
return list.getList().toArray(stringArray);
}
}
|
Throughout your code, you can call DebugLogger.put(),
saving information about what your program is doing, and you can
easily retrieve the debugging information relevant to a particular
thread later when necessary (such as when an error has occurred).
This technique is a lot more convenient and efficient than simply
dumping everything to a log file and then trying to sort out which log
records come from which thread (and worrying about contention for the
logging object between threads.)
ThreadLocal is also useful in servlet-based
applications or any multithreaded server application in which the unit
of work is an entire request, because then a single thread will be
used during the entire course of handling the request. You can use
ThreadLocal variables to store any sort of per-request
context information using the per-thread-singleton technique described
earlier.
ThreadLocal's less thread-safe cousin, InheritableThreadLocal
The ThreadLocal class has a relative,
InheritableThreadLocal, which functions in a similar
manner, but is suitable for an entirely different sort of application.
When a thread is created, if it holds values for any
InheritableThreadLocal objects, these values are
automatically passed on to the child process as well. If a child
process calls get() on an
InheritableThreadLocal, it sees the same object as the
parent would. To preserve thread-safety, you should use
InheritableThreadLocal only for immutable objects
(objects whose state will not ever be changed once created), because the
object is shared between multiple
threads. InheritableThreadLocal is useful for passing
data from a parent thread to a child thread, such as a user id, or a
transaction id, but not for stateful objects like JDBC
Connections.
ThreadLocal
performance
While the concept of a thread-local variable has been around for a
long time and is supported by many threading frameworks including
the Posix pthreads specification, thread-local support
was omitted from the initial Java Threads design and only added in
version 1.2 of the Java platform. In many ways,
ThreadLocal is still a work in progress; it was rewritten
for version 1.3 and again for version 1.4, both times to address
performance problems. In JDK 1.2, ThreadLocal was implemented in a manner
very similar to Listing 2, except that a synchronized
WeakHashMap was used to store the values instead of a
HashMap. (Using WeakHashMap solves the
problem of Thread objects not getting garbage collected,
at some additional performance cost.) Needless to say, the
performance of ThreadLocal was quite poor. The version of ThreadLocal provided with version 1.3
of the Java platform is substantially better; it does not use any
synchronization and so does not present a scalability problem, and it
does not use weak references either. Instead, the Thread
class was modified to support ThreadLocal by adding an
instance variable to Thread that holds a
HashMap mapping thread-local variables to their values
for the current thread. Because the process of retrieving or setting
a thread-local variable does not involve reading or writing data that
might be read or written by another thread, you can implement
ThreadLocal.get() and set() without any
synchronization. Also, because the references to the per-thread
values are stored in the owning Thread object, when the
Thread gets garbage collected, so can its per-thread
values. Unfortunately, even with these improvements, the performance of
ThreadLocal under Java 1.3 is still surprisingly slow.
My rough benchmarks running the Sun 1.3 JDK on a two-processor Linux
system show that a ThreadLocal.get() operation takes
about twice as long as an uncontended synchronization. The reason for
this poor performance is that the Thread.currentThread()
method is quite expensive, accounting for more than two-thirds of the
ThreadLocal.get() run time. Even with these weaknesses,
the JDK 1.3
ThreadLocal.get() is still much faster than a contended
synchronization, so if there is any significant chance of contention
at all (perhaps there is a large number of threads, or the
synchronized block is executed frequently, or the synchronized block
is large), ThreadLocal may still be more efficient
overall. Under the newest version of the Java platform, version 1.4b2, performance of
ThreadLocal and Thread.currentThread() has
been improved significantly. With these new improvements,
ThreadLocal should be faster than other techniques such
as pooling. Because it is simpler and often less error-prone than
those other techniques, it will eventually be discovered as
an effective way to prevent undesired interactions between
threads.
The benefits of ThreadLocal
ThreadLocal offers a number of benefits. It is often the
easiest way to render a stateful class thread-safe, or to encapsulate
non-thread-safe classes so that they can safely be used in
multithreaded environments. Using ThreadLocal allows us
to bypass the complexity of determining when to synchronize in order
to achieve thread-safety, and it improves scalability because it
doesn't require any synchronization. In addition to simplicity, using
ThreadLocal to store a per-thread-singleton or per-thread
context information has a valuable documentation perk -- by using a
ThreadLocal, it's clear that the object stored in the
ThreadLocal is not shared between threads,
simplifying the task of determining whether a class is thread-safe or
not.
I hope you've enjoyed and learned from this series, and I encourage
you to follow up on your multithreading issues in my discussion forum.
Resources - Participate in the discussion forum.
-
Java Performance Tuning
by Jack Shirazi (O'Reilly & Associates,
2000) provides guidance on eliminating performance issues in the Java
platform.
-
Java Platform Performance: Strategies and Tactics
by Steve Wilson and Jeff Kesselman (Addison-Wesley, 2000) offers techniques for the experienced Java programmer to build speedy and efficient Java code.
-
Java Performance and Scalability, Volume 1: Server-Side Programming Techniques
by Dov Bulka (Addison-Wesley, 2000) provides a wealth of tips and tricks designed to help you increase the performance of your apps.
- Brian Goetz' article "Double-checked locking: Clever, but broken" (JavaWorld, February 2001) explores the Java Memory Model (JMM) in detail and the surprising consequences of failing to synchronize in certain situations.
- Doug Lea's
Concurrent Programming in Java, Second Edition
(Addison-Wesley, 1999) is a masterful book on the subtle issues surrounding multithreaded programming in the Java language.
- In his article "Writing multithreaded Java applications" (developerWorks, February 2001), Alex Roetter introduces the Java Thread API, outlines issues involved in multithreading, and offers solutions to common problems.
- "Connection pools" (developerWorks, October 2000) by Siva Visverwaran focuses on support for connection pooling of both database resources and non-database resources in a J2EE environment.
- The performance modeling and analysis team at IBM Thomas J. Watson Research Center is researching several projects in the areas of performance and performance management.
- Find more Java technology resources in the developerWorks
Java technology zone.
About the author  | |  | Brian Goetz is a software consultant and has been a professional software developer for the past 15 years. He is a Principal Consultant at Quiotix, a software development and consulting firm located in Los Altos, CA. See a list of Brian's published and upcoming articles in popular industry publications. |
Rate this page
|  |