Level: Introductory Eric Allen (eallen@cyc.com), Lead Java Developer, Cycorp, Inc.
01 May 2001 Many algorithms are expressed most concisely as tail-recursive methods. Compilers can automatically transform such methods into loops and thereby improve program performance, but this transformation is not required by the Java language specification, so not all JVMs will perform it. This means that tail-recursive methods in the Java language can result in unexpectedly large memory usage. In this article, Eric Allen demonstrates that dynamic compilation maintains the language's semantics while static compilation often doesn't. He shows why this matters and offers a bit of code to help you determine whether your just-in-time (JIT) compiler can transform tail recursion on code while preserving semantics.
Tail recursion and transformations
Quite a few programs contain loops that account for a substantial portion of their total processing time. These loops will often iteratively update more than one variable, and the updates for each variable will often depend upon the values of the others.
When iterations are represented as tail-recursive functions, these
variables will be represented as parameters to the functions. A quick reminder: a
recursive call is tail recursive if the return value of that call is
immediately returned as the value of the calling function; it isn't necessary
to remember the context of the calling function when invoking the call.
Because of this attribute, there is a nice correspondence between tail-recursive
functions and loops: each recursive call can be thought of simply as one more
iteration through a loop. But because the variable parameter values are
sent to the recursive call all at once, it is much easier to get the updated
values right than it is in a loop. Furthermore, awkward break statements are
often replaced by simple returns from the function.
But representing iteration this way in Java programming can be inefficient because numerous recursive calls risk overflowing the stack.
The solution is relatively simple: because these tail-recursive functions are
really just easier ways of writing loops, have the compiler automatically
transform them to loops. Then you get the best of both worlds.
But although it is well known how to automatically transform a
tail-recursive function into a simple loop, the Java specification doesn't
require that this transformation be made. Presumably, one reason it is not a
requirement is that, in general, the transformation can't be made statically
in an object-oriented language. Instead, the transformation from
tail-recursive function to simple loop must be done dynamically by a JIT
compiler.
To understand why this is the case, consider the following failed attempt to
multiply the elements of an iterator over a collection of integers.
Because the following program contains a bug, it throws an exception when run. But, as many of the earlier articles in this column have demonstrated, the precise exception that a program throws is an invaluable clue as to where the bug in the program resides (as well as an excellent identifier for the type of bug), and we wouldn't want the compiler to alter the program in such a way that the resulting code throws a different exception.
Listing 1. A failed attempt to multiply the elements of an Iterator of a collection of Integers
import java.util.Iterator;
public class Example {
public int product(Iterator i) {
return productHelp(i, 0);
}
int productHelp(Iterator i, int accumulator) {
if (i.hasNext()) {
return productHelp(i, accumulator * ((Integer)i.next()).intValue());
}
else {
return accumulator;
}
}
}
|
Notice the bug in the product method. It calls productHelp with an accumulator value of "0". It should be "1". Otherwise, a call to product on any instance of class Example will produce "0", regardless of the Iterator passed to it.
Suppose that this bug is eventually fixed, but, in the meantime, a subclass of class Example has been created, as shown in Listing 2: Listing 2. Attempts to catch Listing 1-type improper calls
import java.util.*;
class Example {
public int product(Iterator i) {
return productHelp(i, 1);
}
int productHelp(Iterator i, int accumulator) {
if (i.hasNext()) {
return productHelp(i, accumulator * ((Integer)i.next()).intValue());
}
else {
return accumulator;
}
}
}
// And, in a separate file:
import java.util.*;
public class Example2 extends Example {
int productHelp(Iterator i, int accumulator) {
if (accumulator < 1) {
throw new RuntimeException("accumulator to productHelp must be >= 1");
}
else {
return super.productHelp(i, accumulator);
}
}
public static void main(String[] args) {
LinkedList l = new LinkedList();
l.add(new Integer(0));
new Example2().product(l.listIterator());
}
}
|
The overridden productHelp method in class Example2 tries to catch such improper calls to productHelp by throwing a run-time exception if the accumulator is less than "1". Unfortunately, in doing so, it introduces a new bug. If the Iterator contains any instances of "0", then productHelp will crash on its own recursive call.
Now notice that, in the main method of class Example2, an instance of Example2 is created and its product method is called. Because the Iterator passed to this method contains a "0", the program should crash.
However, you can see that productHelp in class Example is strictly tail recursive. Suppose a static compiler were to convert the body of this method to a loop, as shown in Listing 3:
Listing 3. An example: Static compilation can't optimize the tail call
int productHelp(Iterator i, int accumulator) {
while (i.hasNext()) {
accumulator *= ((Integer)i.next()).intValue();
}
return accumulator;
}
|
Then the initial call to productHelp would result in a call to the super method. The super method would simply loop through the iterator to compute its result. No exception would be thrown.
Imagine how confusing it would be to compile this code with two different static compilers, only to find that the resulting code threw an exception in one case and not in the other.
Will your JIT do the job?
So, as the example in Listing 3 shows, we cannot expect static compilers to perform transformation of tail recursion on Java code while preserving the semantics of the language. Instead, we must rely on dynamic compilation by the JIT. Depending on the JVM, the JIT may or
may not do this.
One way to check whether your JIT transforms your tail-recursive methods is to compile and run the following small test class:
Listing 4. Determining whether your JIT transforms tail recursion
public class TailRecursionTest {
private static int loop(int i) {
return loop(i);
}
public static void main(String[] args) {
loop(0);
}
}
|
Let's consider for a moment the loop method in this class. This method will simply call itself recursively for as long as it can. Because it will never return, and it doesn't affect any outside variables in any way, replacing its body as shown in Listing 5 preserves the semantics of the program:
Listing 5. A dynamic transformation
public class TailRecursionTest {
private static int loop(int i) {
while (true) {
}
}
public static void main(String[] args) {
loop(0);
}
}
|
And, in fact, this is exactly the transformation that a sufficiently sophisticated compiler would make.
If your JIT compiler converts tail-recursive calls to iteration, this program will continue to run indefinitely. Its memory requirements are small and they don't grow with time.
On the other hand, if the JIT doesn't make this transformation, the program will quickly exhaust the stack space and report a stack overflow error.
I ran this program with a couple of the Java SDKs, and the results were surprising. Running on Sun's Hotspot JVM for version 1.3 reveals that Hotspot doesn't perform the transformation. At default settings, the stack space is exhausted in less than a second on my machine.
On the other hand, IBM's JVM for version 1.3 purrs along without a problem, indicating that it does transform the code in this way.
Wrapup
Remember: we can't rely on our code to always be running on a JVM that will transform tail-recursive calls. Therefore, to guarantee reasonable performance across JVMs, you should always try to write code that most naturally fits a tail-recursive pattern in an iterative style instead.
But be careful. As our example demonstrates, it is very easy to introduce bugs by transforming code in this way, whether it is altered by human hands or by software.
Resources
About the author  | |  | Eric Allen has an A.B. in computer science and mathematics from Cornell University. He is currently the lead Java software developer at Cycorp, Inc., the deputy manager of the Cycorp programming department, and a moderator for the Java Beginner discussion forum at
JavaWorld
magazine. He is also a part-time graduate student on the programming languages team at Rice University. His research concerns the development of formal semantic models and extensions of the Java language, both at the source and bytecode levels. Currently, Eric is implementing a source-to-bytecode compiler for the NextGen programming language, an extension of the Java language with generic run-time types. Contact Eric at eallen@cyc.com. |
Rate this page
|