ForkJoinPool and RecursiveTask lead to deadlock when join is called in a different thread than fork

6 days ago 7
ARTICLE AD BOX

The program below never ends:

void main() { try (Fibonacci fibonacci = new Fibonacci()) { IO.println(fibonacci.calculate(9)); } } public static class Fibonacci implements AutoCloseable { private final ConcurrentHashMap<Integer, FibonacciTask> cache; private final ForkJoinPool pool; public Fibonacci() { cache = new ConcurrentHashMap<>(); pool = new ForkJoinPool(2); } public long calculate(int n) { return pool.invoke(new FibonacciTask(n)); } private class FibonacciTask extends RecursiveTask<Long> { private final int n; private FibonacciTask(int n) { this.n = n; } @Override protected Long compute() { IO.println("[%d] Start".formatted(n)); if (n < 3) { IO.println("[%d] End".formatted(n)); return 1L; } FibonacciTask previousValue = cache.putIfAbsent(n, this); if (previousValue != null) { IO.println("[%d] Wait".formatted(n)); long cachedValue = previousValue.join(); // <-- PROBLEMATIC LINE IO.println("[%d] Got value".formatted(n)); return cachedValue; } FibonacciTask a = new FibonacciTask(n - 2); FibonacciTask b = new FibonacciTask(n - 1); a.fork(); b.fork(); long value = a.join() + b.join(); IO.println("[%d] End".formatted(n)); return value; } } @Override public void close() { if (pool != null) { pool.close(); } } }

Here is example log:

[9] Start [7] Start [8] Start [5] Start [6] Start [3] Start [4] Start [2] Start [1] Start [2] End [1] End [3] Start [2] Start [3] Wait [2] End [3] End [4] Start [3] Got value [4] Wait [4] End [7] Start [5] Start [7] Wait [5] Wait [6] Start [6] Wait

The problematic line is here:

long cachedValue = previousValue.join(); // <-- PROBLEMATIC LINE

In my logs you can see:

... [4] Start ... [4] Wait [4] End ...

but there is no [4] Got value. It means that FibonacciTask(4) was calculated but this line:

long cachedValue = previousValue.join(); // <-- PROBLEMATIC LINE

is still waiting.

If I change line:

pool = new ForkJoinPool(2);

to:

pool = new ForkJoinPool(1);

the deadlock disappears.

My only clue is that I shouldn't call join in different thread than fork, but I didn't see in the documentation that this is forbidden.

Read Entire Article