Why does a thread holding a lock need an acquire fence to observe its own plain write in the lock before it releases the lock? [closed]

1 day ago 1
ARTICLE AD BOX

I'm implementing flat combining (original paper).

When the combiner applies a node

class Node { volatile Function<T, R> action; //Used through its VarHandle volatile Node next; R item; // plain field public void spItem(R item) { this.item = item; } public R lpItem() { return item; } void soAction(Function<T, R> action) { ACTION.setRelease(this, action); } boolean isApplied() { return ACTION.getAcquire(this) == null; } }

, it does:

//In a lock Node curr; curr.spItem(a.apply(item)); // plain write to result curr.soAction(null); // release signals completion of the write

Non-combiners do an acquire on isApplied() before reading item, which makes sense, they need to synchronize with the combiner's release.

What I don't get is why the combiner also needs that acquire fence when checking its own node:

// Still inside the lock, after the combine loop return ours.lpItem(); // plain read, why does this return a stale value from earlier?

If I drop the acquire on isApplied(), only the combining path returns a wrong value however non-combiners are fine.

My current reasoning is that the combiner applies its own node during the scan with a plain write to item, and since it's the same thread reading it back, I'd expect program order to guarantee visibility without any sort of fence. Is there anything I am missing here?

Note: this only manifests under high concurrency (32+ threads in my JMH benchmark). Under lower thread counts the combiner always reads back the correct value.

It turns out this was a correctness bug, not a visibility issue. The combiners node could land past the maxCombinePass cutoff if enough threads prepend before the combiner scans, so the combiner's node never got applied at all.

Read Entire Article