Skip to content

Week 16 - Lock-Free Patterns, VarHandle Memory Modes, and jcstress

Conceptual Core

You almost never need lock-free code. When you do - high-contention counters, single-producer single-consumer queues, latency-critical paths where mutex acquire-cost is intolerable - VarHandle's memory-ordering modes are the right tool, and jcstress is the only way to validate.

The decision rule: if synchronized or ReentrantLock gives you the latency and throughput you need, stop. Lock-free code is one to two orders of magnitude harder to get right than locked code. The few legitimate reasons to write it: documented hot-path contention (>100k ops/sec/thread on the same data), latency tail caused by lock convoy, or memory-ordering needs that the lock primitives don't express.

Mechanical Detail

  • VarHandle modes in order of strictness:

    • Plain - no ordering, no atomicity beyond hardware natural.
    • Opaque - per-variable ordering (no reordering with itself), no inter-variable.
    • Acquire / Release - one-way ordering (acquire-load sees all writes before the matching release-store).
    • Volatile - full sequential consistency.

    Map directly to C++20: relaxed/relaxed+opaque/acquire+release/seq_cst. Use the weakest that proves correctness. - The classical patterns: lock-free SPSC queue (single-producer single-consumer - LMAX Disruptor is the famous example), Treiber stack (CAS on head), Michael-Scott queue (lock-free FIFO), hazard pointers (safe memory reclamation), RCU-style epoch reclamation. - LongAdder vs AtomicLong: under contention, AtomicLong thrashes the cache line; LongAdder is striped across multiple cells, summed on read. The right choice for high-contention counters (request counts, metrics). - jcstress (org.openjdk.jcstress) is Shipilëv's harness: declare small concurrent fragments with @JCStressTest, annotate expected outcomes (@Outcome(id = "1, 1", expect = ACCEPTABLE)), and the harness exhaustively explores interleavings under multiple memory models. The unit-test framework for the JMM.

The trap

Using Plain mode and hoping it's "atomic enough." It isn't. Plain reads can return stale values forever; plain writes can be reordered with anything. Reach for Volatile or Acquire/Release; drop to Opaque only with a documented justification; never use Plain for shared state.

Lab

Implement a Treiber stack with VarHandle.compareAndSet (single VarHandle on the head pointer). Write three jcstress tests: 1. Linearizability - concurrent push + pop produces an ordering consistent with some serial schedule. 2. No lost pops - every pushed element is popped exactly once. 3. ABA exposure - under contention, a pop-then-push cycle can corrupt a CAS; document the scenario even if you don't fix it (the standard fix is hazard pointers or versioned pointers).

Run under all available -m modes (default, sequential consistency, relaxed).

Idiomatic Drill

Read the source of ConcurrentHashMap (Doug Lea). You will not understand all of it. Understand enough - specifically, how the table is striped, how resize hand-off works, and why every shared field is volatile. This is the gold standard for production lock-free Java code.

Production Hardening Slice

Add a "concurrency review" checklist to your hardening/ template:

  • Every mutable shared field is justified (why not local + return?).
  • Every lock has a documented invariant ("monitor protects cache.size <= maxSize").
  • Every volatile has a documented happens-before edge ("write in init() happens-before reads in process()").
  • Every Future / CompletableFuture chain has a defined cancellation path (no orphan continuations on the common pool).
  • Every VarHandle use names its memory mode in a comment.

Comments