Memory models¶
Why it matters¶
A memory model is the contract between the programmer and the runtime (+ compiler + CPU) about what one thread is allowed to observe of another thread's writes. Without a memory model, every multi-threaded program is undefined behavior. With one, every multi-threaded program is possibly defined behavior - if you respect the model.
The model is invisible until it isn't. The day you ship a program that works on x86 and corrupts data on ARM, you discover you needed to know this years ago.
The lens, per path¶
Java - the JMM, oldest and most-studied¶
Month 4 - Concurrency & Loom, Week 13. JSR-133 defined the Java Memory Model in 2004; it remains the model the rest of the industry borrows from.
Mechanism: happens-before edges from synchronization actions (synchronized unlock/lock, volatile write/read, Thread.start/join, final-field freeze at constructor exit, VarHandle memory modes).
The defining feature: final fields. The JMM gives them a freeze guarantee at constructor exit - a correctly-constructed immutable object is safely publishable without synchronization. No other major model offers this.
The trap
double-checked locking without volatile is broken on the JMM. The reordering allowed by the model lets a thread see a partially-constructed object. volatile (or final for the inner field) is mandatory.
Go - the Go Memory Model, simpler and stricter¶
Month 3 - Concurrency Mastery. The Go memory model is a published spec at go.dev/ref/mem. It is deliberately simpler than the JMM and stricter than C++'s - the design ethic is "make it hard to write subtly wrong code."
Mechanism: happens-before via channel sends/receives, sync.Mutex unlock/lock, sync/atomic operations (acquire/release ordering since 1.19), goroutine start.
The defining feature: channels are the primary synchronization primitive in the model. A send on a channel happens-before the corresponding receive. This makes channel-based code analyzable; mutex-based code less so.
The trap
assuming for x := range items { go func() { use(x) } } captures each x separately. Pre-1.22, the loop variable was shared; post-1.22, it's per-iteration. Code older than 1.22 (most production Go) has this bug.
Rust - orderings borrowed from C++20, enforced by types¶
Month 3 - Concurrency & Async. Rust's memory model is, for shared atomics, the C++20 memory model. For everything else, the type system makes most memory-model questions unaskable.
Mechanism: std::sync::atomic::Ordering - Relaxed, Acquire, Release, AcqRel, SeqCst. Map directly to C++20's. Most non-atomic shared mutation is banned by Send/Sync before it can race.
The defining feature: the type system encodes "this is safe to send between threads" (Send) and "this is safe to share between threads" (Sync). The JMM tells you what's safe; Rust's compiler enforces it.
The trap
reaching for SeqCst because you're not sure. SeqCst forces a full memory barrier on every op and tanks scaling. Use the weakest ordering that proves correctness - usually Acquire/Release.
Linux kernel - LKMM, the most formal of all¶
Month 2 - Memory & Scheduling. The Linux Kernel Memory Model is the most rigorously specified of any in this list - it has an executable model in tools/memory-model/ that you can run against test cases.
Mechanism: explicit barriers (smp_mb(), smp_rmb(), smp_wmb(), smp_load_acquire(), smp_store_release()), RCU read-side primitives, per-architecture atomic ops with documented orderings.
The defining feature: RCU as a memory-model citizen. RCU read-side critical sections have specific ordering guarantees with respect to grace periods. No other model treats deferred reclamation as a primitive.
The trap
assuming x86's natural strong ordering (TSO) applies on ARM/POWER. Half of LKMM exists to make this concrete. A kernel patch that's correct on x86 and broken on ARM is the most common cross-arch bug class.
C / C++ - the ancestor¶
Not a path on this site, but worth knowing as context. C11/C++20 memory model is what Rust and (informally) Go borrowed from. memory_order_relaxed/consume/acquire/release/acq_rel/seq_cst. The consume ordering is infamous - defined in the spec, implementable correctly by no compiler, treated as acquire in practice.
The contrasts that teach¶
| Aspect | Java JMM | Go MM | Rust (C++20) | Linux LKMM |
|---|---|---|---|---|
| Primary sync edge | synchronized / volatile |
channel send/receive | Acquire/Release atomics |
barriers + RCU |
| Default if no sync? | undefined visibility | undefined visibility | compile error (Send/Sync) |
undefined; review required |
| Final/immutable guarantee | final fields (freeze) |
none formal | Arc<T> + types |
const + barriers |
| Tooling for verification | jcstress | -race |
loom, MIRI | herd7, klitmus |
| Formal spec quality | very good | good | inherited from C++20 (excellent) | excellent (executable) |
The most clarifying read across these: JMM happens-before + Rust Acquire/Release + LKMM smp_load_acquire side-by-side. Three syntaxes, one semantics. Once you see they're the same idea, every memory model collapses into one.
What to read first¶
- You write Java that runs on more than one thread → JMM section in Java Month 4. Mandatory.
- You write Go that uses
sync/atomic→ the Go memory model spec (linked from Go Month 3) is 8 pages and answers every question. - You write Rust → the C++20 ordering chart, then Jon Gjengset's Rust for Rustaceans chapter on it.
- You patch the kernel → LKMM documentation in
tools/memory-model/. Run the executable model. - You write any of the above and want one mental model → start with the C++20 model. Everything else is a specialization.