Saltar a contenido

Memory models

Why it matters

A memory model is the contract between the programmer and the runtime (+ compiler + CPU) about what one thread is allowed to observe of another thread's writes. Without a memory model, every multi-threaded program is undefined behavior. With one, every multi-threaded program is possibly defined behavior - if you respect the model.

The model is invisible until it isn't. The day you ship a program that works on x86 and corrupts data on ARM, you discover you needed to know this years ago.


The lens, per path

Java - the JMM, oldest and most-studied

Month 4 - Concurrency & Loom, Week 13. JSR-133 defined the Java Memory Model in 2004; it remains the model the rest of the industry borrows from.

Mechanism: happens-before edges from synchronization actions (synchronized unlock/lock, volatile write/read, Thread.start/join, final-field freeze at constructor exit, VarHandle memory modes).

The defining feature: final fields. The JMM gives them a freeze guarantee at constructor exit - a correctly-constructed immutable object is safely publishable without synchronization. No other major model offers this.

The trap

double-checked locking without volatile is broken on the JMM. The reordering allowed by the model lets a thread see a partially-constructed object. volatile (or final for the inner field) is mandatory.

Go - the Go Memory Model, simpler and stricter

Month 3 - Concurrency Mastery. The Go memory model is a published spec at go.dev/ref/mem. It is deliberately simpler than the JMM and stricter than C++'s - the design ethic is "make it hard to write subtly wrong code."

Mechanism: happens-before via channel sends/receives, sync.Mutex unlock/lock, sync/atomic operations (acquire/release ordering since 1.19), goroutine start.

The defining feature: channels are the primary synchronization primitive in the model. A send on a channel happens-before the corresponding receive. This makes channel-based code analyzable; mutex-based code less so.

The trap

assuming for x := range items { go func() { use(x) } } captures each x separately. Pre-1.22, the loop variable was shared; post-1.22, it's per-iteration. Code older than 1.22 (most production Go) has this bug.

Rust - orderings borrowed from C++20, enforced by types

Month 3 - Concurrency & Async. Rust's memory model is, for shared atomics, the C++20 memory model. For everything else, the type system makes most memory-model questions unaskable.

Mechanism: std::sync::atomic::Ordering - Relaxed, Acquire, Release, AcqRel, SeqCst. Map directly to C++20's. Most non-atomic shared mutation is banned by Send/Sync before it can race.

The defining feature: the type system encodes "this is safe to send between threads" (Send) and "this is safe to share between threads" (Sync). The JMM tells you what's safe; Rust's compiler enforces it.

The trap

reaching for SeqCst because you're not sure. SeqCst forces a full memory barrier on every op and tanks scaling. Use the weakest ordering that proves correctness - usually Acquire/Release.

Linux kernel - LKMM, the most formal of all

Month 2 - Memory & Scheduling. The Linux Kernel Memory Model is the most rigorously specified of any in this list - it has an executable model in tools/memory-model/ that you can run against test cases.

Mechanism: explicit barriers (smp_mb(), smp_rmb(), smp_wmb(), smp_load_acquire(), smp_store_release()), RCU read-side primitives, per-architecture atomic ops with documented orderings.

The defining feature: RCU as a memory-model citizen. RCU read-side critical sections have specific ordering guarantees with respect to grace periods. No other model treats deferred reclamation as a primitive.

The trap

assuming x86's natural strong ordering (TSO) applies on ARM/POWER. Half of LKMM exists to make this concrete. A kernel patch that's correct on x86 and broken on ARM is the most common cross-arch bug class.

C / C++ - the ancestor

Not a path on this site, but worth knowing as context. C11/C++20 memory model is what Rust and (informally) Go borrowed from. memory_order_relaxed/consume/acquire/release/acq_rel/seq_cst. The consume ordering is infamous - defined in the spec, implementable correctly by no compiler, treated as acquire in practice.


The contrasts that teach

Aspect Java JMM Go MM Rust (C++20) Linux LKMM
Primary sync edge synchronized / volatile channel send/receive Acquire/Release atomics barriers + RCU
Default if no sync? undefined visibility undefined visibility compile error (Send/Sync) undefined; review required
Final/immutable guarantee final fields (freeze) none formal Arc<T> + types const + barriers
Tooling for verification jcstress -race loom, MIRI herd7, klitmus
Formal spec quality very good good inherited from C++20 (excellent) excellent (executable)

The most clarifying read across these: JMM happens-before + Rust Acquire/Release + LKMM smp_load_acquire side-by-side. Three syntaxes, one semantics. Once you see they're the same idea, every memory model collapses into one.


What to read first

  • You write Java that runs on more than one thread → JMM section in Java Month 4. Mandatory.
  • You write Go that uses sync/atomic → the Go memory model spec (linked from Go Month 3) is 8 pages and answers every question.
  • You write Rust → the C++20 ordering chart, then Jon Gjengset's Rust for Rustaceans chapter on it.
  • You patch the kernel → LKMM documentation in tools/memory-model/. Run the executable model.
  • You write any of the above and want one mental model → start with the C++20 model. Everything else is a specialization.