Skip to content

Week 10 - sync Primitives and sync/atomic

10.1 Conceptual Core

  • sync.Mutex: a fast, fair-but-not-strict mutex with a starvation mode (since 1.9) that switches to FIFO if a goroutine has waited >1ms. Read src/sync/mutex.go.
  • sync.RWMutex: reader-writer lock. Writer-preferring. The read path is fast under low contention, but cache-line bounces under heavy reading; consider sharding before reaching for RWMutex.
  • sync.Once: exactly-once initialization with a memory-barrier guarantee.
  • sync.WaitGroup: not a barrier; a counter with wait-on-zero. Misuse #1: wg.Add(1) inside the goroutine instead of before launching it (race with wg.Wait). Misuse #2: reusing across goroutine generations without resetting.
  • sync.Cond: Mesa-style condition variable. Almost always the wrong tool-channels or chan struct{} + atomic patterns are clearer.
  • sync.Map: optimized for the case where keys are written once and read many times across goroutines. Worse than map + RWMutex for read-modify-write patterns.
  • sync/atomic: low-level atomic operations. Modern API (since 1.19): atomic.Int64, atomic.Pointer[T], atomic.Bool, atomic.Value. Prefer the typed values over the legacy free functions-the typed API prevents most misuse.

10.2 Mechanical Detail

  • The Go memory model: read go.dev/ref/mem once carefully. Key facts:
  • There is happens-before, defined per-channel-op, per-mutex-op, per-atomic-op.
  • There is no total order across atomic operations on different addresses-atomics establish per-location ordering only. (This is closer to C++'s acquire/release than seq_cst.)
  • Reads and writes of word-sized values that are not synchronized are races and have undefined behavior. The race detector is the source of truth.
  • sync.Mutex source walk-through:
  • State is a 32-bit word: locked bit, woken bit, starvation bit, waiter count.
  • Fast path: CAS the locked bit. ~1 ns uncontended.
  • Slow path: spin briefly, then park. Wake order biased toward the most recent waiter except in starvation mode.
  • Atomic patterns:
  • Counter: atomic.Int64.Add(1). Use for stats; do not assume monotonicity across atomic types.
  • Read-only snapshot publish: atomic.Pointer[T].Store(newPtr) paired with Load(). The classic copy-on-write.
  • CAS loop for lock-free updates: for { old := p.Load(); newV := f(old); if p.CompareAndSwap(old, newV) { break } }. Every CAS retry is wasted work; bound the loop or back off.
  • Memory ordering in Go: sync/atomic operations are sequentially consistent on Go-supported architectures (in practice). Do not rely on weaker orderings; the spec does not give you the knobs C++ does.

10.3 Lab-"Lock-Free SPSC Ring"

Build a single-producer, single-consumer ring buffer using only atomic.Uint64 indices. Pad the indices to separate cache lines. Validate with go test -race -count=1000 running 1 producer and 1 consumer. Benchmark against chan T and against sync.Mutex - protected slice. Document the cache-line padding's effect with awithoutPad` variant-expect a 3–10× difference on modern x86.

10.4 Idiomatic & golangci-lint Drill

  • govet: copylocks (mutexes must not be copied), staticcheck SA2000 (WaitGroup.Add after Wait), gocritic: deferUnlambda.

10.5 Production Hardening Slice

  • Run every test with - race` in CI. Make this non-negotiable.
  • Add a CI step that runs critical concurrency tests under - race -count=100` to catch low-probability races. Budget the CI time accordingly.

Comments