Week 10 - sync Primitives and sync/atomic¶
10.1 Conceptual Core¶
sync.Mutex: a fast, fair-but-not-strict mutex with a starvation mode (since 1.9) that switches to FIFO if a goroutine has waited >1ms. Readsrc/sync/mutex.go.sync.RWMutex: reader-writer lock. Writer-preferring. The read path is fast under low contention, but cache-line bounces under heavy reading; consider sharding before reaching forRWMutex.sync.Once: exactly-once initialization with a memory-barrier guarantee.sync.WaitGroup: not a barrier; a counter with wait-on-zero. Misuse #1:wg.Add(1)inside the goroutine instead of before launching it (race withwg.Wait). Misuse #2: reusing across goroutine generations without resetting.sync.Cond: Mesa-style condition variable. Almost always the wrong tool-channels orchan struct{}+ atomicpatterns are clearer.sync.Map: optimized for the case where keys are written once and read many times across goroutines. Worse thanmap + RWMutexfor read-modify-write patterns.sync/atomic: low-level atomic operations. Modern API (since 1.19):atomic.Int64,atomic.Pointer[T],atomic.Bool,atomic.Value. Prefer the typed values over the legacy free functions-the typed API prevents most misuse.
10.2 Mechanical Detail¶
- The Go memory model: read
go.dev/ref/memonce carefully. Key facts: - There is happens-before, defined per-channel-op, per-mutex-op, per-atomic-op.
- There is no total order across atomic operations on different addresses-atomics establish per-location ordering only. (This is closer to C++'s
acquire/releasethanseq_cst.) - Reads and writes of word-sized values that are not synchronized are races and have undefined behavior. The race detector is the source of truth.
sync.Mutexsource walk-through:- State is a 32-bit word: locked bit, woken bit, starvation bit, waiter count.
- Fast path: CAS the locked bit. ~1 ns uncontended.
- Slow path: spin briefly, then park. Wake order biased toward the most recent waiter except in starvation mode.
- Atomic patterns:
- Counter:
atomic.Int64.Add(1). Use for stats; do not assume monotonicity across atomic types. - Read-only snapshot publish:
atomic.Pointer[T].Store(newPtr)paired withLoad(). The classic copy-on-write. - CAS loop for lock-free updates:
for { old := p.Load(); newV := f(old); if p.CompareAndSwap(old, newV) { break } }. Every CAS retry is wasted work; bound the loop or back off. - Memory ordering in Go:
sync/atomicoperations are sequentially consistent on Go-supported architectures (in practice). Do not rely on weaker orderings; the spec does not give you the knobs C++ does.
10.3 Lab-"Lock-Free SPSC Ring"¶
Build a single-producer, single-consumer ring buffer using only atomic.Uint64 indices. Pad the indices to separate cache lines. Validate with go test -race -count=1000 running 1 producer and 1 consumer. Benchmark against chan T and against sync.Mutex - protected slice. Document the cache-line padding's effect with awithoutPad` variant-expect a 3–10× difference on modern x86.
10.4 Idiomatic & golangci-lint Drill¶
govet: copylocks(mutexes must not be copied),staticcheck SA2000(WaitGroup.AddafterWait),gocritic: deferUnlambda.
10.5 Production Hardening Slice¶
- Run every test with - race` in CI. Make this non-negotiable.
- Add a CI step that runs critical concurrency tests under - race -count=100` to catch low-probability races. Budget the CI time accordingly.