Week 12 - Worker Pools, Leak Detection, Deadlock Prevention¶
12.1 Conceptual Core¶
- Worker pool is the canonical "bounded concurrency" pattern: N worker goroutines consuming from a shared task channel. Bounds CPU, memory, and downstream RPC concurrency simultaneously.
- Goroutine leaks are Go's silent OOM. Most common shapes:
- Goroutine blocked on a channel that is never closed and never sent to.
- Goroutine blocked on
<-ctx.Done()of a context that nobody cancels. - Goroutine holding a reference (closure capture) to a request object that is now done.
time.Afterin aselectloop (allocates a timer per iteration; the timer leaks until expiry).- Deadlocks in Go are detected only by the runtime's "all goroutines asleep" check, which fires only when every goroutine is blocked. Most production deadlocks are partial: a subsystem deadlocks while the rest of the program runs. The race detector does not catch these.
12.2 Mechanical Detail¶
- The canonical worker pool:
Every line above is load-bearing: the double-select on input and output, the
func RunPool[T, R any](ctx context.Context, n int, in <-chan T, fn func(context.Context, T) (R, error)) <-chan Result[R] { out := make(chan Result[R]) var wg sync.WaitGroup wg.Add(n) for i := 0; i < n; i++ { go func() { defer wg.Done() for { select { case <-ctx.Done(): return case task, ok := <-in: if !ok { return } r, err := fn(ctx, task) select { case out <- Result[R]{r, err}: case <-ctx.Done(): return } } } }() } go func() { wg.Wait(); close(out) }() return out }wg.Doneindefer, the closer goroutine afterwg.Wait. - Leak detection tooling:
goleakfor tests.pprof goroutinefor production:curl /debug/pprof/goroutine?debug=2dumps every goroutine's stack. Read it.runtime.NumGoroutine()exported as a metric. A monotonically growing count is the leak signal.- Deadlock detection:
go-deadlock(sasha-s/go-deadlock) wrapssync.Mutexwith timing-based deadlock detection in dev builds.- For partial deadlocks: instrumentation on the lock acquisition path (lock contention metrics from
runtime/metrics). - Backpressure: when the worker pool is saturated, what should the caller see? Three strategies: block (default), drop (with metric), reject (return error). The choice is application-dependent; document it.
12.3 Lab-"Worker Pool Survival Test"¶
Build a worker pool that handles:
1. Backpressure-bounded input channel, drop-with-metric on overflow.
2. Graceful shutdown-on ctx.Done(), drain in-flight tasks within a deadline, then abandon the rest.
3. Per-task timeouts-WithTimeout(ctx, 100ms) per task.
4. Panic isolation-a panic in one task does not kill the worker; recover and report.
5. Leak-clean-goleak passes after cancel(); pool.Wait().
Stress-test with 1M tasks across 1000 workers under - race`.
12.4 Idiomatic & golangci-lint Drill¶
bodyclose(HTTP responses leaked),rowserrcheck(sql.Rows.Err unchecked),sqlclosecheck. All three are leak-class lints; enable them as - D warnings`.
12.5 Production Hardening Slice¶
- Add a
/debug/pprof/goroutineperiodic snapshot job to your service template: every 5 minutes, capture the goroutine count and the top-N stacks. Surface as a Prometheus gauge with stack-hash labels (low cardinality). On a leak, you will see which stack is growing without paging anyone.
Month 3 Capstone Deliverable¶
A concurrency-lab/ workspace:
1. chan-bench (week 9)-channel vs mutex vs atomic ring, with a markdown writeup.
2. spsc-ring (week 10)-atomic-only, race-clean, with cache-pad ablation.
3. context-discipline (week 11)-a refactored HTTP service plus a singleflight cache demo.
4. survival-pool (week 12)-the worker pool that survives the five failure modes.
CI gates additions: - raceon every test, - race -count=100 on critical packages, goleak baseline, 0-alloc regression guard on the SPSC ring's hot path. Open one upstream PR-even a doc fix to errgroup or `singleflight - by month end.