Week 12 - Worker Pools, Leak Detection, Deadlock Prevention¶

12.1 Conceptual Core¶

Worker pool is the canonical "bounded concurrency" pattern: N worker goroutines consuming from a shared task channel. Bounds CPU, memory, and downstream RPC concurrency simultaneously.
Goroutine leaks are Go's silent OOM. Most common shapes:
Goroutine blocked on a channel that is never closed and never sent to.
Goroutine blocked on <-ctx.Done() of a context that nobody cancels.
Goroutine holding a reference (closure capture) to a request object that is now done.
time.After in a select loop (allocates a timer per iteration; the timer leaks until expiry).
Deadlocks in Go are detected only by the runtime's "all goroutines asleep" check, which fires only when every goroutine is blocked. Most production deadlocks are partial: a subsystem deadlocks while the rest of the program runs. The race detector does not catch these.

12.2 Mechanical Detail¶

The canonical worker pool:

func RunPool[T, R any](ctx context.Context, n int, in <-chan T, fn func(context.Context, T) (R, error)) <-chan Result[R] {
    out := make(chan Result[R])
    var wg sync.WaitGroup
    wg.Add(n)
    for i := 0; i < n; i++ {
        go func() {
            defer wg.Done()
            for {
                select {
                case <-ctx.Done():
                    return
                case task, ok := <-in:
                    if !ok { return }
                    r, err := fn(ctx, task)
                    select {
                    case out <- Result[R]{r, err}:
                    case <-ctx.Done():
                        return
                    }
                }
            }
        }()
    }
    go func() { wg.Wait(); close(out) }()
    return out
}

Every line above is load-bearing: the double-select on input and output, the wg.Done in defer, the closer goroutine after wg.Wait.

Leak detection tooling:
goleak for tests.
pprof goroutine for production: curl /debug/pprof/goroutine?debug=2 dumps every goroutine's stack. Read it.
runtime.NumGoroutine() exported as a metric. A monotonically growing count is the leak signal.
Deadlock detection:
go-deadlock (sasha-s/go-deadlock) wraps sync.Mutex with timing-based deadlock detection in dev builds.
For partial deadlocks: instrumentation on the lock acquisition path (lock contention metrics from runtime/metrics).
Backpressure: when the worker pool is saturated, what should the caller see? Three strategies: block (default), drop (with metric), reject (return error). The choice is application-dependent; document it.

12.3 Lab-"Worker Pool Survival Test"¶

Build a worker pool that handles: 1. Backpressure-bounded input channel, drop-with-metric on overflow. 2. Graceful shutdown-on ctx.Done(), drain in-flight tasks within a deadline, then abandon the rest. 3. Per-task timeouts-WithTimeout(ctx, 100ms) per task. 4. Panic isolation-a panic in one task does not kill the worker; recover and report. 5. Leak-clean-goleak passes after cancel(); pool.Wait().

Stress-test with 1M tasks across 1000 workers under - race`.

12.4 Idiomatic & `golangci-lint` Drill¶

bodyclose (HTTP responses leaked), rowserrcheck (sql.Rows.Err unchecked), sqlclosecheck. All three are leak-class lints; enable them as - D warnings`.

12.5 Production Hardening Slice¶

Add a /debug/pprof/goroutine periodic snapshot job to your service template: every 5 minutes, capture the goroutine count and the top-N stacks. Surface as a Prometheus gauge with stack-hash labels (low cardinality). On a leak, you will see which stack is growing without paging anyone.

Month 3 Capstone Deliverable¶

A concurrency-lab/ workspace: 1. chan-bench (week 9)-channel vs mutex vs atomic ring, with a markdown writeup. 2. spsc-ring (week 10)-atomic-only, race-clean, with cache-pad ablation. 3. context-discipline (week 11)-a refactored HTTP service plus a singleflight cache demo. 4. survival-pool (week 12)-the worker pool that survives the five failure modes.

CI gates additions: - raceon every test, - race -count=100 on critical packages, goleak baseline, 0-alloc regression guard on the SPSC ring's hot path. Open one upstream PR-even a doc fix to errgroup or `singleflight - by month end.