Saltar a contenido

Week 8 - Allocation Profiling, sync.Pool, GC Tuning

8.1 Conceptual Core

  • The cheapest allocation is the one you do not make. The second cheapest is the one you reuse.
  • sync.Pool is a per-P caches-of-objects mechanism. Items can be reclaimed by the GC at any time (typically at the start of each GC cycle), so it is a cache, not a resource pool. Use it for short-lived, frequently-allocated objects (bytes.Buffer, []byte scratch space, parser nodes).
  • The two production-grade memory tuning knobs: GOGC (heap growth ratio) and GOMEMLIMIT (absolute ceiling). For containerized services, pin GOMEMLIMIT to ~90% of cgroup memory; leave GOGC default unless profiles say otherwise.

8.2 Mechanical Detail

  • sync.Pool mechanics (src/sync/pool.go):
  • Get() returns from the local-P cache, falls back to a victim cache (objects from the previous GC), falls back to New().
  • Put() stores into the local-P cache.
  • At GC, the local cache is moved to victim, victim is freed.
  • Therefore: do not assume a Pool.Get returns recently Put data. Always reset state on Get.
  • Common sync.Pool mistake: putting non-pointer values. The pool stores interface{}, so a non-pointer goes through boxing-net allocation. Always store pointers.
  • bytes.Buffer reuse pattern:
    var bufPool = sync.Pool{New: func() any { return new(bytes.Buffer) }}
    buf := bufPool.Get().(*bytes.Buffer)
    buf.Reset()
    defer bufPool.Put(buf)
    
  • Allocation profile interpretation: pprof -alloc_objects (count) tells you "where churn happens"; - alloc_space(bytes) tells you "where pressure happens"; - inuse_space tells you "what is currently retained." Use all three.
  • runtime/metrics (since 1.16): the modern API for runtime metrics. Replaces ad-hoc MemStats reads. Returns histograms for /gc/pauses:seconds, /sched/latencies:seconds, etc.

8.3 Lab-"Pool the Hot Path"

  1. Take the JSON-handling hot path of any service. Run pprof -alloc_objects under load. Identify the top three allocation sites.
  2. Introduce a sync.Pool for the most appropriate one (typically bytes.Buffer or a decoder).
  3. Re-benchmark. The win should be visible in allocs/op and in p99 latency under load.
  4. Now intentionally misuse: Pool.Put without resetting state. Detect the bug under - race` or via a deliberately-inserted assertion.

8.4 Idiomatic & golangci-lint Drill

  • staticcheck SA6002, gocritic: appendAssign, prealloc. Re-read Dave Cheney's "High Performance Go Workshop" notes (a classic standing reference).

8.5 Production Hardening Slice

  • Add a /debug/pprof HTTP endpoint behind an auth-or-build-tag gate (do not expose it on the public listener). Document the on-call runbook for capturing CPU/heap profiles from a misbehaving production process.
  • Add `runtime/metrics - based exporters for GC pause histograms and scheduler latencies. These are the signals an SRE wants when a Go service misbehaves.

Month 2 Capstone Deliverable

A memory-and-gc/ workspace: 1. layout-forensics (week 5)-with fieldalignment enforced in CI. 2. gc-forensics (week 6)-with annotated gctrace=1 logs and a tuning playbook. 3. iface-bench (week 7)-concrete vs interface vs generic, three-way benchmark. 4. pool-the-hot-path (week 8)-before/after profile diff, baseline benchmark in CI.

Workspace-level CI must add: fieldalignment analyzer, 0-alloc regression guard on critical benchmarks, pprof artifacts captured on demand from a make profile target.

Comments