Saltar a contenido

Week 6 - The Garbage Collector

6.1 Conceptual Core

  • Go's GC is a concurrent, tricolor, mark-sweep, non-generational, non-compacting collector. Each adjective is doing work:
  • Concurrent: marks happen while the application runs ("mutator").
  • Tricolor: every object is white (unreached), gray (reached but children unscanned), or black (reached and scanned). The invariant: a black object never points to a white object.
  • Mark-sweep: phase 1 marks reachable; phase 2 reclaims unmarked.
  • Non-generational: no separate young/old heap. (The Pacer compensates.)
  • Non-compacting: objects do not move. This is what allows direct pointer interior addressing and unsafe.Pointer to remain valid.
  • Why these choices: Go optimizes for predictable low-pause behavior at the cost of throughput. A compacting collector would have lower steady-state heap, but compaction stops the world.

6.2 Mechanical Detail

  • Phases (runtime/mgc.go):
  • Sweep termination (STW, microseconds): finish previous cycle's sweep.
  • Mark setup (STW, microseconds): enable write barrier, scan stacks (briefly STW each goroutine).
  • Concurrent mark: workers and mutator-assist mark the heap. Write barrier intercepts pointer writes.
  • Mark termination (STW, ~100 µs to ms on huge heaps): finalize.
  • Concurrent sweep: lazily reclaim white objects as the next allocation requests space.
  • Write barrier: the Dijkstra-style barrier records pointer writes during mark so the mutator cannot "hide" a white object behind a black one. Implemented as a runtime call inserted by the compiler around pointer stores. This is why pointer-heavy code is GC-expensive: every write costs a barrier.
  • Mark assist: when a goroutine allocates, it is forced to do proportional GC work. This couples allocation rate to GC progress and is the mechanism that prevents heap blowup.
  • The Pacer: targets next_gc = live_after_last_gc * (1 + GOGC/100). Default GOGC=100: GC when heap doubles. Tunable via GOGC=off, GOGC=50 (more frequent, lower memory), GOGC=200 (less frequent, higher memory).
  • GOMEMLIMIT (since Go 1.19): a soft total-memory ceiling. The GC adjusts pacing to stay under the limit even if GOGC would not have triggered. Use it as your primary memory control in containers; leave GOGC at default.
  • Stack scanning: each goroutine's stack is rooted in marking. Goroutines are paused (briefly) for stack scan; this is part of the STW-mark-setup phase but is per-goroutine and parallelized.

6.3 Lab-"GC Forensics"

  1. Write a service that allocates 100 MB/s of short-lived objects. Run with GODEBUG=gctrace=1. Read each GC line and identify: total heap, live heap, pause time, pacer target.
  2. Set GOMEMLIMIT=512MiB and GOGC=off. Re-run; observe how the GC is now driven entirely by the memory ceiling.
  3. Set GOGC=50 (no GOMEMLIMIT). Re-run; observe more frequent, smaller GCs.
  4. Capture a go tool pprof -alloc_objects profile. Identify the top five allocation sites. Refactor at least two using sync.Pool or pre-allocated buffers. Re-benchmark.
  5. Capture a go tool trace and locate the GC mark phases visually.

6.4 Idiomatic & golangci-lint Drill

  • staticcheck SA6002 (sync.Pool with non-pointer types), prealloc, gocritic: rangeValCopy (large struct copies in range loops).

6.5 Production Hardening Slice

  • In your service template, set GOMEMLIMIT from a MEMORY_LIMIT environment variable computed at container start (debug.SetMemoryLimit(int64(0.9 * cgroup_memory_limit))). This is the single most impactful production tuning knob.
  • Export GC metrics: go_gc_duration_seconds (histogram), go_memstats_*. Use prometheus/client_golang's collectors.NewGoCollector(collectors.WithGoCollections(collectors.GoRuntimeMemStatsCollection | collectors.GoRuntimeMetricsCollection)) for the modern collector.

Comments