Garbage collection¶
Why it matters¶
Every memory-safe runtime makes the same trade and resolves it differently. Tracing vs. reference counting. Generational vs. flat. Stop-the-world vs. concurrent. Compacting vs. non-compacting. Region-based vs. free-list. A working systems engineer can name where any production GC lands on each axis - and predict the failure mode that comes from that choice.
The four paths below show four resolutions of the same trade. Read at least two; the contrast is where the learning lives.
The lens, per path¶
Java - the most engineered GC ecosystem in existence¶
Month 3 - Memory & GC. Four weeks on object layout (incl. JEP 450 compact headers), the generational hypothesis, the GC family tree (Serial → Parallel → CMS-removed → G1 → ZGC → Shenandoah → Generational ZGC), container-aware heap sizing, and JFR-driven tuning.
What's unique here: the menu. Java is the only mainstream platform where picking a GC is a real design decision - Generational ZGC for sub-ms pauses on multi-GB heaps, G1 for general-purpose throughput, Parallel for batch, Serial for tiny containers. Every other path has one collector and lives with it.
The trap
assuming -Xmx is the memory budget. It is not. Direct memory, metaspace, code cache, GC overhead, and thread stacks all live outside the heap. Week 11 walks the full accounting.
Go - the world's least-configurable GC, on purpose¶
Month 2 - Memory & GC. Four weeks on the heap/stack split via escape analysis, the tricolor concurrent mark-sweep collector, write barriers, GOGC and GOMEMLIMIT (since 1.19), and the deliberate non-generational design.
What's unique here: Go's GC has two knobs (GOGC, GOMEMLIMIT) and refuses to add a third. The design ethic is "tune the allocation rate, not the collector." Most Go GC bugs are allocation bugs - escape-analysis failures, pointer-rich data structures, accidental interface boxing.
The trap
assuming "Go has no generational GC because generational GC isn't worth it." The actual reason is that Go's compiler aggressively stack-allocates short-lived values, so the youngest generation effectively is the stack. The hypothesis still holds; the implementation just lives in a different place.
Python - refcount first, generational second¶
Month 3 - Runtime & Performance. CPython is reference-counted with a generational cycle collector on top. Every object has a refcount field; when it hits zero, deallocation is immediate. The cycle collector exists only to break unreachable reference cycles that refcounting alone leaks.
What's unique here: GC pauses are predictable (refcount drops are deterministic) until they aren't (a large object with many transitive references triggers a cascade on one decref). The "free-threaded" CPython work (PEP 703, going stable in 3.14) changes the cost model again - atomic refcount ops dominate cache traffic on multi-core.
The trap
assuming Python "doesn't have GC pauses." It does - they're just usually small and frequent rather than large and rare.
Linux kernel - manual allocators, not GC, but the same problems¶
Month 2 - Memory & Scheduling. No tracing GC, but the same fragmentation, locality, and pause problems show up in the buddy allocator (page-level) and slab/slub/slob allocators (object-level). Plus reverse-mapping, RCU reclamation, and the page cache lifecycle.
What's unique here: kernel allocation cannot fail-and-retry. Every allocator path has a fallback strategy (GFP flags), and the wrong choice deadlocks the kernel. RCU is garbage collection in disguise - readers are wait-free, writers defer reclamation until all readers exit a grace period.
The trap
thinking "this isn't GC." RCU absolutely is GC, just with a different reclamation trigger.
The contrasts that teach¶
| Axis | Java | Go | Python | Linux |
|---|---|---|---|---|
| Mechanism | Tracing | Tracing | Refcount + tracing for cycles | Manual + RCU |
| Generational | Yes (G1, Gen ZGC) | No (stack-allocates instead) | Yes (3 generations) | N/A |
| Concurrent | Yes (G1 mostly, ZGC fully) | Yes (tricolor) | No (refcount is sync) | Yes (RCU) |
| Compacting | Yes (G1, ZGC) | No | No | N/A |
| Pause profile | Sub-ms (ZGC) to 100ms (G1) | <1ms typical | Many small | None (no STW) |
| Tuning surface | Dozens of -XX: flags |
Two knobs | None (cycle threshold) | GFP flags, drop_caches |
| Failure mode | Long pauses on bad sizing | Allocation-rate spikes | Cycle leaks, refcount thrash | OOM-killer, fragmentation |
The single most clarifying read across these: Java's Generational ZGC + Go's tricolor concurrent collector side-by-side. Same problem (concurrent reclamation without stopping mutators), two completely different solutions (colored pointers + load barriers vs. write barriers + assist credit).
What to read first¶
Pick by your current work:
- You debug GC pauses in production today → Java Month 3 weeks 10–12. Most directly applicable; the JFR workflow transfers to any JVM-shaped problem.
- You write services that need to scale memory cheaply → Go Month 2 plus Java week 9 (object layout). The contrast is the lesson.
- You write Python at scale → Python Month 3, then come back to Java's Month 3 for the generational-hypothesis derivation. Python's gen collector is the same idea, smaller and weirder.
- You hack the kernel → Linux Month 2, then read Go's tricolor section. RCU and tricolor solve adjacent problems.