Garbage collection¶

Why it matters¶

Every memory-safe runtime makes the same trade and resolves it differently. Tracing vs. reference counting. Generational vs. flat. Stop-the-world vs. concurrent. Compacting vs. non-compacting. Region-based vs. free-list. A working systems engineer can name where any production GC lands on each axis - and predict the failure mode that comes from that choice.

The four paths below show four resolutions of the same trade. Read at least two; the contrast is where the learning lives.

The lens, per path¶

Java - the most engineered GC ecosystem in existence¶

Month 3 - Memory & GC. Four weeks on object layout (incl. JEP 450 compact headers), the generational hypothesis, the GC family tree (Serial → Parallel → CMS-removed → G1 → ZGC → Shenandoah → Generational ZGC), container-aware heap sizing, and JFR-driven tuning.

What's unique here: the menu. Java is the only mainstream platform where picking a GC is a real design decision - Generational ZGC for sub-ms pauses on multi-GB heaps, G1 for general-purpose throughput, Parallel for batch, Serial for tiny containers. Every other path has one collector and lives with it.

The trap

assuming -Xmx is the memory budget. It is not. Direct memory, metaspace, code cache, GC overhead, and thread stacks all live outside the heap. Week 11 walks the full accounting.

Go - the world's least-configurable GC, on purpose¶

Month 2 - Memory & GC. Four weeks on the heap/stack split via escape analysis, the tricolor concurrent mark-sweep collector, write barriers, GOGC and GOMEMLIMIT (since 1.19), and the deliberate non-generational design.

What's unique here: Go's GC has two knobs (GOGC, GOMEMLIMIT) and refuses to add a third. The design ethic is "tune the allocation rate, not the collector." Most Go GC bugs are allocation bugs - escape-analysis failures, pointer-rich data structures, accidental interface boxing.

The trap

assuming "Go has no generational GC because generational GC isn't worth it." The actual reason is that Go's compiler aggressively stack-allocates short-lived values, so the youngest generation effectively is the stack. The hypothesis still holds; the implementation just lives in a different place.

Python - refcount first, generational second¶

Month 3 - Runtime & Performance. CPython is reference-counted with a generational cycle collector on top. Every object has a refcount field; when it hits zero, deallocation is immediate. The cycle collector exists only to break unreachable reference cycles that refcounting alone leaks.

What's unique here: GC pauses are predictable (refcount drops are deterministic) until they aren't (a large object with many transitive references triggers a cascade on one decref). The "free-threaded" CPython work (PEP 703, going stable in 3.14) changes the cost model again - atomic refcount ops dominate cache traffic on multi-core.

The trap

assuming Python "doesn't have GC pauses." It does - they're just usually small and frequent rather than large and rare.

Linux kernel - manual allocators, not GC, but the same problems¶

Month 2 - Memory & Scheduling. No tracing GC, but the same fragmentation, locality, and pause problems show up in the buddy allocator (page-level) and slab/slub/slob allocators (object-level). Plus reverse-mapping, RCU reclamation, and the page cache lifecycle.

What's unique here: kernel allocation cannot fail-and-retry. Every allocator path has a fallback strategy (GFP flags), and the wrong choice deadlocks the kernel. RCU is garbage collection in disguise - readers are wait-free, writers defer reclamation until all readers exit a grace period.

The trap

thinking "this isn't GC." RCU absolutely is GC, just with a different reclamation trigger.

The contrasts that teach¶

Axis	Java	Go	Python	Linux
Mechanism	Tracing	Tracing	Refcount + tracing for cycles	Manual + RCU
Generational	Yes (G1, Gen ZGC)	No (stack-allocates instead)	Yes (3 generations)	N/A
Concurrent	Yes (G1 mostly, ZGC fully)	Yes (tricolor)	No (refcount is sync)	Yes (RCU)
Compacting	Yes (G1, ZGC)	No	No	N/A
Pause profile	Sub-ms (ZGC) to 100ms (G1)	<1ms typical	Many small	None (no STW)
Tuning surface	Dozens of `-XX:` flags	Two knobs	None (cycle threshold)	GFP flags, drop_caches
Failure mode	Long pauses on bad sizing	Allocation-rate spikes	Cycle leaks, refcount thrash	OOM-killer, fragmentation

The single most clarifying read across these: Java's Generational ZGC + Go's tricolor concurrent collector side-by-side. Same problem (concurrent reclamation without stopping mutators), two completely different solutions (colored pointers + load barriers vs. write barriers + assist credit).

What to read first¶

Pick by your current work:

You debug GC pauses in production today → Java Month 3 weeks 10–12. Most directly applicable; the JFR workflow transfers to any JVM-shaped problem.
You write services that need to scale memory cheaply → Go Month 2 plus Java week 9 (object layout). The contrast is the lesson.
You write Python at scale → Python Month 3, then come back to Java's Month 3 for the generational-hypothesis derivation. Python's gen collector is the same idea, smaller and weirder.
You hack the kernel → Linux Month 2, then read Go's tricolor section. RCU and tricolor solve adjacent problems.