Week 3 - Stack Management¶
3.1 Conceptual Core¶
- Every goroutine has its own stack, separate from the OS thread stack. Initial size: 2 KB. Stacks grow (and shrink) dynamically. There is no fixed maximum per goroutine until you hit
runtime.SetMaxStack(default 1 GB on 64-bit). - Contiguous stacks (since Go 1.4): when a goroutine needs more stack, the runtime allocates a new, larger contiguous region, copies the old stack into it, and rewrites all internal pointers. This is what the compiler-emitted "stack guard" preamble enables.
- The relationship to escape analysis is direct: stack-allocated values are free; heap-allocated values cost an allocation, GC tracking, and a future scan. Master Go performance work is, in large part, the art of keeping values on the stack.
3.2 Mechanical Detail¶
- Stack growth flow (
src/runtime/stack.go): - Function prologue checks
g.stackguard0againstSP. - If
SP < stackguard0, jump toruntime.morestack. morestackcallsnewstack, which allocates a new stack 2× the old size, copies, and rewrites pointers (including pointers to local variables and function parameters).- Resume execution.
- Stack shrinking is performed by the GC when it observes the goroutine is using less than 1/4 of its stack.
- Pointer adjustment during copy: this is the reason Go does not let you take stable pointers to stack-allocated locals across goroutine boundaries-moving the stack invalidates them. The escape analysis catches this; values that escape are heap-promoted.
- Unsafe consequences: storing a
uintptr(rather thanunsafe.Pointer) does not protect against stack moves. The GC will not update the address. The Go memory model documents this; theunsafepackage docs make it explicit.
3.3 Lab-"Stack Growth in the Wild"¶
- Write a recursive function
func depth(n int) int { if n == 0 { return 0 }; var buf [256]byte; _ = buf; return 1 + depth(n-1) }. - Run with progressively larger
n. UseGODEBUG=gctrace=1,scheddetail=1and observe stack growth events. - Re-run under
runtime.ReadMemStatssnapshots, recordingStackInuseandStackSys. - Now write the same function with a `goroutine - per-call style and observe how stack churn changes.
3.4 Idiomatic & golangci-lint Drill¶
gocritic: deepEqualByteSlice,prealloc. The latter flags ranged loops appending to a slice that could bemake'd with capacity-relevant to allocator pressure but not stack-specific.
3.5 Production Hardening Slice¶
- Add
runtime/debug.SetMaxStack(64 * 1024 * 1024)(64 MiB) in your service binaries. Default 1 GiB is rarely what you want; bounding stack per-goroutine catches runaway recursion early.