Week 12 - JFR, Heap Dumps, and Allocation Profiling¶
Conceptual Core¶
Java has the best production-grade profiler of any mainstream language: JFR (Java Flight Recorder), free and open-source since 11. Combined with async-profiler for CPU/alloc sampling and Eclipse MAT for heap-dump analysis, you can diagnose almost any memory problem post-mortem.
Mechanical Detail¶
- JFR:
-XX:StartFlightRecording=duration=60s,filename=app.jfr,settings=profileor live viajcmd <pid> JFR.start name=foo duration=60s filename=foo.jfr. - Open in JDK Mission Control (JMC) or via
jfr print --events jdk.GCPhasePause app.jfr. - Key event categories: GC (pauses, allocation), JIT (compilation, deopt), thread (parking, blocking), I/O, allocation outside TLAB (the "humongous allocation" signal).
- async-profiler (1-package native):
asprof -e cpu,alloc -d 30 -f flame.html <pid>. Flame graphs are the right default visualization. - Heap dumps:
jcmd <pid> GC.heap_dump /tmp/heap.hprofor automatic on OOM. Analyze in Eclipse MAT - "Dominator Tree" and "Leak Suspects" are the only two views you need 90% of the time. - The pattern for a leak hunt: heap dump → MAT dominator tree → find the unexpected retainer → trace GC roots → fix.
Lab¶
Write a deliberate memory leak (a static Map that accumulates request contexts). Run it, take a heap dump after some traffic, identify the leak in MAT. Then fix it, re-run, re-dump, confirm.
Idiomatic Drill¶
Capture a 60-second JFR of a real local service (your week 4 lab, scaled up). Open in JMC. Identify the top three allocation sites.
Production Hardening Slice¶
Add to hardening/: a script that starts JFR continuously in chunked rotation (maxsize=200M,maxage=24h). This is "always-on profiling," now standard at every major Java shop.