08 - Memory for app developers¶
What this session is¶
About an hour. Not JVM internals - that's Java Mastery. This is the working model of Java memory that every application developer needs: where objects live, what a reference really is, how the garbage collector decides what to free, and - most importantly - how Java programs leak memory despite having a garbage collector. By the end you'll reason about object lifetimes, avoid the common leaks, and know what OutOfMemoryError is actually telling you.
"Java has garbage collection, so I don't think about memory" - the half-truth¶
It's true you never call free() or delete. The garbage collector (GC) reclaims memory automatically. But "automatic" is not "free of responsibility." Java programs absolutely leak memory, run out of it, and slow to a crawl under GC pressure. The difference from C is how you think about it: not "did I free this?" but "is something still holding a reference that shouldn't be?"
This chapter gives you that model.
Stack and heap: where things live¶
Java memory splits into two regions with different rules.
The stack holds method call frames. Each method call pushes a frame containing its local variables and parameters. When the method returns, its frame pops and that memory is reclaimed instantly - no GC involved. Each thread has its own stack.
The heap holds all objects (everything created with new, plus arrays). It's shared across all threads. Objects live here until the GC determines nothing references them.
The crucial distinction: a local variable on the stack often holds a reference to an object on the heap.
void example() {
int count = 5; // primitive: the value 5 lives on the stack
String name = "Alice"; // reference on stack -> String object on heap
Point p = new Point(1, 2); // reference on stack -> Point object on heap
} // when example() returns: stack frame popped. count gone.
// The String and Point objects on the heap are now unreferenced -> eligible for GC.
Picture it:
STACK (per thread) HEAP (shared)
┌─────────────────┐
│ count = 5 │ ┌──────────────┐
│ name ──────────┼──────────> │ "Alice" │
│ p ──────────┼──────────> │ Point{1, 2} │
└─────────────────┘ └──────────────┘
Two consequences you must internalize:
-
Primitives (
int,double,boolean, ...) hold their value directly. A localintis the number itself, on the stack. AnIntegeris a reference to an object on the heap (this is why autoboxing costs - chapter 12). -
Object variables hold references, not objects. When you write
Point p2 = p1, you copy the reference - both point at the same heap object. This is the reference-types lesson from chapter 03's defensive-copy section, now with the memory picture behind it.
Point a = new Point(1, 2);
Point b = a; // b and a reference the SAME object
b.move(5, 5); // mutating through b
System.out.println(a); // a sees the change - one object, two references
What a reference is¶
A reference is a handle to an object on the heap - conceptually an address, though the JVM may move objects around (the GC compacts the heap), so you can't do pointer arithmetic like C. You can only: follow a reference (.field, .method()), compare references (==), assign references, and pass them.
null is a reference that points at nothing. Dereferencing it (null.field) is the NullPointerException you've met many times - the memory equivalent of following an address to nowhere.
How the garbage collector decides what to free¶
The GC's job: find objects nothing references anymore, and reclaim their memory. The core concept is reachability.
An object is reachable if you can get to it by following references starting from a GC root. GC roots are the "anchors":
- Local variables on any thread's stack.
- Static fields of loaded classes.
- Active threads.
- (A few JVM-internal ones.)
Starting from the roots, the GC follows every reference, marking everything it can reach as live. Anything it can't reach is garbage - nothing in the running program can possibly use it - so its memory is reclaimed.
GC ROOTS reachable (live) unreachable (garbage)
┌──────────┐
│ stack var├──────> Object A ──────> Object B Object D ──────> Object E
│ static ├──────> Object C (nothing points here)
└──────────┘
Objects A, B, C are reachable from roots - kept. D and E reference each other but nothing reachable references them - both collected, even though they're not individually "null." (This is why Java can collect cycles that simple reference-counting can't.)
The practical takeaway, and the key to everything else in this chapter:
An object is freed when - and only when - it becomes unreachable. A "memory leak" in Java is an object that's still reachable but will never be used again.
You don't free memory. You make objects unreachable, by dropping the references that keep them alive. When the last reference goes, the object becomes collectible.
Generational GC, briefly¶
You don't need GC internals, but one fact shapes how you think about allocation: the JVM uses a generational collector based on a simple observation - most objects die young. A request handler creates dozens of short-lived objects that are garbage milliseconds later; a few objects (caches, config) live the whole program.
So the heap is split into a young generation (where new objects are born, collected frequently and cheaply) and an old generation (where long-lived survivors get promoted, collected rarely). This is why:
- Allocating many short-lived objects is cheap. They're born and die in the young generation, collected in fast minor GCs. Don't contort your code to avoid all allocation - the GC is optimized for exactly this pattern.
- The expensive collections are of the old generation (major/full GCs, longer pauses). Objects that get promoted there and then become garbage are the costly ones - which is exactly what leaks produce.
That's the entire model you need. The mechanics (G1, ZGC, regions, write barriers) are Java Mastery.
How Java programs leak memory¶
A garbage collector prevents use-after-free and double-free bugs. It does not prevent leaks. A leak in Java is: an object you're done with, but a reference still reaches it, so the GC can't collect it. Here are the four classic sources - learn to spot each.
1. Forgotten references in long-lived collections¶
The most common leak. You add to a collection that lives a long time, and never remove.
class Cache {
// This map lives for the whole program. Anything added is reachable forever.
private final Map<String, byte[]> entries = new HashMap<>();
void put(String key, byte[] data) {
entries.put(key, data); // added... but who removes? Nobody. Leak.
}
}
Every put keeps the byte[] reachable forever, even after no one needs it. A cache without eviction is a memory leak with extra steps. Fix: bound it (an LRU via LinkedHashMap from chapter 05), evict explicitly, or use a cache library (Caffeine) that handles eviction.
2. Listeners and callbacks never unregistered¶
button.addListener(this::onClick); // registers a reference to `this`
// ... if `this` is never unregistered, the button keeps it alive forever
The event source holds a reference to your listener. If the listener (or its enclosing object) should be collected but you never removeListener, it leaks. Fix: unregister in a cleanup method, or use weak references (below).
3. The classic: static collections¶
class Registry {
// static = lives as long as the class is loaded = essentially forever.
private static final List<Connection> ALL = new ArrayList<>();
Registry() { ALL.add(/* something referencing this */); } // never removed -> leak
}
Static fields are GC roots' close cousins - reachable for the program's life. Anything a static collection holds is immortal. Be very deliberate about what you put in static state.
4. Unclosed resources holding buffers¶
A resource (stream, connection) you don't close may hold native buffers and references. This is the chapter 06 lesson with the memory angle: try-with-resources isn't just tidy, it prevents resource-and-memory leaks.
The unifying diagnosis for all four: find the reference chain from a GC root to the object that should be dead. A heap profiler (chapter 13) shows you exactly this chain - "this 2 GB of byte arrays is retained by Cache.entries, which is held by a static field." That sentence is how every Java memory leak is solved.
Reference strength: strong, weak, soft¶
Not all references keep objects equally alive. Most are strong (an ordinary Point p = ...). Java offers weaker ones for cache and listener scenarios.
import java.lang.ref.WeakReference;
WeakReference<BigThing> ref = new WeakReference<>(new BigThing());
BigThing t = ref.get(); // returns the object, or null if it's been collected
- Strong reference (the default): keeps the object alive. As long as a strong reference exists, the object is never collected.
- Weak reference: does not prevent collection. If the only references to an object are weak, the GC collects it and
ref.get()returnsnull. Used for caches and listener registries that shouldn't keep their keys alive -WeakHashMapis built on this (entries vanish when keys become otherwise-unreachable). - Soft reference: like weak, but the GC keeps it until memory is tight, then collects it. Used for memory-sensitive caches.
You won't use these daily, but recognize them: WeakHashMap for listener registries that auto-clean, soft references for "cache this until we need the memory back." Reaching for them is a sign you're solving a real lifetime problem - use deliberately.
OutOfMemoryError: what it actually means¶
When the heap fills with reachable objects and the GC can't free enough, you get java.lang.OutOfMemoryError: Java heap space. It almost always means one of two things:
- A leak - reachable objects accumulating without bound (one of the four sources above). The fix is finding and breaking the reference chain.
- Genuinely too much data for the configured heap - you're trying to hold more than
-Xmxallows. The fix is either more heap or processing data in chunks/streams instead of loading it all.
The diagnostic move (chapter 13): enable -XX:+HeapDumpOnOutOfMemoryError, get a heap dump, open it in a tool (Eclipse MAT, VisualVM), and look at "what's retaining the most memory and what's the path to a GC root." That path is the bug.
Writing GC-friendly code (without obsessing)¶
The balance: don't micro-optimize allocation everywhere (the GC is good at short-lived objects), but be aware of the patterns that create pressure. The big ones, covered fully in chapter 12:
- Don't hold references longer than needed. Null out a reference in a long-lived object when you're done with what it points to, so the GC can reclaim it. (Don't do this for ordinary locals - they're freed when the method returns. Do it for fields of long-lived objects.)
- Bound your caches. Unbounded caches are the #1 leak.
- Prefer streaming over loading everything. Reading a 10 GB file into a
List<String>will OOM; reading it line by line won't. - Be aware of autoboxing in hot loops (chapter 12) - a million
Integers is a million heap objects.
Try it¶
-
See reachability. Create an object, assign it to two references, null one - it stays alive (the other reference). Null both - now it's collectible. You can't observe the collection directly, but reason through each step: at which line does the object become unreachable?
-
Build a leak. Write a
Cachewith an unboundedstatic Map. In a loop, put a million 1 KB byte arrays with unique keys. Watch memory climb (Runtime.getRuntime().totalMemory() - freeMemory()), and eventuallyOutOfMemoryError. Then bound it with an LRULinkedHashMap(chapter 05) and watch memory stabilize. Feel the difference between "reachable forever" and "evicted." -
WeakHashMap demo. Put entries in a
WeakHashMap<Key, Value>keyed by objects you hold strong references to. Print the size. Null your strong references to the keys, callSystem.gc()(a hint, not a guarantee), and print the size again - entries vanish because nothing else reaches the keys. Compare with a normalHashMapwhere they persist. -
Reference vs value semantics. Make a mutable
Counterobject. Assigna = new Counter(); b = a; b.increment();and printa. Confirm both references see the change (one object). Then do the same with anintand confirm independence (value copy). This is the stack/heap distinction in action. -
Heap dump on OOM. Run the leak from #2 with
-Xmx64m -XX:+HeapDumpOnOutOfMemoryError. Open the resulting.hprofin VisualVM or Eclipse MAT. Find the dominator (HashMap/byte[]) and the path to the GC root. This is exactly how production leaks get solved.
What you might wonder¶
"Should I call System.gc()?" Almost never. It's a hint the JVM can ignore, it forces a full (expensive) collection, and needing it usually signals a design problem. The GC runs when it needs to. The only legitimate uses are niche (some benchmarking, some demos like exercise 3). In application code, don't.
"Does setting a variable to null help?" For local variables, no - they're freed when the method returns; nulling them early is noise. For fields of long-lived objects, sometimes yes - if a long-lived object holds a reference to something big it no longer needs, nulling that field lets the GC reclaim the big thing sooner. The JDK does this in a few places (ArrayList.clear nulls its slots). Don't sprinkle x = null everywhere; do it deliberately when a long-lived object outlives its need for a large referent.
"What's the difference between OutOfMemoryError: heap space and : Metaspace?" Heap space = too many objects (this chapter). Metaspace = too many loaded classes (usually a classloader leak in app servers that redeploy repeatedly). Different cause, different fix; the error message tells you which.
"How big should -Xmx be?" Enough for your working set plus headroom, but not so much that full GCs become long pauses. For containers, the JVM is container-aware (since Java 10+) and sizes the heap from the container memory limit by default. Setting -Xmx to ~75% of the container limit (leaving room for stacks, metaspace, and native memory) is a common starting point - measure and adjust.
"Is the stack ever a problem?" Yes - deep or infinite recursion overflows it (StackOverflowError). Each call adds a frame; too many frames and you run out of stack. Unlike the heap, you rarely tune the stack; you fix the recursion. Default stack size (~512KB-1MB per thread) handles thousands of frames.
"Do records/immutability help memory?" Immutability (chapter 03) helps correctness and lets objects be safely shared (so you can reuse one instance instead of copying). It doesn't inherently use less memory, but sharing immutable instances instead of defensive-copying can reduce allocation. The bigger memory win from immutability is indirect: it discourages the long-lived-mutable-state patterns that leak.
Done¶
- You know the stack (frames, locals, primitives) vs heap (all objects), and that variables hold references.
- You understand reachability: objects live while reachable from a GC root, and a leak is a still-reachable-but-unused object.
- You know the generational model enough to know short-lived allocation is cheap.
- You can identify the four leak sources: long-lived collections, unregistered listeners, static state, unclosed resources.
- You know strong/weak/soft references and
WeakHashMap. - You know what
OutOfMemoryErrormeans and how a heap dump finds the cause.
Next: the heart of this path. Concurrency I - threads, and the three problems that make shared state dangerous.