Saltar a contenido

Java Intermediate

The missing middle: from writing Java to writing good Java. Design judgment, concurrency, performance awareness. Bridges From Scratch to Mastery.

Printing this page

Use your browser's PrintSave as PDF. The print stylesheet hides navigation, comments, and other site chrome; pages break cleanly at section boundaries; advanced content stays included regardless of beginner-mode state.


Java Intermediate - From Writing Java to Writing Good Java

From "I can write a class and a for loop" to "I write concurrent, performance-aware, idiomatic Java that a senior reviewer would approve."

This is the missing middle. Java From Scratch takes you from never-coded to your first pull request. Java Mastery is the senior reference - JVM internals, JIT, GC, bytecode. Between them is a wide gap: you can write working Java, but you don't yet know which collection to reach for, how to make code thread-safe, why your program allocates so much, or how to design with interfaces instead of inheritance. This path closes that gap.

Who this is for

  • You finished Java From Scratch (or you can already write classes, methods, collections, generics, and exceptions without looking up syntax).
  • You can read most of a small Java program but couldn't yet explain why it's written the way it is.
  • You've never written multithreaded code, or you have and it scared you.

If you're still learning what a class is, start with Java From Scratch first. If you're comfortable reading hotspot/share/gc/ source, you're past this - go to Java Mastery.

What you'll need

  • A working JDK 21+ (Java From Scratch's setup chapter covers this).
  • IntelliJ IDEA Community Edition or VS Code with the Java extensions.
  • About 5 hours per week. Sized for 3-4 months.

How this path differs from From Scratch

Same teaching voice - concrete, you-directed, runnable examples, a "Try it" on every concept, a "What you might wonder" Q&A. But pitched higher: we assume you know the syntax, so we spend the time on judgment. When does composition beat inheritance? Which Map implementation? Why is this code not thread-safe? When does an allocation actually matter?

Every chapter still ends with code you run yourself. Reading without doing won't stick - that's even more true here than in the beginner path.

The pages

# Title What you'll know after
00 What "intermediate" means The map, and how to use it
01 OOP done right Composition over inheritance; when interfaces win
02 Interfaces and abstract classes in depth Designing with contracts
03 Equality, hashing, immutability The equals/hashCode/compareTo contracts
04 Generics in depth Bounded types, wildcards, the truth about erasure
05 Collections deep Which one when, Big-O in practice, the Map family
06 Exceptions done right Checked-vs-unchecked strategy, custom hierarchies
07 Functional Java Lambdas, method refs, streams used well, Optional
08 Memory for app developers Heap/stack, references, GC awareness
09 Concurrency I Threads, and the three problems
10 Concurrency II synchronized, locks, atomics, thread-safe collections
11 Concurrency III Executors, futures, CompletableFuture, virtual threads
12 Performance-aware coding Allocation, boxing, strings, when it matters
13 Profiling basics Reading a JFR recording, finding a hot spot
14 Testing at the next level Mocking, parameterized, property-based
15 Bridging to mastery Reading harder code, a more ambitious contribution

Start with What "intermediate" means.

00 - What "intermediate" means

What this session is

Ten minutes. The map for this path: what changes between beginner and intermediate, what we'll cover, and how to get the most out of it.

The shift

Beginner Java is about syntax and "does it run." You learned what a class is, how a loop works, how to catch an exception. The question was always: how do I make the compiler accept this and produce the right output?

Intermediate Java is about judgment and "is this good." The syntax is settled. Now the questions are different:

  • There are five ways to do this. Which is right, and why?
  • This works on my machine with one user. Does it still work with a thousand users hitting it at once?
  • This is correct. Is it fast enough, and if not, where's the cost?
  • This class works. Will the next person who reads it understand it, or curse my name?

Nobody writes good Java by accident. Every senior engineer you admire learned the judgment in this path the hard way - by shipping something, watching it break or slow down or confuse a teammate, and learning the rule behind the mistake. This path gives you the rules without requiring the full set of scars.

The three things that separate intermediate from beginner

1. Design judgment

A beginner makes a class because they need to group some data. An intermediate engineer asks: should this be a class or a record? Should it use inheritance or composition? Should this method take a concrete type or an interface? These choices compound. Get them right and your code stays flexible for years; get them wrong and every change fights you.

Chapters 01–07 are about design judgment.

2. Concurrency

This is the big one, and it's almost entirely absent from beginner Java. The moment your code runs on a server handling many requests at once - which is most real Java - you're writing concurrent code, whether you meant to or not. Concurrency bugs are the worst kind: they appear randomly, vanish when you look closely, and pass every test until production. You can't be a competent Java engineer without understanding threads, shared state, and the tools for managing them.

Chapters 09–11 are about concurrency. They're the heart of this path.

3. Performance awareness

Not premature optimization - awareness. Knowing that a String concatenation in a loop allocates a new object every iteration. Knowing that autoboxing an int into an Integer a million times has a cost. Knowing how to find the actual hot spot with a profiler instead of guessing. You won't optimize most code - but you'll recognize the 5% that matters and you'll have the tools to fix it.

Chapters 08, 12, and 13 are about performance awareness.

What this path is not

  • Not JVM internals. We'll talk about the garbage collector enough to write GC-friendly code, but we won't dissect G1's remembered sets or ZGC's colored pointers. That's Java Mastery.
  • Not a framework course. No Spring, no Hibernate as protagonists. We use the standard library so the concepts transfer. Frameworks are easier once the fundamentals are solid.
  • Not exhaustive. Java is vast. This path covers the intermediate concepts you'll use weekly, not every corner of the language.

How to use this path

  1. Do the exercises. Even more than in the beginner path. Concurrency especially cannot be learned by reading - you have to run code, see it race, and fix it.
  2. Use the race detector mindset. From chapter 09 on, get in the habit of asking "what if two threads ran this at the same time?" of every piece of shared state.
  3. Read real code alongside. Pick a small, well-regarded Java library (we'll suggest some). When a chapter teaches a concept, find it in the wild.
  4. Don't rush the concurrency chapters. 09–11 are worth a week each. They're the difference between "writes Java" and "writes Java that survives production."

A note on Java versions

This path targets Java 21+ (the current LTS as you read this is 21 or 25). Where a feature is newer - virtual threads (21), structured concurrency, scoped values - it's flagged. If you're on an older JDK at work (Java 8 and 11 are still everywhere), the design and concurrency concepts all transfer; only some syntax and a few APIs differ. We note the important gaps.

What you might wonder

"I haven't finished From Scratch. Can I start here?" If you can write a class with fields and methods, use a List and a Map, write a try/catch, and define a generic method - yes. If any of those made you pause, do the relevant From Scratch chapters first. This path moves fast through anything From Scratch covered.

"Is concurrency really that important?" Yes. Almost every Java job involves servers handling concurrent requests. The frameworks hide some of it, but the moment you have shared mutable state - a cache, a counter, a connection pool - you're responsible for thread safety. It's the single most common source of "senior" Java interview questions and production incidents.

"Will this make me job-ready?" This plus From Scratch gets you to "competent mid-level Java engineer who can be trusted with real features." Mastery is for the senior/staff tier. Most working Java engineers operate at the level this path teaches.

Done

  • You know the beginner → intermediate shift: from syntax to judgment.
  • You know the three pillars: design, concurrency, performance.
  • You know how to use the path: do the exercises, especially the concurrency ones.

Next: OOP done right →

01 - OOP done right

What this session is

About an hour. In From Scratch you learned how to write a class and how to extend one with extends. This session is about when to do which - the single most consequential design judgment in object-oriented Java. By the end you'll know why experienced engineers reach for composition far more than inheritance, and when inheritance is still the right call.

The trap beginners fall into

Inheritance is the first "real OOP" tool you learn, so it's the one you reach for. You have a Vehicle, so you make Car extends Vehicle and Truck extends Vehicle. It feels powerful. Code reuse! Polymorphism! The textbook examples all use it.

Then real requirements arrive, and inheritance starts to hurt. Let's watch it happen.

Inheritance, and where it breaks

Say you're modeling employees. You start clean:

class Employee {
    String name;
    double baseSalary;

    double monthlyPay() {
        return baseSalary / 12;
    }
}

class Manager extends Employee {
    double bonus;

    @Override
    double monthlyPay() {
        return (baseSalary + bonus) / 12;
    }
}

Fine so far. Now the requirements grow:

  • Some employees are contractors (paid hourly, no annual salary).
  • Some managers are also engineers (they still code).
  • Some contractors become managers.
  • An employee can be a part-time intern who is also a contractor.

Try to model this with inheritance and you hit a wall. Manager extends Employee - but what about a contractor who manages? ContractorManager extends Manager? But contractors aren't salaried, and Manager assumed a baseSalary. Now you're overriding methods to undo behavior you inherited. You add EngineerManager, ContractorEngineer, PartTimeContractorIntern... the class count explodes combinatorially, and each new trait doubles it.

This is the core problem with inheritance: it forces a single, rigid "is-a" hierarchy onto things that have many independent traits. A person isn't one thing on a tree. They're a bundle of capabilities - paid this way, has these roles, works this schedule - and those vary independently.

Three specific pains you'll feel:

  1. The fragile base class. When Manager extends Employee, Manager depends on Employee's internals. Change Employee.monthlyPay() and you might silently break Manager - or every subclass. The base class can't evolve safely because subclasses reach into its behavior.

  2. You inherit everything, wanted or not. extends is all-or-nothing. You get every field and method of the parent, even the ones that don't make sense for you. A Stack extends Vector (a real, regretted decision in Java's own standard library) means a Stack exposes Vector's add(index, element) - letting you insert into the middle of a stack, which is nonsense.

  3. Single inheritance is a hard limit. A class can extend exactly one class. The moment something needs traits from two places, inheritance can't express it.

Composition: have-a instead of is-a

Composition flips the model. Instead of a class being a kind of another class, it has the pieces it needs as fields.

Here's the employee model rebuilt with composition:

// Each capability is its own small, focused type.
interface PayStrategy {
    double monthlyPay();
}

record Salaried(double annual, double bonus) implements PayStrategy {
    public double monthlyPay() { return (annual + bonus) / 12; }
}

record Hourly(double rate, double hoursPerMonth) implements PayStrategy {
    public double monthlyPay() { return rate * hoursPerMonth; }
}

// A person HAS a pay strategy and HAS a set of roles - they don't inherit them.
class Person {
    String name;
    PayStrategy pay;
    Set<String> roles;          // "manager", "engineer", "intern" - combine freely

    Person(String name, PayStrategy pay, Set<String> roles) {
        this.name = name;
        this.pay = pay;
        this.roles = roles;
    }

    double monthlyPay() {
        return pay.monthlyPay();
    }

    boolean isManager() {
        return roles.contains("manager");
    }
}

Now the combinatorial explosion is gone:

var alice = new Person("Alice",
        new Salaried(180_000, 30_000),
        Set.of("manager", "engineer"));

var bob = new Person("Bob",
        new Hourly(95, 160),
        Set.of("contractor", "manager"));     // a contractor who manages - no problem

A contractor-manager-engineer is just a Person with an Hourly pay strategy and three roles. No new class. Each trait varies independently because each is a separate field, not a fixed position on a tree.

This is the principle, and it's old enough to be a proverb:

Favor composition over inheritance.

It's the first design rule from the "Gang of Four" Design Patterns book and the most-cited piece of OOP advice for a reason. Composition gives you flexibility (mix traits freely), safety (each piece is independent - changing Hourly can't break Salaried), and testability (you can test Salaried.monthlyPay() in isolation).

When inheritance is right

This isn't "never use inheritance." It's "use it deliberately, for the cases it fits." Inheritance is the right tool when:

  1. There's a genuine, stable "is-a" relationship that won't grow new dimensions. A SavingsAccount truly is an Account. A Circle truly is a Shape. If the hierarchy is shallow and the "is-a" is real and unlikely to need cross-cutting traits, inheritance is clean.

  2. You're implementing a framework's extension point. When you write class MyServlet extends HttpServlet or class MyTest extends TestCase, you're plugging into a contract the framework defined. That's inheritance used as intended.

  3. You want to share a partial implementation via an abstract base. An AbstractList provides the plumbing so concrete lists only implement a few methods. (We'll cover abstract classes in chapter 02.)

The test: if you're overriding methods to remove or undo inherited behavior, inheritance is the wrong tool. That's the code telling you the "is-a" relationship is a lie.

The third tool: program to interfaces

Composition pairs with a second habit: depend on interfaces, not concrete classes. Notice that Person holds a PayStrategy (an interface), not a Salaried or Hourly (concrete types). That one choice means:

  • Person doesn't know or care how pay is calculated. New pay schemes (Commission, Equity) drop in without touching Person.
  • You can test Person with a fake PayStrategy that returns a fixed number.
  • The dependency points at a stable contract, not a volatile implementation.

You met interfaces in From Scratch as "a contract any type can satisfy." The intermediate habit is to reach for them by default when one object needs to use another. We'll go deep on interface design in chapter 02.

Try it

  1. Feel the explosion. Sketch (on paper or in code) a Notification system using inheritance: EmailNotification, SmsNotification, then add "urgent" and "scheduled" as variations. Watch the class count: UrgentEmailNotification, ScheduledSmsNotification... Count how many classes you need for 3 channels × 2 urgencies × 2 timings.

  2. Refactor to composition. Rebuild it: a Notification that has a Channel (interface: Email, Sms, Push), a Priority field, and a Schedule field. How many types now? How do you add a new channel?

  3. Spot the lie. Find this in the wild or write it: a subclass that overrides a method to throw UnsupportedOperationException ("this operation doesn't apply to me"). That's the canonical sign of inheritance abuse - the subclass is rejecting part of what it inherited. Rewrite it with composition so the unsupported operation simply isn't there.

  4. Read Java's own mistake. Look up java.util.Stack (it extends Vector). Notice it inherits get(int), add(int, E), remove(int) - operations that violate stack semantics. Then look at ArrayDeque, the modern recommended stack, which uses composition-friendly design. This is the standard library admitting the lesson.

What you might wonder

"So inheritance is bad?" No - overused inheritance is bad. It's a precise tool for genuine, stable is-a relationships and framework extension points. The problem is reaching for it as the default reuse mechanism. Default to composition; use inheritance when the is-a is real.

"Isn't composition more boilerplate?" Sometimes slightly more upfront - you write a field and delegate a method instead of getting it free from extends. But it pays back fast: independent pieces, no fragile base class, easy testing, free mixing of traits. Records and modern Java keep the boilerplate small. The tradeoff is almost always worth it.

"What about default methods on interfaces - isn't that inheritance?" Interfaces can provide default method implementations (since Java 8), which is a limited form of shared behavior. It's safer than class inheritance - interfaces have no fields, so there's no fragile-base-class state problem - and a class can implement many interfaces. We'll cover this in chapter 02.

"How does this relate to 'SOLID'?" "Favor composition over inheritance" supports several SOLID principles at once - especially the Open/Closed Principle (extend behavior by adding new strategy implementations, not by editing existing classes) and Dependency Inversion (depend on the PayStrategy abstraction, not concrete pay types). You don't need to memorize SOLID as an acronym; you need the habits, and this is the central one.

Done

  • You know why inheritance breaks down: rigid single hierarchy, fragile base class, all-or-nothing reuse.
  • You know composition: model traits as fields (often interface-typed) that vary independently.
  • You know the rule - favor composition over inheritance - and the cases where inheritance still wins.
  • You know the tell: overriding to undo inherited behavior means you picked the wrong tool.

This is the foundation for everything in the design half of this path. Next we go deep on the contracts themselves: interfaces and abstract classes.

Next: Interfaces and abstract classes in depth →

02 - Interfaces and abstract classes in depth

What this session is

About ninety minutes. Chapter 01 told you to "program to interfaces." This session is the deep version: what interfaces really are, every feature they've grown (default methods, static methods, private methods, constants), what abstract classes add, exactly when to choose one over the other, and the design habits that make interfaces powerful instead of noise. By the end this is your reference for every "should this be an interface or a class?" decision you'll make.

The mental model: an interface is a promise

A class says what something is and how it works. An interface says what something can do - nothing about how. It's a promise: "any type that implements me guarantees these methods exist."

interface Switch {
    void turnOn();
    void turnOff();
    boolean isOn();
}

That's a promise with three clauses. A LightBulb, a Server, a Valve - completely unrelated things - can all make this promise. Code that depends on Switch works with all of them and never needs to know which it's holding:

void cycle(Switch s) {
    s.turnOn();
    System.out.println("on? " + s.isOn());
    s.turnOff();
}

cycle works on anything switchable, forever, including types that don't exist yet. That decoupling - the caller depends on the promise, not the implementation - is the entire point.

Implementing an interface

A class promises to fulfill an interface with implements, then provides every method:

class LightBulb implements Switch {
    private boolean on = false;

    public void turnOn()    { on = true; }
    public void turnOff()   { on = false; }
    public boolean isOn()   { return on; }
}

Two rules that bite beginners:

  1. Interface methods are implicitly public. When you implement them, you must write public explicitly - leaving it off narrows the visibility, which the compiler rejects.
  2. You must implement every method (unless your class is abstract - see below). Miss one and the class won't compile.

A class can implement many interfaces - this is the superpower inheritance lacks:

class SmartBulb implements Switch, Dimmable, NetworkConnected {
    // must fulfill all three promises
}

A SmartBulb is switchable and dimmable and network-connected - three independent capabilities, combined freely. Try that with single inheritance and you can't.

Default methods: behavior on an interface

Since Java 8, an interface can provide a method body with the default keyword. Implementing classes inherit it for free but may override it.

interface Switch {
    void turnOn();
    void turnOff();
    boolean isOn();

    // A default method built from the abstract ones.
    default void toggle() {
        if (isOn()) turnOff();
        else turnOn();
    }
}

Every Switch now has toggle() without writing it. LightBulb, Server, all of them - they get toggle() for free, defined once.

Why default methods exist: they let an interface grow without breaking every existing implementation. Before Java 8, adding a method to an interface broke every class that implemented it (suddenly missing a method). default methods were added specifically so the JDK could add methods like List.sort() and Collection.stream() to interfaces that millions of classes already implemented. The new method has a default body, so old implementations keep compiling.

Use them for: convenience methods derived from the core (abstract) ones, like toggle() above, or Comparator's reversed() and thenComparing(). Don't use them to smuggle in lots of stateful behavior - interfaces have no fields, so default methods can only work through the other methods. That limit is a feature; it keeps interfaces honest.

Static methods on interfaces

Interfaces can also hold static methods - usually factories or helpers related to the type:

interface Switch {
    void turnOn();
    void turnOff();
    boolean isOn();

    // Static factory: makes a Switch that does nothing. Useful for tests/defaults.
    static Switch noop() {
        return new Switch() {
            public void turnOn() {}
            public void turnOff() {}
            public boolean isOn() { return false; }
        };
    }
}

// Call it on the interface itself:
Switch s = Switch.noop();

You've used these without noticing: List.of(...), Map.of(...), Comparator.comparing(...), Path.of(...) are all static methods on interfaces. They're the modern idiom for "give me a ready-made instance of this type."

Private methods on interfaces

Since Java 9, interfaces can have private methods - used only to share code between default methods, hidden from implementers:

interface Logger {
    void write(String line);

    default void info(String msg)  { write(format("INFO", msg)); }
    default void warn(String msg)  { write(format("WARN", msg)); }
    default void error(String msg) { write(format("ERROR", msg)); }

    // Private helper - not part of the public promise, just shared plumbing.
    private String format(String level, String msg) {
        return "[" + level + "] " + msg;
    }
}

format isn't part of the contract - implementers never see it. It just removes duplication among the three default methods. Reach for private interface methods only when two or more default methods share logic.

Constants on interfaces (and why to be wary)

Fields in an interface are implicitly public static final - constants:

interface Physics {
    double SPEED_OF_LIGHT = 299_792_458;   // public static final, automatically
}

This works, but avoid the "constant interface" antipattern - making an interface whose only purpose is to hold constants, then implementing it to "get" them. It pollutes your type's public API with constants and abuses implements (which should mean "I fulfill this contract," not "I want these numbers"). Put constants in a final class with a private constructor, or an enum, instead. Constants directly relevant to an interface's methods are fine; a bag of unrelated constants is not.

Abstract classes: a partial implementation

An abstract class sits between an interface (pure promise) and a concrete class (full implementation). It can have everything a class has - fields, constructors, concrete methods - plus abstract methods with no body that subclasses must fill in. You can't instantiate it directly.

abstract class AbstractSwitch implements Switch {
    private boolean on = false;          // an interface can't have this field

    // Concrete: shared state management, written once.
    public boolean isOn()  { return on; }
    public void turnOn()   { on = true;  onChanged(); }
    public void turnOff()  { on = false; onChanged(); }

    // Abstract: each subclass provides the device-specific reaction.
    protected abstract void onChanged();
}

class Relay extends AbstractSwitch {
    protected void onChanged() {
        System.out.println("relay clicked, now " + (isOn() ? "closed" : "open"));
    }
}

AbstractSwitch handles the on field and the bookkeeping; subclasses only supply onChanged(). This is the template method pattern: the base class defines the skeleton of an operation and defers specific steps to subclasses. The JDK uses it everywhere - AbstractList, AbstractMap, InputStream all provide most methods and leave a few abstract.

What an abstract class can do that an interface cannot:

  • Hold instance fields (mutable state like on above).
  • Have constructors (to initialize that state).
  • Have non-public members (protected, package-private).

What an interface can do that an abstract class cannot:

  • Be implemented by a class that already extends something else (a class extends one class but implements many interfaces).

The decision: interface or abstract class?

Here's the rule that resolves almost every case:

Default to an interface. Reach for an abstract class only when you need to share mutable state or constructor logic across implementations - and even then, consider composition first.

The longer reasoning:

Question Lean interface Lean abstract class
Is it purely a capability/contract? yes → interface
Do implementers need to also extend something else? yes → interface (they can't extend two classes)
Is there shared mutable state (fields)? yes → abstract class
Is there constructor logic all subtypes need? yes → abstract class
Will there be many unrelated implementers? yes → interface
Is this a framework extension point with lots of plumbing? often abstract class

A powerful middle path the JDK uses: ship both. Define the interface (List), and provide an abstract skeleton (AbstractList) that implementers may extend for convenience but aren't forced to. Implementers who already extend something else implement List directly; everyone else extends AbstractList and saves work. You get the flexibility of an interface and the convenience of shared code.

A worked example: a plugin system

Let's design a small plugin system to see every tool in play.

// The contract every plugin promises.
interface Plugin {
    String name();
    void execute(Context ctx);

    // Default: most plugins don't need setup, so give a no-op default.
    default void init(Context ctx) {}

    // Default: derived convenience.
    default String describe() {
        return name() + " plugin";
    }

    // Static factory for a trivial plugin from a lambda.
    static Plugin of(String name, java.util.function.Consumer<Context> action) {
        return new Plugin() {
            public String name() { return name; }
            public void execute(Context ctx) { action.accept(ctx); }
        };
    }
}

A simple plugin implements directly:

class GreetPlugin implements Plugin {
    public String name() { return "greet"; }
    public void execute(Context ctx) {
        System.out.println("Hello from " + ctx.user());
    }
}

A family of plugins that share setup uses an abstract base:

// Shared plumbing: every DB plugin needs a connection, opened once.
abstract class DatabasePlugin implements Plugin {
    protected Connection conn;            // shared mutable state - needs a class

    public void init(Context ctx) {
        this.conn = ctx.openConnection(); // constructor-like setup, written once
    }

    // Subclasses implement the actual query work; conn is ready for them.
    public abstract void execute(Context ctx);
}

class BackupPlugin extends DatabasePlugin {
    public String name() { return "backup"; }
    public void execute(Context ctx) {
        conn.run("BACKUP DATABASE");       // conn was opened by init()
    }
}

And a throwaway plugin from the static factory:

Plugin ping = Plugin.of("ping", ctx -> System.out.println("pong"));

Three styles, one contract. The host code only ever sees Plugin:

void runAll(List<Plugin> plugins, Context ctx) {
    for (Plugin p : plugins) {
        p.init(ctx);
        System.out.println("running " + p.describe());
        p.execute(ctx);
    }
}

This is the shape of real plugin systems, servlet containers, build-tool task systems, and test frameworks. The interface is the contract; abstract classes provide optional shared scaffolding; static factories and lambdas make trivial cases cheap.

Try it

  1. Build the Switch hierarchy. Write the Switch interface with the toggle() default. Implement LightBulb and a Fan (where turnOn prints "spinning"). Call toggle() on each twice. Confirm the default works for both without either class defining it.

  2. Add a capability. Add a Dimmable interface (void setBrightness(int pct), int brightness()). Make a SmartBulb implements Switch, Dimmable. Write a method dimAll(List<Dimmable> ds) and a method cycleAll(List<Switch> ss). Pass your SmartBulb to both. One object, two contracts.

  3. Template method. Write the AbstractSwitch with the onChanged() hook. Make two subclasses that react differently. Notice you wrote the on-field logic exactly once.

  4. Spot the constant-interface antipattern. Find or write an interface that's only constants. Refactor it into a final class with a private constructor (private Physics() {}) holding public static final fields. Access via Physics.SPEED_OF_LIGHT. Discuss why this is cleaner than implements Physics.

  5. Ship both. Take your Plugin interface and write an AbstractPlugin that provides a default describe() using a protected String category field. Make one plugin extend it and one implement Plugin directly. Both work through runAll.

What you might wonder

"If interfaces can have method bodies now, why have abstract classes at all?" State and constructors. Default methods can't touch instance fields (interfaces have none) and interfaces have no constructors. The moment your shared behavior needs to remember something between calls (the on field, the conn), you need an abstract class. Also: a class can only extend one abstract class, so abstract classes impose a hierarchy that interfaces don't.

"Can an interface extend another interface?" Yes - interface ColorSwitch extends Switch adds methods to the Switch promise. An interface can extend many interfaces: interface SmartDevice extends Switch, Dimmable, NetworkConnected. This is interface composition, and it's how you build up rich contracts from small ones.

"What's the diamond problem with default methods?" If a class implements two interfaces that both have a default method with the same signature, the compiler forces you to resolve the ambiguity by overriding it (you can call a specific one with InterfaceName.super.method()). Java makes the conflict a compile error rather than silently picking one - safer than C++'s implicit resolution.

"Should I make an interface for everything, just in case?" No. An interface with exactly one implementation that will never have another is usually premature - it adds indirection without flexibility. Introduce the interface when you have (or clearly foresee) a second implementation, a need to mock for tests, or a public API boundary. "Accept interfaces, return structs" (chapter 01) is the guide: interfaces earn their place at boundaries.

"sealed interfaces - what are those?" A sealed interface restricts which types may implement it (sealed interface Shape permits Circle, Square). It's the opposite of an open contract - used when you want an exhaustive, known set of implementations the compiler can check (great with pattern-matching switch). You met sealed types in From Scratch chapter 08; we'll use them again in chapter 07.

Done

  • You know an interface is a promise - a contract decoupled from implementation.
  • You know every interface feature: abstract methods, default, static, private, constants (and the constant-interface antipattern to avoid).
  • You know abstract classes add state + constructors + the template-method pattern, at the cost of single inheritance.
  • You can decide between them: default to interface, reach for abstract class when you need shared mutable state or constructor logic - and consider "ship both."

Next we make these contracts airtight: the equals, hashCode, and compareTo contracts that every Java object lives by.

Next: Equality, hashing, immutability →

03 - Equality, hashing, immutability

What this session is

About ninety minutes. Three contracts that every Java object silently participates in, and that - when you get them wrong - produce some of the most baffling bugs in the language: objects that vanish from a HashSet, map lookups that fail for keys you just put in, sort orders that throw exceptions. By the end you'll understand equals, hashCode, and compareTo deeply enough to implement them correctly by hand and to know when records do it for you.

Why this matters more than it looks

Here's a bug that has cost real engineers real hours:

class Point {
    int x, y;
    Point(int x, int y) { this.x = x; this.y = y; }
}

var seen = new HashSet<Point>();
seen.add(new Point(1, 2));
System.out.println(seen.contains(new Point(1, 2)));   // false (!!)

You added (1, 2). You asked if (1, 2) is there. It says no. The set appears broken. It isn't - you are, because Point never told Java what "equal" means. This session is about never writing that bug.

Reference equality vs logical equality

There are two completely different questions you can ask about two objects:

  1. Are they the same object in memory? That's ==. It compares references - the addresses, essentially.
  2. Do they represent the same value? That's .equals(). It compares meaning.
String a = new String("hi");
String b = new String("hi");

System.out.println(a == b);        // false - two different objects
System.out.println(a.equals(b));   // true  - same characters

The default equals() that every object inherits from Object just does ==:

// Object's default - only true for the literal same object
public boolean equals(Object obj) {
    return this == obj;
}

So unless you override equals(), "logical equality" is "same object" - which is why our Point lookup failed. The new Point(1,2) you searched for was a different object from the one you added, and the inherited equals only matches identical objects.

The rule: for any class whose instances represent a value (a point, a money amount, a date, a name) - where two instances with the same contents should be treated as equal - you must override equals(). For classes that represent a unique entity with identity (a database connection, a running thread, a service), the default reference equality is correct; leave it alone.

Implementing equals correctly

equals has a precise contract from the Object Javadoc. It must be:

  • Reflexive: x.equals(x) is always true.
  • Symmetric: x.equals(y) is true if and only if y.equals(x) is true.
  • Transitive: if x.equals(y) and y.equals(z), then x.equals(z).
  • Consistent: repeated calls return the same result (as long as nothing changes).
  • Non-null: x.equals(null) is always false.

Break any of these and the collections that rely on equals (HashSet, HashMap, List.contains, List.indexOf) misbehave in ways that are very hard to debug.

The canonical, contract-correct implementation:

@Override
public boolean equals(Object o) {
    if (this == o) return true;            // fast path: same object
    if (o == null || getClass() != o.getClass()) return false;  // null + type check
    Point other = (Point) o;               // safe cast now
    return x == other.x && y == other.y;   // field-by-field comparison
}

Walk every line, because each guards a clause of the contract:

  • if (this == o) return true; - reflexive and a performance shortcut.
  • if (o == null || getClass() != o.getClass()) return false; - handles the non-null rule and rejects different types. (Using getClass() keeps symmetry airtight; instanceof can break symmetry across subclasses - more in the Q&A.)
  • Point other = (Point) o; - now safe because we verified the type.
  • return x == other.x && y == other.y; - compare the fields that define value. For object fields, use Objects.equals(this.field, other.field) (it null-checks for you); for double/float, use Double.compare to handle NaN and -0.0 correctly.

The hashCode contract - and why it's coupled to equals

Now the part that bit our Point. Hash-based collections (HashMap, HashSet) don't compare every element with equals. That would be O(n) per lookup. Instead they:

  1. Call hashCode() to get an int, and use it to pick a bucket (a slot).
  2. Only compare with equals against the (few) items already in that bucket.

This is what makes HashMap O(1). But it means the system only works if equal objects land in the same bucket - which requires:

If a.equals(b) is true, then a.hashCode() == b.hashCode() must also be true.

This is the single most important coupling in Java. Our Point overrode neither, so two equal points got the default hashCode (based on object identity), landed in different buckets, and the set never even compared them with equals. The lookup failed before equality was ever checked.

The full hashCode contract:

  • Consistent: same object returns the same hash across calls (while unchanged).
  • Equal ⇒ same hash: equal objects must have equal hash codes. (The load-bearing rule.)
  • Unequal may share a hash (a "collision") - allowed, just slower. Good hash codes spread values out to minimize collisions.

The correct Point.hashCode:

@Override
public int hashCode() {
    return Objects.hash(x, y);    // combines the same fields equals() uses
}

Objects.hash(...) takes the fields and combines them into a well-distributed int. Always hash exactly the fields you compare in equals - no more, no fewer. If equals uses x and y, hashCode uses x and y. Mismatch here is the bug.

Now the fix in full:

class Point {
    final int x, y;
    Point(int x, int y) { this.x = x; this.y = y; }

    @Override public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Point p = (Point) o;
        return x == p.x && y == p.y;
    }

    @Override public int hashCode() {
        return Objects.hash(x, y);
    }
}

var seen = new HashSet<Point>();
seen.add(new Point(1, 2));
System.out.println(seen.contains(new Point(1, 2)));   // true - fixed

Iron law: override equals and hashCode together, always. Never one without the other. Every linter and IDE enforces this. The moment you write one, write the other from the same fields.

compareTo - ordering

equals answers "are these equal?" compareTo answers "which comes first?" A class implements Comparable<T> to define a natural order - used by Collections.sort, TreeSet, TreeMap, and list.sort(null).

class Version implements Comparable<Version> {
    final int major, minor, patch;
    Version(int major, int minor, int patch) {
        this.major = major; this.minor = minor; this.patch = patch;
    }

    @Override public int compareTo(Version o) {
        // Returns negative if this < o, 0 if equal, positive if this > o.
        return Comparator.comparingInt((Version v) -> v.major)
                .thenComparingInt(v -> v.minor)
                .thenComparingInt(v -> v.patch)
                .compare(this, o);
    }
}

var versions = new ArrayList<>(List.of(
    new Version(1, 2, 0), new Version(1, 0, 5), new Version(2, 0, 0)));
Collections.sort(versions);   // uses compareTo: 1.0.5, 1.2.0, 2.0.0

The contract: compareTo returns a negative int, zero, or a positive int for less-than, equal, greater-than. It must be a consistent total order (antisymmetric, transitive). And there's a strong recommendation: compareTo should be consistent with equals - x.compareTo(y) == 0 should usually mean x.equals(y). Violate this and TreeSet/TreeMap (which use compareTo, not equals, to decide membership) behave differently from HashSet/HashMap, which surprises everyone.

When you don't control the class or want a one-off order, use a Comparator instead of Comparable:

versions.sort(Comparator.comparingInt((Version v) -> v.patch));   // sort by patch only
versions.sort(Comparator.comparing((Version v) -> v.major).reversed());

Comparable is the one natural order baked into the type; Comparator is any number of external orders. Prefer Comparator for alternative sorts; reserve Comparable for the single most obvious "default" order (or omit it if there's no obvious default).

Immutability - the quiet superpower

Notice the fixed Point used final int x, y. That's deliberate. An immutable object can't change after construction. This matters enormously for the contracts above and for concurrency (chapters 09-11).

Why immutability is a default worth reaching for:

  1. Hash keys must not change. If you put a mutable object in a HashSet, then mutate a field used by hashCode, the object is now in the wrong bucket - lost. Immutable keys can't have this bug.
  2. Thread safety for free. An object that never changes can be shared across threads with zero synchronization. No races possible on data that doesn't mutate. (This is why chapters 09-11 keep coming back to immutability.)
  3. Easier reasoning. You can pass an immutable object anywhere without fear someone will change it behind your back.

How to make a class immutable:

final class Money {                          // final: no subclass can add mutability
    private final long cents;                // final fields, set once
    private final String currency;

    Money(long cents, String currency) {
        this.cents = cents;
        this.currency = currency;
    }

    long cents()       { return cents; }     // getters only, no setters
    String currency()  { return currency; }

    // "Mutation" returns a NEW object instead of changing this one.
    Money plus(Money other) {
        if (!currency.equals(other.currency))
            throw new IllegalArgumentException("currency mismatch");
        return new Money(cents + other.cents, currency);
    }
}

The recipe: final class, all fields private final, no setters, and any "change" returns a new instance. String, Integer, LocalDate, and BigDecimal are all immutable this way - which is why String methods like toUpperCase() return a new string instead of changing the original.

The defensive-copy trap

Immutability has a hole: if a field is itself a mutable object (a List, an array, a Date), storing or returning it directly leaks mutability.

final class Team {
    private final List<String> members;

    // BROKEN: caller keeps a reference to the same list and can mutate it.
    Team(List<String> members) {
        this.members = members;
    }
    List<String> members() {
        return members;                       // caller can do members().add(...)!
    }
}

Fix with defensive copies on the way in and out (or unmodifiable wrappers):

final class Team {
    private final List<String> members;

    Team(List<String> members) {
        this.members = List.copyOf(members);  // copy in - our list is independent
    }
    List<String> members() {
        return members;                       // List.copyOf already made it unmodifiable
    }
}

List.copyOf creates an independent, unmodifiable list. Now neither the original caller's list nor anything they get back can corrupt the Team. The same applies to arrays (array.clone()), maps (Map.copyOf), and any mutable field.

Records: the contracts, for free

Here's the relief: for the common case of a value type, records implement all of this correctly for you. You met records in From Scratch chapter 08; now you can appreciate what they do.

record Point(int x, int y) {}

That one line generates:

  • A constructor.
  • x() and y() accessors.
  • A correct equals comparing both fields.
  • A correct hashCode from both fields (consistent with equals).
  • A readable toString.

The Point bug we spent this chapter fixing simply cannot happen with a record - the generated equals/hashCode are correct and coupled by construction. Records are also implicitly final and their fields are final, so they're immutable (with the defensive-copy caveat for mutable components - a record holding a List field still needs care in its compact constructor):

record Team(List<String> members) {
    Team {                                    // compact constructor
        members = List.copyOf(members);       // defensive copy still needed
    }
}

The practical guidance: for value types, reach for a record first. You get the contracts right automatically. Write equals/hashCode by hand only when you can't use a record (you need mutability, or you extend a class, or you need custom equality semantics like case-insensitive comparison). Knowing how to do it by hand matters because you'll read and maintain pre-record code constantly - but for new value types, records are the answer.

Try it

  1. Reproduce the bug. Write Point with no equals/hashCode. Add one to a HashSet, search for an equal one, watch contains return false. Then add correct equals/hashCode and watch it return true. Feel the cause.

  2. Break the coupling on purpose. Override equals correctly but make hashCode return a constant return 1;. Does the HashSet work? (It does - but every element collides into one bucket, making it O(n). Correct but slow.) Now make hashCode return x only while equals uses x and y. Add (1,2) and (1,3), then search for (1,2). Reason about what happens.

  3. The mutation-in-a-set disaster. Make Point mutable (non-final fields, a setter). Put one in a HashSet. Mutate its x. Now call contains with an equal point and also iterate the set looking for it. The object is "in" the set but unfindable - it's in the wrong bucket. This is the argument for immutable keys.

  4. Order it. Implement Version implements Comparable<Version>. Sort a list. Then sort the same list by patch-number-descending using a Comparator. Notice Comparable is the one natural order; Comparator is for everything else.

  5. Records vs hand-written. Write Money as a hand-written immutable class with equals/hashCode, then as a record. Put both in a HashMap as keys and look them up. Confirm both work. Count the lines you saved.

  6. Defensive copy. Write the broken Team (stores the list directly). Construct one, then team.members().add("intruder") (or mutate the original list you passed in). Watch the "immutable" team change. Fix with List.copyOf. Confirm the mutation no longer leaks.

What you might wonder

"instanceof vs getClass() in equals - which?" getClass() requires both objects to be the exact same class, which keeps symmetry airtight even across subclasses. instanceof allows subclass instances to equal superclass instances, which can break symmetry (a ColorPoint might equal a Point but not vice versa). For most value types, getClass() is the safe default. Records use exact-class matching internally. There are advanced patterns using instanceof for class hierarchies, but they require care; default to getClass().

"Do I really need to memorize the equals contract?" You don't write equals by hand often once you have records. But you read hand-written ones constantly, and you need to recognize when one is broken. Know the five rules well enough to spot a violation in code review - that's the working level.

"Why is Objects.hash better than writing my own?" You can write int result = 31 * x + y; - the classic formula. Objects.hash does essentially that for any number of fields, correctly and readably, and handles nulls. For hot paths where the varargs array allocation of Objects.hash matters, hand-write the 31 * formula; otherwise Objects.hash is the clear, correct default.

"Is everything supposed to be immutable now?" Default to immutable; allow mutability when you have a reason (a builder accumulating state, a large object you mutate in place for performance, an entity whose identity persists while its fields change). The guidance is "immutable unless you need otherwise," not "immutable always." Records make immutable the path of least resistance, which is the point.

"What about compareTo returning this.x - other.x?" A classic bug. Subtraction can overflow (Integer.MIN_VALUE - 1 wraps to a positive number, inverting the order). Always use Integer.compare(a, b) or Comparator builders, never raw subtraction, for compareTo.

Done

  • You know reference equality (==) vs logical equality (.equals()), and when each is correct.
  • You can implement equals against its five-rule contract.
  • You understand why hashCode is coupled to equals - and the bucket mechanism that makes the coupling load-bearing.
  • You can implement compareTo/Comparator for ordering, and avoid the subtraction-overflow trap.
  • You know why immutability protects the contracts and enables thread safety, how to build immutable classes, and the defensive-copy trap.
  • You know records implement all of it correctly for free - reach for them first.

Next: generics in depth - bounded types, wildcards, and the truth about type erasure.

Next: Generics in depth →

04 - Generics in depth

What this session is

About ninety minutes. In From Scratch you used generics - List<String>, Map<String, Integer> - as type-safe containers. This session is about writing generic code: generic methods and classes, bounded type parameters, wildcards (the ? you've seen and maybe feared), the PECS rule that makes wildcards make sense, and the truth about type erasure - what generics actually compile to, and the surprising limits that follow. By the end you'll write generic APIs with confidence and read the gnarliest generic signatures in the JDK.

Why generics exist: the problem they solve

Before generics (Java 1.4 and earlier), containers held Object, and you cast on the way out:

List names = new ArrayList();        // a list of... something
names.add("Alice");
names.add(42);                       // oops, nobody stopped me
String first = (String) names.get(0);   // cast required
String second = (String) names.get(1);  // ClassCastException at runtime!

Two problems: no compiler help (you could put anything in), and casts everywhere (verbose and error-prone). Generics fix both:

List<String> names = new ArrayList<>();
names.add("Alice");
names.add(42);                       // COMPILE ERROR - caught immediately
String first = names.get(0);         // no cast - the compiler knows it's a String

The core idea: generics move type errors from runtime to compile time, and remove casts. A List<String> is a promise, checked by the compiler, that only strings go in and only strings come out.

Generic methods

You can make a single method generic by declaring a type parameter before the return type:

// <T> declares a type parameter. The method works for any T.
static <T> T firstOrNull(List<T> list) {
    return list.isEmpty() ? null : list.get(0);
}

String s = firstOrNull(List.of("a", "b"));   // T inferred as String
Integer n = firstOrNull(List.of(1, 2, 3));   // T inferred as Integer

The <T> before the return type says "this method introduces a type variable T." The caller never specifies it - the compiler infers T from the arguments. One method, type-safe for every type.

A two-parameter example:

// Swap returns a new pair with elements swapped.
static <A, B> Map.Entry<B, A> swap(Map.Entry<A, B> entry) {
    return Map.entry(entry.getValue(), entry.getKey());
}

Convention: type parameters are single uppercase letters - T (type), E (element), K/V (key/value), R (result), N (number). Multiple are comma-separated: <A, B>.

Generic classes

A class can be parameterized too - every instance fixes the type:

// A simple immutable box holding one value of type T.
final class Box<T> {
    private final T value;
    Box(T value)        { this.value = value; }
    T get()             { return value; }
    <R> Box<R> map(java.util.function.Function<T, R> f) {
        return new Box<>(f.apply(value));    // transform T into R
    }
}

Box<String> b = new Box<>("hello");
Box<Integer> len = b.map(String::length);    // Box<String> -> Box<Integer>
System.out.println(len.get());               // 5

Box<T> is type-safe: a Box<String> only ever holds a string. The map method even introduces its own type parameter R so it can transform the box's type. This is exactly how Optional<T> and Stream<T> are built.

Bounded type parameters: extends

Sometimes a generic method needs to do something with T, not just store it. To call methods on T, you must constrain it. <T extends SomeType> says "T can be any type that is a SomeType."

// To find the max, T must be Comparable - we need compareTo.
static <T extends Comparable<T>> T max(List<T> list) {
    T best = list.get(0);
    for (T item : list) {
        if (item.compareTo(best) > 0) best = item;   // compareTo available now
    }
    return best;
}

max(List.of(3, 1, 2));            // works - Integer is Comparable
max(List.of("c", "a", "b"));      // works - String is Comparable
// max(List.of(new Object()));    // COMPILE ERROR - Object isn't Comparable

Without the bound, item.compareTo(best) wouldn't compile - the compiler only knows the methods guaranteed by the bound. <T extends Comparable<T>> unlocks compareTo. Note extends here means "is a subtype of," and it works for both classes and interfaces (you always write extends, never implements, in a bound).

You can have multiple bounds with &:

// T must be both Comparable AND Serializable.
static <T extends Comparable<T> & java.io.Serializable> T pick(T a, T b) {
    return a.compareTo(b) >= 0 ? a : b;
}

Wildcards: ?

Here's where people get lost. Consider:

static double sumOfList(List<Number> list) {
    double sum = 0;
    for (Number n : list) sum += n.doubleValue();
    return sum;
}

List<Integer> ints = List.of(1, 2, 3);
sumOfList(ints);   // COMPILE ERROR (!)

Surprise: a List<Integer> is not a List<Number>, even though Integer is a Number. Generics are invariant - List<Integer> and List<Number> are unrelated types. (If they were related, you could add a Double to a List<Number> that's really a List<Integer>, breaking type safety. The invariance protects you.)

To accept "a list of Number or any subtype," use a wildcard:

static double sumOfList(List<? extends Number> list) {   // ? extends Number
    double sum = 0;
    for (Number n : list) sum += n.doubleValue();
    return sum;
}

sumOfList(List.of(1, 2, 3));          // List<Integer> - works
sumOfList(List.of(1.5, 2.5));         // List<Double>  - works
sumOfList(List.of(1, 2.5));           // List<Number>  - works

? extends Number means "some specific subtype of Number, I don't know which." That's enough to read Numbers out. But there's a catch that leads to the most important rule in generics.

PECS: Producer Extends, Consumer Super

There are two wildcard forms, and choosing between them confuses everyone until they learn the mnemonic.

? extends T - an upper-bounded wildcard. "Some subtype of T." You can read T from it (everything is at least a T), but you cannot write to it (you don't know the exact subtype, so no value is safe to add).

List<? extends Number> nums = List.of(1, 2, 3);
Number n = nums.get(0);    // OK to read - it's at least a Number
// nums.add(4);            // COMPILE ERROR - could be a List<Double>, can't add an Integer

? super T - a lower-bounded wildcard. "Some supertype of T." You can write a T to it (a T fits in any supertype container), but reading gives you only Object (you don't know which supertype).

List<? super Integer> sink = new ArrayList<Number>();
sink.add(42);              // OK to write - an Integer fits in a List of any Integer-supertype
// Integer x = sink.get(0);  // COMPILE ERROR - could be List<Object>, read gives Object
Object o = sink.get(0);    // only this works

The mnemonic, from Joshua Bloch's Effective Java:

PECS: Producer Extends, Consumer Super.

  • If a parameter produces values for you (you read from it), use ? extends T.
  • If a parameter consumes values you give it (you write to it), use ? super T.

The classic example is Collections.copy, which reads from a source (producer) and writes to a destination (consumer):

static <T> void copy(List<? super T> dest, List<? extends T> src) {
    //                      consumer (super)        producer (extends)
    for (int i = 0; i < src.size(); i++) {
        dest.set(i, src.get(i));   // read from src (T), write to dest (super T)
    }
}

src is a producer - we read Ts out, so extends. dest is a consumer - we write Ts in, so super. This signature lets you copy a List<Integer> into a List<Number>, which the rigid List<T>, List<T> version couldn't.

When do you use a plain ? (unbounded wildcard)? When you don't care about the type at all - you only use methods that don't depend on it:

static void printSize(List<?> list) {   // any list, of anything
    System.out.println("size: " + list.size());   // size() doesn't care about type
}

Type erasure: the truth about generics

Now the part that explains every weird limit of Java generics. Generics are a compile-time feature only. After the compiler checks your types and inserts casts, it erases the type parameters. At runtime, a List<String> and a List<Integer> are both just List. The <String> is gone.

List<String> a = new ArrayList<>();
List<Integer> b = new ArrayList<>();
System.out.println(a.getClass() == b.getClass());   // true - both are just ArrayList

Why did Java do this? Backward compatibility. Generics arrived in Java 5 (2004), and erasure let generic code interoperate with the mountain of pre-generics code. The cost is a set of limitations you have to know:

1. You can't check a generic type at runtime.

// if (list instanceof List<String>)   // COMPILE ERROR - type info is erased
if (list instanceof List<?>)           // OK - can only check the raw type

2. You can't create an array of a generic type.

// T[] arr = new T[10];        // COMPILE ERROR
// List<String>[] arr = ...    // COMPILE ERROR
T[] arr = (T[]) new Object[10]; // workaround: create Object[], cast (unchecked warning)

3. You can't use primitives as type arguments.

// List<int> nums;             // COMPILE ERROR
List<Integer> nums;            // must use the wrapper - and pay autoboxing (chapter 12)

4. Overloading on erased types collides.

// These two methods erase to the same signature - won't compile together:
// void process(List<String> s) {}
// void process(List<Integer> i) {}   // COMPILE ERROR - both erase to process(List)

5. A class can't have static state per type parameter - because there's only one class at runtime, shared across all T.

The mental model: the compiler uses the type parameters to check your code and insert casts, then throws them away. Everything generics can and can't do follows from this. When a generic limitation surprises you, ask "what's left after erasure?" and the answer usually explains it.

Reading scary JDK signatures

Armed with all this, you can decode signatures that look terrifying. Here's Stream.collect:

<R, A> R collect(Collector<? super T, A, R> collector)

Decoded: it introduces two type variables R (the result) and A (an internal accumulator). It takes a Collector that consumes the stream's elements (? super T - consumer, super) and produces an R. You don't need to memorize it - you can read it now.

And Comparator.comparing:

static <T, U extends Comparable<? super U>> Comparator<T> comparing(
        Function<? super T, ? extends U> keyExtractor)

Decoded: for a type T to compare and a key type U that is comparable (U extends Comparable<? super U> - U can be compared against itself or a supertype), take a function that consumes a T (? super T) and produces a comparable key (? extends U). PECS everywhere. Intimidating until you have the vocabulary; routine once you do.

Try it

  1. Write a generic method. Implement static <T> List<T> repeat(T item, int n) that returns a list with item repeated n times. Call it with a String and an Integer; note the type is inferred.

  2. Feel invariance. Write void addNumbers(List<Number> list). Try to pass a List<Integer>. Watch it fail to compile. Change the parameter to List<? super Integer> and add integers to it. Now it works - that's the consumer (super) case.

  3. PECS in practice. Write static <T> void moveAll(List<? extends T> from, List<? super T> to) that adds every element of from to to. Test it copying a List<Integer> into a List<Number>. Try swapping extends and super and watch the compile errors - they teach you which side reads and which writes.

  4. Hit erasure. Try to write static <T> T[] makeArray(int n) { return new T[n]; }. Read the compile error. Then try (T[]) new Object[n] and note the unchecked warning. Reason about why the JVM can't make a real T[].

  5. Bounded type. Write static <T extends Comparable<T>> T min(T a, T b). Use it on Strings and Integers. Remove the bound and watch compareTo stop compiling - the bound is what unlocks the method.

  6. Decode a signature. Open the JDK docs for Map.computeIfAbsent. Read its signature out loud, naming each type variable and wildcard. You should be able to explain every symbol now.

What you might wonder

"When do I write <T extends Comparable<T>> vs <T extends Comparable<? super T>>?" The ? super T version is more flexible - it allows T to be comparable against a supertype (e.g., a Timestamp that's Comparable<Date>). For your own code, <T extends Comparable<T>> is usually fine and simpler. The JDK uses the ? super T form for maximum flexibility in public APIs. Start simple; widen to ? super T if a real type forces you.

"Do I need wildcards in my own code, or just to read the JDK?" Both, but reading first. You'll use wildcard-typed APIs constantly (every stream collector). You'll write wildcards when you make utility methods that work across a type family - which is less often, but PECS is the rule when you do. If a method only reads a collection, take ? extends; if it only writes, take ? super; if both, take an exact <T>.

"Is erasure why I get 'unchecked cast' warnings?" Yes. When you cast to a generic type the runtime can't verify (like (T[]) new Object[n] or (List<String>) someRawList), the compiler warns it can't guarantee safety - because the type info is erased and it can't insert a real check. Suppress with @SuppressWarnings("unchecked") only when you've reasoned that it's actually safe.

"Why can other languages (C#, Rust) do things Java generics can't?" C# reifies generics - the type info survives to runtime, so new T[n] and typeof(List<int>) work. Rust monomorphizes - it generates a specialized copy per concrete type. Java chose erasure for backward compatibility in 2004. Each approach trades off differently; erasure's cost is the limitations above, its benefit was seamless interop with a decade of existing code. (This exact comparison is a cross-topic page on the site if you want the three-language view.)

"What's the diamond operator <> actually doing?" new ArrayList<>() lets the compiler infer the type argument from the left-hand side (List<String> x = new ArrayList<>() infers String). Before Java 7 you had to write new ArrayList<String>() redundantly. It's pure inference convenience - no runtime effect.

Done

  • You know why generics exist: compile-time type safety, no casts.
  • You can write generic methods and generic classes, including ones that transform their type.
  • You can bound type parameters with extends (and &) to unlock methods on T.
  • You understand invariance, both wildcard forms, and PECS - producer extends, consumer super.
  • You understand type erasure and the five limitations that follow from it.
  • You can read intimidating JDK generic signatures.

Next: collections deep - which one to reach for, the performance characteristics, and the families that matter.

Next: Collections deep →

05 - Collections deep

What this session is

About ninety minutes. In From Scratch you used List, Set, and Map. This session is about choosing the right one - because Java gives you a dozen implementations, each with different performance characteristics, and reaching for the wrong one is one of the most common quiet performance bugs. By the end you'll know which collection to use for any situation and why, with the Big-O costs in your head.

The map of the collections framework

Everything descends from a few interfaces. Hold this picture:

Iterable
  └─ Collection
       ├─ List      - ordered, indexed, allows duplicates
       ├─ Set       - no duplicates
       └─ Queue/Deque - ordered for adding/removing at ends

Map (separate hierarchy - not a Collection)
  └─ key -> value pairs, unique keys

You program to the interface (List, Set, Map) and choose the implementation (ArrayList, HashSet, HashMap) based on performance needs. This is "program to interfaces" (chapter 01) in daily practice:

List<String> names = new ArrayList<>();   // declare interface, pick implementation
Map<String, Integer> counts = new HashMap<>();

If you later need a different implementation, you change one word (new ArrayList<>() to new LinkedList<>()) and nothing else, because all callers depend on List.

Lists: ArrayList vs LinkedList

Both are Lists. They perform completely differently.

ArrayList is a growable array. Elements live in a contiguous block of memory.

Operation Cost Why
get(i) / set(i) O(1) direct array index
add(item) (at end) O(1) amortized usually just writes to the next slot; occasionally resizes
add(i, item) (middle) O(n) must shift everything after i
remove(i) (middle) O(n) must shift everything after i
contains(item) O(n) linear scan

LinkedList is a doubly-linked list. Each element is a separate node with pointers to its neighbors.

Operation Cost Why
get(i) O(n) must walk from an end to index i
add/remove at ends O(1) just relink the end node
add/remove in middle (with iterator) O(1) relink neighbors
contains(item) O(n) linear scan

The practical verdict: use ArrayList by default. Always. Its O(1) random access and cache-friendly contiguous memory beat LinkedList for almost everything real code does, even insertion-heavy work, because CPU cache locality matters more than the Big-O suggests. LinkedList's only real edge is constant-time add/remove at both ends - and for that, ArrayDeque (below) is faster anyway.

In practice: reach for LinkedList almost never. The fact that beginner tutorials present them as equal alternatives is misleading. ArrayList is the answer ~95% of the time.

Sets: HashSet vs LinkedHashSet vs TreeSet

A Set holds unique elements. Three implementations, three tradeoffs:

HashSet - backed by a HashMap. O(1) add/contains/remove. No order - iteration order is unspecified and can change. Use when you just need uniqueness and fast membership tests.

Set<String> seen = new HashSet<>();
seen.add("a"); seen.add("b"); seen.add("a");
System.out.println(seen.size());          // 2 - duplicate ignored
System.out.println(seen.contains("a"));   // true, O(1)

LinkedHashSet - like HashSet but remembers insertion order. Slightly more memory (it maintains a linked list through the entries). Use when you need uniqueness and predictable iteration order.

Set<String> ordered = new LinkedHashSet<>();
ordered.add("c"); ordered.add("a"); ordered.add("b");
// iterates c, a, b - insertion order, every time

TreeSet - keeps elements sorted (by natural order or a Comparator). O(log n) add/contains/remove - slower than hash sets, but you get sorted iteration and range queries (first(), last(), headSet(), subSet()).

TreeSet<Integer> sorted = new TreeSet<>(List.of(5, 1, 3, 2, 4));
System.out.println(sorted);             // [1, 2, 3, 4, 5] - always sorted
System.out.println(sorted.first());     // 1
System.out.println(sorted.headSet(3));  // [1, 2] - everything below 3

The decision: HashSet for plain uniqueness (fastest). LinkedHashSet when iteration order must match insertion. TreeSet when you need sorted order or range queries (and can pay O(log n)).

Critical reminder from chapter 03: HashSet and LinkedHashSet depend on correct equals/hashCode. TreeSet depends on correct compareTo/Comparator. A broken contract silently breaks the set.

Maps: the workhorse family

Map is the most-used collection after List. Same three flavors as sets, for the same reasons (sets are literally implemented on maps):

HashMap - O(1) get/put/remove, no order. The default map. Use it unless you need order.

LinkedHashMap - insertion-order (or access-order) iteration. The access-order mode makes it a ready-made LRU cache:

// Access-order LinkedHashMap that evicts the least-recently-used entry past capacity.
var lru = new LinkedHashMap<String, String>(16, 0.75f, true) {
    protected boolean removeEldestEntry(Map.Entry<String, String> eldest) {
        return size() > 100;     // keep at most 100 entries
    }
};

TreeMap - sorted by key, O(log n). Gives you firstKey(), lastKey(), floorKey(), ceilingKey(), subMap() - range queries on keys.

TreeMap<Integer, String> byScore = new TreeMap<>();
byScore.put(90, "A"); byScore.put(70, "C"); byScore.put(80, "B");
System.out.println(byScore.firstKey());        // 70
System.out.println(byScore.ceilingKey(75));    // 80 - smallest key >= 75

The decision mirrors sets: HashMap by default; LinkedHashMap for ordered iteration or LRU; TreeMap for sorted keys / range queries.

The modern Map methods you should be using

Pre-Java-8 map code is full of get-check-put patterns. Modern methods replace them:

Map<String, Integer> counts = new HashMap<>();

// Counting - the classic. Old way: get, null-check, put.
for (String word : words) {
    counts.merge(word, 1, Integer::sum);   // add 1, or start at 1 if absent
}

// Grouping into lists without the null dance:
Map<String, List<String>> byFirstLetter = new HashMap<>();
for (String name : names) {
    String key = name.substring(0, 1);
    byFirstLetter.computeIfAbsent(key, k -> new ArrayList<>()).add(name);
}

// Default for missing keys:
int n = counts.getOrDefault("missing", 0);   // 0 instead of null

// Compute only if present:
counts.computeIfPresent("apple", (k, v) -> v * 2);

merge, computeIfAbsent, getOrDefault, putIfAbsent, computeIfPresent - learn these five. They turn five-line patterns into one line and avoid a whole class of null bugs.

Queues and Deques: ArrayDeque and PriorityQueue

ArrayDeque - a double-ended queue backed by a resizable array. Add/remove at both ends in O(1). This is your stack and your queue - faster than the legacy Stack (chapter 01's cautionary tale) and faster than LinkedList for the same job.

Deque<Integer> stack = new ArrayDeque<>();
stack.push(1); stack.push(2);
stack.pop();                 // 2 - LIFO, this is a stack

Deque<Integer> queue = new ArrayDeque<>();
queue.offer(1); queue.offer(2);
queue.poll();                // 1 - FIFO, this is a queue

Use ArrayDeque whenever you need a stack or a queue. Never use the legacy Stack class.

PriorityQueue - a heap. poll() always returns the smallest element (by natural order or Comparator), in O(log n). Use for "always process the most urgent next":

PriorityQueue<Integer> pq = new PriorityQueue<>();
pq.addAll(List.of(5, 1, 3, 2));
pq.poll();   // 1 - smallest first
pq.poll();   // 2

// Max-heap with a reversed comparator:
var maxHeap = new PriorityQueue<Integer>(Comparator.reverseOrder());

Iteration and the fail-fast trap

A bug everyone hits once: modifying a collection while iterating it.

List<String> list = new ArrayList<>(List.of("a", "b", "c"));
for (String s : list) {
    if (s.equals("b")) list.remove(s);   // ConcurrentModificationException!
}

The for-each loop uses an iterator, and most collections are fail-fast: they detect structural modification during iteration and throw ConcurrentModificationException rather than silently corrupting. (The name is misleading - it happens single-threaded too.) Three correct ways to remove during iteration:

// 1. Iterator.remove() - the explicit way
Iterator<String> it = list.iterator();
while (it.hasNext()) {
    if (it.next().equals("b")) it.remove();
}

// 2. removeIf - the modern one-liner
list.removeIf(s -> s.equals("b"));

// 3. Collect what to remove, remove after the loop

removeIf is almost always the right answer. Reach for it.

Immutable and unmodifiable collections

Three ways to get a read-only collection, with different guarantees:

// 1. List.of / Set.of / Map.of - truly immutable, can't be changed by anyone (Java 9+)
List<String> a = List.of("x", "y");        // a.add(...) throws UnsupportedOperationException

// 2. List.copyOf - immutable snapshot of an existing collection
List<String> b = List.copyOf(existing);    // independent, immutable

// 3. Collections.unmodifiableList - a read-only VIEW (the backing list can still change!)
List<String> c = Collections.unmodifiableList(backing);
// you can't change c directly, but if someone changes `backing`, c reflects it

The trap is #3: an unmodifiable view doesn't copy - it wraps. If the underlying collection changes, the view changes. For real immutability (chapter 03), use List.of or List.copyOf, which fully decouple. Return List.copyOf(...) from getters to protect your internal collections (the defensive-copy lesson from chapter 03).

Choosing: a decision table

When you need a... Reach for Because
ordered list, indexed access ArrayList O(1) get, cache-friendly, the default
stack or queue ArrayDeque O(1) both ends, beats Stack and LinkedList
unique elements, fast lookup HashSet O(1), no order needed
unique elements, insertion order LinkedHashSet predictable iteration
unique elements, sorted TreeSet O(log n), sorted + range queries
key-value, fast lookup HashMap O(1), the default map
key-value, insertion order or LRU LinkedHashMap ordered iteration / eviction hook
key-value, sorted by key TreeMap O(log n), range queries on keys
"most urgent first" PriorityQueue O(log n) heap

Print this table mentally. "Which collection?" should take you two seconds.

Try it

  1. Measure ArrayList vs LinkedList. Build both with 1,000,000 integers. Time get(500_000) a million times on each. The ArrayList will be dramatically faster (O(1) vs O(n)). Then time adding a million elements at the end of each. ArrayList still wins (cache locality). This kills the "LinkedList is for inserts" myth.

  2. Word frequency three ways. Count word frequencies in a paragraph using (a) the old get-null-check-put pattern, (b) merge, (c) a stream groupingBy + counting (you'll meet streams next chapter). Compare line counts and readability.

  3. Pick the right set. You need to dedupe a list of names but preserve the order they first appeared. Which set? Implement it. Then change the requirement to "deduped and alphabetical" - which set now?

  4. Hit the fail-fast. Reproduce the ConcurrentModificationException with a for-each remove. Fix it three ways: Iterator.remove, removeIf, and collect-then-remove. Note which is cleanest.

  5. Unmodifiable view trap. Create a backing ArrayList, wrap it with Collections.unmodifiableList. Confirm you can't add to the view. Then add to backing directly and print the view - watch it change. Now do the same with List.copyOf and confirm the copy is truly frozen.

  6. TreeMap range query. Put exam scores (key) to names (value) in a TreeMap. Use subMap, ceilingKey, and headMap to answer "who scored between 70 and 90?" and "what's the lowest passing score >= 60?".

What you might wonder

"Is LinkedList ever the right choice?" Rarely. If you need a List and frequent O(1) add/remove at both ends and you specifically need List semantics (indexed-ish access via iterator), maybe. But ArrayDeque covers the both-ends case better, and ArrayList covers everything else. In years of Java you'll reach for LinkedList a handful of times, usually then reconsider.

"HashMap initial capacity - should I set it?" If you know roughly how many entries you'll add, yes: new HashMap<>(expectedSize / 0.75 + 1) avoids rehashing as it grows (same idea as pre-sizing an ArrayList). For small maps it doesn't matter. For large ones built in a hot path, it's a real win - same lesson as the Go pre-sizing chapter if you've seen it.

"What's the load factor (0.75)?" A HashMap resizes when it's 75% full, to keep collisions low. 0.75 is the default sweet spot between memory and speed. You almost never change it. It's the second constructor arg you saw in the LRU example.

"Are these collections thread-safe?" No - ArrayList, HashMap, etc. are not safe for concurrent modification. That's a chapter 10 topic (ConcurrentHashMap, CopyOnWriteArrayList, and friends). For now: a plain HashMap shared across threads without synchronization is a bug, even if it seems to work in testing.

"Vector and Hashtable - what about those?" Legacy, from Java 1.0. They're synchronized (slow) versions of ArrayList/HashMap. Don't use them in new code. If you need thread safety, use the java.util.concurrent collections (chapter 10), not these. They survive only for backward compatibility - the same "sediment" the Java Mastery path's legacy appendix covers.

"Should I always use List.of for constants?" Yes for fixed, never-changing collections - it's immutable, compact, and clear. Just remember List.of rejects nulls and is immutable, so don't use it where you need to mutate or store nulls.

Done

  • You know the framework map: List, Set, Queue/Deque, Map.
  • You know ArrayList is the default list and why LinkedList rarely wins.
  • You can choose among HashSet/LinkedHashSet/TreeSet and the matching Map trio by their order and performance tradeoffs.
  • You know ArrayDeque for stacks/queues and PriorityQueue for "most urgent first."
  • You know the modern map methods (merge, computeIfAbsent, ...) and the fail-fast iteration trap.
  • You know unmodifiable views vs truly immutable collections.
  • You have the decision table in your head.

Next: exceptions done right - the strategy for checked vs unchecked, custom hierarchies, and clean error handling.

Next: Exceptions done right →

06 - Exceptions done right

What this session is

About ninety minutes. From Scratch taught you try/catch/finally - the mechanics. This session is about the strategy: checked vs unchecked (Java's most-debated design decision), designing exception types, try-with-resources properly, exception chaining, and the anti-patterns that turn error handling into a liability. By the end you'll handle errors like someone who's maintained a large codebase, not someone who wraps everything in try { } catch (Exception e) { }.

The exception hierarchy

Every error in Java is a Throwable. The tree matters because it drives the rules:

Throwable
  ├─ Error                  - JVM-level catastrophes. Don't catch these.
  │    └─ OutOfMemoryError, StackOverflowError, ...
  └─ Exception              - things programs can reasonably handle
       ├─ RuntimeException  - UNCHECKED. Programming errors, usually.
       │    └─ NullPointerException, IllegalArgumentException,
       │       IllegalStateException, IndexOutOfBoundsException, ...
       └─ (everything else) - CHECKED. The compiler forces you to handle these.
            └─ IOException, SQLException, ...

Two splits to internalize:

  1. Error vs Exception. Error means the JVM is in trouble (out of memory, stack overflow). You generally cannot meaningfully recover, so you don't catch them. Exception is for conditions a program can handle.

  2. Checked vs unchecked. This is the big one. RuntimeException and its subclasses are unchecked - the compiler doesn't force you to handle them. Everything else under Exception is checked - the compiler forces you to either catch it or declare throws it. This single distinction shapes how all Java error handling reads.

Checked vs unchecked: the strategy

// Checked: the compiler MAKES you deal with IOException.
void readFile(String path) throws IOException {   // declare it...
    Files.readString(Path.of(path));
}
// or catch it:
void readFileSafe(String path) {
    try {
        Files.readString(Path.of(path));
    } catch (IOException e) {
        // handle
    }
}

// Unchecked: no compiler obligation. You CAN catch it, but aren't forced to.
void parse(String s) {
    int n = Integer.parseInt(s);   // throws NumberFormatException (unchecked) - no try needed
}

The design intent:

  • Checked exceptions = "recoverable conditions the caller should consciously handle." A file might not exist; a network call might fail. The compiler forces the caller to acknowledge the possibility.
  • Unchecked exceptions = "programming errors that shouldn't happen if the code is correct." A null where one shouldn't be, an index out of range, an illegal argument. Forcing callers to catch these everywhere would drown the code.

In practice, the Java community has drifted toward using unchecked exceptions for most things. Checked exceptions sound good but have real costs: they leak through every layer (a low-level IOException forces throws clauses all the way up), they don't compose with lambdas/streams (a lambda can't throw a checked exception that the functional interface doesn't declare), and they tempt developers into the worst anti-pattern (catch-and-ignore) just to make the compiler stop complaining.

The pragmatic guidance most modern Java follows:

Use unchecked exceptions (extend RuntimeException) by default. Use checked exceptions only when the caller can realistically recover and you want to force them to think about it.

Many influential libraries (Spring, and most modern frameworks) use unchecked exceptions almost exclusively, often wrapping checked ones. You'll still handle checked exceptions constantly (the JDK is full of them - IOException, SQLException), but when designing your own exceptions, default to unchecked.

Choosing and throwing the right exception

The JDK provides standard unchecked exceptions - use them instead of inventing your own for common cases:

void setAge(int age) {
    if (age < 0)
        throw new IllegalArgumentException("age must be non-negative, got " + age);
    this.age = age;
}

void withdraw(double amount) {
    if (closed)
        throw new IllegalStateException("account is closed");   // wrong object state
    if (amount > balance)
        throw new IllegalArgumentException("insufficient funds");
}

String get(int index) {
    Objects.checkIndex(index, size);   // throws IndexOutOfBoundsException with a good message
    return data[index];
}

void process(Order order) {
    Objects.requireNonNull(order, "order must not be null");   // throws NPE with a message
    // ...
}

The standard ones and when to throw them:

  • IllegalArgumentException - a method argument is invalid (negative age, empty string where one's required).
  • IllegalStateException - the object is in the wrong state for this call (using a closed resource, calling next() past the end).
  • NullPointerException - a required value was null. Use Objects.requireNonNull(x, "msg") to throw it early with a message rather than letting a confusing NPE surface deep in the call.
  • IndexOutOfBoundsException - an index is out of range. Objects.checkIndex does this with a good message.
  • UnsupportedOperationException - an operation isn't supported (the chapter 01 sign of inheritance abuse, but also legitimately used by immutable collections).

Always include a useful message. throw new IllegalArgumentException() tells the next debugger nothing. throw new IllegalArgumentException("age must be non-negative, got " + age) tells them exactly what went wrong and what the bad value was. The message is the gift you give your future self at 2 AM.

Designing custom exceptions

When the standard exceptions don't fit - when callers need to distinguish your error type to handle it specifically - design your own. Keep them unchecked unless you have a recovery reason.

// A focused, unchecked domain exception.
public class PaymentException extends RuntimeException {
    private final String orderId;

    public PaymentException(String orderId, String message, Throwable cause) {
        super(message, cause);     // pass message AND cause to super (chaining - see below)
        this.orderId = orderId;
    }

    public String orderId() { return orderId; }
}

For a family of related errors, use a small hierarchy so callers can catch broadly or narrowly:

public class PaymentException extends RuntimeException { /* ... */ }
public class CardDeclinedException extends PaymentException { /* ... */ }
public class InsufficientFundsException extends PaymentException { /* ... */ }

// Caller chooses granularity:
try {
    processPayment(order);
} catch (CardDeclinedException e) {
    promptForNewCard();                  // handle this specific case
} catch (PaymentException e) {
    abortCheckout(e.orderId());          // handle the whole family
}

Catch blocks are checked top to bottom, so order subclasses before superclasses - if PaymentException came first, CardDeclinedException would be unreachable (compile error, helpfully).

Design principles for custom exceptions:

  • Carry structured data the handler needs (the orderId above), not just a string. This is the chapter 03 lesson - an exception is an object; put useful fields on it.
  • Keep the hierarchy shallow and meaningful. Two or three levels max.
  • Always provide a constructor that accepts a cause (Throwable) so chaining works.

Try-with-resources, properly

Anything that holds an external resource (a file, a socket, a database connection) must be closed, even if an exception is thrown mid-use. The old way - finally blocks - was verbose and error-prone:

// The old, painful way - don't write this anymore.
BufferedReader r = null;
try {
    r = new BufferedReader(new FileReader("data.txt"));
    return r.readLine();
} finally {
    if (r != null) r.close();   // and close() itself can throw, masking the real error...
}

Try-with-resources handles all of it. Any object implementing AutoCloseable declared in the try (...) header is automatically closed when the block exits - normally or via exception, in reverse order of declaration:

try (BufferedReader r = new BufferedReader(new FileReader("data.txt"))) {
    return r.readLine();
}   // r.close() called automatically, even if readLine() throws

Multiple resources, closed in reverse:

try (var in = new FileInputStream("src");
     var out = new FileOutputStream("dst")) {
    in.transferTo(out);
}   // out closed first, then in - reverse of declaration order

This also fixes a subtle bug the old way had: if both the body and close() throw, try-with-resources keeps the body's exception as primary and attaches the close exception as a suppressed exception (retrievable via e.getSuppressed()), instead of the close exception silently replacing the real one. Always use try-with-resources for anything closeable. Making your own resource closeable is just implements AutoCloseable with a close() method.

Exception chaining: never lose the cause

When you catch a low-level exception and throw a higher-level one, always pass the original as the cause. This preserves the full stack trace - the "caused by:" chain you've seen in logs.

try {
    return database.query(sql);
} catch (SQLException e) {
    // GOOD: wrap, preserving the original as the cause
    throw new DataAccessException("failed to load user " + id, e);
}

The second argument (e) becomes the cause. The resulting stack trace shows your DataAccessException and "Caused by: SQLException ..." underneath - you keep the high-level context and the low-level detail. Drop the cause and you throw away the actual reason it failed:

} catch (SQLException e) {
    // BAD: the original cause is lost forever. Debugging nightmare.
    throw new DataAccessException("failed to load user " + id);
}

This is the Java equivalent of error wrapping in other languages. The cause chain is your single most valuable debugging tool when production breaks. Never break it.

The anti-patterns (and their fixes)

These are the error-handling crimes that show up in code review. Recognize and avoid all of them.

1. Swallowing exceptions.

try {
    riskyOperation();
} catch (Exception e) {
    // nothing here. The error vanishes. The program limps on with bad state.
}
The worst one. The error happened, you hid it, and now something downstream fails mysteriously with no clue why. Never have an empty catch block. At minimum, log it. Usually, handle it or rethrow.

2. Catching Exception (or Throwable) too broadly.

try {
    doStuff();
} catch (Exception e) {   // catches EVERYTHING, including bugs you didn't anticipate
    showError("something went wrong");
}
Catching Exception scoops up NullPointerException, IllegalStateException - programming bugs you'd rather see crash loudly in development. Catch the specific exceptions you can actually handle. Catching Throwable is even worse (it catches Error - out-of-memory, etc.).

3. Using exceptions for control flow.

// BAD: using an exception as a loop terminator
try {
    int i = 0;
    while (true) System.out.println(array[i++]);
} catch (ArrayIndexOutOfBoundsException e) { /* done */ }
Exceptions are for exceptional conditions, not normal flow. They're also expensive (building a stack trace costs real time). Use a normal loop condition.

4. Logging and rethrowing (double-logging).

} catch (IOException e) {
    log.error("read failed", e);   // logged here...
    throw new RuntimeException(e); // ...and will be logged AGAIN by whoever catches this
}
Pick one: either handle it here (log and recover), or wrap-and-rethrow (let the caller log). Doing both produces the same error in the logs three times, making incidents harder to read. The rule: log where you handle, not where you rethrow.

5. Throwing from finally.

try { ... }
finally {
    cleanup();   // if cleanup() throws, it MASKS any exception from the try block
}
An exception thrown in finally replaces any exception in flight from the try - the original error vanishes. Keep finally blocks (or prefer try-with-resources) free of code that can throw.

A clean end-to-end example

Everything together - a service method that does I/O, wraps appropriately, chains causes, and uses try-with-resources:

public class UserRepository {

    // Custom unchecked domain exception with structured data + cause support.
    public static class UserLoadException extends RuntimeException {
        private final long userId;
        public UserLoadException(long userId, Throwable cause) {
            super("failed to load user " + userId, cause);   // message + chained cause
            this.userId = userId;
        }
        public long userId() { return userId; }
    }

    public User load(long userId) {
        Objects.requireNonNull(connection, "connection not initialized");  // fail early, clear

        String sql = "SELECT * FROM users WHERE id = ?";
        try (PreparedStatement stmt = connection.prepareStatement(sql)) {  // auto-closed
            stmt.setLong(1, userId);
            try (ResultSet rs = stmt.executeQuery()) {                     // auto-closed
                if (!rs.next())
                    throw new IllegalArgumentException("no user with id " + userId);
                return mapRow(rs);
            }
        } catch (SQLException e) {
            // Wrap the low-level checked exception in our unchecked domain one,
            // preserving the cause. Callers handle UserLoadException, not SQLException.
            throw new UserLoadException(userId, e);
        }
    }
}

The caller never sees SQLException - it sees a clean UserLoadException carrying the userId and the original cause underneath. Resources close automatically. Bad input fails fast with a clear message. This is the shape of production error handling.

Try it

  1. Wrap and chain. Write a method that reads an int from a file (Files.readString + Integer.parseInt). Catch IOException and NumberFormatException, wrap each in a custom unchecked ConfigException with the cause. Trigger both paths and print e.getCause() - confirm the original is preserved. Then "forget" the cause and see how much worse the stack trace is.

  2. Resource closing order. Make two classes implementing AutoCloseable whose close() prints their name. Use both in one try-with-resources. Confirm they close in reverse declaration order. Then throw inside the body and confirm they still close.

  3. Suppressed exceptions. Make an AutoCloseable whose close() throws. Use it in a try-with-resources whose body also throws. Catch the body's exception and print e.getSuppressed() - see the close exception preserved, not lost.

  4. Build a hierarchy. Design OrderException with subclasses OutOfStockException and PaymentFailedException, each carrying relevant data. Write a caller with multiple catch blocks (subclasses first). Deliberately put the superclass catch first and read the compile error.

  5. Spot the anti-patterns. Take this and fix every crime: try { x(); } catch (Throwable t) { log.error("err", t); throw new RuntimeException(t); }. (Too broad - catches Error; double-logs; loses specificity.) Rewrite it correctly.

  6. requireNonNull early. Write a method that uses a parameter three statements in. Pass null. Note the confusing NPE deep in the method. Add Objects.requireNonNull(param, "param required") at the top. Pass null again. Compare the two stack traces - the second points at the real problem.

What you might wonder

"So are checked exceptions just bad?" They're controversial, not bad. They genuinely help when the caller can recover and should be forced to consider failure (file I/O, network). They hurt when they leak through layers and don't compose with streams/lambdas. The modern lean is "unchecked by default, checked when recovery is realistic and you want to force the conversation." Know how to handle both (the JDK forces you to), but design your own as unchecked unless you have a reason.

"How do I throw a checked exception from inside a lambda/stream?" You can't directly - functional interfaces like Function don't declare throws. Options: catch inside the lambda and wrap in an unchecked exception, or use a library helper. This composition problem is a big reason the community drifted toward unchecked exceptions.

"Should I ever catch Exception?" At the top of an application - a request handler, a thread's run loop, a main - yes: a last-resort catch-all that logs and returns a clean error to the user, so one bad request doesn't crash the server. Deep in business logic, no - catch specific types. The rule: broad catches belong only at boundaries.

"What about Optional instead of exceptions?" For "value might be absent" (a lookup that finds nothing), Optional<T> is often cleaner than throwing (chapter 07). For "operation failed in a way the caller must handle," exceptions are right. Don't throw an exception for an empty search result; do throw for "the database connection died."

"Performance - are exceptions slow?" Throwing is moderately expensive, mostly from capturing the stack trace. That's another reason not to use them for control flow. For genuine errors (which are rare by definition), the cost is irrelevant. If you must throw very frequently in a hot path (you usually shouldn't), you can override fillInStackTrace to skip the trace - but that's a rare optimization, covered more in chapter 12.

Done

  • You know the Throwable hierarchy and the Error/Exception, checked/unchecked splits.
  • You know the strategy: unchecked by default, checked when recovery is realistic.
  • You can throw the right standard exception with a useful message, and fail fast with requireNonNull.
  • You can design custom exception hierarchies that carry structured data and support chaining.
  • You use try-with-resources for anything closeable, and understand suppressed exceptions.
  • You always chain causes, and you can spot and fix the five anti-patterns.

Next: functional Java - lambdas, method references, streams used well, and Optional.

Next: Functional Java →

07 - Functional Java

What this session is

About ninety minutes. Since Java 8, functions are values you can pass around, and the Streams API lets you express data transformations declaratively. This session covers lambdas, method references, the functional interfaces behind them, streams used well (and when not to use them), and Optional done right. By the end you'll write the clean, pipeline-style Java that fills modern codebases - and know where it helps versus where a plain loop is better.

Lambdas: functions as values

A lambda is an anonymous function you can store in a variable or pass to a method:

// Old way: anonymous class, six lines of ceremony for one line of logic.
Runnable oldWay = new Runnable() {
    public void run() { System.out.println("hi"); }
};

// Lambda: the same thing.
Runnable lambda = () -> System.out.println("hi");

lambda.run();   // prints "hi"

The syntax is (parameters) -> body:

() -> 42                          // no args, returns 42
x -> x * 2                        // one arg (parens optional), returns x*2
(x, y) -> x + y                   // two args
(String s) -> s.length()          // explicit type (usually inferred, so omit)
x -> {                            // block body with explicit return
    int doubled = x * 2;
    return doubled + 1;
}

A lambda is just a compact way to implement a functional interface - an interface with exactly one abstract method. That's the key insight that makes lambdas not magic.

Functional interfaces

A functional interface has one abstract method. A lambda is an instance of one - the lambda is that method's implementation.

@FunctionalInterface              // optional annotation; compiler enforces "one abstract method"
interface Transformer {
    String transform(String input);
}

Transformer upper = s -> s.toUpperCase();   // lambda implements transform()
System.out.println(upper.transform("hi"));  // HI

You rarely define your own, because java.util.function provides the ones you need. The five that cover almost everything:

// Function<T, R> - takes a T, returns an R
Function<String, Integer> length = s -> s.length();
length.apply("hello");        // 5

// Predicate<T> - takes a T, returns boolean (a test)
Predicate<Integer> isEven = n -> n % 2 == 0;
isEven.test(4);               // true

// Consumer<T> - takes a T, returns nothing (a side effect)
Consumer<String> print = s -> System.out.println(s);
print.accept("hi");           // prints hi

// Supplier<T> - takes nothing, returns a T (a factory/lazy value)
Supplier<Double> random = () -> Math.random();
random.get();                 // a random double

// BiFunction<T, U, R> - takes two args, returns a result
BiFunction<Integer, Integer, Integer> add = (a, b) -> a + b;
add.apply(2, 3);              // 5

Plus specializations to avoid autoboxing (chapter 12): IntFunction, ToIntFunction, IntPredicate, etc. - same ideas, primitive-typed. And UnaryOperator<T> (a Function<T,T>) and BinaryOperator<T> (a BiFunction<T,T,T>) for same-type cases.

Knowing these names lets you read any modern Java API. When Stream.map asks for a Function<? super T, ? extends R>, you now know exactly what it wants (and you can read the wildcards from chapter 04).

Method references: lambdas, shorter

When a lambda just calls one existing method, a method reference says the same thing more cleanly. :: is the method-reference operator.

// These pairs are equivalent:
s -> s.toUpperCase()          ===  String::toUpperCase      // instance method of the arg
s -> System.out.println(s)    ===  System.out::println      // method on a specific object
s -> Integer.parseInt(s)      ===  Integer::parseInt        // static method
() -> new ArrayList<>()       ===  ArrayList::new            // constructor

The four kinds:

  1. Static method: Integer::parseInt for s -> Integer.parseInt(s).
  2. Instance method of a particular object: System.out::println for s -> System.out.println(s).
  3. Instance method of the parameter: String::toUpperCase for s -> s.toUpperCase() (the parameter becomes the receiver).
  4. Constructor: ArrayList::new for () -> new ArrayList<>().

Use a method reference when the lambda body is exactly one method call - it's more readable. Use a lambda when there's any additional logic. Don't contort code to force a method reference; clarity wins.

Streams: declarative data processing

A Stream is a pipeline for processing a sequence of elements. You describe what you want (filter these, transform those, collect the rest) instead of how to loop. Compare:

// Imperative: how to do it, step by step.
List<String> result = new ArrayList<>();
for (Person p : people) {
    if (p.age() >= 18) {
        result.add(p.name().toUpperCase());
    }
}
result.sort(Comparator.naturalOrder());

// Declarative: what you want.
List<String> result = people.stream()
    .filter(p -> p.age() >= 18)         // keep adults
    .map(p -> p.name().toUpperCase())   // transform to uppercase names
    .sorted()                            // sort them
    .toList();                           // collect to a list

A stream pipeline has three parts:

  1. A source - collection.stream(), Stream.of(...), Arrays.stream(arr), IntStream.range(0, n).
  2. Intermediate operations - lazy transformations that return a new stream: filter, map, sorted, distinct, limit, skip, peek, flatMap. These do nothing until a terminal operation runs.
  3. A terminal operation - triggers the pipeline and produces a result: toList, collect, forEach, count, reduce, findFirst, anyMatch, min/max.

Laziness matters: intermediate ops don't run until a terminal op pulls elements through. This lets streams short-circuit:

Optional<Integer> firstBig = numbers.stream()
    .filter(n -> n > 1000)    // lazy
    .findFirst();             // terminal - stops at the FIRST match, doesn't scan the rest

The operations you'll use constantly

// filter - keep elements matching a predicate
nums.stream().filter(n -> n % 2 == 0)

// map - transform each element
names.stream().map(String::length)

// flatMap - flatten nested structure (a stream of lists into a stream of elements)
listOfLists.stream().flatMap(List::stream)

// reduce - fold into a single value
nums.stream().reduce(0, Integer::sum)       // sum

// collect - the powerful terminal; build any result
people.stream().collect(Collectors.toList())
people.stream().collect(Collectors.groupingBy(Person::department))
people.stream().collect(Collectors.toMap(Person::id, Person::name))
people.stream().collect(Collectors.joining(", "))   // for streams of strings

// counting, summing, averaging
people.stream().collect(Collectors.groupingBy(Person::dept, Collectors.counting()))

Collectors.groupingBy is the workhorse - it does in one line what nested maps-of-lists did in chapter 05:

// Group people by department:
Map<String, List<Person>> byDept = people.stream()
    .collect(Collectors.groupingBy(Person::department));

// Count per department:
Map<String, Long> countByDept = people.stream()
    .collect(Collectors.groupingBy(Person::department, Collectors.counting()));

// Average age per department:
Map<String, Double> avgAge = people.stream()
    .collect(Collectors.groupingBy(Person::department,
             Collectors.averagingInt(Person::age)));

Primitive streams

IntStream, LongStream, DoubleStream avoid autoboxing (chapter 12) and add numeric methods:

int sum = IntStream.rangeClosed(1, 100).sum();          // 5050
double avg = people.stream().mapToInt(Person::age).average().orElse(0);
IntStream.range(0, 5).forEach(System.out::println);     // 0 1 2 3 4

Use mapToInt/mapToObj to move between object and primitive streams.

When NOT to use streams

Streams are not always better. Reach for a plain loop when:

  • The logic is simple iteration with side effects. for (var x : list) process(x); is clearer than list.forEach(this::process) for plain iteration - and far clearer than list.stream().forEach(...) (never add .stream() just to call forEach).
  • You need the index. Streams hide indices; a classic for (int i = ...) is cleaner when you need i.
  • You're mutating external state in the pipeline. Streams should be functional - transform inputs to outputs. Mutating a list or counter from inside map/forEach is a smell and breaks under parallelism.
  • Debugging step-by-step matters. A loop is trivial to breakpoint; a long stream chain is harder (though .peek() helps).
  • Performance is critical in a hot loop. Streams have small per-element overhead. For most code it's irrelevant; in a tight numeric loop, a plain for can be faster (measure - chapter 13).

The guidance: use streams for declarative transformations (filter/map/collect pipelines); use loops for simple iteration, index-dependent logic, and mutation. A stream that's three operations and reads like a sentence is great. A stream with a giant lambda block, side effects, and peek debugging is worse than the loop it replaced.

Optional: representing "maybe a value"

Before Optional, "this might return nothing" meant returning null - and every caller forgetting a null check was a NullPointerException waiting to happen. Optional<T> makes absence explicit in the type.

// A method that might not find a result returns Optional, not null.
Optional<User> findUser(long id) {
    User u = lookup(id);
    return Optional.ofNullable(u);   // wraps null as Optional.empty()
}

The caller is now forced by the type to handle absence:

Optional<User> result = findUser(42);

// Good patterns:
String name = result.map(User::name).orElse("unknown");      // transform + default
result.ifPresent(u -> sendEmail(u));                          // do something if present
User u = result.orElseThrow(() -> new UserNotFoundException(42));  // or throw

// Chaining through possibly-absent steps:
String city = findUser(42)
    .map(User::address)        // Optional<Address>
    .map(Address::city)        // Optional<String>
    .orElse("no city");        // safe even if user or address is absent

Using Optional well - and badly

Do: - Return Optional from methods that may legitimately find nothing (findById, firstMatching). It tells the caller, in the type, to handle absence. - Chain with map/filter/flatMap to transform safely through possibly-missing values. - End with orElse, orElseGet, orElseThrow, or ifPresent.

Don't: - Don't use Optional for fields - it adds overhead and isn't serializable; use a nullable field with clear documentation, or restructure. - Don't use Optional for method parameters - overload or accept nullable instead; forcing callers to wrap arguments in Optional is clumsy. - Don't call .get() without checking - optional.get() on an empty Optional throws NoSuchElementException, recreating the very NPE problem Optional was meant to solve. Use orElse/orElseThrow instead. - Don't do if (opt.isPresent()) opt.get() - that's just a null check with extra steps. Use ifPresent, map, or orElse.

// BAD - Optional used like a null check
if (result.isPresent()) {
    return result.get().name();
}
return "unknown";

// GOOD - functional, no .get()
return result.map(User::name).orElse("unknown");

Optional is for return values that may be absent. That's its job. Used there, it eliminates a category of NPEs; used as a field or parameter, it just adds noise.

A worked example: a small report pipeline

Everything together - parse, filter, group, summarize:

record Sale(String region, String product, double amount) {}

List<Sale> sales = loadSales();

// Total sales per region, only counting sales over $100, sorted high to low.
Map<String, Double> totalByRegion = sales.stream()
    .filter(s -> s.amount() > 100)                          // keep significant sales
    .collect(Collectors.groupingBy(
        Sale::region,                                        // group by region
        Collectors.summingDouble(Sale::amount)));           // sum amounts per group

// Top product overall by revenue:
Optional<String> topProduct = sales.stream()
    .collect(Collectors.groupingBy(
        Sale::product,
        Collectors.summingDouble(Sale::amount)))            // Map<product, total>
    .entrySet().stream()
    .max(Map.Entry.comparingByValue())                      // highest total
    .map(Map.Entry::getKey);                                // just the product name

topProduct.ifPresent(p -> System.out.println("Top product: " + p));

In imperative code this is 30+ lines of nested loops and maps. As a stream pipeline it reads like the requirement. This is where streams shine - declarative aggregation over collections.

Try it

  1. Lambda to method reference. Write names.stream().map(s -> s.toUpperCase()).forEach(s -> System.out.println(s)). Convert both lambdas to method references. Confirm identical output and note the readability gain.

  2. The five functional interfaces. Declare a Function, Predicate, Consumer, Supplier, and BiFunction each as a variable, and call its method. This cements what every stream operation is asking for.

  3. groupingBy three ways. Given a list of words, build (a) Map<Integer, List<String>> grouping by length, (b) Map<Integer, Long> counting per length, (c) Map<Character, List<String>> grouping by first letter. One collect call each.

  4. Stream vs loop judgment. Write "sum of squares of even numbers in a list" as both a stream and a loop. Then write "print each element with its index" as both. Notice the stream wins the first (declarative transform) and the loop wins the second (needs the index). Internalize the boundary.

  5. Optional chains. Write findUser returning Optional<User>. Chain .map(User::manager).map(User::name).orElse("no manager"). Test with a user who has a manager, one who doesn't, and a missing user. One chain, three cases, no NPE.

  6. Refactor the anti-pattern. Take if (opt.isPresent()) { return opt.get().field(); } else { return "default"; } and rewrite as a one-line map().orElse(). Then find a real null-returning method in your own code and convert it to return Optional.

What you might wonder

"Are streams slower than loops?" Slightly, per element, due to the pipeline machinery and (for object streams) boxing. For the vast majority of code the difference is irrelevant and readability wins. In a measured hot path with millions of iterations, a plain for (especially over primitives) can be meaningfully faster. Rule: write the clear version first, optimize only what profiling (chapter 13) flags.

"What about parallel streams (.parallelStream())?" They split the work across threads automatically. Tempting, but a trap for beginners: they only help for large datasets with CPU-heavy, independent, stateless operations, and they can be slower (or wrong) otherwise. They also use a shared common thread pool. Don't reach for parallelStream until you've measured a real bottleneck and understand the constraints - it's a chapter 11 (concurrency) topic in disguise.

"toList() vs collect(Collectors.toList())?" stream.toList() (Java 16+) is the modern, concise form and returns an unmodifiable list. collect(Collectors.toList()) returns a list with no guarantee about mutability. Prefer .toList() unless you specifically need a mutable result (then collect(Collectors.toCollection(ArrayList::new))).

"When flatMap vs map?" map transforms each element to exactly one element. flatMap transforms each element to a stream of elements and flattens them all into one stream - use it when each input produces zero-or-many outputs (a list of orders, each with many items, into a stream of all items). It's also how you chain Optionals that themselves return Optional.

"Is Optional overhead a problem?" Each Optional is a small object allocation. For return values it's negligible and the safety is worth it. That's why the advice is "return values yes, fields/parameters/hot-loops no" - in a tight loop creating millions of Optionals, the allocation adds up (chapter 12). Right tool, right place.

"Can lambdas access local variables?" Yes, but only effectively final ones - variables you don't reassign after the lambda captures them. The lambda captures the value, not a live reference. This is the same closure-capture rule that causes bugs in concurrency (chapter 09) - worth remembering now.

Done

  • You can write lambdas and know they implement single-method functional interfaces.
  • You know the five core functional interfaces (Function, Predicate, Consumer, Supplier, BiFunction) - the vocabulary of every modern Java API.
  • You can convert lambdas to method references when it reads better.
  • You can build stream pipelines: source -> intermediate ops -> terminal, with filter/map/collect/groupingBy/reduce.
  • You know when not to use a stream (simple iteration, indices, mutation, hot loops).
  • You use Optional for absent return values, with map/orElse, and avoid the .get() and field/parameter anti-patterns.

Next: memory for application developers - heap, stack, references, GC awareness, and how Java programs leak.

Next: Memory for app developers →

08 - Memory for app developers

What this session is

About an hour. Not JVM internals - that's Java Mastery. This is the working model of Java memory that every application developer needs: where objects live, what a reference really is, how the garbage collector decides what to free, and - most importantly - how Java programs leak memory despite having a garbage collector. By the end you'll reason about object lifetimes, avoid the common leaks, and know what OutOfMemoryError is actually telling you.

"Java has garbage collection, so I don't think about memory" - the half-truth

It's true you never call free() or delete. The garbage collector (GC) reclaims memory automatically. But "automatic" is not "free of responsibility." Java programs absolutely leak memory, run out of it, and slow to a crawl under GC pressure. The difference from C is how you think about it: not "did I free this?" but "is something still holding a reference that shouldn't be?"

This chapter gives you that model.

Stack and heap: where things live

Java memory splits into two regions with different rules.

The stack holds method call frames. Each method call pushes a frame containing its local variables and parameters. When the method returns, its frame pops and that memory is reclaimed instantly - no GC involved. Each thread has its own stack.

The heap holds all objects (everything created with new, plus arrays). It's shared across all threads. Objects live here until the GC determines nothing references them.

The crucial distinction: a local variable on the stack often holds a reference to an object on the heap.

void example() {
    int count = 5;                    // primitive: the value 5 lives on the stack
    String name = "Alice";            // reference on stack -> String object on heap
    Point p = new Point(1, 2);        // reference on stack -> Point object on heap
}   // when example() returns: stack frame popped. count gone.
    // The String and Point objects on the heap are now unreferenced -> eligible for GC.

Picture it:

STACK (per thread)              HEAP (shared)
┌─────────────────┐
│ count = 5       │            ┌──────────────┐
│ name  ──────────┼──────────> │ "Alice"      │
│ p     ──────────┼──────────> │ Point{1, 2}  │
└─────────────────┘            └──────────────┘

Two consequences you must internalize:

  1. Primitives (int, double, boolean, ...) hold their value directly. A local int is the number itself, on the stack. An Integer is a reference to an object on the heap (this is why autoboxing costs - chapter 12).

  2. Object variables hold references, not objects. When you write Point p2 = p1, you copy the reference - both point at the same heap object. This is the reference-types lesson from chapter 03's defensive-copy section, now with the memory picture behind it.

Point a = new Point(1, 2);
Point b = a;              // b and a reference the SAME object
b.move(5, 5);            // mutating through b
System.out.println(a);   // a sees the change - one object, two references

What a reference is

A reference is a handle to an object on the heap - conceptually an address, though the JVM may move objects around (the GC compacts the heap), so you can't do pointer arithmetic like C. You can only: follow a reference (.field, .method()), compare references (==), assign references, and pass them.

null is a reference that points at nothing. Dereferencing it (null.field) is the NullPointerException you've met many times - the memory equivalent of following an address to nowhere.

How the garbage collector decides what to free

The GC's job: find objects nothing references anymore, and reclaim their memory. The core concept is reachability.

An object is reachable if you can get to it by following references starting from a GC root. GC roots are the "anchors":

  • Local variables on any thread's stack.
  • Static fields of loaded classes.
  • Active threads.
  • (A few JVM-internal ones.)

Starting from the roots, the GC follows every reference, marking everything it can reach as live. Anything it can't reach is garbage - nothing in the running program can possibly use it - so its memory is reclaimed.

GC ROOTS                  reachable (live)              unreachable (garbage)
┌──────────┐
│ stack var├──────> Object A ──────> Object B          Object D ──────> Object E
│ static   ├──────> Object C                           (nothing points here)
└──────────┘

Objects A, B, C are reachable from roots - kept. D and E reference each other but nothing reachable references them - both collected, even though they're not individually "null." (This is why Java can collect cycles that simple reference-counting can't.)

The practical takeaway, and the key to everything else in this chapter:

An object is freed when - and only when - it becomes unreachable. A "memory leak" in Java is an object that's still reachable but will never be used again.

You don't free memory. You make objects unreachable, by dropping the references that keep them alive. When the last reference goes, the object becomes collectible.

Generational GC, briefly

You don't need GC internals, but one fact shapes how you think about allocation: the JVM uses a generational collector based on a simple observation - most objects die young. A request handler creates dozens of short-lived objects that are garbage milliseconds later; a few objects (caches, config) live the whole program.

So the heap is split into a young generation (where new objects are born, collected frequently and cheaply) and an old generation (where long-lived survivors get promoted, collected rarely). This is why:

  • Allocating many short-lived objects is cheap. They're born and die in the young generation, collected in fast minor GCs. Don't contort your code to avoid all allocation - the GC is optimized for exactly this pattern.
  • The expensive collections are of the old generation (major/full GCs, longer pauses). Objects that get promoted there and then become garbage are the costly ones - which is exactly what leaks produce.

That's the entire model you need. The mechanics (G1, ZGC, regions, write barriers) are Java Mastery.

How Java programs leak memory

A garbage collector prevents use-after-free and double-free bugs. It does not prevent leaks. A leak in Java is: an object you're done with, but a reference still reaches it, so the GC can't collect it. Here are the four classic sources - learn to spot each.

1. Forgotten references in long-lived collections

The most common leak. You add to a collection that lives a long time, and never remove.

class Cache {
    // This map lives for the whole program. Anything added is reachable forever.
    private final Map<String, byte[]> entries = new HashMap<>();

    void put(String key, byte[] data) {
        entries.put(key, data);    // added... but who removes? Nobody. Leak.
    }
}

Every put keeps the byte[] reachable forever, even after no one needs it. A cache without eviction is a memory leak with extra steps. Fix: bound it (an LRU via LinkedHashMap from chapter 05), evict explicitly, or use a cache library (Caffeine) that handles eviction.

2. Listeners and callbacks never unregistered

button.addListener(this::onClick);   // registers a reference to `this`
// ... if `this` is never unregistered, the button keeps it alive forever

The event source holds a reference to your listener. If the listener (or its enclosing object) should be collected but you never removeListener, it leaks. Fix: unregister in a cleanup method, or use weak references (below).

3. The classic: static collections

class Registry {
    // static = lives as long as the class is loaded = essentially forever.
    private static final List<Connection> ALL = new ArrayList<>();

    Registry() { ALL.add(/* something referencing this */); }   // never removed -> leak
}

Static fields are GC roots' close cousins - reachable for the program's life. Anything a static collection holds is immortal. Be very deliberate about what you put in static state.

4. Unclosed resources holding buffers

A resource (stream, connection) you don't close may hold native buffers and references. This is the chapter 06 lesson with the memory angle: try-with-resources isn't just tidy, it prevents resource-and-memory leaks.

The unifying diagnosis for all four: find the reference chain from a GC root to the object that should be dead. A heap profiler (chapter 13) shows you exactly this chain - "this 2 GB of byte arrays is retained by Cache.entries, which is held by a static field." That sentence is how every Java memory leak is solved.

Reference strength: strong, weak, soft

Not all references keep objects equally alive. Most are strong (an ordinary Point p = ...). Java offers weaker ones for cache and listener scenarios.

import java.lang.ref.WeakReference;

WeakReference<BigThing> ref = new WeakReference<>(new BigThing());
BigThing t = ref.get();    // returns the object, or null if it's been collected
  • Strong reference (the default): keeps the object alive. As long as a strong reference exists, the object is never collected.
  • Weak reference: does not prevent collection. If the only references to an object are weak, the GC collects it and ref.get() returns null. Used for caches and listener registries that shouldn't keep their keys alive - WeakHashMap is built on this (entries vanish when keys become otherwise-unreachable).
  • Soft reference: like weak, but the GC keeps it until memory is tight, then collects it. Used for memory-sensitive caches.

You won't use these daily, but recognize them: WeakHashMap for listener registries that auto-clean, soft references for "cache this until we need the memory back." Reaching for them is a sign you're solving a real lifetime problem - use deliberately.

OutOfMemoryError: what it actually means

When the heap fills with reachable objects and the GC can't free enough, you get java.lang.OutOfMemoryError: Java heap space. It almost always means one of two things:

  1. A leak - reachable objects accumulating without bound (one of the four sources above). The fix is finding and breaking the reference chain.
  2. Genuinely too much data for the configured heap - you're trying to hold more than -Xmx allows. The fix is either more heap or processing data in chunks/streams instead of loading it all.

The diagnostic move (chapter 13): enable -XX:+HeapDumpOnOutOfMemoryError, get a heap dump, open it in a tool (Eclipse MAT, VisualVM), and look at "what's retaining the most memory and what's the path to a GC root." That path is the bug.

Writing GC-friendly code (without obsessing)

The balance: don't micro-optimize allocation everywhere (the GC is good at short-lived objects), but be aware of the patterns that create pressure. The big ones, covered fully in chapter 12:

  • Don't hold references longer than needed. Null out a reference in a long-lived object when you're done with what it points to, so the GC can reclaim it. (Don't do this for ordinary locals - they're freed when the method returns. Do it for fields of long-lived objects.)
  • Bound your caches. Unbounded caches are the #1 leak.
  • Prefer streaming over loading everything. Reading a 10 GB file into a List<String> will OOM; reading it line by line won't.
  • Be aware of autoboxing in hot loops (chapter 12) - a million Integers is a million heap objects.

Try it

  1. See reachability. Create an object, assign it to two references, null one - it stays alive (the other reference). Null both - now it's collectible. You can't observe the collection directly, but reason through each step: at which line does the object become unreachable?

  2. Build a leak. Write a Cache with an unbounded static Map. In a loop, put a million 1 KB byte arrays with unique keys. Watch memory climb (Runtime.getRuntime().totalMemory() - freeMemory()), and eventually OutOfMemoryError. Then bound it with an LRU LinkedHashMap (chapter 05) and watch memory stabilize. Feel the difference between "reachable forever" and "evicted."

  3. WeakHashMap demo. Put entries in a WeakHashMap<Key, Value> keyed by objects you hold strong references to. Print the size. Null your strong references to the keys, call System.gc() (a hint, not a guarantee), and print the size again - entries vanish because nothing else reaches the keys. Compare with a normal HashMap where they persist.

  4. Reference vs value semantics. Make a mutable Counter object. Assign a = new Counter(); b = a; b.increment(); and print a. Confirm both references see the change (one object). Then do the same with an int and confirm independence (value copy). This is the stack/heap distinction in action.

  5. Heap dump on OOM. Run the leak from #2 with -Xmx64m -XX:+HeapDumpOnOutOfMemoryError. Open the resulting .hprof in VisualVM or Eclipse MAT. Find the dominator (HashMap / byte[]) and the path to the GC root. This is exactly how production leaks get solved.

What you might wonder

"Should I call System.gc()?" Almost never. It's a hint the JVM can ignore, it forces a full (expensive) collection, and needing it usually signals a design problem. The GC runs when it needs to. The only legitimate uses are niche (some benchmarking, some demos like exercise 3). In application code, don't.

"Does setting a variable to null help?" For local variables, no - they're freed when the method returns; nulling them early is noise. For fields of long-lived objects, sometimes yes - if a long-lived object holds a reference to something big it no longer needs, nulling that field lets the GC reclaim the big thing sooner. The JDK does this in a few places (ArrayList.clear nulls its slots). Don't sprinkle x = null everywhere; do it deliberately when a long-lived object outlives its need for a large referent.

"What's the difference between OutOfMemoryError: heap space and : Metaspace?" Heap space = too many objects (this chapter). Metaspace = too many loaded classes (usually a classloader leak in app servers that redeploy repeatedly). Different cause, different fix; the error message tells you which.

"How big should -Xmx be?" Enough for your working set plus headroom, but not so much that full GCs become long pauses. For containers, the JVM is container-aware (since Java 10+) and sizes the heap from the container memory limit by default. Setting -Xmx to ~75% of the container limit (leaving room for stacks, metaspace, and native memory) is a common starting point - measure and adjust.

"Is the stack ever a problem?" Yes - deep or infinite recursion overflows it (StackOverflowError). Each call adds a frame; too many frames and you run out of stack. Unlike the heap, you rarely tune the stack; you fix the recursion. Default stack size (~512KB-1MB per thread) handles thousands of frames.

"Do records/immutability help memory?" Immutability (chapter 03) helps correctness and lets objects be safely shared (so you can reuse one instance instead of copying). It doesn't inherently use less memory, but sharing immutable instances instead of defensive-copying can reduce allocation. The bigger memory win from immutability is indirect: it discourages the long-lived-mutable-state patterns that leak.

Done

  • You know the stack (frames, locals, primitives) vs heap (all objects), and that variables hold references.
  • You understand reachability: objects live while reachable from a GC root, and a leak is a still-reachable-but-unused object.
  • You know the generational model enough to know short-lived allocation is cheap.
  • You can identify the four leak sources: long-lived collections, unregistered listeners, static state, unclosed resources.
  • You know strong/weak/soft references and WeakHashMap.
  • You know what OutOfMemoryError means and how a heap dump finds the cause.

Next: the heart of this path. Concurrency I - threads, and the three problems that make shared state dangerous.

Next: Concurrency I →

09 - Concurrency I: threads and the three problems

What this session is

About two hours - this is the most important chapter in the path, and the hardest. Almost every real Java program runs concurrently: a web server handles many requests at once, each on its own thread. The moment two threads touch the same data, a category of bug appears that you've never had to think about before - bugs that pass every test, work on your machine, and corrupt data randomly in production. This chapter is about seeing those bugs. Chapters 10 and 11 are about fixing them. Do every exercise here by actually running the code - concurrency cannot be learned by reading.

What a thread is

A thread is an independent path of execution within your program. Your main method runs on a thread. You can start more, and they run concurrently - the operating system (and the JVM) interleave them, giving each slices of CPU time, possibly truly in parallel on multiple cores.

public class Hello {
    public static void main(String[] args) {
        Thread t = new Thread(() -> {
            System.out.println("hello from another thread");
        });
        t.start();                          // starts the thread - runs concurrently with main
        System.out.println("hello from main");
        // The two prints can appear in either order - they run concurrently.
    }
}

new Thread(runnable) creates a thread; .start() runs its Runnable on a new thread of execution. (Calling .run() directly would not start a thread - it'd just run the code on the current thread. Always .start().)

Why threads exist: to do things at the same time. Wait for a slow network call on one thread while computing on another. Handle thousands of simultaneous web requests. Use all your CPU cores for a parallel computation. Concurrency is how software stays responsive and uses modern hardware.

Joining and the basics

join() waits for a thread to finish:

Thread worker = new Thread(() -> {
    // do some work
});
worker.start();
worker.join();    // main blocks here until worker finishes
System.out.println("worker done");

A few essentials:

Thread.currentThread().getName();       // who am I
Thread.sleep(1000);                     // pause this thread 1 second (throws InterruptedException)
t.isAlive();                            // is it still running
t.interrupt();                          // request cancellation (cooperative - chapter 11)

That's the mechanics. Now the danger.

The first problem: race conditions

Here is the bug that defines concurrency. Two threads incrementing a shared counter:

public class RaceDemo {
    static int counter = 0;             // shared mutable state

    public static void main(String[] args) throws InterruptedException {
        Runnable task = () -> {
            for (int i = 0; i < 100_000; i++) {
                counter++;              // looks atomic. IS NOT.
            }
        };

        Thread t1 = new Thread(task);
        Thread t2 = new Thread(task);
        t1.start(); t2.start();
        t1.join();  t2.join();

        System.out.println(counter);    // EXPECTED 200000. ACTUAL: something less, varies each run.
    }
}

Run it. You'll get 200000 almost never - you'll get 137492, 156010, a different wrong number each time. The counter lost increments. Why?

counter++ is not one operation. It's three:

  1. Read counter from memory into the CPU.
  2. Add 1.
  3. Write the result back to memory.

Now interleave two threads:

Thread 1: read counter (it's 5)
Thread 2: read counter (also 5)        <- both saw 5
Thread 1: add 1 -> 6, write 6
Thread 2: add 1 -> 6, write 6          <- overwrites! Two increments, but counter only went 5 -> 6

Two increments happened; the counter advanced by one. One was lost. This is a race condition: the result depends on the timing of how operations interleave, which is nondeterministic. Run it a million times and you'll see a million different wrong answers.

The general definition: a race condition is when the correctness of your program depends on the relative timing of threads. Any time two threads access the same mutable data and at least one writes, without coordination, you have one.

The cruelty: it usually works in testing. With light load, the threads happen not to collide. Then production hits it with real concurrency and the data corrupts intermittently, unreproducibly. This is why concurrency bugs are the most feared kind.

The second problem: visibility

Even simpler than a race, and more insidious. One thread sets a flag; another never sees it.

public class VisibilityDemo {
    static boolean running = true;       // shared flag

    public static void main(String[] args) throws InterruptedException {
        Thread worker = new Thread(() -> {
            int count = 0;
            while (running) {             // spin until told to stop
                count++;
            }
            System.out.println("stopped after " + count);
        });
        worker.start();

        Thread.sleep(100);
        running = false;                 // tell it to stop
        System.out.println("told it to stop");
        // worker may NEVER stop - it may never see running == false.
    }
}

You set running = false. Common sense says the worker's loop ends. But it may spin forever. Why?

Each thread can cache values in CPU registers or per-core caches for speed. The worker thread may have cached running == true and never re-read it from main memory. The compiler and CPU are also free to reorder and optimize reads of a variable they don't know another thread is changing. Without explicit coordination, there is no guarantee that one thread's write to a variable ever becomes visible to another thread.

This is the visibility problem: a write by one thread may never be seen by another. It's not about timing of interleaving (like a race) - it's that the update might not propagate at all. And like races, it often "works" in testing and hangs in production on a different CPU.

The third problem: ordering / reordering

The subtlest. Compilers and CPUs reorder instructions for performance, as long as the result looks the same to a single thread. Across threads, that reordering becomes visible and breaks assumptions.

// Thread A
data = compute();      // (1)
ready = true;          // (2)

// Thread B
if (ready) {           // (3)
    use(data);         // (4) - might see ready==true but data not yet set!
}

You wrote data then ready. But the compiler/CPU may reorder A's two writes (they're independent from A's single-threaded view). Thread B can observe ready == true while data is still the old value - using uninitialized data. Single-threaded, the reorder is invisible and legal. Across threads, it's a bug.

These three - races (timing of interleaving), visibility (updates not propagating), ordering (reordering across threads) - are the entire problem space of concurrency. Every concurrency tool in chapters 10 and 11 exists to control them.

The Java Memory Model: the rules of visibility and ordering

How do you get guarantees about visibility and ordering? The Java Memory Model (JMM) defines them through a relation called happens-before. If action X happens-before action Y, then X's effects (including all its memory writes) are guaranteed visible to Y, and X is guaranteed to appear to occur before Y.

Without a happens-before relationship between two threads' actions, you have no guarantee about visibility or ordering between them - that's exactly the bugs above.

The happens-before edges you'll rely on (chapters 10-11 are built on these):

  • Monitor lock: releasing a lock (synchronized block exit) happens-before any subsequent acquisition of the same lock. Everything done before the release is visible after the next acquire.
  • volatile: a write to a volatile field happens-before every subsequent read of that field. (More below.)
  • Thread start: thread.start() happens-before everything the started thread does.
  • Thread join: everything a thread does happens-before another thread returning from thread.join() on it. (This is why the counter in the race demo was at least visible after join - the join gives a happens-before edge for visibility, even though the race already corrupted the value.)

The practical rule: to safely share mutable data between threads, you must establish a happens-before relationship - via a lock, a volatile, or a higher-level concurrent tool. Plain reads and writes of shared fields give you nothing.

volatile: the visibility fix (only)

The lightest tool. Marking a field volatile guarantees visibility and ordering for that field: every write is immediately visible to every other thread's subsequent read, and reads/writes can't be reordered around it.

The visibility demo, fixed:

static volatile boolean running = true;   // <- volatile

That one keyword fixes the infinite-spin bug. Now the worker is guaranteed to see running = false. volatile is the right tool for flags and one-way state changes that one thread writes and others read.

But - and this is critical - volatile does NOT make compound operations atomic. It does not fix the counter race:

static volatile int counter = 0;
counter++;    // STILL a race! volatile makes the read and the write each visible,
              // but the read-add-write sequence can still interleave.

volatile guarantees you see the latest value on each read - but counter++ still does read-add-write as three steps, and another thread can slip between them. volatile solves visibility/ordering, not atomicity. The counter needs a lock or an atomic (chapter 10).

The dividing line: - One thread writes, others read a simple flag/value? volatile is enough. - Multiple threads read-modify-write the same data (++, check-then-act, etc.)? volatile is NOT enough - you need synchronization (chapter 10).

What's safe to share without any of this

Two categories of data are safe to share across threads with no synchronization at all:

  1. Immutable objects (chapter 03). If an object never changes after construction, there's nothing to race on, nothing to go stale - threads can read it freely. This is why immutability keeps coming up: it's the simplest path to thread safety. A record, a String, a LocalDate - share them fearlessly.

  2. Thread-confined data. Data that only one thread ever touches (local variables, or data handed off cleanly) has no sharing, so no concurrency problem. Local variables live on the thread's own stack (chapter 08) - they're inherently thread-confined.

The design lesson that runs through all of concurrency: the less mutable state you share, the fewer concurrency bugs you can have. Prefer immutability, prefer confining data to one thread, and synchronize only the shared mutable state that's genuinely necessary. The best concurrent code minimizes shared mutable state in the first place.

The closure-capture gotcha

A trap that bridges chapter 07. A lambda passed to a thread captures variables - and only effectively final ones. This interacts with concurrency:

for (int i = 0; i < 5; i++) {
    final int id = i;                          // must copy to an effectively-final var
    new Thread(() -> System.out.println(id)).start();
}

You can't capture the loop variable i directly (it changes - not effectively final). More dangerously, if multiple threads capture and mutate the same shared object, you're back to a race. Captured references still point at shared mutable state - capturing doesn't make it safe.

Try it

These exercises are the chapter. Run each and watch the behavior - you must see concurrency bugs to understand them.

  1. Reproduce the race. Run RaceDemo exactly as written. Run it ten times. Record the outputs - all different, all less than 200000. Now increase the loop to 1,000,000 and use 4 threads. The loss gets worse. You're watching lost updates in real time.

  2. Reproduce the visibility hang. Run VisibilityDemo. On many JVMs/CPUs the worker spins forever (you'll have to kill it) - it never sees running = false. (If it happens to stop on your setup, run with java -server and a tight loop; the optimization that hides the write is more aggressive under server compilation.) Then add volatile to running and watch it stop reliably. You just fixed a visibility bug.

  3. Prove volatile doesn't fix the race. Take RaceDemo, make counter volatile, and run it. Still wrong - still less than 200000. This is the most important exercise in the chapter: volatile fixes visibility, not atomicity. Convince yourself by seeing it fail.

  4. The check-then-act race. Write two threads that both do: if (!map.containsKey(k)) map.put(k, expensiveCompute()); on a shared HashMap. Run it. You'll see expensiveCompute() run twice for the same key (both threads passed the check before either put), and possibly a corrupted map. This "check-then-act" race is everywhere in real code.

  5. Immutability is safe. Share an immutable record Point(int x, int y) across ten threads that all read it. No synchronization, no bug, ever - because there's nothing to race on. Contrast with sharing a mutable Point whose fields ten threads write. Feel why immutability is the easy path.

  6. join gives visibility. In RaceDemo, note that after t1.join() and t2.join(), main reads counter and sees a consistent (if wrong) value - the join provides the happens-before edge for visibility. Remove the joins and read counter immediately; now you might not even see the threads' work at all. Two different problems: the race corrupts the value; the missing join can hide it entirely.

What you might wonder

"If concurrency is this dangerous, why use threads at all?" Because the alternative - doing one thing at a time - wastes modern multi-core hardware and makes servers unable to handle concurrent users. The danger isn't threads; it's shared mutable state between threads. Minimize that (immutability, confinement) and synchronize the rest correctly (chapters 10-11), and concurrency is a powerful, manageable tool.

"Do these bugs really happen, or is this theoretical?" They happen constantly, and they're among the most expensive bugs in the industry - precisely because they hide in testing and surface randomly in production. The "works on my machine" that becomes a 2 AM incident is very often a race or visibility bug. This is why concurrency questions dominate senior interviews.

"Why does counter++ not just... work? Other languages?" No mainstream language makes counter++ atomic across threads by default - it's read-modify-write everywhere. Some give you atomic types; some (Rust) use the type system to prevent unsynchronized sharing at compile time. Java gives you the tools (chapters 10-11) but trusts you to use them. The JMM is Java's precise specification of what's guaranteed.

"Is volatile slow?" It has a cost (it prevents certain caching/reordering optimizations and may insert memory barriers), but it's much cheaper than a lock. Use it freely for the flag/single-writer case it's designed for. Just don't reach for it expecting atomicity it doesn't provide.

"What about the synchronized keyword - isn't that the answer?" Yes, for the atomicity problem volatile can't solve - that's chapter 10. synchronized gives both mutual exclusion (fixing races) and a happens-before edge (fixing visibility). It's the next chapter precisely because you need to feel the problems first.

"How do I even find a race condition?" You'll meet the tools in chapter 10 and 13. The short version: there's no volatile keyword that finds them, but there are race detectors and stress-testing tools, careful code review for "shared mutable state without synchronization," and the discipline of asking "what if two threads ran this line at once?" of every shared field. That question is the single most valuable concurrency habit.

Done

  • You can create, start, and join threads.
  • You can see the three core problems: races (timing-dependent interleaving), visibility (writes not propagating), ordering (reordering across threads).
  • You understand counter++ is read-modify-write, and why that races.
  • You know the Java Memory Model gives guarantees only through happens-before edges (locks, volatile, start/join).
  • You know volatile fixes visibility/ordering for a single field but NOT atomicity of compound operations.
  • You know immutability and thread-confinement are the synchronization-free safe paths.

Next: the tools that fix races - synchronized, locks, atomics, and the thread-safe collections.

Next: Concurrency II →

10 - Concurrency II: the tools that fix races

What this session is

About two hours. Chapter 09 showed you the three problems - races, visibility, ordering. This chapter is the toolbox that fixes them: synchronized and intrinsic locks, ReentrantLock, atomic variables, the thread-safe collections, and how to avoid the deadlock you can create while fixing a race. By the end you can take the broken counter and the check-then-act bug from chapter 09 and make them correct, and you'll know which tool to reach for.

synchronized: mutual exclusion + visibility in one keyword

The foundational fix. synchronized ensures that only one thread at a time can execute a guarded block, and it establishes the happens-before edge (chapter 09) that makes writes visible. It solves races and visibility together.

The broken counter from chapter 09, fixed:

public class Counter {
    private int count = 0;

    public synchronized void increment() {   // only one thread in here at a time
        count++;                              // now atomic with respect to other threads
    }

    public synchronized int get() {
        return count;
    }
}

Run two threads incrementing 100,000 times each through this and you get exactly 200000, every time. The synchronized keyword guarantees the read-modify-write of count++ completes without interruption from another thread, and that the result is visible to everyone.

How it works: intrinsic locks (monitors)

Every Java object has an associated intrinsic lock (also called a monitor). synchronized uses it:

  • synchronized instance method locks on this.
  • synchronized static method locks on the Class object.
  • synchronized (someObject) { ... } block locks on someObject.

A thread entering a synchronized block must acquire the lock; if another thread holds it, the entering thread blocks (waits) until it's released. Exiting the block releases the lock. Only one thread can hold a given lock at a time - that's the mutual exclusion.

The block form lets you lock on a specific object and narrow the critical section:

public class BankAccount {
    private final Object lock = new Object();   // a dedicated lock object
    private double balance;

    public void deposit(double amount) {
        synchronized (lock) {              // critical section - as small as possible
            balance += amount;
        }
    }
}

Using a private, dedicated lock object (rather than this) is good practice: it prevents outside code from accidentally (or maliciously) locking on your object and interfering with your synchronization. The lock is an implementation detail; keep it private.

The critical rules of synchronized

  1. Guard all access to shared mutable state with the same lock. If increment() is synchronized but get() isn't, get() can see a torn or stale value. Every read and write of the shared data must go through the same lock. A lock only protects against other threads using the same lock.

  2. Keep critical sections small. While you hold a lock, every other thread wanting it waits. Do the minimum inside synchronized; never do I/O, network calls, or call unknown code while holding a lock (that's a deadlock risk and a throughput killer).

  3. Synchronized is reentrant. A thread that holds a lock can re-acquire it (e.g., a synchronized method calling another synchronized method on the same object) without deadlocking itself. The JVM tracks a hold count.

ReentrantLock: explicit locking with more control

synchronized is implicit and scoped to a block. java.util.concurrent.locks.ReentrantLock is an explicit lock object with the same semantics plus extra capabilities.

import java.util.concurrent.locks.ReentrantLock;

public class Counter {
    private final ReentrantLock lock = new ReentrantLock();
    private int count = 0;

    public void increment() {
        lock.lock();                  // acquire
        try {
            count++;
        } finally {
            lock.unlock();            // ALWAYS release in finally - or you leak the lock forever
        }
    }
}

The lock()/try/finally/unlock() pattern is mandatory: if the body throws and you don't unlock in finally, the lock is never released and every other thread blocks forever. This verbosity is the price of the extra power, which is:

  • tryLock() - attempt to acquire without blocking forever; returns false (or times out) if you can't get it. Lets you avoid waiting indefinitely.
    if (lock.tryLock(1, TimeUnit.SECONDS)) {
        try { /* got it */ } finally { lock.unlock(); }
    } else {
        /* couldn't get the lock in 1s - do something else instead of blocking */
    }
    
  • Interruptible locking (lockInterruptibly()) - a waiting thread can be cancelled.
  • Fairness - an optional FIFO ordering so threads acquire in request order (slower, rarely needed).
  • Multiple condition variables (newCondition()) for advanced wait/signal patterns.

The guidance: use synchronized by default - it's simpler, less error-prone (no forgotten unlock), and the JVM optimizes it well. Reach for ReentrantLock only when you need its specific features (tryLock with timeout, interruptibility, fairness, or multiple conditions).

ReadWriteLock: many readers, one writer

When data is read far more often than written, a plain lock is wasteful - it blocks readers from each other even though concurrent reads are safe. ReentrantReadWriteLock allows many simultaneous readers but exclusive writers:

import java.util.concurrent.locks.ReentrantReadWriteLock;

public class Cache {
    private final ReentrantReadWriteLock rw = new ReentrantReadWriteLock();
    private final Map<String, String> map = new HashMap<>();

    public String get(String key) {
        rw.readLock().lock();                 // many threads can hold the read lock at once
        try { return map.get(key); }
        finally { rw.readLock().unlock(); }
    }

    public void put(String key, String value) {
        rw.writeLock().lock();                // exclusive - blocks all readers and writers
        try { map.put(key, value); }
        finally { rw.writeLock().unlock(); }
    }
}

Use it for read-heavy shared data. (Though for the specific case of a map, ConcurrentHashMap below is usually better - it's lock-striped internally and you don't manage locks at all.)

Atomic variables: lock-free single-variable updates

For the common case of a single counter or flag that multiple threads update, locking is overkill. The java.util.concurrent.atomic package gives lock-free atomic types built on hardware compare-and-swap (CAS) instructions - faster than locks under contention.

The counter, the lock-free way:

import java.util.concurrent.atomic.AtomicInteger;

public class Counter {
    private final AtomicInteger count = new AtomicInteger(0);

    public void increment() {
        count.incrementAndGet();    // atomic ++ in one call, no lock
    }

    public int get() {
        return count.get();
    }
}

incrementAndGet() is a single atomic operation - no race, no lock, no synchronized. The atomics:

AtomicInteger ai = new AtomicInteger(0);
ai.incrementAndGet();        // ++ai, atomic
ai.getAndAdd(5);             // add 5, return old value
ai.compareAndSet(10, 20);    // if value is 10, set to 20; atomic check-and-act

AtomicLong;                  // long version
AtomicBoolean;               // boolean version
AtomicReference<T>;          // atomic reference to an object
LongAdder;                   // even faster counter under HIGH contention (preferred for hot counters)

compareAndSet (CAS) is the primitive under all of them: "if the current value is X, atomically set it to Y; tell me if it worked." It lets you build lock-free read-modify-write loops:

AtomicReference<List<String>> ref = new AtomicReference<>(List.of());
// Atomically add to an immutable list without a lock:
List<String> oldList, newList;
do {
    oldList = ref.get();
    newList = new ArrayList<>(oldList);
    newList.add("item");
} while (!ref.compareAndSet(oldList, newList));   // retry if someone else changed it

Use atomics for single-variable counters/flags/references. For coordinating multiple related variables together, you still need a lock (atomics only make one variable atomic; "increment A and B together" needs a lock around both).

Thread-safe collections

A plain HashMap or ArrayList is not thread-safe (chapter 05) - concurrent modification can corrupt it or throw. The java.util.concurrent package provides safe versions designed for concurrency.

ConcurrentHashMap - the workhorse. A thread-safe HashMap that allows concurrent reads and writes without locking the whole map (it's internally lock-striped/lock-free). Use it whenever multiple threads share a map.

import java.util.concurrent.ConcurrentHashMap;

ConcurrentHashMap<String, Integer> counts = new ConcurrentHashMap<>();
counts.merge("key", 1, Integer::sum);    // atomic increment - the right way to count concurrently
counts.compute("key", (k, v) -> (v == null ? 0 : v) + 1);   // atomic compound update

Critically, its compound methods (merge, compute, computeIfAbsent, putIfAbsent) are atomic - they fix the check-then-act race from chapter 09. Two threads both doing counts.merge(k, 1, Integer::sum) will correctly count both, no lost updates, no lock needed.

The check-then-act bug from chapter 09, fixed:

// BROKEN (chapter 09): two threads both pass the check, both compute.
// if (!map.containsKey(k)) map.put(k, expensiveCompute());

// FIXED: computeIfAbsent is atomic - expensiveCompute runs exactly once per key.
map.computeIfAbsent(k, key -> expensiveCompute());

Other concurrent collections:

  • CopyOnWriteArrayList - thread-safe list where every write copies the whole array. Great for read-heavy, write-rare data (listener lists, config). Terrible for write-heavy (copies on every write).
  • ConcurrentLinkedQueue - lock-free FIFO queue.
  • BlockingQueue (e.g., ArrayBlockingQueue, LinkedBlockingQueue) - a queue where take() blocks until an element is available and put() blocks when full. The backbone of producer-consumer patterns and thread pools (chapter 11).
BlockingQueue<Task> queue = new LinkedBlockingQueue<>();
// Producer thread:
queue.put(task);                 // blocks if the queue is full
// Consumer thread:
Task t = queue.take();           // blocks until a task is available

The guidance: for shared collections, reach for ConcurrentHashMap (maps), CopyOnWriteArrayList (read-heavy lists), and BlockingQueue (handoff between threads) instead of wrapping plain collections in synchronized. They're designed for concurrency and far outperform a giant lock around a HashMap.

Deadlock: the bug you create while fixing races

Locks fix races but introduce a new failure mode: deadlock - two threads each holding a lock the other needs, both waiting forever.

// Thread 1                          // Thread 2
synchronized (lockA) {               synchronized (lockB) {
    synchronized (lockB) {               synchronized (lockA) {
        // ...                                // ...
    }                                    }
}                                    }

Thread 1 holds A, wants B. Thread 2 holds B, wants A. Neither can proceed. The program hangs - no exception, no crash, just frozen threads. This happens whenever two threads acquire multiple locks in different orders.

The classic real example - transferring money between accounts:

void transfer(Account from, Account to, double amount) {
    synchronized (from) {           // lock `from`
        synchronized (to) {         // lock `to`
            from.debit(amount);
            to.credit(amount);
        }
    }
}
// transfer(a, b) on one thread locks a then b.
// transfer(b, a) on another thread locks b then a.
// DEADLOCK.

The fix: always acquire multiple locks in a consistent global order. If every thread locks accounts in the same order (say, by account ID), the cycle can't form:

void transfer(Account from, Account to, double amount) {
    // Always lock the lower-id account first - consistent order, no cycle possible.
    Account first  = from.id() < to.id() ? from : to;
    Account second = from.id() < to.id() ? to : from;
    synchronized (first) {
        synchronized (second) {
            from.debit(amount);
            to.credit(amount);
        }
    }
}

Other deadlock defenses: hold one lock at a time when possible; use tryLock with a timeout (back off and retry instead of waiting forever); use higher-level concurrency tools (chapter 11) that avoid manual locking entirely. The rule to remember: multiple locks acquired in inconsistent order = deadlock waiting to happen.

Choosing the right tool

Situation Reach for
Single counter / flag, multiple writers AtomicInteger / AtomicLong / LongAdder
Single reference swapped atomically AtomicReference + compareAndSet
Guard a few related fields together synchronized (block on a private lock)
Need tryLock, timeout, or fairness ReentrantLock
Read-heavy shared data ReentrantReadWriteLock (or a concurrent collection)
Shared map ConcurrentHashMap
Read-heavy, write-rare list CopyOnWriteArrayList
Handoff work between threads BlockingQueue
Counting concurrently ConcurrentHashMap.merge or LongAdder

The meta-rule: prefer the highest-level tool that fits. A ConcurrentHashMap is better than a synchronized block around a HashMap; an AtomicInteger is better than a lock around an int. The high-level tools are correct by construction and faster under contention. Drop to manual synchronized/ReentrantLock only when no higher-level tool fits your exact coordination need.

Try it

  1. Fix the counter four ways. Take chapter 09's RaceDemo. Fix it with (a) synchronized method, (b) synchronized block on a private lock, (c) AtomicInteger, (d) LongAdder. Run each with 4 threads x 1,000,000 increments. Confirm all four give exactly 4,000,000. Time them - note the atomics/adder are faster than the locks under this contention.

  2. Forget the finally. Write a ReentrantLock counter but unlock() outside a finally, and make the body throw on some iterations. Watch the program hang (the lock never releases). Move unlock() into finally and watch it work. Feel why the pattern is mandatory.

  3. Fix check-then-act. Take chapter 09's double-expensiveCompute bug. Fix it with ConcurrentHashMap.computeIfAbsent. Add a print inside the compute lambda and confirm it runs exactly once per key, even with 8 threads racing.

  4. Build a deadlock. Write the two-account transfer with inconsistent lock order. Run transfer(a, b) and transfer(b, a) on two threads in a tight loop. It will hang within seconds. Take a thread dump (jstack <pid> or Ctrl-\) and read it - the JVM tells you "Found one Java-level deadlock" and names the threads and locks. Then apply the ordered-lock fix and confirm it runs forever without hanging.

  5. ReadWriteLock vs synchronized. Build a read-heavy cache (1000 reads per write) both with a plain synchronized and with a ReentrantReadWriteLock. Run many reader threads. The read-write version should show higher read throughput because readers don't block each other.

  6. BlockingQueue producer-consumer. One producer thread puts 100 tasks into a LinkedBlockingQueue; three consumer threads take and process them. Use a poison-pill or count to stop cleanly. Notice you wrote zero synchronized - the queue handles all the coordination.

What you might wonder

"synchronized vs ReentrantLock - really, which?" Default to synchronized: simpler, can't-forget-to-unlock, well-optimized, and reads clearly. Use ReentrantLock only when you specifically need tryLock/timeout, interruptible acquisition, fairness, or multiple condition variables. If you're not using one of those features, synchronized is the better choice.

"Are atomics always faster than locks?" Under low contention, similar. Under high contention, atomics (especially LongAdder, which spreads updates across cells) win because they don't block threads - they retry. But atomics only handle single-variable updates; the moment you need to update multiple things together atomically, you need a lock. Right tool for the granularity.

"Is ConcurrentHashMap just a synchronized HashMap?" No, and the difference matters. Collections.synchronizedMap(new HashMap<>()) locks the entire map for every operation - one thread at a time, total. ConcurrentHashMap allows many threads to operate concurrently (lock striping / lock-free reads) and provides atomic compound methods. It's dramatically more scalable. Never use synchronizedMap for a contended map.

"How do I detect deadlocks?" A thread dump (jstack, or jcmd <pid> Thread.print) explicitly reports "Found one Java-level deadlock" with the cycle. ThreadMXBean.findDeadlockedThreads() detects them programmatically. But detection is after-the-fact - the real defense is consistent lock ordering so they can't form. Chapter 13 covers reading thread dumps.

"What's a race condition vs a data race?" Often used interchangeably, but: a data race is the specific low-level event of unsynchronized concurrent access to a variable (what the JMM forbids). A race condition is the broader correctness bug where timing affects results (check-then-act can be a race condition even using thread-safe pieces, if the sequence isn't atomic). Fixing data races (synchronize access) doesn't automatically fix all race conditions (you may need the whole compound operation atomic).

"Can I just make everything synchronized to be safe?" No - over-synchronizing kills performance (threads serialize, defeating the point of concurrency) and increases deadlock risk (more locks, more chances for inconsistent ordering). The goal is the minimum synchronization that's correct: minimize shared mutable state (chapter 09), then guard only what's truly shared, with the highest-level tool that fits.

Done

  • You can fix races with synchronized (mutual exclusion + visibility via the intrinsic lock).
  • You know ReentrantLock for tryLock/timeout/fairness, and the mandatory lock()/finally/unlock() pattern.
  • You know ReadWriteLock for read-heavy data.
  • You can use atomics (AtomicInteger, LongAdder, compareAndSet) for lock-free single-variable updates.
  • You reach for ConcurrentHashMap, CopyOnWriteArrayList, and BlockingQueue instead of locking plain collections, and you know the concurrent map's atomic compound methods fix check-then-act.
  • You can recognize, reproduce, and prevent deadlock with consistent lock ordering.
  • You can choose the right tool by granularity, preferring the highest-level one that fits.

Next: the highest-level concurrency - executors, futures, CompletableFuture, and virtual threads. Where you stop managing threads and locks by hand entirely.

Next: Concurrency III →

11 - Concurrency III: executors, futures, and virtual threads

What this session is

About ninety minutes. Chapters 09-10 were about raw threads and locks - the foundations. This chapter is where modern Java actually does concurrency: you stop creating Thread objects and managing locks by hand, and instead submit tasks to an executor that manages a pool of threads for you, get futures representing results-to-come, compose async pipelines with CompletableFuture, and - since Java 21 - use virtual threads to write simple blocking code that scales to millions of concurrent operations. By the end you'll write concurrent Java the way production codebases do.

Why raw threads don't scale

Creating a Thread per task (chapter 09) has problems at scale:

  • Threads are expensive. Each OS thread costs ~1 MB of stack memory and has real creation/teardown cost. Create 10,000 and you've used 10 GB of stack and overwhelmed the scheduler.
  • Unbounded thread creation is a denial-of-service on yourself. A server that spawns a thread per request falls over under load.
  • No reuse. Creating and destroying a thread for each short task wastes the creation cost.

The fix is a thread pool: a fixed set of worker threads that pull tasks from a queue. You submit work; the pool runs it on an available worker. This decouples "how much work" from "how many threads," caps resource use, and reuses threads.

ExecutorService: submit tasks, not threads

ExecutorService is the standard thread-pool abstraction. You create one, submit tasks, shut it down.

import java.util.concurrent.*;

ExecutorService pool = Executors.newFixedThreadPool(4);   // 4 worker threads

// Submit a Runnable (no result):
pool.submit(() -> System.out.println("task ran on " + Thread.currentThread().getName()));

// Submit a Callable (returns a result) - get a Future back:
Future<Integer> future = pool.submit(() -> {
    Thread.sleep(100);
    return 42;
});

pool.shutdown();                          // stop accepting new tasks; finish queued ones
pool.awaitTermination(1, TimeUnit.MINUTES);  // wait for them to finish

The shift in thinking: you no longer say "run this on a new thread." You say "here is a task; run it whenever a worker is free." The pool handles thread lifecycle, reuse, and queueing.

The factory methods:

Executors.newFixedThreadPool(n)        // n threads, fixed. The common choice for CPU work.
Executors.newCachedThreadPool()        // grows/shrinks on demand. For many short-lived I/O tasks.
Executors.newSingleThreadExecutor()    // one thread - serializes tasks, useful for ordering
Executors.newScheduledThreadPool(n)    // for delayed/periodic tasks (replaces Timer)
Executors.newVirtualThreadPerTaskExecutor()  // Java 21+ - one virtual thread per task (below)

Always shut down your executor (shutdown() then awaitTermination, or use try-with-resources - ExecutorService is AutoCloseable since Java 19). A non-daemon pool keeps the JVM alive if you forget.

// Try-with-resources (Java 19+): shutdown + awaitTermination happen automatically on close.
try (var pool = Executors.newFixedThreadPool(4)) {
    pool.submit(task1);
    pool.submit(task2);
}   // close() shuts down and waits for tasks to finish

Sizing the pool

A rule of thumb that matters:

  • CPU-bound work (computation, no waiting): pool size ≈ number of CPU cores (Runtime.getRuntime().availableProcessors()). More threads than cores just adds context-switching overhead.
  • I/O-bound work (waiting on network/disk/database): more threads than cores, because threads spend most of their time blocked, not computing. The exact number depends on the wait/compute ratio - or, better, use virtual threads (below), which make this question mostly disappear.

Future: a result that isn't ready yet

submit of a Callable returns a Future<T> - a handle to a result that will exist eventually. You can do other work, then collect it:

Future<Integer> f = pool.submit(() -> expensiveComputation());

// ... do other things while it runs ...

Integer result = f.get();          // BLOCKS until the result is ready (or throws)

Future methods:

f.get();                  // block until done, return result (throws ExecutionException if task threw)
f.get(2, TimeUnit.SECONDS);  // block up to 2s, then throw TimeoutException
f.isDone();               // non-blocking check
f.cancel(true);           // attempt to cancel (interrupts the running thread if true)

Running several tasks in parallel and collecting results:

List<Future<Integer>> futures = new ArrayList<>();
for (int i = 0; i < 10; i++) {
    final int n = i;
    futures.add(pool.submit(() -> process(n)));   // all 10 start, run in parallel on the pool
}
int total = 0;
for (Future<Integer> f : futures) {
    total += f.get();             // collect each (blocks per future, but they ran concurrently)
}

The limitation of plain Future: get() blocks, and you can't easily chain "when this finishes, do that next" without blocking a thread to wait. That's what CompletableFuture solves.

CompletableFuture: composable async pipelines

CompletableFuture<T> is a Future you can compose - attach callbacks that run when it completes, chain transformations, combine multiple futures - all without blocking a thread to wait. It's how you build async pipelines.

import java.util.concurrent.CompletableFuture;

CompletableFuture<String> pipeline =
    CompletableFuture
        .supplyAsync(() -> fetchUser(id))          // run async, produce a User
        .thenApply(user -> user.email())           // transform: User -> String (when ready)
        .thenApply(String::toLowerCase)            // chain another transform
        .exceptionally(ex -> "unknown@example.com"); // recover from any failure in the chain

String email = pipeline.join();                    // get the final result (join = get without checked exc)

Each step runs when the previous completes - no blocking between steps. The vocabulary:

supplyAsync(supplier)       // start an async task producing a value
runAsync(runnable)          // start an async task with no result
thenApply(fn)               // transform the result (sync continuation)
thenApplyAsync(fn)          // transform on the pool (async continuation)
thenCompose(fn)             // chain another CompletableFuture (flatMap for futures - avoids nesting)
thenAccept(consumer)        // consume the result, no return
thenCombine(other, fn)      // combine two futures' results when both complete
exceptionally(fn)           // recover from an exception
handle((result, ex) -> ...) // handle both success and failure

Combining independent async calls - fetch two things in parallel, then merge:

CompletableFuture<Profile> profile = CompletableFuture.supplyAsync(() -> fetchProfile(id));
CompletableFuture<List<Order>> orders = CompletableFuture.supplyAsync(() -> fetchOrders(id));

CompletableFuture<Dashboard> dashboard = profile.thenCombine(orders,
    (p, o) -> new Dashboard(p, o));     // runs when BOTH complete, on whichever finishes last

Dashboard d = dashboard.join();          // profile and orders were fetched concurrently

thenCompose vs thenApply is the same distinction as flatMap vs map from chapter 07: use thenCompose when your function itself returns a CompletableFuture (to avoid CompletableFuture<CompletableFuture<T>> nesting).

CompletableFuture is the tool for orchestrating multiple async operations - parallel service calls, pipelines of dependent steps, fan-out/fan-in. It's everywhere in reactive and microservice code.

Virtual threads: the Java 21 game-changer

The biggest concurrency change in Java's history. A virtual thread is a lightweight thread managed by the JVM, not the OS. You can have millions of them. They make the "thread per task" model - simple, blocking, readable code - scale to levels that previously required complex async/reactive programming.

// Java 21+. One virtual thread per task. Create a MILLION if you want.
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    for (int i = 0; i < 1_000_000; i++) {
        executor.submit(() -> {
            Thread.sleep(1000);        // BLOCKS - but cheaply, on a virtual thread
            return fetchSomething();
        });
    }
}   // a million concurrent blocking tasks, on a handful of OS threads

How it works: when a virtual thread blocks (on I/O, sleep, a lock), the JVM unmounts it from its carrier OS thread and runs another virtual thread there. The OS thread is never idle waiting; it's always doing useful work. A few OS threads can host millions of virtual threads, as long as most are blocked at any moment (which I/O-bound work always is).

Why this matters: before virtual threads, scaling to many concurrent I/O operations forced you into asynchronous code - callbacks, CompletableFuture chains, reactive streams - which is harder to write, read, and debug. Virtual threads let you write simple, sequential, blocking code and still scale:

// This blocking, sequential, easy-to-read code now scales to millions of concurrent requests:
void handleRequest(Request req) {
    var user = db.loadUser(req.userId());      // blocks - fine on a virtual thread
    var orders = api.fetchOrders(user);        // blocks - fine
    var result = process(user, orders);        // computes
    respond(result);
}
// Run one virtual thread per request. No async, no callbacks, scales enormously.

The guidance:

  • For I/O-bound concurrency (web servers, API clients, anything that waits a lot): virtual threads are the new default. newVirtualThreadPerTaskExecutor(), write blocking code, scale freely.
  • For CPU-bound work (heavy computation): use a fixed platform-thread pool sized to cores - virtual threads don't help when threads are computing, not waiting.
  • Don't pool virtual threads. They're cheap to create; create one per task. Pooling them defeats the purpose.

One caveat: virtual threads are great for blocking I/O, but synchronized blocks can "pin" a virtual thread to its carrier (preventing unmounting) in some JDK versions - prefer ReentrantLock over synchronized in code that runs on virtual threads and holds locks across blocking calls. (This pinning is being reduced in newer JDKs.)

Structured concurrency (preview)

A newer model (preview in recent JDKs, stabilizing) that treats a group of related concurrent tasks as a single unit - if one fails, the others are cancelled; the parent waits for all. It makes concurrent code as structured as sequential code (no leaked threads, clear error propagation):

// StructuredTaskScope - subtasks are bound to a scope; the scope joins them all.
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
    var userTask  = scope.fork(() -> fetchUser(id));     // forked subtask
    var orderTask = scope.fork(() -> fetchOrders(id));   // forked subtask
    scope.join();                  // wait for both
    scope.throwIfFailed();         // if either failed, propagate (and the other was cancelled)
    return new Dashboard(userTask.get(), orderTask.get());
}   // guaranteed: no subtask outlives this block

This is the future of multi-task concurrency in Java - it eliminates the leaked-task and partial-failure problems of manual executor use. Worth knowing it exists; check your JDK version for stability.

Choosing the right tool

Situation Reach for
Run independent tasks on a bounded pool ExecutorService (fixed pool for CPU, virtual-per-task for I/O)
Get a single async result Future (or CompletableFuture)
Chain/compose async steps without blocking CompletableFuture (thenApply/thenCompose/thenCombine)
Many concurrent blocking I/O operations virtual threads (newVirtualThreadPerTaskExecutor, Java 21+)
Heavy parallel computation fixed platform-thread pool sized to cores
Delayed or periodic tasks newScheduledThreadPool
Group related subtasks with all-or-nothing semantics structured concurrency (StructuredTaskScope)

The arc of these three chapters: raw threads + locks (09-10) are the foundation you must understand, but you rarely write them directly. In real code you submit tasks to executors, compose with CompletableFuture, and - increasingly - use virtual threads to write simple blocking code that scales. The high-level tools sit on the low-level guarantees; knowing both is what makes you trustworthy with concurrency.

Try it

  1. Pool vs raw threads. Submit 10,000 short tasks (each Thread.sleep(10) then increment an AtomicInteger) two ways: (a) new Thread() per task, (b) a fixed pool of 100. Time both and watch memory. The raw-thread version strains or fails; the pool sails through. This is why pools exist.

  2. Future parallelism. Write 8 tasks that each sleep(1000) and return a number. Submit all to a pool of 8 and collect with get(). Total wall-clock time should be ~1 second, not 8 - they ran in parallel. Then submit to a pool of 2 and watch it take ~4 seconds (only 2 run at once).

  3. CompletableFuture pipeline. Build supplyAsync(() -> fetchUser()).thenApply(User::name).thenApply(String::toUpperCase). Add an .exceptionally(...) and make fetchUser throw - confirm the recovery value comes through. Then use thenCombine to fetch two things in parallel and merge them; verify the wall-clock is the max of the two, not the sum.

  4. thenApply vs thenCompose. Write a function CompletableFuture<Profile> loadProfile(User u). Chain it after loadUser with thenApply and observe the awkward CompletableFuture<CompletableFuture<Profile>>. Fix it with thenCompose. This is the map-vs-flatMap lesson in async form.

  5. Virtual threads at scale (Java 21+). Submit 1,000,000 tasks that each Thread.sleep(1000) to newVirtualThreadPerTaskExecutor(). It completes in ~1 second total (all million block concurrently on a few OS threads). Try the same with a fixed platform pool of 100 - it takes ~10,000 seconds. Witness why virtual threads changed Java concurrency.

  6. Forget to shut down. Submit a task to a non-virtual newFixedThreadPool and don't shut it down. Notice the JVM doesn't exit (the pool's non-daemon threads keep it alive). Add shutdown() (or use try-with-resources) and watch it exit cleanly.

What you might wonder

"Do virtual threads make ExecutorService and CompletableFuture obsolete?" No. Virtual threads change how many threads you can have and let you write blocking code that scales - they're about the threading model. ExecutorService is still how you submit and manage tasks (now often with virtual threads as the backing). CompletableFuture is still the tool for composing async results and fan-out/fan-in. They complement virtual threads. What virtual threads reduce the need for is complex reactive/async frameworks adopted purely to avoid blocking - now you can block cheaply.

"When CompletableFuture vs virtual threads?" If you can write the logic as simple sequential blocking code, virtual threads let you do that and scale - often the simpler choice now. Use CompletableFuture when you genuinely need to compose independent async operations (run three services in parallel and combine), express dependency graphs between async steps, or you're on Java < 21. Many codebases use both: virtual threads for the per-request thread, CompletableFuture for fan-out within a request.

"What pool size should I actually use?" CPU-bound: availableProcessors() (maybe +1). I/O-bound on platform threads: higher, tuned to your wait/compute ratio (the formula is roughly cores * (1 + wait/compute)), but this is fiddly - which is exactly why virtual threads are compelling for I/O: you stop sizing pools and just create one virtual thread per task.

"Is parallelStream() (chapter 07) related?" Yes - it uses a shared ForkJoinPool (the common pool) under the hood to parallelize stream operations. It's good for CPU-bound, independent, large-dataset operations. It's not a general task executor and shouldn't be used for I/O (it can starve the shared pool). For arbitrary concurrent tasks, use an ExecutorService, not parallelStream.

"How do exceptions work across threads?" A task's exception doesn't propagate to the submitting thread automatically. With Future, get() throws ExecutionException wrapping the task's exception - you only see it when you call get(). With CompletableFuture, use exceptionally/handle to deal with it in the pipeline. A submit-ted task that throws and is never get()-ted swallows the exception silently - a common bug. (Use execute + an uncaught-exception handler, or always check your futures.)

"Should I ever extend Thread or implement Runnable directly now?" Rarely. implements Runnable (or a lambda) to define a task, yes - but hand it to an executor rather than new Thread(runnable).start(). Extending Thread is almost never right (it conflates the task with the worker - a chapter 01 composition-over-inheritance issue). Define tasks; submit them to executors.

Done

  • You know why raw thread-per-task doesn't scale, and how thread pools fix it.
  • You can use ExecutorService - submit Runnable/Callable, size pools for CPU vs I/O, and shut down properly (try-with-resources).
  • You can use Future for single async results and understand its blocking limitation.
  • You can compose async pipelines with CompletableFuture (thenApply/thenCompose/thenCombine/exceptionally).
  • You understand virtual threads (Java 21+): cheap, millions-scale, write simple blocking code for I/O concurrency.
  • You know structured concurrency exists and where the field is heading.
  • You can pick the right tool for CPU-bound vs I/O-bound vs compositional concurrency.

That completes the concurrency core - the heart of this path. Next: performance-aware coding - allocation, boxing, and the patterns that actually matter.

Next: Performance-aware coding →

12 - Performance-aware coding

What this session is

About an hour. Not premature optimization - awareness. The goal isn't to make every line fast; it's to recognize the patterns that quietly cost performance, avoid the obvious mistakes, and know that real optimization comes from measuring (chapter 13), not guessing. By the end you'll write code that doesn't have gratuitous performance problems, and you'll know the difference between "this matters" and "this is fine."

The first rule: measure, don't guess

Start here because it governs everything else:

"Premature optimization is the root of all evil." - Donald Knuth (the full quote: "we should forget about small efficiencies, say about 97% of the time").

Most code doesn't need to be fast. It needs to be correct and clear. The 3% that's genuinely performance-critical, you find by measuring (profiling, chapter 13) - not by guessing, because human intuition about what's slow is famously wrong. The cost of optimizing the wrong thing is doubled: you spend effort, and you make the code more complex and bug-prone for no benefit.

So this chapter is not "optimize everything." It's "know the patterns that have real cost, avoid the free mistakes, and measure before optimizing anything specific." Awareness, not obsession.

Autoboxing: the invisible allocation

The most common quiet performance cost. Java has primitives (int, long, double) and their wrapper objects (Integer, Long, Double). Autoboxing silently converts between them - and each box is a heap allocation (chapter 08).

// This loop allocates a million Integer objects. Silently.
Long sum = 0L;                      // wrapper type - the bug
for (long i = 0; i < 1_000_000; i++) {
    sum += i;                       // unbox sum, add, BOX the result - new Long every iteration
}

sum is a Long (wrapper). Each sum += i unboxes sum to a primitive, adds, and boxes the result back into a new Long object - a million allocations, a million pieces of garbage. Fix: use the primitive.

long sum = 0L;                      // primitive - zero allocations
for (long i = 0; i < 1_000_000; i++) {
    sum += i;                       // pure primitive arithmetic
}

This single change can be many times faster in a hot loop. The lesson: prefer primitives over wrappers in performance-sensitive code, especially in loops and large collections.

Where boxing hides:

  • Collections. List<Integer>, Map<Integer, ...> - generics can't hold primitives (chapter 04's erasure), so every element is boxed. A List<Integer> of a million numbers is a million Integer objects plus the list. For primitive-heavy data, use primitive arrays (int[]) or specialized libraries (Eclipse Collections, fastutil) instead of List<Integer>.
  • Streams. stream().map(i -> i + 1) on a Stream<Integer> boxes constantly. Use IntStream/mapToInt (chapter 07) for primitive streams.
  • Wrapper-typed accumulators, as above.

You don't eliminate all boxing - List<Integer> is fine for small or non-hot collections. You avoid it where volume × hotness makes it matter.

String handling: the concatenation trap

The classic. String concatenation in a loop with + is O(n²) because strings are immutable (chapter 03) - each + creates a new string copying all previous characters.

// O(n^2) - each += copies the entire string so far. Catastrophic for large n.
String result = "";
for (String word : words) {
    result += word + ", ";          // new String allocated and fully copied every iteration
}

For words of size n, this allocates and copies progressively larger strings - total work proportional to n². For 100,000 words it can take seconds. Fix with StringBuilder, which has a growable internal buffer (like ArrayList):

// O(n) - one growable buffer, appended to.
StringBuilder sb = new StringBuilder();
for (String word : words) {
    sb.append(word).append(", ");
}
String result = sb.toString();      // build the final string once

Or, more idiomatically for joining, String.join or a stream collector:

String result = String.join(", ", words);                          // cleanest for a collection
String result = words.stream().collect(Collectors.joining(", "));  // when you're already streaming

Important nuance: a single + or concatenation in straight-line code is fine - the compiler optimizes a + b + c into a StringBuilder for you. The problem is only concatenation in a loop, where the compiler can't see across iterations. Don't reflexively replace every + with StringBuilder; do replace loop-accumulated concatenation.

Pre-size your collections

From chapter 05, applied as performance: if you know roughly how many elements you'll add, tell the collection up front so it doesn't repeatedly resize (each resize copies the whole backing array).

// Resizes ~log(n) times as it grows, copying each time.
List<String> list = new ArrayList<>();

// One allocation, no resizing, if you know the size.
List<String> list = new ArrayList<>(expectedSize);
Map<String, Integer> map = new HashMap<>(expectedSize * 4 / 3 + 1);  // account for load factor

For an ArrayList that grows to a million elements, pre-sizing avoids ~20 resize-and-copy cycles. In a hot path, measurable; in cold code, irrelevant (but harmless and clear).

Choose the right data structure

The biggest performance wins usually come not from micro-optimizing code but from picking the right collection (chapter 05) and algorithm. A contains check:

// O(n) per check - scanning a list. In a loop, O(n*m). Death by a thousand cuts.
List<String> seen = new ArrayList<>();
if (seen.contains(item)) { ... }          // linear scan every time

// O(1) per check - a hash set.
Set<String> seen = new HashSet<>();
if (seen.contains(item)) { ... }          // constant time

If you contains-check a collection repeatedly, a HashSet instead of an ArrayList turns O(n) lookups into O(1). This kind of structural choice dwarfs any micro-optimization. The chapter 05 Big-O table is a performance tool: reaching for the wrong collection is the most common real performance bug, far more than "should I use ++i or i++" (which doesn't matter at all).

Allocation awareness (without allocation paranoia)

From chapter 08: short-lived objects are cheap (the generational GC is built for them), so don't contort code to avoid every allocation. But in genuinely hot paths, reducing allocation reduces GC pressure. Patterns:

  • Reuse buffers in hot loops instead of allocating per iteration (a byte[] you reuse, or sync.Pool-style reuse - though Java's equivalent is manual or library-provided).
  • Avoid creating objects you immediately discard in tight loops (a new Comparator or boxed value per iteration).
  • Use primitive streams and arrays for numeric bulk data (the boxing point above).

But measure first. The advice "reduce allocation" applies to the 3% that profiling flags, not everywhere. Allocating a few objects per request in a web handler is completely fine; allocating a million in a tight numeric kernel is the thing to fix.

Lazy initialization and computation

Don't compute what you might not need. Two patterns:

// Compute on first use, then cache. (Be careful with threads - chapter 10.)
private List<String> cached;
public List<String> getExpensiveList() {
    if (cached == null) {
        cached = computeExpensiveList();   // only runs once, on first call
    }
    return cached;
}

And short-circuit evaluation - put the cheap check first so the expensive one is skipped when possible:

// && short-circuits: if isCached() is false, expensiveCheck() never runs.
if (isCached(key) && expensiveValidation(key)) { ... }
//   ^ cheap, first        ^ expensive, only if needed

Order conditions cheap-to-expensive and likely-to-fail-first; the JVM and &&/|| short-circuit the rest.

Things that DON'T matter (stop worrying about them)

Awareness includes knowing what's a non-issue, so you don't waste effort or sacrifice clarity for imaginary gains:

  • i++ vs ++i in a loop - identical performance. Use whichever reads better.
  • final on local variables - no runtime performance effect (it's a readability/safety choice).
  • One-line method extraction - the JIT inlines small methods; don't inline by hand for "speed."
  • System.out.println micro-costs - irrelevant unless you're printing in a hot loop (where the cost is the I/O, not the call).
  • Manual loop unrolling, bit-twiddling tricks - the JIT does these; hand-doing them usually just obscures the code and sometimes defeats the JIT's own optimizations.
  • Caching list.size() in a loop variable - the JIT handles it; for (int i = 0; i < list.size(); i++) is fine.

The JIT compiler (Java Mastery covers it deeply) is very good at low-level optimization. Your job is the high-level choices it can't make for you: the right data structure, the right algorithm, avoiding gratuitous allocation and O(n²) patterns. Leave the instruction-level stuff to the JIT.

The performance mindset

Put it together into a discipline:

  1. Write clear, correct code first. Don't optimize while writing - it's premature and usually wrong.
  2. Avoid the free mistakes as you go: don't concatenate strings in loops, don't box in hot loops, don't use O(n) contains repeatedly, do pre-size known collections, do pick the right data structure. These cost nothing in clarity and avoid the common real problems.
  3. If it's too slow, measure (chapter 13). Profile to find the actual hot spot - it's almost never where you'd guess.
  4. Optimize the measured hot spot, and only that. Re-measure to confirm it helped.
  5. Stop when it's fast enough. "Fast enough" is a real, definable target (a latency budget, a throughput goal). Optimizing past it is wasted effort.

This is the difference between a junior who either ignores performance entirely or optimizes everything blindly, and an engineer who writes clean code, sidesteps the known traps, and surgically fixes what measurement proves slow.

Try it

  1. Measure boxing. Sum 10,000,000 longs into a Long (wrapper) accumulator and into a long (primitive). Time both (System.nanoTime). The primitive version is dramatically faster - you're watching a million allocations cost real time. Confirm with -verbose:gc that the wrapper version triggers far more GC.

  2. The O(n^2) string trap. Concatenate 100,000 short strings with += in a loop, then with StringBuilder, then with String.join. Time all three. The += version takes seconds; the others milliseconds. Feel the quadratic blowup.

  3. Wrong collection. Check membership 100,000 times against a 100,000-element ArrayList (O(n) each) vs a HashSet (O(1) each). Time both. The list version is thousands of times slower. This is the most common real performance bug in beginner code.

  4. Pre-sizing. Build a 10,000,000-element ArrayList with and without an initial capacity. Time both and count allocations. Pre-sizing avoids the resize-copy cycles.

  5. Prove a non-issue. Time i++ vs ++i in a billion-iteration loop. Identical. Time final int x vs int x. Identical. Convince yourself these don't matter so you stop thinking about them.

  6. Short-circuit ordering. Write if (cheapCheck() && expensiveCheck()) where cheapCheck usually returns false. Add prints to confirm expensiveCheck rarely runs. Swap the order and watch expensiveCheck run every time. Same logic, different cost.

What you might wonder

"How do I know if something is in the hot 3%?" You profile (chapter 13). You cannot reliably know by reading. Studies repeatedly show developers guess the hot spot wrong most of the time - the bottleneck is in a place you didn't suspect. That's the whole reason "measure, don't guess" is rule #1.

"Isn't avoiding boxing/string-traps itself premature optimization?" No - these are not optimizations, they're avoiding pessimizations. Using long instead of Long, StringBuilder in a loop, or HashSet for repeated lookups costs nothing in clarity (often it's clearer) and avoids a known-bad pattern. Premature optimization is sacrificing clarity for speculative gain. Avoiding O(n²) is just not writing O(n²). Different thing.

"What about the JIT - doesn't it fix everything?" The JIT (just-in-time compiler) optimizes low-level code brilliantly: inlining, dead-code elimination, loop optimizations, escape analysis (it can even stack-allocate objects that don't escape). It does not fix your algorithm or data-structure choices - it can't turn an O(n²) string concatenation into O(n), or a list contains into a hash lookup. Those high-level choices are yours. Trust the JIT for instruction-level; own the structural decisions.

"Should I use StringBuilder everywhere then?" No. For loop-accumulated concatenation, yes. For straight-line a + b + c, no - the compiler already uses StringBuilder for you, and writing it manually just adds noise. The rule is specifically about concatenation inside loops.

"Are micro-benchmarks like my nanoTime timing reliable?" Roughly, for big differences (the boxing and string examples are so dramatic that crude timing shows them). But for subtle comparisons, naive timing lies - JIT warmup, dead-code elimination, and GC noise distort results (this is exactly the JMH lesson). For anything close, use a real benchmark harness. Chapter 13 covers measuring properly.

"What's the single highest-impact performance habit?" Picking the right data structure and algorithm (chapter 05). A wrong collection or an accidental O(n²) loop dwarfs every micro-optimization combined. Get the Big-O right and most code is fast enough without any further effort.

Done

  • You know the prime directive: measure, don't guess; most code doesn't need optimizing.
  • You can spot and fix autoboxing in hot loops, collections, and streams.
  • You avoid the O(n²) string-concatenation-in-a-loop trap (and know single + is fine).
  • You pre-size known collections and pick the right data structure for the access pattern.
  • You're allocation-aware without being allocation-paranoid.
  • You know what doesn't matter (i++ vs ++i, etc.) so you don't waste effort.
  • You have the five-step performance mindset: clear first, avoid free mistakes, measure, fix the hot spot, stop at good enough.

Next: profiling basics - how to actually measure, so the "find the hot spot" step is something you can do.

Next: Profiling basics →

13 - Profiling basics

What this session is

About an hour. Chapter 12 said "measure, don't guess" - this is how you measure. You'll learn to find where a program actually spends its time and memory using JFR (the profiler built into the JDK), how to write a benchmark that doesn't lie (JMH), and how to read a thread dump to diagnose a hang or deadlock. By the end, "profile it" is something you can actually do instead of a thing experts say.

The golden rule, restated

You cannot find a performance problem by reading code. Human intuition about hot spots is wrong far more often than right - the bottleneck is routinely in a place nobody suspected (a logging call, a regex recompiled per request, an accidental O(n²)). The only reliable method is to measure the running program and let the data point at the problem. Everything in this chapter is a way to get that data.

Quick-and-dirty timing (and why it's not enough)

The crudest measurement, useful for huge differences (chapter 12's boxing demo):

long start = System.nanoTime();
doWork();
long elapsedMs = (System.nanoTime() - start) / 1_000_000;
System.out.println("took " + elapsedMs + " ms");

This is fine for spotting order-of-magnitude differences. But for anything subtle it lies, because of how the JVM runs:

  • JIT warmup. Java code starts interpreted and gets compiled to optimized machine code only after running enough times. Your first measurement is of slow, un-compiled code - not representative of steady state.
  • Dead-code elimination. If the JIT proves doWork()'s result is unused, it may delete the call entirely - you measure nothing.
  • GC pauses land randomly in your timing window, adding noise.

So nanoTime timing is a smoke detector, not a diagnostic. For real measurement you need a profiler (to find where time goes) and JMH (to benchmark specific code correctly).

JFR: the profiler in your JDK

Java Flight Recorder (JFR) is a low-overhead profiler built into the JDK - no install, ~1% overhead, safe to run in production. It records what the JVM is doing (which methods run, what's allocated, GC activity, locks) into a file you analyze afterward.

Start a recording when launching your app:

java -XX:StartFlightRecording=duration=60s,filename=app.jfr -jar app.jar

Or attach to a running process with jcmd:

jcmd <pid> JFR.start duration=60s filename=app.jfr
jcmd <pid> JFR.dump filename=app.jfr        # dump what's recorded so far

Then open app.jfr in JDK Mission Control (JMC) - a free GUI (or VisualVM, or IntelliJ's profiler which uses JFR underneath). What you look at:

  • The hot-methods / flame graph view - which methods consumed the most CPU. This is the "where does time go" answer. The widest bars are your hot spots.
  • The allocation view - which methods allocated the most memory (the chapter 08/12 boxing and garbage culprits). "Who's creating all this garbage?"
  • The GC view - how often GC ran, how long pauses were, whether you're under memory pressure.
  • The lock/contention view - threads waiting on locks (chapter 10 contention).

Reading a flame graph

A flame graph is the key skill. Each box is a method; its width is how much total time was spent in it (and everything it called). Boxes stack to show the call hierarchy - a method sits on top of its caller.

How to read it: scan for the widest boxes. A wide box near the top (a "plateau") is a method burning CPU directly - your hot spot. Click it, see who calls it, decide if it can be made faster or called less. You're not reading every box; you're finding the few wide ones that dominate. Often a single surprising method is 60% of the width - that's your 3% from chapter 12, found.

Heap profiling: finding leaks and allocation

For memory problems (chapter 08), two moves:

Allocation profiling (who creates garbage) - JFR's allocation view, or async-profiler in alloc mode. Shows which methods allocate the most. A method allocating millions of Integers (boxing) or temporary strings lights up here.

Heap dumps (what's retained) - for leaks, capture a snapshot of every live object:

jcmd <pid> GC.heap_dump heap.hprof
# or automatically on OOM:
java -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=. -jar app.jar

Open heap.hprof in Eclipse MAT (Memory Analyzer Tool) or VisualVM. The questions it answers (chapter 08's leak diagnosis):

  • "What's using the most memory?" - the dominator tree shows the biggest retainers. "2 GB of byte[]."
  • "What's keeping it alive?" - the path to GC root. "These byte[]s are retained by Cache.entries, a HashMap, held by a static field." That sentence is the leak, solved. (Eclipse MAT's "Leak Suspects" report often points right at it.)

The path-to-GC-root is the single most valuable thing a heap dump gives you. A leak is a still-reachable object (chapter 08); the path shows you exactly which reference chain to break.

JMH: benchmarking that doesn't lie

When you need to compare two implementations precisely - is this optimization actually faster? - use JMH (Java Microbenchmark Harness), the standard tool. It handles JIT warmup, prevents dead-code elimination, runs multiple forks for statistical validity, and reports results with error bars. Naive nanoTime benchmarking gives wrong answers; JMH gives trustworthy ones.

import org.openjdk.jmh.annotations.*;
import java.util.concurrent.TimeUnit;

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
@Warmup(iterations = 3, time = 1)        // run 3 warmup rounds so the JIT compiles the code
@Measurement(iterations = 5, time = 1)   // then 5 measured rounds
@Fork(2)                                  // in 2 separate JVMs (catches JVM-specific flukes)
public class StringBench {

    @Param({"100", "10000"})
    int n;

    @Benchmark
    public String concat() {              // the slow way
        String s = "";
        for (int i = 0; i < n; i++) s += i;
        return s;                         // RETURN it - prevents dead-code elimination
    }

    @Benchmark
    public String builder() {             // the fast way
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < n; i++) sb.append(i);
        return sb.toString();
    }
}

Key annotations (the things naive timing misses):

  • @Warmup - runs the code untimed first, so the JIT compiles it before you measure steady-state performance.
  • @Measurement - the actual timed runs.
  • @Fork - runs in separate JVMs to catch JVM-specific anomalies.
  • Returning the result - JMH consumes returned values so the JIT can't delete "useless" computation.

Run it (mvn package then java -jar target/benchmarks.jar StringBench) and you get a table with scores and error bars:

Benchmark            (n)  Mode  Cnt    Score    Error  Units
StringBench.builder  100  avgt   10    0.412 ±  0.02   us/op
StringBench.concat   100  avgt   10    2.140 ±  0.11   us/op
StringBench.builder 10000 avgt   10   38.5  ±  1.2    us/op
StringBench.concat  10000 avgt   10 9821.0  ± 210     us/op   <- O(n^2) blowup, proven

The error bars matter: if two scores' ranges overlap, the difference isn't real - it's noise. JMH's discipline is what makes a benchmark trustworthy instead of a number you fooled yourself with.

Thread dumps: diagnosing hangs and deadlocks

When a program hangs (not slow - frozen), you need a thread dump - a snapshot of what every thread is doing right now. This is the tool for chapter 10's deadlocks and for "why is my app stuck."

jstack <pid>                      # print all thread stacks
jcmd <pid> Thread.print           # same thing, modern command
# or press Ctrl-\ (SIGQUIT) in the terminal running the JVM

The dump lists every thread and its current stack. What you look for:

  • Deadlock - the JVM detects and explicitly reports it: Found one Java-level deadlock: followed by the threads and the locks they're each holding/waiting for. The exact cycle from chapter 10, named for you. This is the fastest way to confirm and locate a deadlock.
  • A thread stuck in BLOCKED state - waiting on a lock another thread holds (contention or deadlock).
  • Many threads in the same method - a hot spot or a bottleneck where everything queues up.
  • Threads in WAITING/TIMED_WAITING - parked (often normal for pool threads idle between tasks; suspicious if a thread you expect to be working is parked).

Take two or three dumps a few seconds apart: if a thread is on the same line in all of them, it's stuck there (a hang); if threads move between dumps, work is progressing (maybe just slow).

The profiling workflow

Putting it together - the loop you run when something is slow or broken:

  1. Reproduce it under realistic load (a slow path needs traffic to show up; profiling an idle app shows nothing).
  2. Pick the tool for the symptom:
  3. Slow (high CPU) -> JFR CPU profile, read the flame graph for hot methods.
  4. Growing memory / OOM -> heap dump, find the dominator and path-to-GC-root.
  5. High GC / churn -> JFR allocation profile, find who allocates.
  6. Hung / frozen -> thread dump, look for deadlock or blocked threads.
  7. "Is change X faster?" -> JMH benchmark of X vs the original.
  8. Find the one dominant cause - usually one method or one reference chain accounts for most of the problem.
  9. Fix that (using chapters 05, 08, 10, 12 - the right data structure, breaking the leak chain, reducing contention, killing the allocation).
  10. Re-measure to confirm it actually helped and the bottleneck moved (it always moves to the next thing - stop when you're at "fast enough").

Notice this is chapter 12's mindset made concrete. The hard part of performance work isn't fixing - it's finding, and these tools are how you find.

Try it

  1. Crude timing first. Take chapter 12's boxing example (Long vs long sum). Time both with nanoTime. The difference is huge enough that even crude timing shows it. This is the case where quick timing is legitimately enough.

  2. Record with JFR. Run any non-trivial program (or one with a deliberate hot loop) with -XX:StartFlightRecording=duration=30s,filename=run.jfr. Open run.jfr in JDK Mission Control. Find the hot-methods view. Identify the widest method. Did you guess right beforehand? (Usually not - that's the lesson.)

  3. Read a flame graph. In JMC (or IntelliJ's profiler), open the flame graph for a CPU-heavy run. Find the widest plateau. Click it, trace its callers. Write one sentence: "X% of CPU is in method M, called from C." That sentence is a profiling result.

  4. Capture a heap leak. Run chapter 08's unbounded-cache leak with -Xmx128m -XX:+HeapDumpOnOutOfMemoryError. When it OOMs, open the .hprof in Eclipse MAT. Run "Leak Suspects." Confirm it points at the HashMap/byte[] and names the static field retaining it. You just diagnosed a leak the way professionals do.

  5. Write a JMH benchmark. Set up the StringBench above (add the JMH Maven dependency). Run it. Confirm concat is dramatically slower at n=10000 and that the error bars don't overlap. Then benchmark ArrayList vs HashSet contains (chapter 12). Real numbers, properly measured.

  6. Catch a deadlock in a dump. Run chapter 10's deadlocking transfer. While it's hung, run jstack <pid>. Find the "Found one Java-level deadlock" section. Read which thread holds which lock and waits for which. Then apply the ordered-lock fix and confirm the dump no longer reports a deadlock.

What you might wonder

"JFR vs async-profiler vs VisualVM vs IntelliJ profiler - which?" They overlap. JFR is built into the JDK, low-overhead, production-safe - the default. JDK Mission Control is the GUI for JFR files. async-profiler is a popular open-source sampling profiler with excellent flame graphs (and it samples native code JFR can miss). VisualVM is a free all-rounder (profiling + heap dumps). IntelliJ's profiler wraps JFR/async-profiler in the IDE - the most convenient for development. Start with whatever's in front of you; they answer the same questions.

"Is it safe to profile in production?" JFR yes - it's designed for it (~1% overhead, always-on is a legitimate strategy). Heap dumps briefly pause the app (they stop the world to snapshot) and produce large files, so do them deliberately, not casually. Thread dumps are cheap and safe. Heavy instrumenting profilers (older ones) can have high overhead - prefer sampling profilers (JFR, async-profiler) in production.

"Do I need JMH for everyday performance checks?" Only when comparing implementations precisely (is A faster than B?) and the difference might be subtle. For "is this whole feature fast enough," profile the running app instead. JMH is for microbenchmarks - small, isolated pieces of code where naive timing would lie. Don't JMH a whole application; profile it.

"The flame graph shows my hot method is in a library I can't change. Now what?" Then the fix is to call it less, not make it faster. If 60% of time is in regex.compile, the fix is "compile the pattern once and reuse it" (it was being recompiled per call), not optimizing the regex engine. Profiling tells you where time goes; the fix is often "do this expensive thing fewer times," which is in your code (caching, pre-computing, batching).

"How do I profile a problem I can't reproduce locally?" This is why production-safe profiling matters. Enable always-on JFR in production; when the problem happens, you have the recording. For OOMs, -XX:+HeapDumpOnOutOfMemoryError captures the dump automatically when it crashes. The discipline is "have the recorder running before the problem happens," because intermittent production issues won't reproduce on demand.

"What's a 'sampling' vs 'instrumenting' profiler?" A sampling profiler periodically checks what each thread is doing (cheap, low-overhead, statistically accurate for hot spots - JFR, async-profiler). An instrumenting profiler injects timing code into every method (precise per-method counts but high overhead, can distort results and slow the app a lot). For finding hot spots, sampling is almost always the right choice.

Done

  • You know why crude nanoTime timing lies (JIT warmup, dead-code elimination, GC noise) and where it's still useful.
  • You can capture and read a JFR recording: hot-methods/flame graph for CPU, allocation view for garbage, GC view for pressure.
  • You can read a flame graph - scan for the widest boxes, trace callers.
  • You can capture a heap dump and find a leak via the dominator tree and path-to-GC-root.
  • You can write a trustworthy JMH benchmark (warmup, forks, return values, error bars).
  • You can take a thread dump and spot deadlocks and blocked threads.
  • You have the profiling workflow: reproduce, pick the tool for the symptom, find the one cause, fix, re-measure.

Next: testing at the next level - mocking, parameterized tests, and testing the concurrent code you now write.

Next: Testing at the next level →

14 - Testing at the next level

What this session is

About ninety minutes. From Scratch taught you JUnit 5 basics - write a test, assert a result. This session is about testing real code: isolating the unit under test with mocks, running the same test over many inputs with parameterized tests, finding edge cases you didn't think of with property-based testing, the test-double vocabulary that shows up in every code review, and how to test the concurrent code you learned to write in chapters 09-11. By the end you can test code that has dependencies, not just pure functions.

The problem: real code has dependencies

From Scratch tests looked like this - pure functions, easy to test:

@Test
void addsCorrectly() {
    assertEquals(5, Calculator.add(2, 3));
}

But real code depends on other things - a database, an HTTP client, a clock, a payment gateway:

class OrderService {
    private final PaymentGateway gateway;     // external dependency
    private final InventoryRepo inventory;    // external dependency

    OrderResult placeOrder(Order order) {
        if (!inventory.inStock(order.itemId())) return OrderResult.outOfStock();
        var charge = gateway.charge(order.amount());   // calls a real payment system!
        return charge.success() ? OrderResult.placed() : OrderResult.declined();
    }
}

You can't unit-test this against the real PaymentGateway (you'd charge real money) or the real database (slow, stateful, requires setup). You need to substitute fake versions of the dependencies. That's what mocking - and the broader idea of test doubles - is for.

Design for testability first

The reason the example above is testable at all is a design choice from chapter 01: OrderService receives its dependencies through its constructor (dependency injection) and they're typed as interfaces. That's not an accident - it's what makes substituting fakes possible.

// Testable: dependencies are interfaces, injected. You can pass fakes.
class OrderService {
    OrderService(PaymentGateway gateway, InventoryRepo inventory) { ... }
}

// Untestable: dependency created internally, concrete. You're stuck with the real one.
class OrderService {
    private final PaymentGateway gateway = new StripeGateway();   // hardcoded - can't substitute
}

This connects the whole path: "accept interfaces" (chapter 01), "program to contracts" (chapter 02), and "depend on abstractions" all pay off here as testability. Code that's hard to test is usually code with bad dependencies - testing pressure reveals design problems. If a class is painful to test, that's a signal to fix its design, not to skip the test.

The test-double vocabulary

"Mock" is used loosely for all fakes, but the precise terms show up in reviews and matter for thinking clearly:

  • Dummy - a placeholder passed but never used (fills a parameter slot).
  • Stub - returns canned answers to calls (when asked for stock, say true). No verification.
  • Fake - a working but simplified implementation (an in-memory Map standing in for a database).
  • Mock - a stub that also records how it was called, so you can verify interactions (assert charge() was called once with $50).
  • Spy - a wrapper around a real object that records calls while delegating to the real implementation.

The two you use most: stub (control what a dependency returns) and mock (verify how a dependency was called). Most "mocking" is really one of these two.

Mockito: the standard mocking library

Mockito is the de facto Java mocking library. It creates fake implementations of interfaces, lets you program their responses, and lets you verify how they were called.

import static org.mockito.Mockito.*;
import org.junit.jupiter.api.Test;

class OrderServiceTest {

    @Test
    void placesOrderWhenInStockAndPaymentSucceeds() {
        // 1. Create mocks of the dependencies.
        PaymentGateway gateway = mock(PaymentGateway.class);
        InventoryRepo inventory = mock(InventoryRepo.class);

        // 2. Stub their behavior - "when called this way, return that".
        when(inventory.inStock("widget")).thenReturn(true);
        when(gateway.charge(50.0)).thenReturn(new Charge(true));

        // 3. Run the code under test with the fakes injected.
        var service = new OrderService(gateway, inventory);
        var result = service.placeOrder(new Order("widget", 50.0));

        // 4. Assert the result.
        assertEquals(OrderResult.placed(), result);

        // 5. Verify the interaction - the gateway WAS charged, exactly once, with $50.
        verify(gateway).charge(50.0);
        verify(gateway, times(1)).charge(anyDouble());
    }

    @Test
    void doesNotChargeWhenOutOfStock() {
        PaymentGateway gateway = mock(PaymentGateway.class);
        InventoryRepo inventory = mock(InventoryRepo.class);
        when(inventory.inStock("widget")).thenReturn(false);   // out of stock

        var service = new OrderService(gateway, inventory);
        var result = service.placeOrder(new Order("widget", 50.0));

        assertEquals(OrderResult.outOfStock(), result);
        verify(gateway, never()).charge(anyDouble());   // CRUCIAL: we never charged anyone
    }
}

The core verbs:

mock(Type.class)                          // create a fake
when(mock.method(args)).thenReturn(value) // stub a return value
when(mock.method(args)).thenThrow(ex)     // stub an exception
verify(mock).method(args)                 // assert it was called (once, by default)
verify(mock, times(n)).method(args)       // called exactly n times
verify(mock, never()).method(args)        // never called
any(), anyString(), anyDouble(), eq(x)    // argument matchers for flexible matching

The never() verification in the second test is the kind of thing mocks make possible and is genuinely valuable: proving the code doesn't do something (charge a card when out of stock) is as important as proving it does.

When not to mock: don't mock value types (just construct them - chapter 03 records are trivial to make real), and don't mock types you don't own in a way that couples your test to their internals. Prefer a real or fake implementation when it's cheap; reach for mocks for genuinely external, expensive, or hard-to-set-up dependencies.

Parameterized tests: one test, many inputs

When you'd otherwise copy-paste a test with different values, use a parameterized test - one test method run once per input set. JUnit 5 makes this clean:

import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.*;

class ValidationTest {

    @ParameterizedTest
    @ValueSource(strings = {"", " ", "  ", "\t"})       // run once per value
    void blankStringsAreInvalid(String input) {
        assertFalse(Validator.isValidName(input));
    }

    @ParameterizedTest
    @CsvSource({                                          // each row: input, expected
        "alice@example.com, true",
        "no-at-sign,        false",
        "@nodomain,         false",
        "a@b.co,            true"
    })
    void emailValidation(String email, boolean expected) {
        assertEquals(expected, Validator.isValidEmail(email));
    }

    @ParameterizedTest
    @MethodSource("edgeCaseProvider")                    // supply args from a method
    void handlesEdgeCases(int input, int expected) {
        assertEquals(expected, Math.abs(input));
    }
    static Stream<Arguments> edgeCaseProvider() {
        return Stream.of(
            Arguments.of(-5, 5),
            Arguments.of(0, 0),
            Arguments.of(Integer.MIN_VALUE, Integer.MIN_VALUE)  // the famous abs overflow!
        );
    }
}

This turns ten near-duplicate test methods into one method and a data table - easier to read, easier to add cases, and each input shows as a separate result so you see exactly which case failed. Reach for parameterized tests whenever you're testing "the same logic across a range of inputs."

Property-based testing: finding cases you didn't think of

Example-based tests check the cases you thought of. Property-based testing generates hundreds of random inputs and checks that a property (an invariant) always holds - finding edge cases you'd never have written by hand. The Java library is jqwik.

import net.jqwik.api.*;

class SortProperties {

    @Property
    void sortedListIsSameLength(@ForAll List<Integer> input) {
        var sorted = MySort.sort(input);
        assertEquals(input.size(), sorted.size());        // property: sorting preserves length
    }

    @Property
    void sortedListIsOrdered(@ForAll List<Integer> input) {
        var sorted = MySort.sort(input);
        for (int i = 1; i < sorted.size(); i++) {
            assertTrue(sorted.get(i - 1) <= sorted.get(i)); // property: each <= the next
        }
    }

    @Property
    void encodingRoundTrips(@ForAll String original) {
        assertEquals(original, decode(encode(original)));  // property: decode(encode(x)) == x
    }
}

jqwik runs each @Property with ~1000 generated inputs - empty lists, single elements, huge lists, negative numbers, weird Unicode strings. When it finds a failure, it shrinks the input to the minimal failing case ("fails on [0, -1]") so you get a tiny reproduction, not a 500-element monster.

The mental shift: instead of "what example should I test," ask "what must always be true?" Round-trip properties (decode(encode(x)) == x), invariants (sorting preserves length and order), and equivalences (two implementations agree) are the sweet spots. Property-based testing famously finds the edge cases - empty input, overflow, Unicode boundaries - that example tests miss. Use it for pure logic with clear invariants; it complements example tests, doesn't replace them.

Testing concurrent code

The hard one, building on chapters 09-11. Concurrency bugs are timing-dependent (chapter 09), so a test that passes once proves little. Strategies:

1. Stress with many threads and check the invariant. Hammer the code from many threads and assert the result is correct - this catches races that single-threaded tests can't.

@Test
void counterIsThreadSafe() throws InterruptedException {
    var counter = new Counter();              // the chapter 10 fixed version
    int threads = 8, perThread = 100_000;
    try (var pool = Executors.newFixedThreadPool(threads)) {
        var done = new CountDownLatch(threads);
        for (int t = 0; t < threads; t++) {
            pool.submit(() -> {
                for (int i = 0; i < perThread; i++) counter.increment();
                done.countDown();
            });
        }
        done.await();                          // wait for all threads
    }
    assertEquals(threads * perThread, counter.get());  // exact - any race would lose increments
}

Run this against chapter 09's broken counter and it fails (wrong total); against chapter 10's fixed one it passes. The high iteration count and thread count maximize the chance of provoking a race.

2. Use synchronization aids for coordination. CountDownLatch (wait for N things), CyclicBarrier (release all threads at once to maximize collision), Phaser - these let you control timing to make races more likely to surface.

3. Use a real concurrency-testing tool. For serious lock-free or low-level concurrent code, jcstress (the Java Concurrency Stress tool) systematically explores interleavings and reports which outcomes are actually possible - far beyond what hand-rolled stress tests catch. It's specialist, but it's the right tool when you're writing genuinely tricky concurrent code.

The honest caveat: concurrency tests are probabilistic. A stress test that passes makes a bug less likely but can't prove absence (the bad interleaving might just not have happened this run). That's why the real defenses are correct design (chapters 09-11: minimize shared mutable state, use the right tools) plus the race detector and stress testing as backup. You test concurrent code, but you don't rely on tests alone to make it correct.

Test structure and quality

A few habits that separate good test suites from brittle ones:

  • Arrange-Act-Assert (AAA). Structure each test in three clear phases: set up (arrange), call the code (act), check the result (assert). The Mockito example above follows it. Readable tests have visible structure.
  • One logical assertion per test. A test should verify one behavior. "placesOrderWhenInStockAndPaymentSucceeds" tests one scenario; a separate test handles out-of-stock. When a test fails, the name tells you what broke.
  • Test behavior, not implementation. Assert what the code does (the return value, the observable effect), not how (don't assert internal calls unless the interaction is the behavior, like "must not charge when out of stock"). Over-mocking and over-verifying makes tests break on every refactor - a brittle suite people stop trusting.
  • Name tests as specifications. doesNotChargeWhenOutOfStock reads as a requirement. A failing test should tell you the violated requirement from its name alone.
  • Fast and isolated. Unit tests run in milliseconds and don't depend on each other or external state. Slow, order-dependent tests get skipped, and a skipped test protects nothing.

A note on the test pyramid

The standard guidance for what to test at which level:

  • Many unit tests (fast, isolated, mock external dependencies) - the base of the pyramid. Most of your tests.
  • Fewer integration tests (real database, real HTTP, wired-together components) - verify the pieces work together. Slower, so fewer.
  • Few end-to-end tests (the whole system) - confirm the critical paths work. Slowest and most brittle, so fewest.

The shape matters: lots of fast unit tests catch most bugs cheaply; a few integration and E2E tests catch the wiring issues units can't. Inverting it (mostly slow E2E tests) gives a suite that's slow, flaky, and hard to debug. Mock dependencies for unit tests; use real ones (often via Testcontainers - real databases in throwaway Docker containers) for the smaller integration layer.

Try it

  1. Mock a dependency. Build the OrderService with PaymentGateway and InventoryRepo interfaces. Write the two tests shown: success path (stub both, verify charge) and out-of-stock (verify never() charged). Run them. Then write a third: payment declined (stub charge to return failure, assert declined, verify charge was attempted).

  2. Fake vs mock. Implement InventoryRepo as a fake - a real class backed by an in-memory Map. Rewrite a test using the fake instead of a Mockito mock. Discuss when each is clearer: the fake when you need realistic behavior across many calls, the mock when you need to verify specific interactions.

  3. Parameterize. Take a validation method and write a @CsvSource parameterized test with 6-8 input/expected rows including edge cases (empty, whitespace, boundary values). Run it - note each row is a separate result. Add a row that should fail and confirm only that row reports red.

  4. Property test. If you have jqwik, write @Property void absIsNonNegative(@ForAll int n). Run it - it'll find Integer.MIN_VALUE, where Math.abs returns a negative number (overflow), shrinking to that exact input. A famous bug, found automatically. Then write a round-trip property for any encode/decode pair you have.

  5. Stress-test a counter. Write the 8-thread stress test against chapter 09's broken counter++ and watch it fail (wrong total). Then point it at chapter 10's AtomicInteger version and watch it pass. Run each several times - note the broken one fails by different amounts (probabilistic).

  6. Spot the brittle test. Take a test that over-verifies (verifys every internal method call). Refactor the implementation without changing behavior - watch the brittle test break despite correct behavior. Rewrite it to assert the observable result instead. Feel why "test behavior, not implementation" matters.

What you might wonder

"Mockito vs writing fakes by hand?" Mockito is faster to set up for one-off stubbing and is the standard for verifying interactions. Hand-written fakes (an in-memory repo) are better when many tests need the same realistic behavior, or when the fake is reused. Many codebases use both: Mockito for verifying calls and quick stubs, fakes for stateful dependencies used across a test suite. Don't mock value types - just build real ones.

"Is 100% test coverage the goal?" No. Coverage measures lines executed by tests, not lines meaningfully verified - you can have 100% coverage with assertions that prove nothing. Aim to test behavior and edge cases that matter; coverage is a rough hint about untested areas (0% on a complex method is a red flag), not a target to maximize. High coverage of trivial getters while the gnarly logic is untested is the worst of both.

"How do I test code that uses the current time or random numbers?" Inject them. Instead of calling Instant.now() or new Random() directly, take a Clock or a Random/seeded source as a dependency (chapter 01 again). In tests, pass a fixed Clock.fixed(...) or a seeded Random so behavior is deterministic. Hardcoded now()/randomness is the classic "untestable because of bad dependencies" problem.

"Should I test private methods?" Generally no - test them through the public methods that use them. If a private method is complex enough to want its own test, that's often a sign it should be extracted into its own class (with a public, testable interface) - the testing pressure revealing a design improvement again.

"Property-based testing sounds great - why isn't it everywhere?" It shines for pure logic with clear invariants (parsers, encoders, data structures, math) and is underused there. It's awkward for code with lots of side effects or unclear properties ("what's the invariant of this UI handler?"). Use it where invariants are clear; it complements example tests rather than replacing them. Many teams don't know it exists - now you do.

"How do I make flaky concurrency tests reliable?" You largely can't make a probabilistic test deterministic - which is the point that they can't prove correctness. Increase iterations/threads to raise the odds of catching a bug, use jcstress for systematic interleaving exploration on critical code, and most importantly design concurrency correctly (chapters 09-11) rather than testing it in afterward. A flaky test that occasionally catches a real race is still valuable as a signal - just don't mistake "passed once" for "correct."

Done

  • You know real code needs testable design: inject dependencies as interfaces (chapters 01-02 paying off).
  • You know the test-double vocabulary (dummy, stub, fake, mock, spy) and use stubs and mocks most.
  • You can use Mockito: mock, when().thenReturn(), verify(), including verifying something didn't happen.
  • You can write parameterized tests (@ValueSource, @CsvSource, @MethodSource) for one-logic-many-inputs.
  • You know property-based testing finds edge cases by checking invariants over generated inputs.
  • You can stress-test concurrent code and know such tests are probabilistic, not proof.
  • You know the quality habits (AAA, behavior-not-implementation, names-as-specs) and the test pyramid.

Next, the final chapter: bridging to mastery - reading harder code, a more ambitious contribution, and where to go from here.

Next: Bridging to mastery →

15 - Bridging to mastery

What this session is

About forty-five minutes. You've reached the end of the intermediate tier. This chapter isn't new syntax - it's the bridge. What you can do now, how to keep growing, how to read code that's harder than anything in this path, how to make a more ambitious open-source contribution, and what the jump to Java Mastery actually involves. Read it once now, and come back to the "what next" parts when you're ready for the next step.

What you can do now

Take stock. When you started this path you could write a class and a loop. Now you can:

  • Design with judgment - choose composition over inheritance, design with interfaces and abstract classes, get the equals/hashCode/compareTo contracts right, default to immutability (chapters 01-03).
  • Use the language well - write and read generics with wildcards, pick the right collection for the access pattern, handle errors with a real strategy, write functional pipelines and use Optional correctly (chapters 04-07).
  • Reason about memory - stack vs heap, reachability, the four leak sources, what OutOfMemoryError means (chapter 08).
  • Write correct concurrent code - see races, visibility, and ordering bugs; fix them with synchronized, locks, atomics, and concurrent collections; orchestrate with executors, CompletableFuture, and virtual threads (chapters 09-11).
  • Care about performance the right way - avoid the free mistakes, measure before optimizing, and use a profiler to find the real hot spot (chapters 12-13).
  • Test real code - mock dependencies, parameterize, property-test, stress-test concurrency (chapter 14).

That's the toolkit of a competent mid-level Java engineer. You can be handed a real feature in a real codebase and deliver it well. That's a genuine milestone - most people who use Java never reach this level of deliberate understanding.

The habit that matters most

If you take one thing from this entire path, take this: the question "what if two threads ran this at the same time?" Ask it of every piece of shared mutable state you write. It's the single habit that separates engineers who ship reliable server code from those whose code mysteriously corrupts data in production. Combined with its quieter siblings - "which collection is right here?", "should this be immutable?", "is this the testable design?" - you now have the judgment the beginner path couldn't give you.

Judgment is what intermediate was always about. The syntax was never the hard part. Knowing which of five correct options is the right one - that's the skill, and it only comes from understanding the tradeoffs, which you now do.

Reading code that's harder than this path

The fastest way to keep growing is reading code better than your own. But senior codebases use patterns this path only introduced. Here's how to approach code that's over your head without drowning:

  1. Start from the entry point and trace one path. Don't read a large codebase top to bottom. Pick one feature - "what happens when a request comes in?" - and follow that single thread of execution through the layers. Ignore everything else on the first pass.

  2. Recognize the patterns you now know. You'll see dependency injection (chapter 01), interfaces as seams (chapter 02), CompletableFuture pipelines (chapter 11), ConcurrentHashMap for shared state (chapter 10). You have the vocabulary now - the code that looked like noise before has structure you can name.

  3. Let the tests be the documentation. A class's test suite shows you how it's meant to be used and what its contracts are - often clearer than the code itself (chapter 14). Read the tests first.

  4. Use the tools. Run it under a debugger and step through the path you're tracing. Run it under a profiler (chapter 13) to see what actually executes. Reading plus running beats reading alone.

  5. Keep a "things I don't understand yet" list. When you hit a pattern you don't recognize (a ThreadLocal, a Phaser, a bytecode-manipulation library, a reactive stream), note it and move on - don't let one unknown stop the whole reading session. Come back to the list later, one item at a time.

Good code to read, roughly in order of difficulty: a well-regarded small library (Guava utilities, a JSON parser), then a mid-size framework's core (the request-handling path of a web framework), then eventually the JDK's own source (java.util.concurrent is a masterclass - and it's the bridge to Mastery).

A more ambitious contribution

The From Scratch path got you to your first pull request. With intermediate skills, you can take on more:

  • Fix a real bug, not just a typo. Find an open issue labeled bug in a project you use. Reproduce it (write a failing test - chapter 14), find the cause (maybe with a debugger or profiler), fix it, prove it with the test. A bug fix with a test is a serious contribution.
  • Tackle a concurrency or performance issue. These are exactly where your new skills shine and where many projects need help. An issue like "X is slow under load" or "Y has a race condition" is now within reach - profile it (chapter 13), find the cause, fix it.
  • Contribute to a project's test suite. Many projects have under-tested areas. Adding parameterized or property-based tests (chapter 14) that catch real edge cases is welcomed and teaches you the codebase deeply.
  • Implement a small feature end to end. Pick a good first issue that's a real feature, not cosmetic. Design it (chapters 01-02), build it, test it, document it.

The workflow is the same as From Scratch (fork, branch, change, test, PR) - what's changed is the ambition of what you can take on. Pick projects whose code is at or slightly above your level; you'll learn the most from code that stretches you without overwhelming you.

What Java Mastery is, and when you're ready

The senior reference path - Java Mastery - is a different kind of depth. Where Intermediate taught you to use the language and platform well, Mastery teaches you how the platform works underneath:

  • JVM internals - class loading, the bytecode your code compiles to, how the interpreter and JIT compilers (C1, C2, Graal) turn it into machine code.
  • Garbage collection in depth - not "avoid leaks" (chapter 08) but the actual algorithms: G1's regions, ZGC's colored pointers, generational collection mechanics, how to read and tune GC logs.
  • The JIT and performance at the machine level - escape analysis, inlining, deoptimization, reading assembly output, why your benchmark behaves the way it does.
  • The concurrency primitives' implementation - how synchronized, volatile, and the atomics actually work at the hardware/memory-barrier level; the java.util.concurrent source you used in chapters 10-11.
  • Production operations - JFR/async-profiler deeply (chapter 13 was the surface), heap-dump forensics, container-aware tuning, observability.

You're ready for Mastery when the intermediate concepts feel natural - when you reach for the right collection without thinking, when you write concurrent code and instinctively ask the "two threads" question, when you've shipped real features and maybe debugged a real production issue or two. You don't need to finish every intermediate topic perfectly; you need them to be familiar enough that the next layer down is "how does this thing I already use actually work?" rather than "what is this thing?"

If chapters 09-13 still feel shaky, that's fine - spend more time using the concepts (build things, contribute, read code) before climbing to Mastery. The platform internals make far more sense once you've felt the problems they solve. Mastery is most valuable to someone who has needed it - who has hit a GC pause in production, or a JIT deopt, or a memory-model subtlety - not someone studying it cold.

A realistic next 6 months

A concrete plan to consolidate intermediate and move toward senior:

  1. Build something real and concurrent. A small web service, a job processor, a tool you'll actually use - something with shared state, multiple threads or async work, real dependencies. Apply chapters 01-14 to a project you care about. Nothing cements the material like shipping it.
  2. Contribute the more ambitious PR described above - a bug fix with a test, or a small feature, in a project you use.
  3. Read java.util.concurrent source. You used ConcurrentHashMap, ReentrantLock, CompletableFuture (chapters 10-11). Now read how they're built. It's the single best bridge to Mastery - simultaneously a concurrency masterclass and a tour of JVM-level thinking.
  4. Profile a real performance problem. Find (or create) something slow, profile it with JFR (chapter 13), fix the actual hot spot, measure the improvement. Do this once for real and the skill is yours.
  5. Then start Java Mastery when "how does the JIT compile this?" is a question you find yourself asking.

Where this path sits

The three tiers, for orientation:

  • Java From Scratch - never-coded to first OSS PR. Syntax and "does it run."
  • Java Intermediate (this path) - writing Java to writing good Java. Judgment, concurrency, performance, testing. Competent mid-level engineer.
  • Java Mastery - the platform internals. JVM, JIT, GC, the concurrency primitives' implementation. Senior/staff depth.

You've completed the middle tier - the one that turns someone who can write Java into someone who can be trusted with production Java. That's the hardest and most valuable jump in the progression, because it's where syntax becomes judgment.

A closing note

The thing that makes a senior engineer isn't knowing more syntax - it's knowing the tradeoffs, having the judgment to pick well, and having the discipline to measure instead of guess and to ask "what if two threads ran this at once?" instead of hoping. You built all three in this path. Keep building things, keep reading code better than yours, keep asking the hard questions of your own code, and the senior tier will arrive not as a leap but as a series of "oh, that's how it works underneath" moments.

You can write good Java now. Go write a lot of it.

What you might wonder

"Am I a 'senior' engineer now?" You have the technical knowledge of a strong mid-level engineer. Seniority is also experience - having shipped, maintained, debugged production, made design calls that played out over time, and mentored others. This path gives you the foundation; the title comes from applying it over a few years on real systems. The knowledge gap to senior is smaller than you think; the experience gap closes only with time and reps.

"Should I do Java Mastery next, or go build things?" Build things first, mostly. The intermediate concepts cement through use, and Mastery's internals make sense only after you've felt the problems they solve. Interleave: build for a few months, contribute, then start Mastery when you find yourself curious about how the tools you rely on actually work. Studying internals cold, before you need them, doesn't stick.

"What if I want to learn another language now?" Great instinct - a second language deepens the first by contrast. Your intermediate Java judgment transfers heavily: composition over inheritance, immutability, the concurrency problems (every language has them), measure-don't-guess, testable design. The site has From Scratch and (growing) intermediate paths for Go, Python, and others. You'll move much faster through them now that you have the concepts - you're learning syntax, not ideas.

"I finished but I don't feel like an expert." Good - that's accurate and healthy. Expertise is asymptotic; the more you know, the more you see how much there is. The goal of this path was never "expert" - it was "competent, judgment-equipped, and able to keep growing on your own." If you can read this chapter's list of what-you-can-do and recognize yourself, you succeeded. The discomfort is just the view from a higher vantage point.

"How do I keep this knowledge from fading?" Use it. Knowledge you apply weekly stays; knowledge you only read fades. Build, contribute, review others' code, and revisit the chapters when you hit the situation they cover (come back to chapter 10 when you write your next concurrent component, chapter 13 when something's slow). This path is a reference, not a one-time read - that's why it was written as a source of truth.

Done with this path

You've gone from writing Java to writing good Java:

  • Design judgment - composition, interfaces, contracts, immutability.
  • The language used well - generics, collections, exceptions, functional style.
  • Memory reasoning and leak avoidance.
  • Correct concurrency - the three problems and the full toolbox to fix them.
  • Performance awareness and real profiling.
  • Real-world testing - mocks, parameterization, properties, concurrency.

The next tier, when you're ready, is Java Mastery - the platform internals beneath everything you just learned. But first: go build something real, and apply all of it.

Congratulations. You're a competent Java engineer.