Saltar a contenido

03 - Equality, hashing, immutability

What this session is

About ninety minutes. Three contracts that every Java object silently participates in, and that - when you get them wrong - produce some of the most baffling bugs in the language: objects that vanish from a HashSet, map lookups that fail for keys you just put in, sort orders that throw exceptions. By the end you'll understand equals, hashCode, and compareTo deeply enough to implement them correctly by hand and to know when records do it for you.

Why this matters more than it looks

Here's a bug that has cost real engineers real hours:

class Point {
    int x, y;
    Point(int x, int y) { this.x = x; this.y = y; }
}

var seen = new HashSet<Point>();
seen.add(new Point(1, 2));
System.out.println(seen.contains(new Point(1, 2)));   // false (!!)

You added (1, 2). You asked if (1, 2) is there. It says no. The set appears broken. It isn't - you are, because Point never told Java what "equal" means. This session is about never writing that bug.

Reference equality vs logical equality

There are two completely different questions you can ask about two objects:

  1. Are they the same object in memory? That's ==. It compares references - the addresses, essentially.
  2. Do they represent the same value? That's .equals(). It compares meaning.
String a = new String("hi");
String b = new String("hi");

System.out.println(a == b);        // false - two different objects
System.out.println(a.equals(b));   // true  - same characters

The default equals() that every object inherits from Object just does ==:

// Object's default - only true for the literal same object
public boolean equals(Object obj) {
    return this == obj;
}

So unless you override equals(), "logical equality" is "same object" - which is why our Point lookup failed. The new Point(1,2) you searched for was a different object from the one you added, and the inherited equals only matches identical objects.

The rule: for any class whose instances represent a value (a point, a money amount, a date, a name) - where two instances with the same contents should be treated as equal - you must override equals(). For classes that represent a unique entity with identity (a database connection, a running thread, a service), the default reference equality is correct; leave it alone.

Implementing equals correctly

equals has a precise contract from the Object Javadoc. It must be:

  • Reflexive: x.equals(x) is always true.
  • Symmetric: x.equals(y) is true if and only if y.equals(x) is true.
  • Transitive: if x.equals(y) and y.equals(z), then x.equals(z).
  • Consistent: repeated calls return the same result (as long as nothing changes).
  • Non-null: x.equals(null) is always false.

Break any of these and the collections that rely on equals (HashSet, HashMap, List.contains, List.indexOf) misbehave in ways that are very hard to debug.

The canonical, contract-correct implementation:

@Override
public boolean equals(Object o) {
    if (this == o) return true;            // fast path: same object
    if (o == null || getClass() != o.getClass()) return false;  // null + type check
    Point other = (Point) o;               // safe cast now
    return x == other.x && y == other.y;   // field-by-field comparison
}

Walk every line, because each guards a clause of the contract:

  • if (this == o) return true; - reflexive and a performance shortcut.
  • if (o == null || getClass() != o.getClass()) return false; - handles the non-null rule and rejects different types. (Using getClass() keeps symmetry airtight; instanceof can break symmetry across subclasses - more in the Q&A.)
  • Point other = (Point) o; - now safe because we verified the type.
  • return x == other.x && y == other.y; - compare the fields that define value. For object fields, use Objects.equals(this.field, other.field) (it null-checks for you); for double/float, use Double.compare to handle NaN and -0.0 correctly.

The hashCode contract - and why it's coupled to equals

Now the part that bit our Point. Hash-based collections (HashMap, HashSet) don't compare every element with equals. That would be O(n) per lookup. Instead they:

  1. Call hashCode() to get an int, and use it to pick a bucket (a slot).
  2. Only compare with equals against the (few) items already in that bucket.

This is what makes HashMap O(1). But it means the system only works if equal objects land in the same bucket - which requires:

If a.equals(b) is true, then a.hashCode() == b.hashCode() must also be true.

This is the single most important coupling in Java. Our Point overrode neither, so two equal points got the default hashCode (based on object identity), landed in different buckets, and the set never even compared them with equals. The lookup failed before equality was ever checked.

The full hashCode contract:

  • Consistent: same object returns the same hash across calls (while unchanged).
  • Equal ⇒ same hash: equal objects must have equal hash codes. (The load-bearing rule.)
  • Unequal may share a hash (a "collision") - allowed, just slower. Good hash codes spread values out to minimize collisions.

The correct Point.hashCode:

@Override
public int hashCode() {
    return Objects.hash(x, y);    // combines the same fields equals() uses
}

Objects.hash(...) takes the fields and combines them into a well-distributed int. Always hash exactly the fields you compare in equals - no more, no fewer. If equals uses x and y, hashCode uses x and y. Mismatch here is the bug.

Now the fix in full:

class Point {
    final int x, y;
    Point(int x, int y) { this.x = x; this.y = y; }

    @Override public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Point p = (Point) o;
        return x == p.x && y == p.y;
    }

    @Override public int hashCode() {
        return Objects.hash(x, y);
    }
}

var seen = new HashSet<Point>();
seen.add(new Point(1, 2));
System.out.println(seen.contains(new Point(1, 2)));   // true - fixed

Iron law: override equals and hashCode together, always. Never one without the other. Every linter and IDE enforces this. The moment you write one, write the other from the same fields.

compareTo - ordering

equals answers "are these equal?" compareTo answers "which comes first?" A class implements Comparable<T> to define a natural order - used by Collections.sort, TreeSet, TreeMap, and list.sort(null).

class Version implements Comparable<Version> {
    final int major, minor, patch;
    Version(int major, int minor, int patch) {
        this.major = major; this.minor = minor; this.patch = patch;
    }

    @Override public int compareTo(Version o) {
        // Returns negative if this < o, 0 if equal, positive if this > o.
        return Comparator.comparingInt((Version v) -> v.major)
                .thenComparingInt(v -> v.minor)
                .thenComparingInt(v -> v.patch)
                .compare(this, o);
    }
}

var versions = new ArrayList<>(List.of(
    new Version(1, 2, 0), new Version(1, 0, 5), new Version(2, 0, 0)));
Collections.sort(versions);   // uses compareTo: 1.0.5, 1.2.0, 2.0.0

The contract: compareTo returns a negative int, zero, or a positive int for less-than, equal, greater-than. It must be a consistent total order (antisymmetric, transitive). And there's a strong recommendation: compareTo should be consistent with equals - x.compareTo(y) == 0 should usually mean x.equals(y). Violate this and TreeSet/TreeMap (which use compareTo, not equals, to decide membership) behave differently from HashSet/HashMap, which surprises everyone.

When you don't control the class or want a one-off order, use a Comparator instead of Comparable:

versions.sort(Comparator.comparingInt((Version v) -> v.patch));   // sort by patch only
versions.sort(Comparator.comparing((Version v) -> v.major).reversed());

Comparable is the one natural order baked into the type; Comparator is any number of external orders. Prefer Comparator for alternative sorts; reserve Comparable for the single most obvious "default" order (or omit it if there's no obvious default).

Immutability - the quiet superpower

Notice the fixed Point used final int x, y. That's deliberate. An immutable object can't change after construction. This matters enormously for the contracts above and for concurrency (chapters 09-11).

Why immutability is a default worth reaching for:

  1. Hash keys must not change. If you put a mutable object in a HashSet, then mutate a field used by hashCode, the object is now in the wrong bucket - lost. Immutable keys can't have this bug.
  2. Thread safety for free. An object that never changes can be shared across threads with zero synchronization. No races possible on data that doesn't mutate. (This is why chapters 09-11 keep coming back to immutability.)
  3. Easier reasoning. You can pass an immutable object anywhere without fear someone will change it behind your back.

How to make a class immutable:

final class Money {                          // final: no subclass can add mutability
    private final long cents;                // final fields, set once
    private final String currency;

    Money(long cents, String currency) {
        this.cents = cents;
        this.currency = currency;
    }

    long cents()       { return cents; }     // getters only, no setters
    String currency()  { return currency; }

    // "Mutation" returns a NEW object instead of changing this one.
    Money plus(Money other) {
        if (!currency.equals(other.currency))
            throw new IllegalArgumentException("currency mismatch");
        return new Money(cents + other.cents, currency);
    }
}

The recipe: final class, all fields private final, no setters, and any "change" returns a new instance. String, Integer, LocalDate, and BigDecimal are all immutable this way - which is why String methods like toUpperCase() return a new string instead of changing the original.

The defensive-copy trap

Immutability has a hole: if a field is itself a mutable object (a List, an array, a Date), storing or returning it directly leaks mutability.

final class Team {
    private final List<String> members;

    // BROKEN: caller keeps a reference to the same list and can mutate it.
    Team(List<String> members) {
        this.members = members;
    }
    List<String> members() {
        return members;                       // caller can do members().add(...)!
    }
}

Fix with defensive copies on the way in and out (or unmodifiable wrappers):

final class Team {
    private final List<String> members;

    Team(List<String> members) {
        this.members = List.copyOf(members);  // copy in - our list is independent
    }
    List<String> members() {
        return members;                       // List.copyOf already made it unmodifiable
    }
}

List.copyOf creates an independent, unmodifiable list. Now neither the original caller's list nor anything they get back can corrupt the Team. The same applies to arrays (array.clone()), maps (Map.copyOf), and any mutable field.

Records: the contracts, for free

Here's the relief: for the common case of a value type, records implement all of this correctly for you. You met records in From Scratch chapter 08; now you can appreciate what they do.

record Point(int x, int y) {}

That one line generates:

  • A constructor.
  • x() and y() accessors.
  • A correct equals comparing both fields.
  • A correct hashCode from both fields (consistent with equals).
  • A readable toString.

The Point bug we spent this chapter fixing simply cannot happen with a record - the generated equals/hashCode are correct and coupled by construction. Records are also implicitly final and their fields are final, so they're immutable (with the defensive-copy caveat for mutable components - a record holding a List field still needs care in its compact constructor):

record Team(List<String> members) {
    Team {                                    // compact constructor
        members = List.copyOf(members);       // defensive copy still needed
    }
}

The practical guidance: for value types, reach for a record first. You get the contracts right automatically. Write equals/hashCode by hand only when you can't use a record (you need mutability, or you extend a class, or you need custom equality semantics like case-insensitive comparison). Knowing how to do it by hand matters because you'll read and maintain pre-record code constantly - but for new value types, records are the answer.

Try it

  1. Reproduce the bug. Write Point with no equals/hashCode. Add one to a HashSet, search for an equal one, watch contains return false. Then add correct equals/hashCode and watch it return true. Feel the cause.

  2. Break the coupling on purpose. Override equals correctly but make hashCode return a constant return 1;. Does the HashSet work? (It does - but every element collides into one bucket, making it O(n). Correct but slow.) Now make hashCode return x only while equals uses x and y. Add (1,2) and (1,3), then search for (1,2). Reason about what happens.

  3. The mutation-in-a-set disaster. Make Point mutable (non-final fields, a setter). Put one in a HashSet. Mutate its x. Now call contains with an equal point and also iterate the set looking for it. The object is "in" the set but unfindable - it's in the wrong bucket. This is the argument for immutable keys.

  4. Order it. Implement Version implements Comparable<Version>. Sort a list. Then sort the same list by patch-number-descending using a Comparator. Notice Comparable is the one natural order; Comparator is for everything else.

  5. Records vs hand-written. Write Money as a hand-written immutable class with equals/hashCode, then as a record. Put both in a HashMap as keys and look them up. Confirm both work. Count the lines you saved.

  6. Defensive copy. Write the broken Team (stores the list directly). Construct one, then team.members().add("intruder") (or mutate the original list you passed in). Watch the "immutable" team change. Fix with List.copyOf. Confirm the mutation no longer leaks.

What you might wonder

"instanceof vs getClass() in equals - which?" getClass() requires both objects to be the exact same class, which keeps symmetry airtight even across subclasses. instanceof allows subclass instances to equal superclass instances, which can break symmetry (a ColorPoint might equal a Point but not vice versa). For most value types, getClass() is the safe default. Records use exact-class matching internally. There are advanced patterns using instanceof for class hierarchies, but they require care; default to getClass().

"Do I really need to memorize the equals contract?" You don't write equals by hand often once you have records. But you read hand-written ones constantly, and you need to recognize when one is broken. Know the five rules well enough to spot a violation in code review - that's the working level.

"Why is Objects.hash better than writing my own?" You can write int result = 31 * x + y; - the classic formula. Objects.hash does essentially that for any number of fields, correctly and readably, and handles nulls. For hot paths where the varargs array allocation of Objects.hash matters, hand-write the 31 * formula; otherwise Objects.hash is the clear, correct default.

"Is everything supposed to be immutable now?" Default to immutable; allow mutability when you have a reason (a builder accumulating state, a large object you mutate in place for performance, an entity whose identity persists while its fields change). The guidance is "immutable unless you need otherwise," not "immutable always." Records make immutable the path of least resistance, which is the point.

"What about compareTo returning this.x - other.x?" A classic bug. Subtraction can overflow (Integer.MIN_VALUE - 1 wraps to a positive number, inverting the order). Always use Integer.compare(a, b) or Comparator builders, never raw subtraction, for compareTo.

Done

  • You know reference equality (==) vs logical equality (.equals()), and when each is correct.
  • You can implement equals against its five-rule contract.
  • You understand why hashCode is coupled to equals - and the bucket mechanism that makes the coupling load-bearing.
  • You can implement compareTo/Comparator for ordering, and avoid the subtraction-overflow trap.
  • You know why immutability protects the contracts and enables thread safety, how to build immutable classes, and the defensive-copy trap.
  • You know records implement all of it correctly for free - reach for them first.

Next: generics in depth - bounded types, wildcards, and the truth about type erasure.

Next: Generics in depth →

Comments