Saltar a contenido

09 - Concurrency I: threads and the three problems

What this session is

About two hours - this is the most important chapter in the path, and the hardest. Almost every real Java program runs concurrently: a web server handles many requests at once, each on its own thread. The moment two threads touch the same data, a category of bug appears that you've never had to think about before - bugs that pass every test, work on your machine, and corrupt data randomly in production. This chapter is about seeing those bugs. Chapters 10 and 11 are about fixing them. Do every exercise here by actually running the code - concurrency cannot be learned by reading.

What a thread is

A thread is an independent path of execution within your program. Your main method runs on a thread. You can start more, and they run concurrently - the operating system (and the JVM) interleave them, giving each slices of CPU time, possibly truly in parallel on multiple cores.

public class Hello {
    public static void main(String[] args) {
        Thread t = new Thread(() -> {
            System.out.println("hello from another thread");
        });
        t.start();                          // starts the thread - runs concurrently with main
        System.out.println("hello from main");
        // The two prints can appear in either order - they run concurrently.
    }
}

new Thread(runnable) creates a thread; .start() runs its Runnable on a new thread of execution. (Calling .run() directly would not start a thread - it'd just run the code on the current thread. Always .start().)

Why threads exist: to do things at the same time. Wait for a slow network call on one thread while computing on another. Handle thousands of simultaneous web requests. Use all your CPU cores for a parallel computation. Concurrency is how software stays responsive and uses modern hardware.

Joining and the basics

join() waits for a thread to finish:

Thread worker = new Thread(() -> {
    // do some work
});
worker.start();
worker.join();    // main blocks here until worker finishes
System.out.println("worker done");

A few essentials:

Thread.currentThread().getName();       // who am I
Thread.sleep(1000);                     // pause this thread 1 second (throws InterruptedException)
t.isAlive();                            // is it still running
t.interrupt();                          // request cancellation (cooperative - chapter 11)

That's the mechanics. Now the danger.

The first problem: race conditions

Here is the bug that defines concurrency. Two threads incrementing a shared counter:

public class RaceDemo {
    static int counter = 0;             // shared mutable state

    public static void main(String[] args) throws InterruptedException {
        Runnable task = () -> {
            for (int i = 0; i < 100_000; i++) {
                counter++;              // looks atomic. IS NOT.
            }
        };

        Thread t1 = new Thread(task);
        Thread t2 = new Thread(task);
        t1.start(); t2.start();
        t1.join();  t2.join();

        System.out.println(counter);    // EXPECTED 200000. ACTUAL: something less, varies each run.
    }
}

Run it. You'll get 200000 almost never - you'll get 137492, 156010, a different wrong number each time. The counter lost increments. Why?

counter++ is not one operation. It's three:

  1. Read counter from memory into the CPU.
  2. Add 1.
  3. Write the result back to memory.

Now interleave two threads:

Thread 1: read counter (it's 5)
Thread 2: read counter (also 5)        <- both saw 5
Thread 1: add 1 -> 6, write 6
Thread 2: add 1 -> 6, write 6          <- overwrites! Two increments, but counter only went 5 -> 6

Two increments happened; the counter advanced by one. One was lost. This is a race condition: the result depends on the timing of how operations interleave, which is nondeterministic. Run it a million times and you'll see a million different wrong answers.

The general definition: a race condition is when the correctness of your program depends on the relative timing of threads. Any time two threads access the same mutable data and at least one writes, without coordination, you have one.

The cruelty: it usually works in testing. With light load, the threads happen not to collide. Then production hits it with real concurrency and the data corrupts intermittently, unreproducibly. This is why concurrency bugs are the most feared kind.

The second problem: visibility

Even simpler than a race, and more insidious. One thread sets a flag; another never sees it.

public class VisibilityDemo {
    static boolean running = true;       // shared flag

    public static void main(String[] args) throws InterruptedException {
        Thread worker = new Thread(() -> {
            int count = 0;
            while (running) {             // spin until told to stop
                count++;
            }
            System.out.println("stopped after " + count);
        });
        worker.start();

        Thread.sleep(100);
        running = false;                 // tell it to stop
        System.out.println("told it to stop");
        // worker may NEVER stop - it may never see running == false.
    }
}

You set running = false. Common sense says the worker's loop ends. But it may spin forever. Why?

Each thread can cache values in CPU registers or per-core caches for speed. The worker thread may have cached running == true and never re-read it from main memory. The compiler and CPU are also free to reorder and optimize reads of a variable they don't know another thread is changing. Without explicit coordination, there is no guarantee that one thread's write to a variable ever becomes visible to another thread.

This is the visibility problem: a write by one thread may never be seen by another. It's not about timing of interleaving (like a race) - it's that the update might not propagate at all. And like races, it often "works" in testing and hangs in production on a different CPU.

The third problem: ordering / reordering

The subtlest. Compilers and CPUs reorder instructions for performance, as long as the result looks the same to a single thread. Across threads, that reordering becomes visible and breaks assumptions.

// Thread A
data = compute();      // (1)
ready = true;          // (2)

// Thread B
if (ready) {           // (3)
    use(data);         // (4) - might see ready==true but data not yet set!
}

You wrote data then ready. But the compiler/CPU may reorder A's two writes (they're independent from A's single-threaded view). Thread B can observe ready == true while data is still the old value - using uninitialized data. Single-threaded, the reorder is invisible and legal. Across threads, it's a bug.

These three - races (timing of interleaving), visibility (updates not propagating), ordering (reordering across threads) - are the entire problem space of concurrency. Every concurrency tool in chapters 10 and 11 exists to control them.

The Java Memory Model: the rules of visibility and ordering

How do you get guarantees about visibility and ordering? The Java Memory Model (JMM) defines them through a relation called happens-before. If action X happens-before action Y, then X's effects (including all its memory writes) are guaranteed visible to Y, and X is guaranteed to appear to occur before Y.

Without a happens-before relationship between two threads' actions, you have no guarantee about visibility or ordering between them - that's exactly the bugs above.

The happens-before edges you'll rely on (chapters 10-11 are built on these):

  • Monitor lock: releasing a lock (synchronized block exit) happens-before any subsequent acquisition of the same lock. Everything done before the release is visible after the next acquire.
  • volatile: a write to a volatile field happens-before every subsequent read of that field. (More below.)
  • Thread start: thread.start() happens-before everything the started thread does.
  • Thread join: everything a thread does happens-before another thread returning from thread.join() on it. (This is why the counter in the race demo was at least visible after join - the join gives a happens-before edge for visibility, even though the race already corrupted the value.)

The practical rule: to safely share mutable data between threads, you must establish a happens-before relationship - via a lock, a volatile, or a higher-level concurrent tool. Plain reads and writes of shared fields give you nothing.

volatile: the visibility fix (only)

The lightest tool. Marking a field volatile guarantees visibility and ordering for that field: every write is immediately visible to every other thread's subsequent read, and reads/writes can't be reordered around it.

The visibility demo, fixed:

static volatile boolean running = true;   // <- volatile

That one keyword fixes the infinite-spin bug. Now the worker is guaranteed to see running = false. volatile is the right tool for flags and one-way state changes that one thread writes and others read.

But - and this is critical - volatile does NOT make compound operations atomic. It does not fix the counter race:

static volatile int counter = 0;
counter++;    // STILL a race! volatile makes the read and the write each visible,
              // but the read-add-write sequence can still interleave.

volatile guarantees you see the latest value on each read - but counter++ still does read-add-write as three steps, and another thread can slip between them. volatile solves visibility/ordering, not atomicity. The counter needs a lock or an atomic (chapter 10).

The dividing line: - One thread writes, others read a simple flag/value? volatile is enough. - Multiple threads read-modify-write the same data (++, check-then-act, etc.)? volatile is NOT enough - you need synchronization (chapter 10).

What's safe to share without any of this

Two categories of data are safe to share across threads with no synchronization at all:

  1. Immutable objects (chapter 03). If an object never changes after construction, there's nothing to race on, nothing to go stale - threads can read it freely. This is why immutability keeps coming up: it's the simplest path to thread safety. A record, a String, a LocalDate - share them fearlessly.

  2. Thread-confined data. Data that only one thread ever touches (local variables, or data handed off cleanly) has no sharing, so no concurrency problem. Local variables live on the thread's own stack (chapter 08) - they're inherently thread-confined.

The design lesson that runs through all of concurrency: the less mutable state you share, the fewer concurrency bugs you can have. Prefer immutability, prefer confining data to one thread, and synchronize only the shared mutable state that's genuinely necessary. The best concurrent code minimizes shared mutable state in the first place.

The closure-capture gotcha

A trap that bridges chapter 07. A lambda passed to a thread captures variables - and only effectively final ones. This interacts with concurrency:

for (int i = 0; i < 5; i++) {
    final int id = i;                          // must copy to an effectively-final var
    new Thread(() -> System.out.println(id)).start();
}

You can't capture the loop variable i directly (it changes - not effectively final). More dangerously, if multiple threads capture and mutate the same shared object, you're back to a race. Captured references still point at shared mutable state - capturing doesn't make it safe.

Try it

These exercises are the chapter. Run each and watch the behavior - you must see concurrency bugs to understand them.

  1. Reproduce the race. Run RaceDemo exactly as written. Run it ten times. Record the outputs - all different, all less than 200000. Now increase the loop to 1,000,000 and use 4 threads. The loss gets worse. You're watching lost updates in real time.

  2. Reproduce the visibility hang. Run VisibilityDemo. On many JVMs/CPUs the worker spins forever (you'll have to kill it) - it never sees running = false. (If it happens to stop on your setup, run with java -server and a tight loop; the optimization that hides the write is more aggressive under server compilation.) Then add volatile to running and watch it stop reliably. You just fixed a visibility bug.

  3. Prove volatile doesn't fix the race. Take RaceDemo, make counter volatile, and run it. Still wrong - still less than 200000. This is the most important exercise in the chapter: volatile fixes visibility, not atomicity. Convince yourself by seeing it fail.

  4. The check-then-act race. Write two threads that both do: if (!map.containsKey(k)) map.put(k, expensiveCompute()); on a shared HashMap. Run it. You'll see expensiveCompute() run twice for the same key (both threads passed the check before either put), and possibly a corrupted map. This "check-then-act" race is everywhere in real code.

  5. Immutability is safe. Share an immutable record Point(int x, int y) across ten threads that all read it. No synchronization, no bug, ever - because there's nothing to race on. Contrast with sharing a mutable Point whose fields ten threads write. Feel why immutability is the easy path.

  6. join gives visibility. In RaceDemo, note that after t1.join() and t2.join(), main reads counter and sees a consistent (if wrong) value - the join provides the happens-before edge for visibility. Remove the joins and read counter immediately; now you might not even see the threads' work at all. Two different problems: the race corrupts the value; the missing join can hide it entirely.

What you might wonder

"If concurrency is this dangerous, why use threads at all?" Because the alternative - doing one thing at a time - wastes modern multi-core hardware and makes servers unable to handle concurrent users. The danger isn't threads; it's shared mutable state between threads. Minimize that (immutability, confinement) and synchronize the rest correctly (chapters 10-11), and concurrency is a powerful, manageable tool.

"Do these bugs really happen, or is this theoretical?" They happen constantly, and they're among the most expensive bugs in the industry - precisely because they hide in testing and surface randomly in production. The "works on my machine" that becomes a 2 AM incident is very often a race or visibility bug. This is why concurrency questions dominate senior interviews.

"Why does counter++ not just... work? Other languages?" No mainstream language makes counter++ atomic across threads by default - it's read-modify-write everywhere. Some give you atomic types; some (Rust) use the type system to prevent unsynchronized sharing at compile time. Java gives you the tools (chapters 10-11) but trusts you to use them. The JMM is Java's precise specification of what's guaranteed.

"Is volatile slow?" It has a cost (it prevents certain caching/reordering optimizations and may insert memory barriers), but it's much cheaper than a lock. Use it freely for the flag/single-writer case it's designed for. Just don't reach for it expecting atomicity it doesn't provide.

"What about the synchronized keyword - isn't that the answer?" Yes, for the atomicity problem volatile can't solve - that's chapter 10. synchronized gives both mutual exclusion (fixing races) and a happens-before edge (fixing visibility). It's the next chapter precisely because you need to feel the problems first.

"How do I even find a race condition?" You'll meet the tools in chapter 10 and 13. The short version: there's no volatile keyword that finds them, but there are race detectors and stress-testing tools, careful code review for "shared mutable state without synchronization," and the discipline of asking "what if two threads ran this line at once?" of every shared field. That question is the single most valuable concurrency habit.

Done

  • You can create, start, and join threads.
  • You can see the three core problems: races (timing-dependent interleaving), visibility (writes not propagating), ordering (reordering across threads).
  • You understand counter++ is read-modify-write, and why that races.
  • You know the Java Memory Model gives guarantees only through happens-before edges (locks, volatile, start/join).
  • You know volatile fixes visibility/ordering for a single field but NOT atomicity of compound operations.
  • You know immutability and thread-confinement are the synchronization-free safe paths.

Next: the tools that fix races - synchronized, locks, atomics, and the thread-safe collections.

Next: Concurrency II →

Comments