11 - Concurrency III: executors, futures, and virtual threads¶

What this session is¶

About ninety minutes. Chapters 09-10 were about raw threads and locks - the foundations. This chapter is where modern Java actually does concurrency: you stop creating Thread objects and managing locks by hand, and instead submit tasks to an executor that manages a pool of threads for you, get futures representing results-to-come, compose async pipelines with CompletableFuture, and - since Java 21 - use virtual threads to write simple blocking code that scales to millions of concurrent operations. By the end you'll write concurrent Java the way production codebases do.

Why raw threads don't scale¶

Creating a Thread per task (chapter 09) has problems at scale:

Threads are expensive. Each OS thread costs ~1 MB of stack memory and has real creation/teardown cost. Create 10,000 and you've used 10 GB of stack and overwhelmed the scheduler.
Unbounded thread creation is a denial-of-service on yourself. A server that spawns a thread per request falls over under load.
No reuse. Creating and destroying a thread for each short task wastes the creation cost.

The fix is a thread pool: a fixed set of worker threads that pull tasks from a queue. You submit work; the pool runs it on an available worker. This decouples "how much work" from "how many threads," caps resource use, and reuses threads.

ExecutorService: submit tasks, not threads¶

ExecutorService is the standard thread-pool abstraction. You create one, submit tasks, shut it down.

import java.util.concurrent.*;

ExecutorService pool = Executors.newFixedThreadPool(4);   // 4 worker threads

// Submit a Runnable (no result):
pool.submit(() -> System.out.println("task ran on " + Thread.currentThread().getName()));

// Submit a Callable (returns a result) - get a Future back:
Future<Integer> future = pool.submit(() -> {
    Thread.sleep(100);
    return 42;
});

pool.shutdown();                          // stop accepting new tasks; finish queued ones
pool.awaitTermination(1, TimeUnit.MINUTES);  // wait for them to finish

The shift in thinking: you no longer say "run this on a new thread." You say "here is a task; run it whenever a worker is free." The pool handles thread lifecycle, reuse, and queueing.

The factory methods:

Executors.newFixedThreadPool(n)        // n threads, fixed. The common choice for CPU work.
Executors.newCachedThreadPool()        // grows/shrinks on demand. For many short-lived I/O tasks.
Executors.newSingleThreadExecutor()    // one thread - serializes tasks, useful for ordering
Executors.newScheduledThreadPool(n)    // for delayed/periodic tasks (replaces Timer)
Executors.newVirtualThreadPerTaskExecutor()  // Java 21+ - one virtual thread per task (below)

Always shut down your executor (shutdown() then awaitTermination, or use try-with-resources - ExecutorService is AutoCloseable since Java 19). A non-daemon pool keeps the JVM alive if you forget.

// Try-with-resources (Java 19+): shutdown + awaitTermination happen automatically on close.
try (var pool = Executors.newFixedThreadPool(4)) {
    pool.submit(task1);
    pool.submit(task2);
}   // close() shuts down and waits for tasks to finish

Sizing the pool¶

A rule of thumb that matters:

CPU-bound work (computation, no waiting): pool size ≈ number of CPU cores (Runtime.getRuntime().availableProcessors()). More threads than cores just adds context-switching overhead.
I/O-bound work (waiting on network/disk/database): more threads than cores, because threads spend most of their time blocked, not computing. The exact number depends on the wait/compute ratio - or, better, use virtual threads (below), which make this question mostly disappear.

Future: a result that isn't ready yet¶

submit of a Callable returns a Future<T> - a handle to a result that will exist eventually. You can do other work, then collect it:

Future<Integer> f = pool.submit(() -> expensiveComputation());

// ... do other things while it runs ...

Integer result = f.get();          // BLOCKS until the result is ready (or throws)

Future methods:

f.get();                  // block until done, return result (throws ExecutionException if task threw)
f.get(2, TimeUnit.SECONDS);  // block up to 2s, then throw TimeoutException
f.isDone();               // non-blocking check
f.cancel(true);           // attempt to cancel (interrupts the running thread if true)

Running several tasks in parallel and collecting results:

List<Future<Integer>> futures = new ArrayList<>();
for (int i = 0; i < 10; i++) {
    final int n = i;
    futures.add(pool.submit(() -> process(n)));   // all 10 start, run in parallel on the pool
}
int total = 0;
for (Future<Integer> f : futures) {
    total += f.get();             // collect each (blocks per future, but they ran concurrently)
}

The limitation of plain Future: get() blocks, and you can't easily chain "when this finishes, do that next" without blocking a thread to wait. That's what CompletableFuture solves.

CompletableFuture: composable async pipelines¶

CompletableFuture<T> is a Future you can compose - attach callbacks that run when it completes, chain transformations, combine multiple futures - all without blocking a thread to wait. It's how you build async pipelines.

import java.util.concurrent.CompletableFuture;

CompletableFuture<String> pipeline =
    CompletableFuture
        .supplyAsync(() -> fetchUser(id))          // run async, produce a User
        .thenApply(user -> user.email())           // transform: User -> String (when ready)
        .thenApply(String::toLowerCase)            // chain another transform
        .exceptionally(ex -> "unknown@example.com"); // recover from any failure in the chain

String email = pipeline.join();                    // get the final result (join = get without checked exc)

Each step runs when the previous completes - no blocking between steps. The vocabulary:

supplyAsync(supplier)       // start an async task producing a value
runAsync(runnable)          // start an async task with no result
thenApply(fn)               // transform the result (sync continuation)
thenApplyAsync(fn)          // transform on the pool (async continuation)
thenCompose(fn)             // chain another CompletableFuture (flatMap for futures - avoids nesting)
thenAccept(consumer)        // consume the result, no return
thenCombine(other, fn)      // combine two futures' results when both complete
exceptionally(fn)           // recover from an exception
handle((result, ex) -> ...) // handle both success and failure

Combining independent async calls - fetch two things in parallel, then merge:

CompletableFuture<Profile> profile = CompletableFuture.supplyAsync(() -> fetchProfile(id));
CompletableFuture<List<Order>> orders = CompletableFuture.supplyAsync(() -> fetchOrders(id));

CompletableFuture<Dashboard> dashboard = profile.thenCombine(orders,
    (p, o) -> new Dashboard(p, o));     // runs when BOTH complete, on whichever finishes last

Dashboard d = dashboard.join();          // profile and orders were fetched concurrently

thenCompose vs thenApply is the same distinction as flatMap vs map from chapter 07: use thenCompose when your function itself returns a CompletableFuture (to avoid CompletableFuture<CompletableFuture<T>> nesting).

CompletableFuture is the tool for orchestrating multiple async operations - parallel service calls, pipelines of dependent steps, fan-out/fan-in. It's everywhere in reactive and microservice code.

Virtual threads: the Java 21 game-changer¶

The biggest concurrency change in Java's history. A virtual thread is a lightweight thread managed by the JVM, not the OS. You can have millions of them. They make the "thread per task" model - simple, blocking, readable code - scale to levels that previously required complex async/reactive programming.

// Java 21+. One virtual thread per task. Create a MILLION if you want.
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    for (int i = 0; i < 1_000_000; i++) {
        executor.submit(() -> {
            Thread.sleep(1000);        // BLOCKS - but cheaply, on a virtual thread
            return fetchSomething();
        });
    }
}   // a million concurrent blocking tasks, on a handful of OS threads

How it works: when a virtual thread blocks (on I/O, sleep, a lock), the JVM unmounts it from its carrier OS thread and runs another virtual thread there. The OS thread is never idle waiting; it's always doing useful work. A few OS threads can host millions of virtual threads, as long as most are blocked at any moment (which I/O-bound work always is).

Why this matters: before virtual threads, scaling to many concurrent I/O operations forced you into asynchronous code - callbacks, CompletableFuture chains, reactive streams - which is harder to write, read, and debug. Virtual threads let you write simple, sequential, blocking code and still scale:

// This blocking, sequential, easy-to-read code now scales to millions of concurrent requests:
void handleRequest(Request req) {
    var user = db.loadUser(req.userId());      // blocks - fine on a virtual thread
    var orders = api.fetchOrders(user);        // blocks - fine
    var result = process(user, orders);        // computes
    respond(result);
}
// Run one virtual thread per request. No async, no callbacks, scales enormously.

The guidance:

For I/O-bound concurrency (web servers, API clients, anything that waits a lot): virtual threads are the new default. newVirtualThreadPerTaskExecutor(), write blocking code, scale freely.
For CPU-bound work (heavy computation): use a fixed platform-thread pool sized to cores - virtual threads don't help when threads are computing, not waiting.
Don't pool virtual threads. They're cheap to create; create one per task. Pooling them defeats the purpose.

One caveat: virtual threads are great for blocking I/O, but synchronized blocks can "pin" a virtual thread to its carrier (preventing unmounting) in some JDK versions - prefer ReentrantLock over synchronized in code that runs on virtual threads and holds locks across blocking calls. (This pinning is being reduced in newer JDKs.)

Structured concurrency (preview)¶

A newer model (preview in recent JDKs, stabilizing) that treats a group of related concurrent tasks as a single unit - if one fails, the others are cancelled; the parent waits for all. It makes concurrent code as structured as sequential code (no leaked threads, clear error propagation):

// StructuredTaskScope - subtasks are bound to a scope; the scope joins them all.
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
    var userTask  = scope.fork(() -> fetchUser(id));     // forked subtask
    var orderTask = scope.fork(() -> fetchOrders(id));   // forked subtask
    scope.join();                  // wait for both
    scope.throwIfFailed();         // if either failed, propagate (and the other was cancelled)
    return new Dashboard(userTask.get(), orderTask.get());
}   // guaranteed: no subtask outlives this block

This is the future of multi-task concurrency in Java - it eliminates the leaked-task and partial-failure problems of manual executor use. Worth knowing it exists; check your JDK version for stability.

Choosing the right tool¶

Situation	Reach for
Run independent tasks on a bounded pool	`ExecutorService` (fixed pool for CPU, virtual-per-task for I/O)
Get a single async result	`Future` (or `CompletableFuture`)
Chain/compose async steps without blocking	`CompletableFuture` (`thenApply`/`thenCompose`/`thenCombine`)
Many concurrent blocking I/O operations	virtual threads (`newVirtualThreadPerTaskExecutor`, Java 21+)
Heavy parallel computation	fixed platform-thread pool sized to cores
Delayed or periodic tasks	`newScheduledThreadPool`
Group related subtasks with all-or-nothing semantics	structured concurrency (`StructuredTaskScope`)

The arc of these three chapters: raw threads + locks (09-10) are the foundation you must understand, but you rarely write them directly. In real code you submit tasks to executors, compose with CompletableFuture, and - increasingly - use virtual threads to write simple blocking code that scales. The high-level tools sit on the low-level guarantees; knowing both is what makes you trustworthy with concurrency.

Try it¶

Pool vs raw threads. Submit 10,000 short tasks (each Thread.sleep(10) then increment an AtomicInteger) two ways: (a) new Thread() per task, (b) a fixed pool of 100. Time both and watch memory. The raw-thread version strains or fails; the pool sails through. This is why pools exist.
Future parallelism. Write 8 tasks that each sleep(1000) and return a number. Submit all to a pool of 8 and collect with get(). Total wall-clock time should be ~1 second, not 8 - they ran in parallel. Then submit to a pool of 2 and watch it take ~4 seconds (only 2 run at once).
CompletableFuture pipeline. Build supplyAsync(() -> fetchUser()).thenApply(User::name).thenApply(String::toUpperCase). Add an .exceptionally(...) and make fetchUser throw - confirm the recovery value comes through. Then use thenCombine to fetch two things in parallel and merge them; verify the wall-clock is the max of the two, not the sum.
thenApply vs thenCompose. Write a function CompletableFuture<Profile> loadProfile(User u). Chain it after loadUser with thenApply and observe the awkward CompletableFuture<CompletableFuture<Profile>>. Fix it with thenCompose. This is the map-vs-flatMap lesson in async form.
Virtual threads at scale (Java 21+). Submit 1,000,000 tasks that each Thread.sleep(1000) to newVirtualThreadPerTaskExecutor(). It completes in ~1 second total (all million block concurrently on a few OS threads). Try the same with a fixed platform pool of 100 - it takes ~10,000 seconds. Witness why virtual threads changed Java concurrency.
Forget to shut down. Submit a task to a non-virtual newFixedThreadPool and don't shut it down. Notice the JVM doesn't exit (the pool's non-daemon threads keep it alive). Add shutdown() (or use try-with-resources) and watch it exit cleanly.

What you might wonder¶

"Do virtual threads make ExecutorService and CompletableFuture obsolete?" No. Virtual threads change how many threads you can have and let you write blocking code that scales - they're about the threading model. ExecutorService is still how you submit and manage tasks (now often with virtual threads as the backing). CompletableFuture is still the tool for composing async results and fan-out/fan-in. They complement virtual threads. What virtual threads reduce the need for is complex reactive/async frameworks adopted purely to avoid blocking - now you can block cheaply.

"When CompletableFuture vs virtual threads?" If you can write the logic as simple sequential blocking code, virtual threads let you do that and scale - often the simpler choice now. Use CompletableFuture when you genuinely need to compose independent async operations (run three services in parallel and combine), express dependency graphs between async steps, or you're on Java < 21. Many codebases use both: virtual threads for the per-request thread, CompletableFuture for fan-out within a request.

"What pool size should I actually use?" CPU-bound: availableProcessors() (maybe +1). I/O-bound on platform threads: higher, tuned to your wait/compute ratio (the formula is roughly cores * (1 + wait/compute)), but this is fiddly - which is exactly why virtual threads are compelling for I/O: you stop sizing pools and just create one virtual thread per task.

"Is parallelStream() (chapter 07) related?" Yes - it uses a shared ForkJoinPool (the common pool) under the hood to parallelize stream operations. It's good for CPU-bound, independent, large-dataset operations. It's not a general task executor and shouldn't be used for I/O (it can starve the shared pool). For arbitrary concurrent tasks, use an ExecutorService, not parallelStream.

"How do exceptions work across threads?" A task's exception doesn't propagate to the submitting thread automatically. With Future, get() throws ExecutionException wrapping the task's exception - you only see it when you call get(). With CompletableFuture, use exceptionally/handle to deal with it in the pipeline. A submit-ted task that throws and is never get()-ted swallows the exception silently - a common bug. (Use execute + an uncaught-exception handler, or always check your futures.)

"Should I ever extend Thread or implement Runnable directly now?" Rarely. implements Runnable (or a lambda) to define a task, yes - but hand it to an executor rather than new Thread(runnable).start(). Extending Thread is almost never right (it conflates the task with the worker - a chapter 01 composition-over-inheritance issue). Define tasks; submit them to executors.

Done¶

You know why raw thread-per-task doesn't scale, and how thread pools fix it.
You can use ExecutorService - submit Runnable/Callable, size pools for CPU vs I/O, and shut down properly (try-with-resources).
You can use Future for single async results and understand its blocking limitation.
You can compose async pipelines with CompletableFuture (thenApply/thenCompose/thenCombine/exceptionally).
You understand virtual threads (Java 21+): cheap, millions-scale, write simple blocking code for I/O concurrency.
You know structured concurrency exists and where the field is heading.
You can pick the right tool for CPU-bound vs I/O-bound vs compositional concurrency.

That completes the concurrency core - the heart of this path. Next: performance-aware coding - allocation, boxing, and the patterns that actually matter.

Next: Performance-aware coding →