Async models¶

Why it matters¶

"Do many things at once without using one OS thread per task" is the problem every modern runtime solves. The answers diverge dramatically: Go uses M:N goroutines on a work-stealing scheduler; Java added virtual threads in Loom; Rust uses zero-cost futures on Tokio; Python uses cooperative coroutines on a single thread (with asyncio); the kernel uses epoll/io_uring underneath all of them.

Read at least two paths' async chapters. The contrast clarifies what each language is actually optimizing for.

The fundamental decision¶

Every async model picks one position on each of two axes:

How are tasks scheduled?
Preemptive - the scheduler can interrupt at any point (Go since 1.14, OS threads, virtual threads).
Cooperative - tasks yield only at explicit suspension points (await, yield).
How is parallelism delivered?
M:N - many tasks, few OS threads, work-stealing across cores.
Single-threaded event loop - one OS thread, one event loop, many coroutines.
Thread-per-request - one OS thread per task (old model; renaissance with virtual threads).

Different positions, different trade-offs.

The lens, per path¶

Go - preemptive goroutines on M:N scheduler¶

Month 3 - Concurrency Mastery. The runtime's GMP scheduler (Goroutines × Machines (OS threads) × Processors) multiplexes millions of goroutines onto a small thread pool. Preemption added in 1.14 - long-running goroutines no longer starve siblings.

Default unit: goroutine. ~2KB stack, grows on demand. Spawned with go func() { ... }(). Channels for coordination.

What's unique: uniform. Every Go program - CPU-bound, I/O-bound, mixed - uses the same primitives. There's no "async vs sync" distinction; you write the same code either way and the runtime handles it. The cost: GC overhead and runtime opacity (less control than Rust's hand-written futures).

Java - virtual threads (Loom), the late convert¶

Month 4 - Concurrency & Loom. Java's pre-Loom story was reactive streams (CompletableFuture, Reactor, RxJava) for high concurrency on a thread pool. Loom (JDK 21+) added virtual threads - same continuation-on-carrier-thread model as Go, with Thread.startVirtualThread and Executors.newVirtualThreadPerTaskExecutor.

Default unit (post-Loom): virtual thread. Same scheduling model as goroutines. Blocking I/O is cheap again - spring.threads.virtual.enabled=true lets a Spring Boot service handle thousands of concurrent requests with thread-per-request semantics.

What's unique: the strategic pivot. For 15 years, Java's high-concurrency answer was reactive. With Loom, reactive is now reserved for explicit-backpressure cases; most apps switch back to blocking + virtual threads. Reading old vs new Java service code, you can date it by its concurrency model.

Rust - Tokio, async/await, zero-cost futures¶

Month 3 - Concurrency & Async. Rust's async fn returns a future (a state machine). An executor (Tokio is the most popular) polls them. The runtime uses epoll/kqueue/io_uring for I/O.

Default unit: task (a spawned future). Lightweight (no OS thread). Tokio's tokio::spawn schedules on its multi-threaded work-stealing executor.

What's unique: zero-cost abstractions. Futures compile down to optimized state machines; no allocations per call when the compiler can prove it. Memory safety + concurrency safety enforced by the type system (Send / Sync marker traits). The cost: async fn interacts with lifetimes in confusing ways; Pin/Unpin/Stream/etc. are advanced. Real Rust async ergonomics in 2026 are still rough at the edges.

Python - `asyncio` cooperative coroutines (single-threaded)¶

Month 4 - Concurrency & Parallelism. async def defines a coroutine. await suspends until a result is ready. The asyncio event loop runs in one thread (typically), juggling many coroutines.

Default unit (for I/O concurrency): coroutine. Created with async def; awaited with await. The GIL means CPU-bound concurrency needs multiprocessing or native extensions instead.

What's unique: single-threaded simplicity. No data races between coroutines (they only interleave at await). The trade: zero true parallelism unless you reach for processes / threads (with caveats from the GIL) / native code. The PEP 703 free-threaded CPython (stable in 3.14) changes this for the future but adoption is gradual.

Linux kernel - `epoll`, `io_uring`, kernel threads¶

Month 3 - Namespaces, Cgroups, eBPF for the scheduler context; Month 4 for the I/O primitives.

The runtimes above all sit on the kernel's epoll (Linux) or kqueue (BSD). The kernel maintains a watch list of file descriptors; threads block on epoll_wait until any descriptor is ready, then handle whichever fired.

io_uring (5.1+) is the modern submission-queue model: userspace and kernel share ring buffers for op submission and completion. Reduces syscall overhead; supports completion-based I/O (vs epoll's readiness-based). Some runtimes (Tokio with tokio-uring, some Node.js libuv configurations) opt in.

The contrasts that teach¶

Aspect	Go	Java (Loom)	Rust (Tokio)	Python (asyncio)	Kernel
Scheduling	preemptive	preemptive	cooperative (`await`)	cooperative (`await`)	preemptive
Default unit	goroutine	virtual thread	task (future)	coroutine	thread
True parallelism	yes (M:N)	yes (carrier threads)	yes (Tokio worker pool)	no (GIL); processes for CPU work	yes
Sync I/O cost	cheap (runtime parks goroutine)	cheap (parks virtual thread)	N/A - must use async	blocks the event loop (bad)	blocks the thread
CPU-bound concurrency	trivial	trivial	trivial	requires multiprocessing	trivial
Cancellation	`context.Context`	structured concurrency / `Future.cancel`	`tokio::select` + drop	`task.cancel()`	signals
Backpressure	bounded channels	reactive streams / queues	`Stream` / `Sink` traits	asyncio queues	TCP windows
Color problem?	no (uniform)	no (post-Loom)	yes (`async` vs sync split)	yes (`async` vs sync split)	no

The "function coloring" problem (async functions can only be called from other async functions) is a real ergonomic cost. Go and post-Loom Java sidestep it by making the scheduling implicit. Rust and Python pay it in exchange for the explicit control async/await provides.

What to read first¶

You write Go services → Go Month 3. Goroutines are the model; channels are the coordination tool; context is the cancellation propagation.
You write Java services → Java Month 4. The pre-Loom and post-Loom worlds; when to pick virtual threads vs reactive.
You write Rust async code → Rust Month 3. Take the borrow-checker page (Month 2 / page 06 of the beginner path) seriously first; async lifetimes get worse, not better, without that foundation.
You write Python at scale → Python Month 4. Understand the GIL before you optimize anything; understand asyncio vs threads vs processes before you choose.
You optimize past userspace → Linux Month 3 + 4. io_uring, eBPF for runtime tracing, kernel scheduler internals.