Glossary¶

Terms that appear across multiple paths, with the one-paragraph definition that lets you read any path's chapter without first reading another's.

Where a path covers a term in depth, it's linked. If you see a term used as-if-you-know-it on any page, search this glossary first.

Just starting out?

Many entries here assume some systems familiarity. The beginner glossary covers the same kinds of terms in plain language with analogies, aimed at readers working through the From Scratch paths.

Compiler & runtime¶

ABI (Application Binary Interface). The contract between two compiled binaries about register usage, calling conventions, structure layout, and symbol naming. Compatible ABIs link; incompatible ones produce mysterious crashes. C ABI is the lingua franca of FFI.

AOT compilation. Ahead-of-time compilation: turning source or bytecode into native machine code before execution, as opposed to JIT (just-in-time, during execution). Examples: GraalVM native-image, Project Leyden, Rust/Go static binaries.

Backend (compiler). The half of a compiler that emits machine code for a target ISA, given an intermediate representation produced by the front end. LLVM, Cranelift, HotSpot C2.

Bytecode. A stack-machine instruction set produced by a compiler front-end and consumed by a runtime or interpreter. JVM bytecode, CPython bytecode, WebAssembly.

Codegen. Code generation - turning a compiler IR into target-machine instructions.

Continuation. A reified call stack - the state needed to resume a paused computation. Underpins virtual threads (Loom), async/await desugaring, and exception handling in some languages.

Deoptimization. A JIT runtime decision to throw away compiled code and fall back to the interpreter, usually because a speculative assumption (monomorphic call site, escape analysis result) was invalidated. See Java 02-month-jvm-and-bytecode and Go runtime.

Escape analysis. A compiler analysis that decides whether a heap allocation could safely stay on the stack instead. Foundational to Go and Java performance.

FFI (Foreign Function Interface). The mechanism a language uses to call code written in another language (almost always C). Examples: Rust extern "C", Java Panama / JNI, Go cgo, Python ctypes/cffi. See the FFI cross-topic page.

Inlining. The compiler optimization of replacing a function call site with the callee's body. The single most important enabling optimization for modern JITs.

JIT (just-in-time compilation). Compiling code at runtime, typically after profiling shows a method to be hot. HotSpot's C1/C2, V8's TurboFan, .NET's RyuJIT, PyPy.

Monomorphic / bimorphic / megamorphic. A call site that has been observed with one / two / three-or-more concrete receiver types. JITs aggressively inline monomorphic sites and deoptimize when polymorphism increases.

LTO (link-time optimization). Compiler optimization across translation-unit boundaries, applied at link time when the whole program is visible. Rust lto = "fat", Clang/GCC -flto. Slower builds, faster binaries.

PGO (profile-guided optimization). Compiling once, running with profiling, then recompiling with the profile to guide branch placement, inlining, and codegen. Go (go build -pgo), Rust (cargo pgo), HotSpot via Leyden. Typically 5-15% speedup on hot paths.

Memory & GC¶

Allocator. The subsystem that hands out memory: the kernel's buddy allocator (page-level), slab/slub (object-level), malloc/jemalloc (userspace), runtime allocators (Go's mcache, Java's TLAB).

Compacting GC. A garbage collector that moves live objects together during reclamation, eliminating heap fragmentation. G1, ZGC, Shenandoah; not Go's tricolor (non-compacting).

Generational GC. A collector built on the weak generational hypothesis - "most objects die young." Java's G1 and ZGC; Python's three-generation cycle collector. See the GC cross-topic page.

Heap. The pool of memory used for dynamically-sized allocations whose lifetime exceeds the current function call. Opposite of the stack.

HBM. High-Bandwidth Memory - the on-package memory used by modern GPUs. Specifically referenced when talking about bandwidth-bound vs compute-bound kernels (AI Systems path).

RCU (Read-Copy-Update). A Linux kernel synchronization primitive: readers are wait-free; writers defer reclamation until a "grace period" has elapsed during which all readers exit. Effectively GC for kernel data structures.

Refcount. Reference counting - every object carries a count of references to it; when the count drops to zero, free. CPython's primary memory management; supplements all manual malloc/free code.

Slab allocator. A kernel allocator that maintains caches of pre-constructed fixed-size objects of common types (task_struct, inode), eliminating per-allocation construction cost.

TLAB (Thread-Local Allocation Buffer). A per-thread slice of the heap that a thread can bump-pointer allocate from without synchronization. HotSpot's TLAB is the reason most allocations are cheap.

Tricolor GC. A mark-sweep algorithm using three colors (white/grey/black) to track liveness, designed for concurrent collection. Go's GC; the academic foundation for most modern concurrent collectors.

Write barrier. A snippet of code injected by the runtime around every pointer write, so a concurrent GC can track which references it might have missed during marking. Go's GC, Java G1/ZGC, all use them. They are the per-write cost of concurrent collection.

Concurrency & memory models¶

Acquire / release ordering. Memory-model orderings that enforce one-way visibility - an acquire-load sees all stores that happened-before the matching release-store. C++20, Rust, Java's VarHandle, the kernel's smp_load_acquire. See memory models.

Atomic. An operation that completes as a single, indivisible step from the perspective of other threads. Hardware-supported via CAS, LL/SC, or full memory barriers.

CAS (compare-and-swap). An atomic instruction: "set memory location X to value B if its current value is A; return whether the swap happened." The foundation of every lock-free data structure.

Channel. A typed message-passing primitive - values flow from senders to receivers with built-in synchronization. Go's primary concurrency primitive; Rust has it via std::sync::mpsc and crossbeam.

Continuation. See Compiler & runtime.

Coroutine. A function that can suspend and resume mid-execution. Underpins Python async/await, Kotlin coroutines, JavaScript's generators. Closely related to virtual threads (which are continuations made operating-system-thread-shaped).

Critical section. A region of code that must be entered by at most one thread at a time. Protected by a lock.

Deadlock. A cycle where each thread waits for a resource held by another, none ever progresses.

Futex (fast userspace mutex). A Linux syscall that lets userspace mutex implementations skip the kernel when uncontended, dropping into the kernel only when a thread needs to wait. Underpins every pthread_mutex_t-equivalent on Linux.

GIL (Global Interpreter Lock). CPython's lock that allows only one thread at a time to execute Python bytecode. PEP 703 makes it optional (free-threaded CPython, stable in 3.14).

Happens-before. The partial order on memory operations defined by a memory model: if A happens-before B, then B sees A's effects. Synchronization actions (locks, atomics, thread join, channel ops) create the edges.

Loom. Project Loom - the JEP family that delivered virtual threads, structured concurrency, and scoped values to the JVM. Final in JDK 21–25.

Memory model. The specification of what one thread is allowed to observe of another thread's writes. JMM (Java), Go MM, C++20, LKMM (Linux). See memory models.

Mutex. A lock that exactly one thread can hold at a time. The default shared-state primitive in most languages.

Race condition. A bug where the program's correctness depends on the order of execution of concurrent operations. Data race (a specific kind: concurrent unsynchronized access to the same location, at least one write) is undefined behavior in most memory models.

Spinlock. A lock that busy-waits (spins on a CAS) instead of sleeping. Right for very short critical sections in low-level (often kernel) code; almost always wrong in userspace.

Structured concurrency. A pattern where a "scope" owns all the concurrent tasks it spawned; the scope cannot exit until they all terminate or are cancelled. Java JEP 462/480/505 (final 25); Trio in Python; Go via errgroup.

Virtual thread. A thread scheduled in user-space (by the runtime) onto a small pool of OS threads. Java's Loom; conceptually similar to Go's goroutines and Rust's Tokio tasks.

Work-stealing. A scheduling pattern where idle workers steal tasks from busy workers' queues, balancing load without central coordination. Go's GMP scheduler, Tokio's runtime, Java's ForkJoinPool, Rust's Rayon.

Error handling¶

Checked exception. An exception that the compiler requires you to declare or catch. Only Java implements this in a mainstream language. The continuous debate: useful enforcement or noise?

EAFP. "Easier to Ask Forgiveness than Permission" - Python's idiom of trying an operation and catching the resulting exception rather than checking preconditions first (LBYL - "Look Before You Leap"). Faster, more robust, more idiomatic Python.

Errors as values. The Go/Rust convention: functions return their errors as part of their type, not via a side channel. Caller must explicitly handle or propagate. See error handling.

Panic / unwind. A non-local exit triggered by an unrecoverable condition. Go's panic (recoverable via recover), Rust's panic! (unwind-or-abort, catchable at thread boundaries with catch_unwind), Java's Error family. Distinct from exceptions in that they signal programmer bugs, not recoverable failures.

Result type. A sum type representing success-or-failure: Result<T, E> (Rust), Either<E, T> (Scala/Haskell), result (OCaml). The type-system encoding of "errors as values."

Try-with-resources. Java's pattern (also Python's with, C# using, Go defer) for ensuring cleanup happens regardless of how the block exits. RAII (Rust, C++) automates the same idea via destructors.

Distributed systems¶

At-least-once / exactly-once / at-most-once delivery. The three message-delivery contracts. Exactly-once is impossible without coordination; "effectively exactly-once" is what production systems deliver.

CAP. Brewer's theorem: a network partition forces a choice between Consistency and Availability. "Twelve Years Later" (Brewer's revisit) is the cleaner statement to read.

Consensus. A protocol by which a set of replicas agree on a single value. Paxos (the original), Raft (the understandable one), Zab (ZooKeeper). Underpins every replicated state store.

Idempotent. An operation that produces the same result if applied multiple times. The single most important property for retry-safety in distributed systems.

Linearizability. The strongest consistency model: every read returns the value of the most recent completed write, system-wide. Stronger than serializability (transactional).

Raft. A consensus algorithm designed for understandability. Diego Ongaro's 2014 paper. Powers etcd, Consul, TiKV, CockroachDB, every modern Raft implementation in every path's capstone.

Saga. A long-running multi-service transaction implemented as a sequence of local transactions with compensating actions for rollback. The microservices answer to distributed transactions, since 2PC doesn't scale.

Serializability. The strongest correctness condition for concurrent transactions: the outcome is equivalent to some serial ordering. Weaker than linearizability (which adds a real-time order constraint) but the database default.

TLA+. Leslie Lamport's formal-specification language for distributed and concurrent systems. The right tool when "I think this protocol is correct" is not good enough. Used by AWS, MongoDB, Cosmos DB, every serious distributed-systems team.

Build & deployment¶

BuildKit. Docker's modern build frontend. Graph-based, parallel, with cache mounts that survive layer rebuilds. The reason docker build is now fast.

Cache mount. A BuildKit feature (RUN --mount=type=cache,target=/root/.cache/pip ...) that persists a directory across builds without baking it into the image layer. Eliminates redundant downloads.

Cargo workspace. A Rust multi-crate repo layout where Cargo.toml declares [workspace] members = [...]. Shared Cargo.lock, shared target/, atomic version bumps.

Lockfile. A file recording the exact version of every transitive dependency, ensuring reproducible installs. Cargo.lock, go.sum, package-lock.json, uv.lock, Pipfile.lock. Commit it for applications; sometimes don't for libraries.

Manifest. The top-level file declaring a project's metadata and dependencies. Cargo.toml, pom.xml, go.mod, pyproject.toml, package.json.

Multi-stage build. A Dockerfile with multiple FROM lines, where later stages copy artifacts from earlier ones via COPY --from=. Lets you compile in a heavy builder image and ship a lean runtime image.

SBOM (Software Bill of Materials). A machine-readable list of every component (and version) in a built artifact. CycloneDX and SPDX are the two formats. Required by SLSA, useful for vulnerability scanning, increasingly demanded by regulators.

Sigstore / cosign. A free, no-key signing infrastructure for container images and other artifacts. The 2026 default for supply-chain integrity.

SLSA. Supply-chain Levels for Software Artifacts. A framework for ranking supply-chain security maturity from level 0 (no controls) to level 4 (fully reproducible, signed, multi-party reviewed builds).

Testing & verification¶

Coverage-guided fuzzing. Fuzzing where the fuzzer instruments the program to track which code paths each input exercises, then mutates inputs to maximize coverage. libFuzzer, AFL, go test -fuzz, cargo-fuzz. The state of the art since ~2015.

Doctest. A test embedded in a documentation comment, verifying that the example actually works as shown. Rust's /// examples, Python's doctest module, Go's Example* functions.

Fixture. A pytest concept: a function decorated with @pytest.fixture that produces setup state for tests. Scope levels (function, class, module, session) control reuse. The right tool for "every test needs a fresh database connection."

Flaky test. A test that passes and fails non-deterministically. Almost always a sign of: timing assumption, shared state, ordering dependence, or network reliance. The single biggest CI pain.

Generator (property-based). A function producing random inputs of a given type, used by hypothesis, proptest, jqwik, etc. Combinator-built (lists(integers(min_value=0))) so you describe shape, not enumerations.

Golden file test. A test that compares output to a stored "known good" file. Update with care; the cost of a wrong golden is silent rot.

Mutation testing. Mechanically corrupting your source code (x + 1 → x - 1) and re-running tests; surviving mutants mean your tests don't actually cover the changed logic. PIT (Java), mutmut (Python), cargo-mutants (Rust).

Shrinking. A property-based-testing feature: when a generated input fails, automatically simplify it (smaller list, smaller number, fewer constructors) to the minimal failing case. The reason hypothesis/proptest/jqwik produce useful failure reports.

Snapshot testing. Like golden-file but typically inline or auto-generated. Capture the output of a function on first run, assert equality on subsequent runs. insta (Rust), jest --updateSnapshot (JS), pytest-snapshot.

Subtest. A test inside a test. Go's t.Run("subname", ...), JUnit's @Nested, pytest's parametrize. Lets you express a table of cases as one logical test.

Container / Kubernetes¶

Cgroups (control groups). A Linux kernel feature for limiting and accounting CPU/memory/IO usage of process groups. The "resource limits" half of container isolation.

CRI (Container Runtime Interface). The Kubernetes API kubelet uses to talk to a container runtime (containerd, CRI-O). The runtime in turn calls a low-level runtime (runc, crun) via OCI.

CSI (Container Storage Interface). Kubernetes' pluggable storage API. Implemented by EBS, GCE PD, Ceph, Rook, etc.

CNI (Container Network Interface). Kubernetes' pluggable networking API. Implemented by Calico, Cilium, Flannel.

Namespace (Linux). A kernel mechanism for partitioning views of system resources (PIDs, network, mounts, UIDs, ...). The "what does the container see" half of container isolation.

OCI (Open Container Initiative). The specs defining container image format, runtime contract, and distribution protocol. What docker pull is actually doing under the hood.

Reconcile loop. The Kubernetes controller pattern: observe actual state, compare to desired state, act to reconcile. Every controller and operator is a reconcile loop.

Sidecar. A second container in the same pod that augments the main one (logging, proxying, service-mesh data plane).

AI systems¶

Attention. The transformer mechanism softmax(QKᵀ/√d)V. Quadratic in sequence length without optimization (FlashAttention removes the quadratic memory cost).

Autograd. Automatic differentiation - the system that computes gradients by tracing the forward pass. PyTorch's, JAX's, TensorFlow's.

BF16 / FP16 / FP8 / FP4. Floating-point formats used in ML training/inference. BF16 has FP32's exponent range with FP16's bit width - the modern training default.

FlashAttention. A streaming-reduction algorithm that computes attention without materializing the N×N softmax matrix in HBM. The canonical example of advanced kernel fusion.

Inference. Running a trained model in production. As opposed to training. Different cost model - memory-bandwidth-bound at batch=1, compute-bound at high batch.

KV cache. In autoregressive transformer inference, the cached K and V tensors per decoded token, so each new token only re-runs the last position through attention.

Kernel fusion. Combining multiple operators into a single GPU kernel to eliminate HBM round-trips. The single highest-leverage optimization in deep-learning compilers. See AI Systems Deep Dive 12.

LoRA. Low-Rank Adaptation - fine-tuning by training small low-rank deltas instead of full weights. The default parameter-efficient fine-tuning technique.

MoE (Mixture of Experts). A model architecture where each token activates only a subset of "expert" sub-networks, decoupling parameter count from per-token compute.

Quantization. Reducing the bit-width of weights/activations (FP32 → INT8 / INT4 / FP8) to save memory and bandwidth. Inference-side default in production LLM serving.

RAG (Retrieval-Augmented Generation). Pre-pending retrieved documents to a model's context to ground its answers. The dominant production pattern for LLM applications over private data.

Tensor core. Specialized matmul units on modern NVIDIA GPUs (since Volta). The reason FP16/BF16/FP8 training reaches peak throughput.

Observability¶

Flame graph. A visualization invented by Brendan Gregg: width = time spent, stack of bars = call stack. The right default for CPU, allocation, lock-contention, and wall-clock profiles. See observability.

JFR (Java Flight Recorder). The JVM's built-in production profiler. ~1% overhead, always-on capable, captures GC pauses, allocations, JIT events, locks, I/O.

OpenTelemetry. The cross-language standard for traces and (increasingly) metrics and logs. The right default for new instrumentation.

Prometheus / OpenMetrics. The de facto pull-based metrics format and protocol.

pprof. Go's built-in profiler (and protocol, also used by Brendan Gregg's profilers). CPU, heap, goroutine, mutex, block.

Trace ID. A request-scoped identifier propagated through every service hop. Joins traces ↔ metrics ↔ logs in observability backends.

Networking¶

epoll. Linux's I/O multiplexing syscall - register interest in many file descriptors, get notified when any is ready. The foundation under Go's netpoller, Tokio, asyncio, and Node.js's libuv on Linux.

io_uring. Modern Linux I/O (since 5.1) using shared submission and completion ring buffers between userspace and kernel. Reduces syscall overhead vs epoll. Tokio and some libuv configurations opt in.

kqueue. BSD/macOS equivalent of epoll. Same role.

gRPC. Google's high-performance RPC framework. Protobuf for the schema, HTTP/2 for transport, code generation per language. Schema-first, streaming-native, binary.

TLS termination. Decrypting TLS at one point in the request path (typically an Ingress or load balancer) so upstream services see plain HTTP. Cheap upstream, single cert-rotation point.

Backpressure. A flow-control signal that says "I can't accept more right now." TCP windows are the canonical example. Reactive Streams in Java, Tokio's Sink, asyncio queues, Go channels (full = blocks sender) all express it.

Async runtimes¶

Continuation. A reified call stack - the state needed to resume a paused computation. Underpins virtual threads (Loom), async/await desugaring.

Color problem (function coloring). Some languages split functions into "async" and "non-async" worlds. async fns can only call other async fns. Rust, Python, JavaScript have this; Go and post-Loom Java don't.

Executor. The component that polls futures or runs tasks. Tokio's runtime, Java's ForkJoinPool, asyncio's event loop, Go's GMP scheduler.

Cooperative scheduling. Tasks yield control voluntarily (at await points). Python asyncio, Rust async/await. Contrast preemptive (Go, Java virtual threads, OS threads).

Reactor pattern. Single event loop demultiplexes I/O events to handler callbacks. Underlies Node.js, asyncio's event loop, Netty's NioEventLoopGroup.

Distributed systems (consensus + replication)¶

Quorum. A majority of replicas required to make progress. For N replicas, quorum = ⌊N/2⌋ + 1. Survives ⌊(N-1)/2⌋ failures.

Linearizability. The strongest consistency model: every operation appears to happen at a single point between its invocation and completion. Implies real-time order. Distinct from serializability (which is the transactional analogue).

Log replication. Distributed systems' workhorse pattern. Leader writes an append-only log; followers replicate; commits happen when a quorum has acknowledged. Raft, Paxos, Kafka all use it.

Split brain. Two halves of a partitioned cluster each electing their own leader, accepting writes, diverging. Quorum-based systems prevent this; non-quorum systems must reconcile after the partition heals.

Containers & deployment¶

OCI (Open Container Initiative). Standards for container image format, runtime contract, and distribution protocol. Docker, Podman, containerd, runc all implement OCI.

Multi-stage build. A Dockerfile with multiple FROM stages. Build artifacts in one stage; copy just the result into a minimal final stage. The single most impactful image-slimming technique.

Distroless. Google-published base images containing only the runtime (or nothing at all) - no shell, no package manager. Tiny attack surface; harder to debug.

Buildpacks. A reproducible image-build system that doesn't require writing Dockerfiles. pack build infers the build from your source. Cloud-Foundry-origin, now CNCF.

Sidecar. A second container in the same pod that augments the main one (logging, proxying, service-mesh data plane).

Helm chart. A package of templated Kubernetes manifests + a values file. Helm's install/upgrade/rollback make complex apps a one-command deploy.

Operator. A controller-pattern app that watches custom resources and reconciles desired state. Examples: cert-manager (Certificates), prometheus-operator (Prometheus), postgres-operator (PostgresCluster).

GitOps. Pattern where cluster state lives in a git repo; a controller (Argo CD, Flux) reconciles cluster to match git. Push to git → cluster converges. Auditable, rollback = git revert.

Service mesh. A layer that handles service-to-service communication features (mTLS, retries, circuit breaking, observability) outside the app. Istio, Linkerd, Cilium Service Mesh. Implemented via sidecars or eBPF.

Ingress. Kubernetes' L7 routing resource. Configures an ingress controller (nginx, Traefik) to route HTTP traffic to services based on host/path. Newer alternative: Gateway API.

Performance methodology¶

Amdahl's law. Speedup bound: if X% of work is serial, the maximum speedup is 1/X. A 10%-serial task tops out at 10× speedup regardless of cores.

Coordinated omission. Naive latency measurement misses the time of requests that were never sent because the previous one was in flight. Real percentile measurement (HdrHistogram-style) accounts for this. Made famous by Gil Tene.

p99 / p99.9 latency. The 99th / 99.9th percentile of latency. Captures tail behavior that means / medians hide. Always look at percentiles, not averages.

RED method. Service-level metrics: Rate (req/s), Errors (req/s), Duration (latency distribution). The three-dashboard minimum for any service.

USE method. Resource-level metrics: Utilization, Saturation, Errors. Per CPU, memory, disk, network. The other end of the RED method, for system-level diagnosis.

Cache-line. The unit (typically 64 bytes) that the CPU fetches from memory at a time. False sharing happens when two unrelated variables share a cache line and are mutated from different cores.

Roofline model. Performance-bound visualization: any kernel is bound either by compute (FLOPs/s) or by memory bandwidth (B/s). The "roofline" plot tells you which.