Saltar a contenido

Week 18 - Zero-Copy I/O and the Poll-Based Model

18.1 Conceptual Core

  • Zero-copy in the small means avoiding memcpy between buffers. In the large it means retaining the same allocation from kernel boundary through parsing into application data structures.
  • The poll-based model (epoll/kqueue/IOCP) returns readiness, not data. The application reads when ready, into a buffer it owns. This is the model mio exposes; tokio builds on top of it.
  • io_uring is the alternative completion-based model on modern Linux: the application submits a request, the kernel performs it and signals completion. Better throughput at high QPS; integrating it cleanly with Rust's borrow model is non-trivial (tokio-uring, glommio, monoio).

18.2 Mechanical Detail

  • bytes::Bytes and BytesMut: refcounted byte buffers that support cheap slicing and splitting. Bytes::slice produces a new Bytes that points into the same allocation. The cornerstone of zero-copy parsing pipelines.
  • AsyncRead/AsyncWrite (Tokio): the async analogue of Read/Write. The tokio variants take a ReadBuf to allow zero-init buffers; the futures-rs variants (used by Smol) use &mut [u8].
  • Vectored I/O: readv/writev. IoSlice and IoSliceMut in std. Avoids small-write coalescing copies.
  • sendfile(2) and splice(2): kernel-mediated copy avoidance for proxy workloads. Wrappers exist in nix and rustix.
  • Parsers that borrow from input: nom and winnow produce &'a [u8] references into the source buffer. serde with #[serde(borrow)] on &'a str fields. Combine with Bytes to keep allocations alive.

18.3 Lab-"A Zero-Copy Line Protocol"

Build a server speaking a minimal newline-delimited protocol: - Read into a BytesMut with try_read_buf. - Parse line-by-line with winnow, yielding &[u8] slices. - Push each parsed message into a downstream channel as a Bytes (cloned cheaply, shared with the parser's allocation). - Benchmark with wrk or tcpkali. Inspect with perf and confirm __memcpy is not a hot frame.

18.4 Idiomatic & Clippy Drill

  • clippy::read_zero_byte_vec, clippy::unbuffered_bytes, clippy::needless_collect. Each is a hint that you're allocating where a `Bytes - style flow would suffice.

18.5 Production Hardening Slice

  • Add tokio-console instrumentation. Set tokio_unstable in RUSTFLAGS (only in dev/profile builds). Run a 60-second load test and capture a perf record flamegraph. Commit the SVG.

Comments