Week 22 - Distributed Storage Patterns¶

22.1 Conceptual Core¶

The matching engine of a distributed storage system is consensus (week 21). The engineering of one is everything around it: durable storage, replication, partitioning, snapshotting, repair, observability.
Three patterns to know:
Replicated state machine (Raft, Paxos): one consensus group, each node holds the full data set. Linearizable; throughput limited by the leader.
Sharded replicated (etcd, CockroachDB ranges): many consensus groups, one per data shard. Horizontal scale.
Eventually consistent (Cassandra, DynamoDB): no consensus on the write path; quorum reads, hinted handoff, anti-entropy. Different consistency model.

22.2 Mechanical Detail¶

WAL discipline: every state-changing op is durably logged before acknowledgment. fsync after each batch (or per-op for stricter durability). The WAL is the source of truth for recovery.
Snapshots: periodic point-in-time captures of the state machine. Truncate the WAL behind them. Snapshot format must be efficient to ship to a recovering follower.
Membership changes: adding/removing nodes is the hardest correctness boundary. Raft's "joint consensus" handles this. Both hashicorp/raft and etcd-io/raft provide APIs; do not roll your own.
Linearizable reads: three options-read from the leader after a heartbeat round (etcd "linearizable read" with read-index), read from any node with a lease, or read after a no-op append. Each has tradeoffs.
Storage engine choice: BoltDB (simple, single-writer, great for Raft logs), BadgerDB (LSM-based, higher throughput), Pebble (CockroachDB's RocksDB replacement, the modern choice for high-throughput).

22.3 Lab-"Harden the KV Store"¶

Take the week 21 Raft KV and add: 1. Pebble as the storage engine for both the WAL and the state machine. 2. Snapshots every N entries, with InstallSnapshot to recovering followers. 3. Linearizable reads via read-index. 4. Membership changes: add and remove nodes online. 5. Metrics: per-node Raft state, log lag, snapshot duration, apply latency.

22.4 Idiomatic & `golangci-lint` Drill¶

errcheck, errorlint, wrapcheck. Distributed-systems code is almost entirely error handling; lint rigor is non-optional.

22.5 Production Hardening Slice¶

Add a Jepsen-style "nemesis" goroutine to your test harness that randomly partitions, pauses, and restarts nodes. Verify linearizability over 1M operations.