Saltar a contenido

Week 21 - Implementing Complex Data Structures From Scratch

21.1 Conceptual Core

  • The std collections (Vec, HashMap, BTreeMap, VecDeque) are excellent but generic. Real systems regularly need: lock-free hash tables, B-trees specialized to a key shape, intrusive linked lists, slab allocators, MPMC queues with bounded backpressure, log-structured merge trees, skip lists.
  • Building these from scratch once teaches the patterns: cache-friendly layout, intrusive vs extrusive, the cost of Box, the role of NonNull in linked structures.

21.2 Mechanical Detail-three target structures

(a) A B-Tree map. - Branching factor ~6 by default in std; tune for your key/value sizes. - Inner nodes store keys + child pointers; leaf nodes store keys + values. - Fixed-capacity arrays ([MaybeUninit<K>; B]) avoid per-key allocations. - The split/merge invariants are the entire algorithmic content; the Rust difficulty is expressing the invariants in safe code.

(b) A lock-free concurrent hash map. - Study dashmap (sharded, simpler) and flurry (Java-ConcurrentHashMap port using crossbeam-epoch). Build a sharded version yourself first; only attempt epoch-based after week 22's allocator work. - Sharding: N shards, each a RwLock<HashMap<K, V>>. Hash key, mod-shard, lock the shard. Trivially correct, scales linearly until shard contention. - Lock-free: open addressing with atomic CAS on slots, epoch-based reclamation for resizes. This is hard.

(c) An intrusive doubly-linked list. - The Linux kernel pattern: nodes embed prev/next pointers, list operations are O(1) and zero-allocation. - In Rust, this requires Pin (nodes can't move once linked) and careful unsafe code. Read intrusive-collections and tokio::sync::notify's waiter list as references.

21.3 Lab-"Pick One and Ship It"

Implement one of the three to publishable quality: - Property-tested against std equivalent. - Miri-clean. - Loom-verified (for the lock-free). - Criterion-benchmarked against std/dashmap/intrusive-collections. - README explains the algorithmic choice and tradeoffs.

21.4 Idiomatic & Clippy Drill

  • clippy::missing_safety_doc, clippy::undocumented_unsafe_blocks, clippy::pedantic group. By now these should fire rarely.

21.5 Production Hardening Slice

  • Publish to a private registry (or a personal GitHub Packages registry). Tag a v0.1.0. Run cargo semver-checks against the tag from this point on.

Comments