Saltar a contenido

Capstone Projects-Three Tracks, One Choice

The Month 6 capstone is the deliverable that converts this curriculum from study into evidence. Pick one track. The work performed here is the work you describe in interviews and link from a portfolio.


Track 1-Compiler / Tooling

Outcome: a merged PR (or one in advanced review) against rust-lang/rust, rust-clippy, rust-analyzer, or cargo.

Suggested scopes (ranked by tractability)

  1. Diagnostic improvement in rustc. Pick an A-diagnostics issue with a clear reproduction. Improve the error: more accurate span, structured suggestion, better wording. Realistic effort: 20–40 hours including bootstrap, review iterations, UI test churn.
  2. New clippy lint. The clippy issue tracker maintains a queue of "lint requests." Pick one tagged good-first-issue. Implement, test (UI tests + dogfood the lint against the rustc tree), document. Realistic effort: 30–60 hours.
  3. A new MIR optimization pass (advanced). Choose a narrow, well-bounded transform-e.g., a peephole simplification of a specific MIR pattern. Profile its impact with rustc-perf. Realistic effort: 60–120 hours and substantial reviewer hand-holding; treat as stretch.
  4. rust-analyzer feature. Implement a code action or completion improvement. RA's architecture is extremely well-documented; the PR loop is fast.

Acceptance criteria

  • A PR exists, is linked from your portfolio, and has at least one round of review feedback addressed.
  • A short write-up (CAPSTONE_NOTES.md) documenting: what you changed, why, what you learned about the compiler internals, and what reviewers pushed back on.
  • Your local fork has a working stage-1 build of rustc with your patch applied.

Skills exercised

  • Months 4 (macros / unsafe), 6.23 (compiler internals).
  • The hardening discipline matters less here; the deliverable is upstream code, not a service.

Track 2-High-Performance Fintech: Limit-Order-Book Matching Engine

Outcome: a benchmarked, fuzz-tested matching engine for limit and market orders across multiple symbols, single-process, with sub-microsecond p99 hot-path latency on commodity x86_64.

Functional spec

  • Order types: limit (GTC, IOC, FOK), market, cancel, modify.
  • Matching policy: price-time priority. Partial fills allowed. Self-trade prevention configurable.
  • Multi-symbol: a Engine owns N independent symbol books; symbols may be sharded across worker threads.
  • Wire format: a binary protocol (your design or a subset of FIX/SBE).
  • Output: an event stream (Filled, PartiallyFilled, Cancelled, Rejected, BookUpdate) consumed by downstream feed handlers.

Non-functional spec

  • Latency: p50 < 200 ns, p99 < 1 µs for the hot path (order in → match → event out), measured under sustained 1 M orders/sec.
  • Throughput: ≥ 1 M orders/sec sustained on a single symbol on a single core.
  • Determinism: identical inputs produce identical event sequences. No HashMap iteration order in the hot path; use deterministic structures.
  • Fault tolerance: panic-safe (panic = "abort" is acceptable; document operational implications). Persistent log for replay.

Architecture sketch

  • Hot path is single-threaded per symbol. SPSC ring buffer on input, SPSC on output. Cross-thread coordination only at session boundaries.
  • Order book: paired sorted structures (often BTreeMap<Price, OrderQueue> for asks, mirror for bids). For ultimate latency: array-of-price-levels with sparse bitmap; this is the rabbit hole `Aeron - style designs go down.
  • Allocator: mimalloc global, plus per-symbol bump arenas for short-lived order metadata.
  • Memory layout: #[repr(C)] orders, padded to cache-line boundaries; pre-allocated slabs.
  • No async on the hot path. Async is for session/admin paths only. Mixing the two is the most common architectural mistake in this space.

Test rigor

  • Property tests: proptest invariants on the book-total quantity preserved across fills, no cross of bid > ask except during a match, time priority preserved at a price level.
  • Fuzz: cargo-fuzz with `arbitrary - derived order generators. The corpus must include high-volume sequences with cancels and modifies.
  • Loom: any cross-thread sync (admin → engine, engine → publisher) must be Loom-verified.
  • Bench: criterion with regression detection in CI; flamegraphs committed; perf stat outputs (cycles, instructions, IPC, cache-misses) tracked over time.

Hardening pass

  • LTO fat, codegen-units 1, panic abort, target-cpu=native (for the deployed-on-known-hardware case).
  • PGO with a representative replay workload.
  • BOLT post-link.
  • Deterministic build via Docker pinned to a content hash.

Acceptance criteria

  • Public repo with the above.
  • A README that includes a flamegraph, a perf stat table, and a latency CDF.
  • A THREAT_MODEL.md covering the inputs you do and do not validate.
  • An interview-defensible answer to: "What does your worst-case allocation pattern look like under a 100× burst?"

Skills exercised

  • Months 3 (concurrency), 4 (unsafe / FFI for the wire codec), 5 (production architecture though the hot path skips most of the hexagonal), 6.21 (custom data structures), 6.22 (allocators).

Track 3-Kernel: A rust-for-linux Character Device

Outcome: a working out-of-tree Rust kernel module implementing a non-trivial character device, with KUnit tests, building cleanly against a recent mainline kernel.

Functional spec

  • A character device (/dev/<yourname>) that exposes an in-kernel ring buffer.
  • Operations: read (drains the ring), write (appends), ioctl for resize/clear/stats, mmap for zero-copy access (stretch).
  • Multi-reader / multi-writer with appropriate kernel synchronization (SpinLock, Mutex from the kernel crate, not std).
  • Sysfs entries for runtime tuning.

Why this scope

  • Touches every cross-FFI surface: char device registration, file operations, copy_from_user/copy_to_user, sysfs, locking.
  • Forces you to read kernel-side Rust idioms (Box::try_new, fallible alloc, Pin<&mut Self> everywhere, Arc-equivalents).
  • The rust-for-linux toolchain itself is a learning surface: pinned rustc, custom libcore subset, no std.

Build environment

  • Linux ≥ 6.8 (Rust support is stable enough for out-of-tree work).
  • rustup toolchain link kernel <path> to point at the kernel-supported rustc.
  • A local kernel build with CONFIG_RUST=y, CONFIG_SAMPLES_RUST=y.

Test rigor

  • KUnit-based unit tests inside the module.
  • `selftest - style scripts running the device through real read/write/ioctl from userspace.
  • Stress test: N concurrent readers and writers with taskset pinning, watch for KASAN/KCSAN reports.

Hardening pass

  • Kernel-side: KASAN (kernel address sanitizer), KCSAN (concurrency sanitizer), lockdep enabled in your test kernel.
  • Module-side: every unsafe block carries a // SAFETY: comment justifying the kernel invariants.
  • A `dmesg - clean run on insertion, exercise, and removal.

Acceptance criteria

  • The module builds, loads, exercises end-to-end, unloads, with no KASAN/KCSAN/lockdep warnings.
  • A PR-ready patch series formatted for git format-patch (even if not submitted upstream).
  • A KERNEL_NOTES.md describing the locking model, the failure modes you considered, and the explicit reason you chose SpinLock vs Mutex at each site.

Skills exercised

  • Months 4 (unsafe + FFI to the kernel C API), 6.22 (no_std), 6.23 (compiler internals indirectly via the pinned toolchain).

Cross-Track Requirements

Regardless of track:

  • Hardening workspace integrated. The hardening/ template from Appendix A applies.
  • Architectural Decision Records (ADRs). At least three for the capstone, each ~1 page.
  • Threat model. One page minimum, no matter the track.
  • Defense readiness. You should be able to walk a reviewer through the code in 45 minutes and answer "what fails first under load / fuzzing / a malicious input / a pathological kernel state?"

The track choice signals career direction: compiler track for tooling/PL roles, fintech for HFT/exchange/crypto roles, kernel for OS/embedded/security roles. Do not pick based on what looks easiest; pick based on where you want the next interview loop.

Comments