Month 1-Foundations: Compute Hierarchy, Tensors, Autograd, Training Loops¶
Goal: by the end of week 4 you can (a) sketch the memory and compute hierarchy from CPU register to multi-node cluster and put numerical bandwidth/latency on each step, (b) implement matrix multiplication three ways with measured performance differences, (c) implement reverse-mode automatic differentiation from scratch, and (d) write an honest training loop that handles checkpointing, mixed precision, and metrics.
This is the beginner ramp. If you already do all four, skim and proceed to Month 2. If you don't, this is the hardest month-concepts here are referenced everywhere else.
Weeks¶
- Week 1 - The Compute Hierarchy and the Cost Model
- Week 2 - Linear Algebra Refresh, BLAS, NumPy
- Week 3 - Tensors, Autograd, the Gradient Tape
- Week 4 - The Honest Training Loop