Month 7-Week 2: Frontier paper + substantive build¶
Week summary¶
- Goal: Read DeepSeek-V3 technical report (the best public training writeup of 2024–2025). Substantive build on the track project-milestone-meaningful progress, not just commits.
- Time: ~10 h over 3 sessions.
- Output: Frontier paper notes; track repo at meaningful state-of-progress.
- Sequences relied on: track-specific (12 / 11 / 14).
Why this week matters¶
Reading a frontier paper monthly keeps your mental model current. DeepSeek-V3 is unusually well-written-the architectural and training details are there. Knowing what 2024–2025 frontier teams actually do separates you from engineers stuck in 2022 transformer mental models.
Prerequisites¶
- M07-W01 complete with v0.0.1 tagged.
Recommended cadence¶
- Session A-Tue/Wed evening (~3 h): DeepSeek-V3 deep read
- Session B-Sat morning (~4 h): track build
- Session C-Sun afternoon (~3 h): track build + short progress post
Session A-DeepSeek-V3 technical report¶
Goal: Read the architecture and training sections deeply. Take notes on what surprised you.
Part 1-First pass (75 min)¶
Find the report on arXiv (search "DeepSeek-V3 technical report"). Read sections: - Abstract + Introduction. - Model Architecture (note: MLA-Multi-head Latent Attention; MoE design). - Training (FP8 mixed precision; load-balancing; pipelining). - Pre-training data + tokenizer. - Post-training (SFT + RL with GRPO).
Don't aim for full understanding. Orient.
Part 2-Second pass with notes (75 min)¶
Write paper_notes/deepseek_v3.md covering:
- MLA: how is it different from standard attention? What does it save?
- MoE design: how many experts? How is routing done?
- FP8 training: what's hard about it? What did they do?
- GRPO: brief sketch (you'll see it again in M08).
- 5 things you didn't know before reading.
Part 3-Translate to your track (30 min)¶
How does anything in DeepSeek-V3 inform your track project? - (A) Their eval methodology-what's different from western labs? - (B) Did they use agentic methods anywhere in training? - (C) FP8 + MLA-direct relevance to inference infra.
Even if "no direct application," seeing how a frontier team thinks about engineering at scale is itself the lesson.
Output of Session A¶
paper_notes/deepseek_v3.md(~700 words).- 5 surprises captured.
Session B-Track build, day 1¶
Goal: Substantive feature progress. Cut scope where needed.
Part 1-Pick the day's milestone (15 min)¶
From DESIGN.md, the next milestone. Make it small enough to finish in ~3 hours.
Part 2-Build (180 min)¶
Heads down. Tests where applicable.
Track A example milestones:
- Add a multi-scorer composer.
- Add a BaseModelGradedScorer with a configurable rubric.
- Add caching of LLM-as-judge calls.
Track B example milestones: - Tool registry with a tool-loading API. - Standardize trajectory logging format. - Run on 30 SWE-bench-Lite issues; capture success rate.
Track C example milestones: - Sweep batch sizes: 1, 4, 16, 32. Capture throughput curve. - Compare AWQ-int4 vs fp16 on accuracy + latency. - Implement a benchmark harness with concurrent users.
Part 3-Commit + retro (15 min)¶
Commit. Update LEARNING_LOG: "what I shipped, what I learned."
Output of Session B¶
- 1 substantive milestone shipped.
Session C-Track build, day 2 + progress post¶
Goal: Continue building. Write a short public progress update.
Part 1-Build (120 min)¶
Same rhythm as Session B.
Part 2-Progress post (60 min)¶
A short (800–1000 word) post: "Q3 week 2 update-what I'm building and what I've learned so far."
This is not the big specialty post (that's M07-W04). It's a shorter check-in. Why publish: - Forces clarity weekly. - Builds an audience for the bigger post. - Future hiring reads progressive thinking.
Outline: 1. The track and the project (200 words). 2. What I built this week (300 words). 3. What surprised me (300 words). 4. What's next (100 words).
Publish to your blog. Cross-post to one other channel (X, dev.to, LinkedIn).
Part 3-Commit (15 min)¶
Push v0.1.0 if scope justifies. Update LEARNING_LOG.
Output of Session C¶
- 2nd milestone shipped.
- Short progress post published.
End-of-week artifact¶
- DeepSeek-V3 paper notes
- Two substantive milestones committed
- Short progress post published
End-of-week self-assessment¶
- I can summarize DeepSeek-V3's architecture in 5 minutes.
- My track repo has measurable progress.
- I'm publishing weekly, even small.
Common failure modes for this week¶
- Skipping the frontier paper. It's how you stay current.
- Big-bang building. Two small milestones beat one too-ambitious one.
- Not publishing. "I haven't done enough yet." Publish anyway.
What's next (preview of M07-W03)¶
Honest head-to-head comparison of your project vs the incumbent. The result is the lever for M07-W04.