Month 7-Week 2: Frontier paper + substantive build¶

Week summary¶

Goal: Read DeepSeek-V3 technical report (the best public training writeup of 2024–2025). Substantive build on the track project-milestone-meaningful progress, not just commits.
Time: ~10 h over 3 sessions.
Output: Frontier paper notes; track repo at meaningful state-of-progress.
Sequences relied on: track-specific (12 / 11 / 14).

Why this week matters¶

Reading a frontier paper monthly keeps your mental model current. DeepSeek-V3 is unusually well-written-the architectural and training details are there. Knowing what 2024–2025 frontier teams actually do separates you from engineers stuck in 2022 transformer mental models.

Prerequisites¶

M07-W01 complete with v0.0.1 tagged.

Recommended cadence¶

Session A-Tue/Wed evening (~3 h): DeepSeek-V3 deep read
Session B-Sat morning (~4 h): track build
Session C-Sun afternoon (~3 h): track build + short progress post

Session A-DeepSeek-V3 technical report¶

Goal: Read the architecture and training sections deeply. Take notes on what surprised you.

Part 1-First pass (75 min)¶

Find the report on arXiv (search "DeepSeek-V3 technical report"). Read sections: - Abstract + Introduction. - Model Architecture (note: MLA-Multi-head Latent Attention; MoE design). - Training (FP8 mixed precision; load-balancing; pipelining). - Pre-training data + tokenizer. - Post-training (SFT + RL with GRPO).

Don't aim for full understanding. Orient.

Part 2-Second pass with notes (75 min)¶

Write paper_notes/deepseek_v3.md covering: - MLA: how is it different from standard attention? What does it save? - MoE design: how many experts? How is routing done? - FP8 training: what's hard about it? What did they do? - GRPO: brief sketch (you'll see it again in M08). - 5 things you didn't know before reading.

Part 3-Translate to your track (30 min)¶

How does anything in DeepSeek-V3 inform your track project? - (A) Their eval methodology-what's different from western labs? - (B) Did they use agentic methods anywhere in training? - (C) FP8 + MLA-direct relevance to inference infra.

Even if "no direct application," seeing how a frontier team thinks about engineering at scale is itself the lesson.

Output of Session A¶

paper_notes/deepseek_v3.md (~700 words).
5 surprises captured.

Session B-Track build, day 1¶

Goal: Substantive feature progress. Cut scope where needed.

Part 1-Pick the day's milestone (15 min)¶

From DESIGN.md, the next milestone. Make it small enough to finish in ~3 hours.

Part 2-Build (180 min)¶

Heads down. Tests where applicable.

Track A example milestones: - Add a multi-scorer composer. - Add a BaseModelGradedScorer with a configurable rubric. - Add caching of LLM-as-judge calls.

Track B example milestones: - Tool registry with a tool-loading API. - Standardize trajectory logging format. - Run on 30 SWE-bench-Lite issues; capture success rate.

Track C example milestones: - Sweep batch sizes: 1, 4, 16, 32. Capture throughput curve. - Compare AWQ-int4 vs fp16 on accuracy + latency. - Implement a benchmark harness with concurrent users.

Part 3-Commit + retro (15 min)¶

Commit. Update LEARNING_LOG: "what I shipped, what I learned."

Output of Session B¶

1 substantive milestone shipped.

Session C-Track build, day 2 + progress post¶

Goal: Continue building. Write a short public progress update.

Part 1-Build (120 min)¶

Same rhythm as Session B.

Part 2-Progress post (60 min)¶

A short (800–1000 word) post: "Q3 week 2 update-what I'm building and what I've learned so far."

This is not the big specialty post (that's M07-W04). It's a shorter check-in. Why publish: - Forces clarity weekly. - Builds an audience for the bigger post. - Future hiring reads progressive thinking.

Outline: 1. The track and the project (200 words). 2. What I built this week (300 words). 3. What surprised me (300 words). 4. What's next (100 words).

Publish to your blog. Cross-post to one other channel (X, dev.to, LinkedIn).

Part 3-Commit (15 min)¶

Push v0.1.0 if scope justifies. Update LEARNING_LOG.

Output of Session C¶

2nd milestone shipped.
Short progress post published.

End-of-week artifact¶

DeepSeek-V3 paper notes
Two substantive milestones committed
Short progress post published

End-of-week self-assessment¶

I can summarize DeepSeek-V3's architecture in 5 minutes.
My track repo has measurable progress.
I'm publishing weekly, even small.

Common failure modes for this week¶

Skipping the frontier paper. It's how you stay current.
Big-bang building. Two small milestones beat one too-ambitious one.
Not publishing. "I haven't done enough yet." Publish anyway.

What's next (preview of M07-W03)¶

Honest head-to-head comparison of your project vs the incumbent. The result is the lever for M07-W04.