Skip to content

Month 7-Week 2: Frontier paper + substantive build

Week summary

  • Goal: Read DeepSeek-V3 technical report (the best public training writeup of 2024–2025). Substantive build on the track project-milestone-meaningful progress, not just commits.
  • Time: ~10 h over 3 sessions.
  • Output: Frontier paper notes; track repo at meaningful state-of-progress.
  • Sequences relied on: track-specific (12 / 11 / 14).

Why this week matters

Reading a frontier paper monthly keeps your mental model current. DeepSeek-V3 is unusually well-written-the architectural and training details are there. Knowing what 2024–2025 frontier teams actually do separates you from engineers stuck in 2022 transformer mental models.

Prerequisites

  • M07-W01 complete with v0.0.1 tagged.
  • Session A-Tue/Wed evening (~3 h): DeepSeek-V3 deep read
  • Session B-Sat morning (~4 h): track build
  • Session C-Sun afternoon (~3 h): track build + short progress post

Session A-DeepSeek-V3 technical report

Goal: Read the architecture and training sections deeply. Take notes on what surprised you.

Part 1-First pass (75 min)

Find the report on arXiv (search "DeepSeek-V3 technical report"). Read sections: - Abstract + Introduction. - Model Architecture (note: MLA-Multi-head Latent Attention; MoE design). - Training (FP8 mixed precision; load-balancing; pipelining). - Pre-training data + tokenizer. - Post-training (SFT + RL with GRPO).

Don't aim for full understanding. Orient.

Part 2-Second pass with notes (75 min)

Write paper_notes/deepseek_v3.md covering: - MLA: how is it different from standard attention? What does it save? - MoE design: how many experts? How is routing done? - FP8 training: what's hard about it? What did they do? - GRPO: brief sketch (you'll see it again in M08). - 5 things you didn't know before reading.

Part 3-Translate to your track (30 min)

How does anything in DeepSeek-V3 inform your track project? - (A) Their eval methodology-what's different from western labs? - (B) Did they use agentic methods anywhere in training? - (C) FP8 + MLA-direct relevance to inference infra.

Even if "no direct application," seeing how a frontier team thinks about engineering at scale is itself the lesson.

Output of Session A

  • paper_notes/deepseek_v3.md (~700 words).
  • 5 surprises captured.

Session B-Track build, day 1

Goal: Substantive feature progress. Cut scope where needed.

Part 1-Pick the day's milestone (15 min)

From DESIGN.md, the next milestone. Make it small enough to finish in ~3 hours.

Part 2-Build (180 min)

Heads down. Tests where applicable.

Track A example milestones: - Add a multi-scorer composer. - Add a BaseModelGradedScorer with a configurable rubric. - Add caching of LLM-as-judge calls.

Track B example milestones: - Tool registry with a tool-loading API. - Standardize trajectory logging format. - Run on 30 SWE-bench-Lite issues; capture success rate.

Track C example milestones: - Sweep batch sizes: 1, 4, 16, 32. Capture throughput curve. - Compare AWQ-int4 vs fp16 on accuracy + latency. - Implement a benchmark harness with concurrent users.

Part 3-Commit + retro (15 min)

Commit. Update LEARNING_LOG: "what I shipped, what I learned."

Output of Session B

  • 1 substantive milestone shipped.

Session C-Track build, day 2 + progress post

Goal: Continue building. Write a short public progress update.

Part 1-Build (120 min)

Same rhythm as Session B.

Part 2-Progress post (60 min)

A short (800–1000 word) post: "Q3 week 2 update-what I'm building and what I've learned so far."

This is not the big specialty post (that's M07-W04). It's a shorter check-in. Why publish: - Forces clarity weekly. - Builds an audience for the bigger post. - Future hiring reads progressive thinking.

Outline: 1. The track and the project (200 words). 2. What I built this week (300 words). 3. What surprised me (300 words). 4. What's next (100 words).

Publish to your blog. Cross-post to one other channel (X, dev.to, LinkedIn).

Part 3-Commit (15 min)

Push v0.1.0 if scope justifies. Update LEARNING_LOG.

Output of Session C

  • 2nd milestone shipped.
  • Short progress post published.

End-of-week artifact

  • DeepSeek-V3 paper notes
  • Two substantive milestones committed
  • Short progress post published

End-of-week self-assessment

  • I can summarize DeepSeek-V3's architecture in 5 minutes.
  • My track repo has measurable progress.
  • I'm publishing weekly, even small.

Common failure modes for this week

  • Skipping the frontier paper. It's how you stay current.
  • Big-bang building. Two small milestones beat one too-ambitious one.
  • Not publishing. "I haven't done enough yet." Publish anyway.

What's next (preview of M07-W03)

Honest head-to-head comparison of your project vs the incumbent. The result is the lever for M07-W04.

Comments