Month 2-Week 4: Course wrap, ablation post, transformer preview¶

Week summary¶

Goal: Finish the foundational portion of your course. Publish your second blog post analyzing the ablation study with bootstrap CIs. Begin watching Karpathy's Zero to Hero-bridge to the transformer month.
Time: ~9 h over 3 sessions.
Output: Second public blog post; new repo transformer-from-scratch initialized; month-2 retrospective.
Sequences relied on: 06-classical-ml rungs 08, 09; 03-probability-statistics rung 09; 08-transformers rung 01.

Why this week matters¶

Two arcs close: the classical-ML foundation arc, and the "writing publicly about your work" arc. Both produce visible compounding artifacts. Then a third arc opens: transformers. Month 3 is the most important month of your year. Beginning early sets the stage.

The blog post on the ablation study is non-trivial because it's the kind of analysis-driven writing AI hiring managers screen for. Done well, it signals "this person reasons about uncertainty"-a rare and valuable signal.

Prerequisites¶

M02-W01–W03 complete.
Bootstrap CIs computed in W02.

Recommended cadence¶

Session A-Tue/Wed evening (~3 h): course wrap + bootstrap re-analysis
Session B-Sat morning (~3.5 h): blog post draft + Karpathy preview
Session C-Sun afternoon (~2.5 h): publish + month retrospective

Session A-Course wrap + bootstrap analysis revisit¶

Goal: Finish foundational lectures of your course. Re-analyze your ablation results with proper bootstrap CIs.

Part 1-Course final foundational lecture (90 min)¶

fast.ai Lesson 5-"Collaborative filtering and tabular": - Watch. - Run the embedding-based collaborative filtering notebook. Embeddings appear here for the first time-note this; we'll meet them again everywhere.

Ng path: Course 2 weeks 3–4 (decision trees, ensemble methods).

You don't need to finish the whole course this month. Finish what makes the foundations cohesive, then continue at your own pace alongside Q3 work.

Part 2-Re-analyze ablations (60 min)¶

Pull up your W02 ablation data.

Read Allen Downey's Think Stats, Chapter 9 (free PDF online, search "thinkstats2 pdf"). Section on bootstrap and hypothesis testing.

Apply rigorously: For each pair of variants, compute the bootstrap distribution of the difference:

import numpy as np

def bootstrap_diff(a, b, n=10000):
    a, b = np.array(a), np.array(b)
    diffs = []
    for _ in range(n):
        sa = np.random.choice(a, len(a), replace=True).mean()
        sb = np.random.choice(b, len(b), replace=True).mean()
        diffs.append(sa - sb)
    return np.array(diffs)

diff = bootstrap_diff(augment_accs, baseline_accs)
ci = np.percentile(diff, [2.5, 97.5])
print(f"augment - baseline: mean={diff.mean():.4f}, 95% CI {ci}")
print(f"P(augment > baseline) = {(diff > 0).mean():.4f}")

The probability P(augment > baseline) is more honest than a binary "significant or not."

Part 3-Hone the message (30 min)¶

Decide your post's thesis. Possibilities: - "Three seeds is the minimum-here's what one seed obscured." - "My ablation looked significant but bootstrap showed it wasn't." - "Bootstrap confidence intervals applied to a real ML experiment."

Pick the one your data actually supports. Write a 50-word abstract.

Output of Session A¶

Course week 5 notebook done.
Bootstrap difference analysis with P(A > B) numbers.
Blog post abstract decided.

Session B-Blog post draft + transformer preview¶

Goal: Draft the blog post. Begin watching Karpathy.

Part 1-Draft the post (90 min)¶

Outline (~1500 words): 1. Hook: "I ran my first ML ablation. With one seed, augment beat baseline by 2 points. With three seeds and a bootstrap, the picture changed." 2. Why seed variance matters. A page on what seeds do, why one-seed comparisons are unreliable. 3. The experiment. 3 variants × 3 seeds × CIFAR-10 (or your dataset). 4. The naive analysis. Mean accuracies; the apparent winner. 5. The honest analysis. Bootstrap distributions; CI overlap; P(A > B). 6. Lessons. What you'll do differently next time. (Always 3 seeds minimum, always bootstrap, always report CIs.) 7. Why this scales to LLM evals. Forward-look: the same discipline applies in Q2 when comparing prompts.

Write the full draft. Don't perfect-just complete.

Part 2-Watch Karpathy lecture 2 (90 min)¶

Karpathy Zero to Hero Lecture 2-makemore, part 1: bigram model. - ~80 min. - This is the first character-level language model. - Type along in a new repo transformer-from-scratch.

By the end you have a model that produces "name-like strings." Not transformer yet, but the data pipeline (tokenize → train → sample) is the same.

Output of Session B¶

Blog post draft.
transformer-from-scratch/01-bigram.ipynb working.

Session C-Publish + month-2 retro¶

Goal: Polish and publish the post. Run the month retrospective.

Part 1-Polish + publish (60 min)¶

Edit. Cut filler. Read aloud.
Add charts: bootstrap distribution histograms.
Embed the comparison table.
Publish to your blog.
Cross-post: dev.to, Reddit r/MachineLearning (Project flair), LinkedIn.

Part 2-Engage (30 min)¶

Spend 30 minutes reading reactions. Respond to substantive comments. Note any questions you didn't expect-those are seeds for future posts.

Part 3-Month-2 retrospective (60 min)¶

Write MONTH_2_RETRO.md:

# Month 2 retro

## Artifacts shipped
- 4 course-week notebooks
- Trained classifier + (optionally) deployed Space
- Ablation study (3 × 3) with bootstrap CIs
- XGBoost vs MLP on tabular data with 5-fold CV
- Blog post: <link>
- transformer-from-scratch/01-bigram.ipynb

## KPIs vs Q1 targets
- Public repos: 3 (target end-of-Q1: 3) ✓
- Blog posts: 2 (target end-of-Q1: 1) ✓ ahead

## Biggest insights
1. ...
2. ...
3. ...

## What slipped

## Pace check (sustainable / accelerated / behind)

## M03 plan
- Most important month of the year.
- Karpathy lectures 2, 3, 4 in M03-W01.
- Lecture 6 (transformer build) in M03-W02-block weekend.
- nanoGPT in M03-W03.
- Modification + Q1 retrospective post in M03-W04.

## Reading queue for M03
- Attention Is All You Need (arXiv 1706.03762)
- Jay Alammar Illustrated Transformer
- nanoGPT repo (read code in advance)

Output of Session C¶

Second public blog post live.
Month-2 retrospective committed.

End-of-week artifact¶

Second public blog post live, shared in ≥2 channels
MONTH_2_RETRO.md written
transformer-from-scratch/01-bigram.ipynb started
Course foundational lectures done

End-of-week self-assessment¶

I can defend a model claim with seed variance and CI evidence.
I can write a 1500-word post in a single weekend.
I am ready for the densest month of the year.

Common failure modes for this week¶

Polishing the post for two weeks. Ship at 80%. The next post raises the bar.
Watching Karpathy passively. Type along; build the repo.
Skipping the retro. It's the cheapest leverage you have.

What's next (preview of M03-W01)¶

Karpathy lectures 2–4 in depth: bigram, MLP language model, batch norm, initialization. The runway to attention.