Month 10-Week 4: Capstone v0.1 ship + month-10 retro¶

Week summary¶

Goal: Polish capstone for public ship. Tests passing. Excellent README. Eval results documented. Soft launch in 1-2 channels.
Time: ~9 h over 3 sessions.
Output: Capstone v0.1.0 publicly shipped. Month-10 retro.

Why this week matters¶

A v0.1 release is a commitment to the world: "this exists; it works for these tasks; here are the numbers." It's also what month 11's blog post will be about.

Prerequisites¶

M10-W01–W03 complete.

Recommended cadence¶

Session A-Tue/Wed evening (~3 h): tests + CI
Session B-Sat morning (~3.5 h): README polish + RESULTS.md
Session C-Sun afternoon (~2.5 h): soft launch + retro

Session A-Tests + CI¶

Goal: Tests pass on every push. CI green.

Part 1-Audit test coverage (45 min)¶

Run a coverage tool (pytest-cov). Identify untested major paths.

Part 2-Add tests for core paths (90 min)¶

You don't need 100%-focus on the hot paths: - Public API (entry-point function). - Eval pipeline. - Critical scoring or runtime logic.

Part 3-CI green (45 min)¶

# .github/workflows/ci.yml
name: ci
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v2
      - run: uv sync
      - run: uv run pytest
      - run: uv run ruff check

Verify badges in README.

Output of Session A¶

Test suite passing.
CI green.

Session B-README + RESULTS.md¶

Goal: README a stranger can fully consume. RESULTS.md with all eval numbers.

Part 1-README (90 min)¶

Structure: 1. Title + 1-line tagline. 2. Why (motivation, 1 paragraph). 3. Quickstart (3-5 commands that work on a fresh clone). 4. Examples (1-2 runnable examples). 5. Results (table summary; full in RESULTS.md). 6. Compared to (1 paragraph). 7. License + citation if applicable.

Part 2-RESULTS.md (60 min)¶

Full eval breakdown: - Setup + dataset. - Numbers per metric with bootstrap CIs. - Comparison vs incumbents. - Failure-mode analysis.

Part 3-Inspect for friction (30 min)¶

Re-clone in a fresh directory. Run quickstart. Verify it works.

Output of Session B¶

Polished README + RESULTS.md.

Session C-Soft launch + retro¶

Goal: Tag v0.1.0. Soft launch in 1-2 channels. Run month retro.

Part 1-Tag v0.1.0 (15 min)¶

git tag v0.1.0
git push --tags
gh release create v0.1.0 --notes "Initial public release."

Part 2-Soft launch (45 min)¶

Soft (not the big launch-that's M11): - Tweet announcing. - Pin on GitHub profile. - Post in 1-2 relevant Discords / Slacks. - Link from your existing blog posts that referenced "M11 launch."

This is intentional-M11 is when you make noise. M10's v0.1 is the underlying artifact.

Part 3-Month-10 retro (60 min)¶

MONTH_10_RETRO.md:

# Month 10 retro

## Artifacts shipped
- Capstone v0.1.0 publicly tagged
- 5+ features
- Tests + CI green
- One observed user session
- Top 3 confusions fixed

## KPIs vs Q4 targets
| Metric | Q4 Target | End of M10 |
|---|---|---|
| Capstone v0.1 | Y | ✓
| User observed | Y | ✓
| Tests in place | Y | ✓

## Lessons
1. ...
2. ...

## M11 plan
- 3000-word capstone post.
- A talk (internal first; external as stretch).
- Outreach to 5 specific people.

Output of Session C¶

v0.1.0 release on GitHub.
Month-10 retro committed.

End-of-week artifact¶

End-of-week self-assessment¶

My capstone is legitimately public-anyone can clone and run.
My results are documented honestly.
I'm ready to make noise about it next week.

Common failure modes for this week¶

README that assumes context. Strangers don't have it.
No CI. Untested code rots; invisible to outsiders.
Soft launch into nowhere. Pick channels with at least 100 likely viewers.

What's next (preview of M11-W01)¶

The capstone long-form blog post. ~3500 words. The single most career-leveraged piece of the year.