Skip to content

Prelude - The Philosophy Behind the Syllabus

Sit with this document for an evening before week 1. The rest of the curriculum is mechanically dense; this is the only chapter where we step back and define the shape of the discipline.


1. Python Is a Glue Language Riding on a Reference-Counted VM

The most damaging misconception a Python engineer can hold is that "Python is a slow scripting language with libraries." A working senior practitioner thinks the inverse:

Python is a glue language - a small, dynamically typed surface - bolted to a reference-counted bytecode VM (CPython) whose superpower is calling into native code (C, C++, Rust, Fortran, CUDA) without paying for a heavyweight FFI. That is why Python won data, ML, and AI: not because Python is fast, but because it makes fast things addressable from a REPL.

Almost every interesting performance question in production Python reduces to "does this loop stay in C, or does it cross back into Python bytecode?" Almost every elegant high-throughput Python architecture is a thin layer over numpy, torch, polars, asyncio, uvloop, or a C extension - with Python orchestrating, not computing.

Internalize this and the rest of the curriculum makes sense.


2. The Five-Axis Cost Model

A working senior Python engineer reasons about every line of code along five axes simultaneously:

Axis Question to ask
Allocation & object overhead Does this create Python objects in a hot loop? Could it stay as a NumPy/torch array, a bytes, or a memoryview?
Bytecode boundaries How many trips through the eval loop does this take? Can it be vectorized, pushed into C, or JITed (PyPy / Numba / Cython)?
Concurrency model Is this CPU-bound (→ processes / free-threaded / native release-the-GIL) or I/O-bound (→ asyncio / threads)?
Type integrity Will pyright --strict accept this? Are runtime contracts (Pydantic, attrs validators) enforced at the right boundary?
Failure What happens on KeyboardInterrupt? On asyncio.CancelledError? On a partially consumed generator that holds a file handle? On an OOM in a forked worker?

Beginner courses teach axis 1 only (and incompletely). This curriculum forces all five into your hands by week 12.


3. The "Pythonic Way" - Aesthetic as Engineering Constraint

Python's design ethic, captured in import this, is "explicit, simple, readable." That phrase is doing more work than newcomers think. Specifically:

  • Duck typing, then static typing. Protocols and structural typing (typing.Protocol) win over nominal hierarchies. Inheritance is fine, deep inheritance is not.
  • EAFP, not LBYL. "Easier to ask forgiveness than permission" - try/except is idiomatic, if hasattr(...) is usually a smell.
  • Comprehensions, generators, iterators. A for loop that builds a list with .append in idiomatic Python is almost always a comprehension or a generator expression in disguise.
  • The stdlib is enormous and underused. itertools, functools, collections, dataclasses, contextlib, pathlib, concurrent.futures, asyncio, logging, argparse, sqlite3, unittest.mock, typing - these cover ~70% of any service. Reach for third-party only when stdlib runs out, and know when it does.
  • Tooling is opinionated. ruff (lint+format), pyright/mypy (types), pytest (test), uv or hatch (build/dep), py-spy/scalene/memray (profile). A Python engineer who does not know these is half-trained.

If you fight these defaults, you will write Java in Python. If you internalize them, your code will look like the stdlib - which is the actual deliverable Python optimizes for.


4. The Reading List

These are referenced throughout the curriculum. You are not expected to read them cover-to-cover before starting; they are pinned tabs.

Primary - Fluent Python, 2nd ed. (Luciano Ramalho). The canonical text. Read chapters 1–6 in Month 1, 14–21 in Month 2, the rest as referenced. - Effective Python, 3rd ed. (Brett Slatkin). The single best companion to Fluent Python. - High Performance Python, 2nd ed. (Gorelick & Ozsvald). Read in Month 3 alongside the runtime chapter. - Architecture Patterns with Python (Percival & Gregory). Read in Month 5 alongside the patterns chapter.

Runtime & internals - The CPython source itself - treat as primary literature, not reference: - Python/ceval.c (the eval loop) - Objects/object.c, Objects/typeobject.c, Objects/dictobject.c, Objects/listobject.c, Objects/longobject.c - Python/gc.c (the cyclic GC) - Modules/_asynciomodule.c (the C accelerator for asyncio) - Include/internal/pycore_*.h (interpreter state, frame layout) - Brandt Bucher's "Python 3.11 specializing adaptive interpreter" talk and the PEP 659 text. - Anthony Shaw, CPython Internals (Real Python). The most accessible treatment. - PEPs that are mandatory reading (curriculum points to each at the right moment): 8, 20, 257, 318, 343, 380, 484, 492, 525, 530, 544, 557, 585, 593, 612, 634, 646, 654, 657, 659, 669, 684, 692, 695, 703, 709.

AI systems canon (not Python-specific, but mandatory by Month 6) - Lewis et al., Retrieval-Augmented Generation for Knowledge-Intensive NLP. The original RAG paper. - Sumers et al., Cognitive Architectures for Language Agents (CoALA). - Designing Machine Learning Systems (Chip Huyen). Especially chapters 7–10. - AI Engineering (Chip Huyen, 2024). The most current treatment of LLM-app design. - The vLLM paper (Efficient Memory Management for LLM Serving with PagedAttention). - Anthropic's Building effective agents and OpenAI's A practical guide to building agents.

Adjacent canon - Drepper, What Every Programmer Should Know About Memory. Re-read in week 9. - Kleppmann, Designing Data-Intensive Applications. Read chapters 5–9 in Month 5.


5. Curriculum Philosophy: "Read the Source, Ship the Lab"

Three rules govern every module:

  1. Source first, blog second. When the curriculum says "study how dict resolves a key," it means open Objects/dictobject.c and read lookdict_unicode_nodummy. Blogs go stale; CPython commits are dated.
  2. One lab per concept, one artifact per phase. By the end of each month, the reader has produced one open-source-quality artifact (library, gist, or blog post) - not a notebook of toy snippets.
  3. py-spy, pytest -x, and pyright --strict are the teachers. When you do not understand why a program misbehaves, the first response is py-spy dump --pid <pid>, the second is a failing pytest with hypothesis, and only the third is to ask another human.

6. What Python Is Not For

A graduate of this curriculum should be able to argue these points in a design review without sounding ideological:

  • Tight CPU-bound loops without a vectorized library. The interpreter overhead is real. Either vectorize, drop to Cython/Rust/C, or use Numba/PyPy.
  • Hard-real-time systems. GC pauses are short but non-zero, refcount drops can cascade, and the GIL adds tail-latency variance. Wrong tool.
  • Mobile, sandboxed, or aggressively cold-started serverless. A Python interpreter + numpy + torch is a 1+ GB image and a 1+ second cold start. Choose Go, Rust, or a pre-warmed runtime.
  • Code where the team will not adopt typing. Untyped Python over ~5k lines becomes archaeology. A team that resists pyright --strict will fight Python at scale forever.

The signal that Python is the right tool: you have a glue, AI/data, or developer-velocity constraint that ranks above raw single-thread CPU efficiency.


7. A Note on AI-Assisted Workflows

Modern Python authors use LLM tooling. Three rules:

  1. Never accept generated async code without reading it. The most common failure mode of generated Python is "looks async, blocks the event loop" - time.sleep instead of asyncio.sleep, sync requests inside async def, blocking file I/O without run_in_executor.
  2. Verify generated type annotations. Models hallucinate from typing import paths and confuse list[int] (3.9+) with List[int]. Always run pyright.
  3. Treat suggested context-handling skeptically. Generators that hold file handles, async with mismatches, and unclosed httpx.AsyncClient instances are endemic in generated code. Use pytest --tb=short plus tracemalloc to catch leaks.

You are now ready for Week 1. Open 01_MONTH_FOUNDATIONS.md.

Comments