Saltar a contenido

Week 9 - The CPython VM: Objects, Bytecode, the Eval Loop

9.1 Conceptual Core

  • CPython is a stack-based bytecode interpreter with reference counting + a generational cyclic GC. Every PyObject is a 16-byte header (ob_refcnt, ob_type) + type-specific tail.
  • The eval loop (Python/ceval.c::_PyEval_EvalFrameDefault) is a giant computed-goto dispatch over opcodes. Since 3.11, the loop is specializing and adaptive (PEP 659): hot opcodes get rewritten in place to type-specialized variants (LOAD_ATTR_INSTANCE_VALUE, BINARY_OP_ADD_INT).

9.2 Mechanical Detail

  • dis.dis(fn): disassemble a function. Memorize the common opcodes: LOAD_FAST, STORE_FAST, LOAD_GLOBAL, LOAD_CONST, CALL, RETURN_VALUE, BINARY_OP, COMPARE_OP, FOR_ITER, POP_JUMP_IF_FALSE, LOAD_ATTR, STORE_SUBSCR.
  • Why local lookups are fast and global lookups are slow: locals are a fixed-size array indexed by integer (fast locals), globals are a dict lookup. Hot functions often hoist globals to locals (def f(_len=len): ...).
  • Frame objects, code objects, and the difference. func.__code__.co_consts, co_names, co_varnames, co_flags.
  • The specializing interpreter: read PEP 659 once. Use python -X opt -c "import dis; dis.dis(fn, adaptive=True)" to see specialized opcodes after warm-up.
  • Free lists and small-int / interned-string caches.

9.3 Lab - "Bytecode Forensics"

  1. Write three implementations of "sum of squares": a for loop, a sum() + genexp, and numpy.dot(a, a). dis.dis each. Benchmark with timeit. Explain the gap.
  2. Take a function with a global lookup in its hot loop. Refactor to a default-argument cache. Re-bench. Quantify the win.
  3. Use sys.setprofile to count opcode-level events on a small program. Compare counts before and after warm-up to observe specialization.

9.4 Idiomatic & Linter Drill

  • Enable ruff PERF. Read every rule. Identify cases in your codebase where the rule applies but readability suffers.

9.5 Production Hardening Slice

  • Add pytest-benchmark to CI as a non-failing job that publishes JSON results. Build a script that flags >10% regressions on PRs.

Comments