Skip to content

Week 10 - Memory: Refcounts, Cyclic GC, the pymalloc Allocator

10.1 Conceptual Core

  • Reference counting is eager - most objects die at refcount 0, deterministically, often without invoking the GC at all. This is why Python file handles can be closed by del f and why context managers are the right answer for resources that cannot tolerate non-determinism.
  • The cyclic GC handles only objects that might form cycles (containers). It runs in three generations with thresholds. It does not free memory; it breaks cycles so refcounting can free memory.
  • The CPython allocator (pymalloc) is an arena/pool/block allocator tuned for small (<512B) objects. Large allocations go to the system malloc.

10.2 Mechanical Detail

  • sys.getrefcount(obj): returns refcount + 1 (the temporary on the call stack). weakref.ref to break cycles.
  • gc.set_threshold, gc.disable, gc.collect. Disabling GC during a known short-lived high-allocation phase (e.g., model loading) and re-enabling after is a real production technique.
  • Memory leaks in pure Python are almost always (a) caches without bounds, (b) closures capturing large objects, (c) __del__ methods on cyclic objects (legacy issue; mostly fixed since 3.4 / PEP 442). Find with tracemalloc or memray.
  • __slots__ revisited: per-instance memory savings, attribute-access speed-ups, the inheritance gotcha.
  • array.array, bytes, bytearray, memoryview, numpy.ndarray: when not to make Python objects in the first place.

10.3 Lab - "Find the Leak"

  1. Write a service that has a deliberate leak: an unbounded dict cache, a leaking closure, and a circular reference with a __del__. Run under memray and tracemalloc. Identify each leak from the output.
  2. Bound the cache with functools.lru_cache(maxsize=...). Confirm with memray that growth flatlines.
  3. Profile a NumPy-heavy workload. Observe that pymalloc and Python refcounts are largely unused - most memory is in NumPy buffers. Internalize: "NumPy is a different memory world."

10.4 Idiomatic & Linter Drill

  • Enable ruff B008, B023. Catch closure-capture bugs at lint time.

10.5 Production Hardening Slice

  • Add a memray smoke job to CI: run the service against a fixture, fail if peak RSS exceeds a threshold.

Comments