Week 10 - Memory: Refcounts, Cyclic GC, the pymalloc Allocator¶
10.1 Conceptual Core¶
- Reference counting is eager - most objects die at refcount 0, deterministically, often without invoking the GC at all. This is why Python file handles can be closed by
del fand why context managers are the right answer for resources that cannot tolerate non-determinism. - The cyclic GC handles only objects that might form cycles (containers). It runs in three generations with thresholds. It does not free memory; it breaks cycles so refcounting can free memory.
- The CPython allocator (
pymalloc) is an arena/pool/block allocator tuned for small (<512B) objects. Large allocations go to the system malloc.
10.2 Mechanical Detail¶
sys.getrefcount(obj): returns refcount + 1 (the temporary on the call stack).weakref.refto break cycles.gc.set_threshold,gc.disable,gc.collect. Disabling GC during a known short-lived high-allocation phase (e.g., model loading) and re-enabling after is a real production technique.- Memory leaks in pure Python are almost always (a) caches without bounds, (b) closures capturing large objects, (c)
__del__methods on cyclic objects (legacy issue; mostly fixed since 3.4 / PEP 442). Find withtracemallocormemray. __slots__revisited: per-instance memory savings, attribute-access speed-ups, the inheritance gotcha.array.array,bytes,bytearray,memoryview,numpy.ndarray: when not to make Python objects in the first place.
10.3 Lab - "Find the Leak"¶
- Write a service that has a deliberate leak: an unbounded
dictcache, a leaking closure, and a circular reference with a__del__. Run undermemrayandtracemalloc. Identify each leak from the output. - Bound the cache with
functools.lru_cache(maxsize=...). Confirm withmemraythat growth flatlines. - Profile a NumPy-heavy workload. Observe that pymalloc and Python refcounts are largely unused - most memory is in NumPy buffers. Internalize: "NumPy is a different memory world."
10.4 Idiomatic & Linter Drill¶
- Enable
ruffB008,B023. Catch closure-capture bugs at lint time.
10.5 Production Hardening Slice¶
- Add a
memraysmoke job to CI: run the service against a fixture, fail if peak RSS exceeds a threshold.