Week 6 - Swapping, OOM, Memory Pressure (PSI)¶
6.1 Conceptual Core¶
- Swap is the kernel's overflow valve when anon memory pressure exceeds available RAM. Modern systems often run swapless or with a small swap (e.g.,
zswaporzram). - The OOM killer is the kernel's last-resort mechanism: when all reclaim has failed and an allocation cannot be satisfied, it kills the process with the highest
oom_score(heuristic of memory usage and adjusted byoom_score_adj). - PSI (Pressure Stall Information)-
/proc/pressure/{cpu,memory,io}and per-cgroup `pressure.{cpu,memory,io} - reports time the system or a cgroup spent stalled on each resource. The modern signal for "this system is sad."
6.2 Mechanical Detail¶
swapon,swapoff,/proc/swaps,vm.swappiness.zram(compressed RAM-backed swap) configuration viasystemd-zram-generatoror manually withzramctl.- OOM tuning:
oom_score_adjper-process (/proc/<pid>/oom_score_adj, range -1000 to 1000).systemdOOMScoreAdjust=directive.vm.overcommit_memory(0/1/2): allow / always-allow / strict accounting.- PSI semantics:
- `some - at least one task stalled.
- `full - all runnable tasks stalled (system-wide can't reach this for CPU).
- Numbers are 10s/60s/300s averages of stall percentage.
6.3 Lab-"Pressure and the OOM Killer"¶
- Write a memory-eater program. Run inside a
memory.high=512Mcgroup. Observepressure.memoryrise. - Push past
memory.max; watch the OOM killer. Checkdmesgandjournalctl -k | grep -i 'killed process'. - Set
oom_score_adj=-500on a critical process; verify it survives an OOM event triggered by another, lower-priority hog. - Measure PSI under realistic load: capture
pressure.memoryevery second for 5 minutes during a workload spike. Plot.
6.4 Hardening Drill¶
- Add
MemoryHigh=andMemoryMax=to every long-running service. UseMemoryHighas a soft target (slows allocations) andMemoryMaxas the hard cliff.
6.5 Performance Tuning Slice¶
- Hook
bpftrace -e 'kprobe:oom_kill_process { printf("%s killed %s\n", comm, str(arg0->comm)) }'to observe OOM events live.