Week 5 - Virtual Memory, Paging, and the Page Cache¶
5.1 Conceptual Core¶
- Each process has a private virtual address space (
mm_struct). The MMU translates virtual to physical addresses via page tables (4-level on x86_64; 5-level on newer CPUs). - Pages are 4 KiB by default. HugePages (2 MiB or 1 GiB) reduce TLB pressure for memory-intensive workloads.
- Memory is divided into anonymous (heap, stack) and file-backed (mmap'd files, page cache for read/write).
- The page cache is Linux's most aggressive optimization: nearly every read of a regular file is cached; writes are buffered until writeback (or
fsync()).
5.2 Mechanical Detail¶
/proc/meminfoline decoding:MemTotal/MemFree/ `MemAvailable - the latter is what to monitor.- `Buffers - block-device caches.
- `Cached - page cache.
Active/Inactive(anon, file)-LRU lists.Dirty/ `Writeback - outstanding writeback work.Slab(Reclaimable / Unreclaim)-kernel object allocators.AnonHugePages/ `HugePages_* - transparent and explicit hugepages.vm.dirty_ratio/ `vm.dirty_background_ratio - when does the kernel start (and force) writeback.- `vm.swappiness - bias between swapping anon vs evicting file pages. Default 60; for DB servers often lowered to 10 or even 1.
- `mm/memory.c::handle_mm_fault - the page-fault entry point. Three classes: minor (already in page cache, just map), major (must read from disk), and COW (write to a shared mapping).
5.3 Lab-"Memory Forensics"¶
- Run
vmstat 1andfree -hwhile loading a 4-GB file withcat file > /dev/null. WatchCachedgrow. echo 3 > /proc/sys/vm/drop_cachesand observe the eviction.mmapa large fileMAP_PRIVATE, write to it, observeAnonHugePagesand the COW behavior in/proc/<pid>/smaps.- Configure
vm.nr_hugepages=512(1 GiB of 2 MiB pages). Allocate viaMAP_HUGETLB. Measure the latency-distribution change vs default pages.
5.4 Hardening Drill¶
- Set
vm.unprivileged_userfaultfd=0(a frequently-exploited surface) andvm.mmap_min_addr=65536(defense against null-pointer kernel exploits). Document the reasoning.
5.5 Performance Tuning Slice¶
- Use
perf stat -e dTLB-load-misses,dTLB-loads ./prog. If TLB miss ratio >1%, evaluate hugepages ormadvise(MADV_HUGEPAGE).