Skip to content

Week 5 - Virtual Memory, Paging, and the Page Cache

5.1 Conceptual Core

  • Each process has a private virtual address space (mm_struct). The MMU translates virtual to physical addresses via page tables (4-level on x86_64; 5-level on newer CPUs).
  • Pages are 4 KiB by default. HugePages (2 MiB or 1 GiB) reduce TLB pressure for memory-intensive workloads.
  • Memory is divided into anonymous (heap, stack) and file-backed (mmap'd files, page cache for read/write).
  • The page cache is Linux's most aggressive optimization: nearly every read of a regular file is cached; writes are buffered until writeback (or fsync()).

5.2 Mechanical Detail

  • /proc/meminfo line decoding:
  • MemTotal / MemFree / `MemAvailable - the latter is what to monitor.
  • `Buffers - block-device caches.
  • `Cached - page cache.
  • Active / Inactive (anon, file)-LRU lists.
  • Dirty / `Writeback - outstanding writeback work.
  • Slab (Reclaimable / Unreclaim)-kernel object allocators.
  • AnonHugePages / `HugePages_* - transparent and explicit hugepages.
  • vm.dirty_ratio / `vm.dirty_background_ratio - when does the kernel start (and force) writeback.
  • `vm.swappiness - bias between swapping anon vs evicting file pages. Default 60; for DB servers often lowered to 10 or even 1.
  • `mm/memory.c::handle_mm_fault - the page-fault entry point. Three classes: minor (already in page cache, just map), major (must read from disk), and COW (write to a shared mapping).

5.3 Lab-"Memory Forensics"

  1. Run vmstat 1 and free -h while loading a 4-GB file with cat file > /dev/null. Watch Cached grow.
  2. echo 3 > /proc/sys/vm/drop_caches and observe the eviction.
  3. mmap a large file MAP_PRIVATE, write to it, observe AnonHugePages and the COW behavior in /proc/<pid>/smaps.
  4. Configure vm.nr_hugepages=512 (1 GiB of 2 MiB pages). Allocate via MAP_HUGETLB. Measure the latency-distribution change vs default pages.

5.4 Hardening Drill

  • Set vm.unprivileged_userfaultfd=0 (a frequently-exploited surface) and vm.mmap_min_addr=65536 (defense against null-pointer kernel exploits). Document the reasoning.

5.5 Performance Tuning Slice

  • Use perf stat -e dTLB-load-misses,dTLB-loads ./prog. If TLB miss ratio >1%, evaluate hugepages or madvise(MADV_HUGEPAGE).

Comments