Skip to content

Week 3 - The Virtual File System (VFS)

3.1 Conceptual Core

  • The VFS is the kernel's abstraction over filesystem implementations. Userspace sees one consistent API (open, read, stat, mmap); the kernel dispatches to ext4, btrfs, xfs, tmpfs, procfs, sysfs, fuse via per-FS operation tables.
  • Four core VFS objects:
  • inode-a file's metadata (owner, perms, size, pointers to data blocks).
  • dentry-a directory entry; the cached mapping from a name to an inode.
  • file-an open file description (per open() call); holds offset, flags, ref count.
  • superblock-a mounted filesystem instance.
  • The dentry cache (dcache) and inode cache (icache) are why repeated stat()s are fast.

3.2 Mechanical Detail

  • Read fs/open.c::do_sys_openat2 - the entry point ofopenat(2)`.
  • path_openat resolves the path through the dcache, allocating new dentries on miss.
  • Each FS implements a struct file_operations and struct inode_operations. ext4's are in fs/ext4/file.c, fs/ext4/inode.c.
  • Mount namespaces (preview)-each mount namespace has its own mount tree. Containers exploit this.
  • Pseudo-filesystems:
  • procfs (/proc)-kernel-introspection: /proc/<pid>/, /proc/cpuinfo, /proc/meminfo, /proc/sys/.
  • sysfs (/sys)-device/driver-introspection, with most kernel tunables under /sys/kernel/, /sys/class/, /sys/block/, /sys/fs/cgroup/.
  • cgroupfs, devtmpfs, tmpfs, bpf, tracefs, debugfs, securityfs.

3.3 Lab-"VFS Forensics"

  1. Catalogue every entry in /proc/<pid>/ for one of your processes. Document what each gives.
  2. Read /proc/<pid>/maps and explain every region (text, heap, stack, vdso, vvar, shared libs).
  3. Use eBPF's vfs_open kprobe (via bpftrace) to log every open system-wide for 5 seconds. Triage the noise.
  4. Mount tmpfs at a custom path, fill it, and observe the allocator behavior in /proc/meminfo (Shmem).

3.4 Hardening Drill

  • Lock down /proc with hidepid=2 (mount option). Verify a non-root user can no longer see other users' processes.

3.5 Performance Tuning Slice

  • Use perf trace -F to find the hottest VFS function on your workload. If it's __d_lookup, your dcache is being thrashed; if it's __find_get_block, your buffer cache is. Document the inference.

Comments