Week 2 - Syscalls and the Kernel/Userspace Boundary¶
2.1 Conceptual Core¶
- A system call is a transfer of control from userspace to the kernel via a defined ABI: trigger an interrupt or a
syscallinstruction, the kernel reads register-passed arguments, dispatches via a table indexed by syscall number. - On x86_64 Linux: arguments in
rdi, rsi, rdx, r10, r8, r9; syscall number inrax; return inrax. Errors as negativerax( - errno`). - libc wraps each syscall in a function (
open(2)is a thin wrapper; some wrappers likefork(3)glue toclone(2)).
2.2 Mechanical Detail¶
- Read `arch/x86/entry/syscalls/syscall_64.tbl - the syscall table.
- The path: userspace
syscallinstruction →entry_SYSCALL_64(arch/x86/entry/entry_64.S) →do_syscall_64→sys_<name>in C. strace -f -e trace=%file ./progtraces file-related syscalls only.ltracefor library-level tracing (less useful since most actions hit the kernel anyway).perf traceis the modern equivalent ofstracewith much lower overhead.audit(auditd) for production-grade syscall logging-gated by rules, written via netlink.
2.3 Lab-"Syscall Forensics"¶
- `strace -c ls /etc - produce a count summary of syscalls. Predict the top 5; verify.
- Implement
catin pure C using onlyopen,read,write,close. No libc helpers (syscall(SYS_open, ...)). - Run under
strace -fto verify zero unexpected calls. - Build a minimal
seccompallowlist for yourcat, allowing only the syscalls actually used. Verify it kills attempts to invoke other syscalls.
2.4 Hardening Drill¶
- Configure
auditctl -a always,exit -F arch=b64 -S execve -k execto log everyexecve. Read the resultingaureport -xoutput. Document the operational cost (log volume).
2.5 Performance Tuning Slice¶
- Run a workload under
perf stat -e syscalls:sys_enter_*. Identify the highest-frequency syscall. Hypothesize a reduction (batching, larger buffers, splice).