Skip to content

Week 12 - eBPF in Production: Observability Tools

12.1 Conceptual Core

  • eBPF makes "perfect tracing" possible: every important system event can be intercepted with low overhead, aggregated in-kernel, and shipped to userspace.
  • The standard observability stack today (in 2026) is: Cilium (networking), Pixie / Parca / Pyroscope (profiling), Tetragon (security observability), Falco (runtime security). All eBPF-based.

12.2 Mechanical Detail

  • Continuous profiling with Parca / Pyroscope: stack-sampling at low frequency across all processes, attributing CPU and on-CPU time per function, with flame graphs in a UI.
  • `bpftrace - style tools you'll write yourself:
  • `tcpconnect - log new TCP connections with PID and process name.
  • execsnoop - log everyexecve` system-wide.
  • `opensnoop - every file open.
  • `biosnoop - every block I/O completion with latency.
  • The ring buffer map (BPF_MAP_TYPE_RINGBUF) is the modern way to ship events to userspace; it replaces the older perf-buffer pattern with simpler, faster semantics.

12.3 Lab-"Build a Production-Grade eBPF Tool"

Write connsnoop: - Hooks tcp_v4_connect and tcp_v6_connect (kprobe), inet_csk_accept (kretprobe), tcp_close. - Records per-connection: 5-tuple, PID, process name, duration, bytes-tx/rx. - Aggregates in-kernel via per-CPU hash maps, ships completion events through a ring buffer. - Userspace consumer in C (with libbpf) or Go (with cilium/ebpf). Outputs JSON. - Verifier-clean, CO-RE-portable across kernels 5.10+.

12.4 Hardening Drill

  • Add connsnoop as a systemd service with full hardening. The eBPF program needs CAP_BPF and CAP_PERFMON; do not grant CAP_SYS_ADMIN (the legacy alternative).

12.5 Performance Tuning Slice

  • Run connsnoop on a host doing real work; measure its CPU overhead with perf stat. Target <0.5% in steady state. If higher, narrow the hookpoints or aggregate more in-kernel.

Month 3 Capstone Deliverable

A namespaces-cgroups-ebpf/ directory: 1. mini-container/ (week 9)-the C program that builds a container by hand. 2. multi-tenant-cgroups/ (week 10)-the cgroup-v2 policy + verification script. 3. bpf-tour/ (week 11)-five bpftrace recipes with annotated output. 4. connsnoop/ (week 12)-the libbpf + userspace consumer tool.

CI runs the recipes against a CI VM and validates output schemas. Open one upstream interaction: a doc-fix PR to bpftrace, or a tested bpftrace recipe submitted as an example.

Comments