Week 13 - The Network Stack: Sockets, NAPI, conntrack¶
13.1 Conceptual Core¶
- The Linux networking stack is layered: NIC driver → NAPI (interrupt + polling hybrid) →
netif_receive_skb→ protocol handlers (IP, ARP) → transport (TCP/UDP) → socket buffer → userspace viarecv(). - An
sk_buffis the kernel's packet representation: a struct with metadata + pointers into the data buffer. It travels from driver to socket. ~250 bytes of metadata. - Netfilter is the packet-mangling/filtering framework: hooks at PRE_ROUTING, INPUT, FORWARD, OUTPUT, POST_ROUTING. iptables and nftables are userspace tools that program these hooks.
13.2 Mechanical Detail¶
ss -tnp-show TCP sockets with PIDs. Replaces deprecatednetstat.ipsuite-ip addr,ip route,ip rule,ip neigh,ip link,ip tuntap. The single tool you need;ifconfigandrouteare deprecated.- conntrack: stateful tracking of connections in netfilter.
conntrack -Lshows the table;nf_conntrack_maxtunes capacity. Each entry ~300 bytes; a busy host may track millions. tc-traffic control: queue disciplines (qdiscs), classes, filters. The shaping and policing tool. Modern qdiscs:fq_codel(default),fq(for high-bandwidth),cake.- TSO / GSO / GRO / LRO-segmentation offloads. Disable with
ethtool -Kfor debugging; they distort packet timing.
13.3 Lab-"Packet Forensics"¶
- `tcpdump -i any -nn -X 'tcp port 443' -c 10 - capture and dissect TLS handshake bytes.
- Trace a TCP connection's lifecycle with
bpftrace'stcplife.bt. - Set up a gratuitous DROP rule with
iptables -I INPUT -p icmp -j DROPand verify withping. Remove. Repeat withnft. - Inspect conntrack:
cat /proc/net/nf_conntrackwhile a long-lived connection is open.
13.4 Hardening Drill¶
sysctl net.ipv4.tcp_syncookies=1,net.ipv4.conf.all.rp_filter=1,net.ipv4.conf.all.accept_source_route=0,net.ipv6.conf.all.accept_redirects=0. Document each.
13.5 Performance Tuning Slice¶
ethtool -S <iface> - driver stats (drops, errors, csum issues).ip -s link show- interface stats. Identify any non-zero error counter.