Week 14 - Cilium and eBPF Networking¶
14.1 Conceptual Core¶
Cilium is the dominant eBPF-based CNI. It replaces iptables-based packet processing with eBPF programs attached at three layers:
- Socket layer (
bpf_sock_ops) - connection-level decisions before packets exist. - Cgroup egress - per-pod outbound policy enforcement.
- NIC-level XDP - ingress filtering at line rate, before the kernel network stack.
The shift from iptables matters at scale: an iptables-based kube-proxy walks a linear chain of rules per packet - O(services). eBPF programs do hash-table lookups: O(1) per packet, regardless of service count.
Beyond replacing the CNI, Cilium provides: - Kube-proxy replacement (eBPF-based service load balancing - no iptables churn on every endpoint change). - L7 NetworkPolicy (HTTP, gRPC, Kafka filtering at the dataplane, not in a sidecar). - ClusterMesh (multi-cluster service discovery and cross-cluster policy). - Hubble (eBPF-based flow observability - every pod-to-pod connection visible without sampling). - Service Mesh (sidecar-less mTLS via eBPF + SPIFFE).
This is the bridge to Linux Month 3 - eBPF in production. See also: eBPF in the observability cross-topic page.
14.2 Mechanical Detail¶
- Dataplane as eBPF graph: Cilium's eBPF programs live under
bpf/incilium/cilium. The agent compiles them at startup with the cluster's specific configuration baked in (BTF-driven CO-RE for portability across kernels). - Identity-based policy: pods are assigned a numeric identity derived from their labels (
app=foo,env=prod→ identity 1234). eBPF programs match on these identities, not on IPs. This is what allows policy to scale to thousands of pods without per-pod iptables rules - identities are stable across pod restarts and IP changes. - Service load balancing: instead of iptables DNAT chains, Cilium uses an eBPF map indexed by
(service IP, port)returning a backend. Connection state lives in a separate eBPF map; updates are atomic, no kernel reload, no race during endpoint churn. - Encryption: WireGuard (recommended; in-kernel since 5.6) or IPsec tunnels between nodes. Per-NetworkPolicy opt-in or cluster-wide.
- Hubble captures every packet's metadata via eBPF - source/dest identity, verdict (allowed/denied), L7 protocol info - and exposes it via gRPC + a CLI + a UI. Per-packet overhead is single-digit-percent CPU.
The trap
Switching kubeProxyReplacement from false → true on a live cluster without draining nodes. The iptables rules from the old kube-proxy don't get cleaned up automatically, and they interact badly with Cilium's eBPF NAT. Always: drain node → reconfigure → uncordon. The Cilium installer's kubeProxyReplacement: strict mode aborts if it finds residual rules.
14.3 Lab - "Install and Drive Cilium"¶
- Install via Helm with:
kubeProxyReplacement=true,hubble.enabled=true,hubble.relay.enabled=true,hubble.ui.enabled=true,encryption.enabled=true,encryption.type=wireguard. - Use the Hubble UI (
cilium hubble ui) to visualize pod-to-pod traffic in real time. - Author L4
NetworkPolicy(standard k8s API); test enforcement with a denied + allowed flow. - Author an L7
CiliumNetworkPolicy(e.g., allow onlyHTTP GET /api/*from frontend → backend); test enforcement. - Enable Cilium Service Mesh; observe sidecar-free mTLS between two test services.
14.4 Hardening Drill¶
Enable transparent encryption (WireGuard) between nodes. Combined with default-deny NetworkPolicy (start: deny everything, allow explicitly), this gives defense-in-depth: even if a node is compromised, the attacker sees only encrypted traffic for flows they haven't been explicitly authorized to observe.
14.5 Operations Slice¶
Monitor cilium_* Prometheus metrics. Alert on:
- policy-drop rate spikes - legitimate workloads being denied (usually a NetworkPolicy author mistake, or a new service that didn't get its allow rule).
- identity-table pressure - Cilium has a max identity count per cluster; approaching it means too many distinct label combinations, often from a bad operator emitting unique labels per request.
- endpoint regeneration time - if it climbs past 5-10s, your label churn is overwhelming the agent.