Skip to content

Worked example - Week 14: a NetworkPolicy → what eBPF actually does

Companion to Kubernetes → Month 04 → Week 14: Cilium and eBPF. The week explains the Cilium model: CNI plugin, identities, the L3/L4/L7 policy layers, and the eBPF datapath. This page takes one Kubernetes NetworkPolicy and traces it through Cilium all the way to the eBPF program enforcing it on a packet.

You need a kind/k3s/minikube cluster with Cilium installed (cilium install from the Cilium CLI; or Helm with --set kubeProxyReplacement=true).

The policy

# api-deny-from-frontend.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-deny-from-frontend
  namespace: shop
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes: [Ingress]
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: orders
    ports:
    - port: 8080
      protocol: TCP

What this says, in English: "Pods in namespace shop with label app=api will accept TCP/8080 traffic only from pods labeled app=orders in the same namespace. Everything else gets dropped."

Without a NetworkPolicy controller, Kubernetes ignores this object entirely. With Cilium installed, the policy becomes a real packet-level rule. Walk through how.

Step 1 - Pod IPs and identities

Apply the policy and deploy three sample pods:

$ kubectl apply -f api-deny-from-frontend.yaml
$ kubectl run -n shop api      --image=nginx --labels=app=api    --port 8080
$ kubectl run -n shop orders   --image=alpine --labels=app=orders -- sh -c "while true; do sleep 60; done"
$ kubectl run -n shop frontend --image=alpine --labels=app=frontend -- sh -c "while true; do sleep 60; done"

Now look at what Cilium did:

$ kubectl exec -n kube-system ds/cilium -- cilium endpoint list -o json | jq '.[] | {id, name: .status.identity.labels, ip: .status.networking.addressing[0].ipv4}'
{ "id": 412, "name": ["k8s:app=api","k8s:io.kubernetes.pod.namespace=shop"], "ip": "10.244.0.42" }
{ "id": 1207, "name": ["k8s:app=orders","k8s:io.kubernetes.pod.namespace=shop"], "ip": "10.244.0.43" }
{ "id": 1208, "name": ["k8s:app=frontend","k8s:io.kubernetes.pod.namespace=shop"], "ip": "10.244.0.44" }

Cilium assigned each pod an endpoint ID and a security identity derived from the pod's labels. The identity is a number, not the label set itself. All pods with the same label set share an identity, which is the unit Cilium reasons about.

The key trick: traditional iptables-based CNIs do rule matching by IP, which means rules scale O(n²) with pod count. Cilium does it by identity, which scales O(unique_label_sets²) - vastly smaller in practice.

Step 2 - The policy in Cilium's view

$ kubectl exec -n kube-system ds/cilium -- cilium policy get
[
  {
    "endpointSelector": {"matchLabels": {"k8s:app": "api", "k8s:io.kubernetes.pod.namespace": "shop"}},
    "ingress": [
      {
        "fromEndpoints": [
          {"matchLabels": {"k8s:app": "orders", "k8s:io.kubernetes.pod.namespace": "shop"}}
        ],
        "toPorts": [{"ports": [{"port": "8080", "protocol": "TCP"}]}]
      }
    ]
  }
]

Same content, Cilium's internal representation. The selectors will resolve to specific identity numbers when the policy is materialized into eBPF maps.

Step 3 - Test the policy works

$ kubectl exec -n shop orders -- wget -qO- --timeout=2 http://10.244.0.42:8080
<!DOCTYPE html>
<html>
<head><title>Welcome to nginx!</title>
...
$ kubectl exec -n shop frontend -- wget -qO- --timeout=2 http://10.244.0.42:8080
wget: download timed out

orders succeeds (allowed). frontend times out (silently dropped). Good.

But where is the drop happening?

Step 4 - Find the eBPF program

Cilium attaches eBPF programs at several kernel hook points: tc (traffic control) ingress/egress on every pod's veth, and on the host's external interface. List them:

$ kubectl exec -n kube-system ds/cilium -- bpftool prog show | grep cil_
1342: sched_cls  name cil_from_container  tag 4f...
1343: sched_cls  name cil_to_container    tag 8a...
1344: sched_cls  name cil_from_host       tag c2...
1345: sched_cls  name cil_to_host         tag d7...
1346: sched_cls  name cil_from_netdev     tag e3...

These are the BPF programs implementing the datapath. cil_from_container runs on every packet leaving a pod's veth; cil_to_container on every packet entering. The policy enforcement happens in cil_to_container.

Step 5 - The maps Cilium uses

eBPF programs are stateless; they read from kernel-managed maps (kv stores). Cilium maintains several:

$ kubectl exec -n kube-system ds/cilium -- bpftool map show | grep -E "cilium_"
221: hash  name cilium_policy   key 16B  value 48B  max_entries 16384
222: lru_hash name cilium_ct4   key 40B  value 64B  max_entries 524288
223: hash  name cilium_lxc      key 4B   value 64B  max_entries 65536
224: hash  name cilium_metrics  key 8B   value 16B  max_entries 65536
...
  • cilium_lxc - endpoint ID → pod info (IP, MAC, security identity).
  • cilium_policy - (endpoint_id, src_identity, port, protocol) → allow/deny. This is the lookup table the BPF program consults to decide whether a packet is allowed.
  • cilium_ct4 - connection tracking. Stores active flows for established-connection allowance.

Step 6 - The actual lookup

When a packet from frontend (identity 1208) reaches the host with destination api (10.244.0.42:8080, endpoint 412):

  1. cil_to_container BPF program triggers on the veth's ingress hook.
  2. Program reads packet headers - src IP 10.244.0.44, dst IP 10.244.0.42, dst port 8080.
  3. Program looks up dst endpoint via cilium_lxc[10.244.0.42] → endpoint 412.
  4. Program looks up src identity via cilium_ipcache[10.244.0.44] → identity 1208.
  5. Program builds policy key (endpoint=412, identity=1208, port=8080, proto=TCP) and queries cilium_policy.
  6. No matching entry → returns DROP.
  7. Program updates cilium_metrics (drop counter ++).
  8. tc framework drops the packet.

When orders (identity 1207) sends the same kind of packet, step 5 builds key (412, 1207, 8080, TCP), the policy map has this entry (from the NetworkPolicy → identity match), and the program returns PASS. The packet proceeds; the connection is tracked in cilium_ct4 so the return packet is also allowed via fast-path.

Step 7 - See the drop in real time

$ kubectl exec -n kube-system ds/cilium -- cilium monitor -t drop
xx drop (Policy denied) flow 0xab12 to endpoint 412, identity 1208->10044, file bpf_lxc.c line 1142, 86 bytes

This is the BPF program emitting a perf event when it drops a packet. The format includes the source line of the bpf_lxc.c program that made the decision, the source/destination identities, and the byte count. Cilium's hubble (a separate component) consumes these events to provide a real-time UI.

Why this matters

The traditional kube-proxy + iptables path for this same policy would: - Maintain ~O(pods^2) iptables rules per port. - Linearly walk those rules on every packet. - Rewrite rules on every pod create/delete, which under churn can take seconds and lose packets.

Cilium's eBPF path: - Maintains a hash map keyed by (endpoint, identity, port, proto). - O(1) lookup on every packet. - Identity-based: adding a new orders pod doesn't change the policy map at all (same identity).

In a cluster with 10,000 pods, the difference is "stable 50µs latency vs unbounded tail." That's the whole pitch for Cilium.

The trap

A NetworkPolicy without a controller that supports it does nothing. Many K8s users apply policies on clusters where the CNI doesn't enforce them, and the cluster silently allows everything. Verify with kubectl exec between pods that shouldn't be able to reach each other, or use cilium connectivity test if you're on Cilium.

The other trap: Cilium identity granularity. Two pods with identical label sets share an identity. If you split traffic by namespace alone, every pod in the namespace has the same identity for policy purposes. Add labels (role, tier, app-version) to get finer-grained control.

Exercise

  1. Run the demo above. Confirm the drop is visible via cilium monitor.
  2. Add a third allowed source: pods labeled app=admin. Reapply the policy. Watch cilium policy get change and confirm admin pods now succeed.
  3. (Advanced) Use bpftool prog dump xlated id <prog-id> on one of the cil_* programs. Read the BPF assembly. Find the map lookup instructions.
  4. (Advanced) Read Documentation/bpf/ in the kernel tree for the BPF instruction set reference. Find BPF_LDX, BPF_JEQ. You'll see them in the disassembly.

Comments