Saltar a contenido

Week 15 - Service Meshes: Istio, Linkerd, Cilium Service Mesh

15.1 Conceptual Core

  • A service mesh adds: mTLS between Services, retries/timeouts/circuit-breaking, traffic shifting (canary, blue/green), observability (RED metrics + traces), policy enforcement.
  • Two architectural patterns:
  • Sidecar (Istio classic, Linkerd)-Envoy/linkerd-proxy runs in every Pod. ~50 MB memory per Pod, ~1 ms latency overhead.
  • Sidecar-less (Istio ambient, Cilium SM)-eBPF + per-node proxy. Much lower per-Pod overhead.
  • Decision matrix:
  • Mature, full-featured, complex → Istio.
  • Minimalist, Rust-based, fast to install → Linkerd.
  • Already running Cilium, want sidecar-less → Cilium Service Mesh.

15.2 Mechanical Detail

  • Envoy (under Istio + others) is the dataplane proxy. xDS APIs (LDS, RDS, CDS, EDS) push config from the control plane.
  • mTLS rotation: the mesh control plane issues short-lived certs (typically 24h) signed by an internal CA (or SPIFFE-compatible).
  • Traffic management: Istio VirtualService + DestinationRule for routing rules. K8s Gateway API is the standard-track replacement, supported by all major meshes.
  • Observability: every mesh emits RED metrics (Rate, Errors, Duration) per-service. With OTel, traces propagate through the mesh.

15.3 Lab-"Three Meshes"

  1. Install Istio in ambient mode on a test cluster. Apply a VirtualService that does 90/10 canary routing. Verify with Hubble or Kiali.
  2. Repeat with Linkerd. Compare install footprint, configuration ergonomics, and observability quality.
  3. (If running Cilium) enable Cilium Service Mesh. Compare again.
  4. Document tradeoffs: install effort, per-Pod overhead, feature gaps.

15.4 Hardening Drill

  • Enable mTLS in STRICT mode. Define AuthorizationPolicys denying cross-namespace traffic by default; allow only intended pairs.

15.5 Operations Slice

  • Wire the mesh's RED metrics into your service-level dashboards. Define SLOs per service: latency p99, error rate, mTLS handshake success rate.

Comments