Skip to content

Week 4 - Built-in Controllers and client-go Foundations

4.1 Conceptual Core

  • The kube-controller-manager is a single binary running ~30 built-in controllers. Each is a goroutine running the reconciliation loop pattern.
  • The Deployment controller is the best worked example: watches Deployment objects, creates/updates ReplicaSet objects, drives rolling-update progression.
  • The patterns the built-in controllers establish-informers, work queues, structured logging, leader election-are the templates you'll use when building custom controllers.

4.2 Mechanical Detail

  • Informers (staging/src/k8s.io/client-go/tools/cache/):
  • A shared in-memory cache populated by a single watch stream per resource type.
  • Event handlers: OnAdd, OnUpdate, OnDelete.
  • Cache provides O(1) lookup by namespace/name.
  • Multiple controllers in one process share informers via SharedInformerFactory.
  • Work queues (client-go/util/workqueue): rate-limited, deduplicated, item-keyed queues. Reconcile functions pull a key, list-from-cache, act, requeue on error.
  • The Deployment controller flow (pkg/controller/deployment/):
  • Informer detects a Deployment change.
  • Reconciler computes the desired ReplicaSet count and per-RS replica counts based on strategy (rolling vs recreate).
  • Creates new RS / scales old RS / scales new RS.
  • Updates Deployment status with progress.

4.3 Lab-"Read the Deployment Controller"

  1. Read pkg/controller/deployment/deployment_controller.go end-to-end (~1500 lines).
  2. Trace a kubectl rollout through the source: which conditions are checked, which fields updated, what triggers the next loop iteration.
  3. Reproduce a stuck-rollout scenario (deploy a bad image); observe Progressing=False after the deadline; inspect status conditions.
  4. Manually scale a Deployment to 0 with kubectl scale; trace what the controller does in response.

4.4 Hardening Drill

  • Set sensible Deployment defaults in your platform: progressDeadlineSeconds, revisionHistoryLimit, rolling-update maxSurge/maxUnavailable for production workloads.

4.5 Operations Slice

  • Wire workqueue metrics: workqueue_adds_total, workqueue_depth, workqueue_queue_duration_seconds. Alert on persistent depth or processing latency.

Month 1 Capstone Deliverable

A control-plane/ workspace: 1. etcd-cluster/ - week 1's 3-node cluster + backup/restore script. 2.audit-pipeline/ - week 2's audit-log shipping + sample queries. 3. custom-scheduler-plugin/ - week 3's scheduler plugin + deployment. 4.controller-walkthrough.md - week 4's annotated tour of the Deployment controller.

Comments