Prelude-What Kubernetes Actually Is¶
Sit with this document for an evening before week 1.
1. Kubernetes Is a Distributed Reconciliation Loop¶
The most clarifying way to understand Kubernetes:
Kubernetes is a distributed key-value store (etcd) wrapped in an HTTP API server, surrounded by a swarm of independent controllers, each of which watches some types of objects in the store and writes other types of objects in response-until the cluster's actual state matches the desired state.
That's it. No central brain. No orchestration engine in the traditional sense. Every component is a client of the API server. The "control plane" is an emergent property of independent controllers cooperating through a shared, transactional store.
If you internalize this, the rest of the curriculum is bookkeeping.
2. The Control Loop Is the Atom¶
Every interesting behavior in Kubernetes is some controller running this loop:
for {
desired := apiServer.List(myWatchedKind)
actual := observe(realWorld)
diff := compute(desired, actual)
for _, action := range diff {
actOn(action) // create / update / delete real resources
apiServer.UpdateStatus(...)
}
watchOrSleep()
}
The Deployment controller watches Deployment objects, creates/updates ReplicaSet objects. The ReplicaSet controller watches ReplicaSet objects, creates/updates Pod objects. The kubelet watches Pod objects bound to its node, talks to the CRI to actually start containers. Each controller's only knowledge of the others is the objects they share.
This is also why Kubernetes is eventually consistent-there is no central scheduler enforcing global state, just many controllers converging.
3. The Five-Axis Cost Model¶
A working platform engineer reasons along five axes:
| Axis | Question to ask |
|---|---|
| Control plane | Does this load etcd? How many writes? How many list/watch consumers? |
| Scheduling | What's the resource request? Affinity? Taint/toleration? Topology spread? |
| Networking | Cluster-internal service / NodePort / LoadBalancer / Ingress / Gateway? CNI overhead? |
| Identity & isolation | What ServiceAccount? What namespace? What RBAC? What NetworkPolicy? What PodSecurity profile? |
| Day-2 ops | What does upgrade look like? Backup? Disaster recovery? Cost? |
Beginner courses teach axis 2 only.
4. The Reading List¶
Primary - Kubernetes in Action (Lukša, 2e). The single best book. - Programming Kubernetes (Hausenblas & Schimanski). Required for Months 3–4. - Production Kubernetes (Vyas, et al.). The Day-2 bible. - Cloud Native Patterns (Davis). Architectural patterns.
Source
- kubernetes/kubernetes - the monorepo. Particularly:
-cmd/kube-apiserver,pkg/apiserver/,staging/src/k8s.io/apiserver/ - API server.
- cmd/kube-scheduler, pkg/scheduler/ - scheduler.
-cmd/kube-controller-manager,pkg/controller/ - built-in controllers.
- pkg/kubelet/ - node agent.
-pkg/proxy/ - service implementations.
- kubernetes/community/sig-* - design docs.
- KEPs (Kubernetes Enhancement Proposals) atkubernetes/enhancements`. The canonical record of why features exist.
Adjacent canon - Designing Data-Intensive Applications (Kleppmann). Especially chapters on consensus and replication. - The Raft paper. Read in week 1. - Site Reliability Engineering (Google). The "what does Day-2 mean?" book.
5. Curriculum Philosophy¶
- Source first, blog second. When the curriculum says "study informer mechanics," open
staging/src/k8s.io/client-go/tools/cache/. Blogs go stale; commits are dated. - Run a real cluster. Many labs assume a multi-node setup.
kindis fine for development; weeks 17+ assume something closer to production. - Defaults are wrong. Kubernetes ships with permissive defaults to ease onboarding (no NetworkPolicy, no PodSecurity, broad RBAC). Production requires inverting them.
6. What Kubernetes Is Not For¶
A graduate of this curriculum should be able to argue these points:
- Single-server simple deploys. A Postgres + an app on one VM with systemd is operationally simpler than a one-node Kubernetes cluster. Don't add a control plane to host one app.
- Hard real-time / latency-critical hot paths. kube-proxy adds latency. CNI plugins add latency. The scheduler is not designed for sub-millisecond placement decisions. Use bare-metal or VM-based deployments for ultra-low-latency workloads.
- Stateful databases at scale, naively. Kubernetes can run stateful workloads with operators (Postgres operator, MongoDB operator, etc.), but doing it correctly requires a mature operator ecosystem and skilled operators. "Just put your DB in K8s" is not free.
- Teams without ops capacity. Kubernetes is not a Heroku replacement. The complexity is real. If you don't have a platform team, use Cloud Run, Fly, or a managed container service before reaching for K8s.
7. AI-Assisted Workflows¶
- Always read generated YAML. Models hallucinate field names; Kubernetes silently ignores unknown fields by default-your "successful apply" may be doing nothing.
- Verify CRD generation. Tools like
controller-genare deterministic; let them generate, never hand-edit. - Treat generated RBAC with extreme suspicion. Models tend toward over-broad permissions ("just give it cluster-admin"). Tighten by hand.
You are now ready for Week 1. Open 01_MONTH_CONTROL_PLANE.md.