Kubernetes From Scratch (Beginner)¶
Beginner path: heard-of-Kubernetes → deploying pods/services/Helm charts, debugging clusters, contributing to K8s-adjacent OSS.
Printing this page
Use your browser's Print → Save as PDF. The print stylesheet hides navigation, comments, and other site chrome; pages break cleanly at section boundaries; advanced content stays included regardless of beginner-mode state.
Kubernetes From Scratch - Beginner to OSS Contributor¶
From "I have heard of Kubernetes" to "I can deploy a small app to a local cluster, read a Helm chart, debug a failing pod, and submit a fix to a Kubernetes-adjacent OSS project."
Who this is for¶
- You've finished Containers From Scratch (or you've used Docker enough to know what an image and a container are).
- You've never used Kubernetes, OR you've copy-pasted some YAML without really understanding what it does.
Soft prerequisite¶
This path assumes container fluency. If you can't write a Dockerfile and run a multi-service compose stack, do Containers From Scratch first.
What you'll need¶
- Docker Desktop (with the Kubernetes feature enabled) OR
minikubeORkindORk3d- any will work for the local cluster. kubectl- the Kubernetes CLI.- A text editor.
- About 5 hours/week. Path is sized for 3-4 months.
Why Kubernetes¶
- The de facto standard for production container orchestration. Every major cloud, every modern infra team.
- Skills transfer across clouds and on-prem. Kubernetes is Kubernetes is Kubernetes.
- OSS surface is huge. Operators, controllers, Helm charts, kubectl plugins, dozens of CNCF projects - endless contribution opportunities.
How this path works¶
Each page does one thing: explains, shows, gives an exercise, ends with Q&A.
The pages¶
| # | Title | What you'll know after |
|---|---|---|
| 00 | Introduction | What Kubernetes is and isn't |
| 01 | Setup | Local cluster + kubectl |
| 02 | Pods | The smallest unit |
| 03 | Deployments | Managing pods declaratively |
| 04 | Services | Networking pods together |
| 05 | ConfigMaps and Secrets | Configuration |
| 06 | Namespaces | Organization |
| 07 | Labels and selectors | How K8s finds things |
| 08 | Volumes and storage | Persistent data |
| 09 | Ingress | Routing external traffic |
| 10 | Helm | Package manager |
| 11 | kubectl power tools | Debugging in real clusters |
| 12 | Reading other people's manifests | The bridge |
| 13 | Picking a project | K8s-adjacent OSS candidates |
| 14 | Anatomy of a K8s-related OSS project | Case study |
| 15 | Your first contribution | Workflow + PR |
Start with Introduction.
00 - Introduction¶
What this session is¶
A 10-minute read. Sets expectations.
What you're going to be able to do, eventually¶
By the end:
- Run a local Kubernetes cluster on your laptop.
- Deploy an application: Pod → Deployment → Service → Ingress.
- Configure apps with ConfigMaps and Secrets.
- Persist data with PersistentVolumes.
- Install third-party software with Helm.
- Debug failing pods: read logs, exec in, port-forward.
- Read a real-world Kubernetes manifest or Helm chart and understand what it does.
- Submit a fix to a Kubernetes-adjacent OSS project (a chart, a controller's docs, a kubectl plugin).
That last bullet is the goal.
What Kubernetes actually is¶
Kubernetes is a system for running containers across a cluster of machines. You give it: - A description of what you want running (which images, how many copies, what resources). - Some machines (nodes) to run them on.
Kubernetes does the rest: places containers on nodes, restarts them when they die, scales them up or down, networks them together, exposes them to the outside world, rolls out new versions, rolls back on failure.
The promise: you describe desired state in YAML; the system converges the actual state to match. If a node dies, pods running on it are rescheduled elsewhere. If you change the image version, pods are replaced one by one.
The deal¶
- It's slow on purpose. One concept per page.
- Container fluency assumed. If "Docker container" is unfamiliar, do containers first.
- You'll run a local cluster. Most pages have hands-on exercises.
- Kubernetes is a lot of vocabulary. Pod, Deployment, Service, Namespace, ConfigMap, Secret, Ingress, PersistentVolume, PersistentVolumeClaim, Helm chart, Custom Resource, Operator, Controller. We introduce them one at a time. Don't panic at the list.
What you need¶
- A way to run a local Kubernetes cluster. Pick one:
- Docker Desktop's built-in Kubernetes - easiest on macOS / Windows. Toggle in settings.
minikube- works on Mac, Linux, Windows. Mature.kind(Kubernetes IN Docker) - fast, popular for development.k3d- wrapsk3s(a lightweight K8s) in Docker.kubectl- the CLI. Usually bundled with the above; otherwisebrew install kubectl/sudo apt install kubectl.- A text editor.
- ~5 hours/week. Path is sized for 3-4 months.
What you do NOT need¶
- A cloud account. We work entirely locally.
- A programming language (some advanced topics use Go, but you'll mostly be writing YAML).
- A multi-machine cluster. Local single-node is enough for everything here.
What Kubernetes is not¶
Useful clarifications:
- Not a PaaS. It doesn't include a code-deployment pipeline, a database, or a logging system. You bring those (or run them on K8s).
- Not a virtualization layer. It schedules containers on Linux nodes; the nodes themselves are Linux machines.
- Not "just better Docker Compose." It solves a different (bigger) problem: orchestration across many machines. For single-machine deployment, Compose is often the right tool.
When to use Kubernetes (and when not)¶
Use Kubernetes when: - You have several services that need to talk to each other. - You want declarative deploys (commit YAML → cluster converges). - You need auto-restart, rolling updates, scaling. - You're operating in a multi-machine environment (cloud, on-prem fleet).
Don't use Kubernetes when:
- You have one app on one server. Use Docker Compose (or just docker run).
- You're a single developer making your first deploy. Use a PaaS (Railway, Fly.io, Render). PaaS hides K8s; you ship faster.
- You don't yet have multiple services. Premature complexity.
This path teaches you Kubernetes regardless - even if you don't need to deploy with it daily, the vocabulary appears in job interviews, blog posts, and the wider infra community.
How long this realistically takes¶
3 to 4 months at 5 hours/week. Shorter than the language paths - Kubernetes is concepts + YAML, not a new language.
What success looks like¶
You'll be able to:
- Read a kubectl get pods output and tell what's wrong.
- Read a Helm chart and predict what it'll deploy.
- Write a Deployment + Service for a small app.
- Submit a PR to a K8s-adjacent project.
You will not be able to: - Build and operate a production cluster end-to-end. (Months of additional work; the "Kubernetes" senior reference path on this site covers it.) - Pass a CKA (Certified Kubernetes Administrator) exam. (More targeted prep needed.)
One last thing before we start¶
Kubernetes has more jargon than any path on this site. Don't try to memorize all of it on day one. Each page introduces what's needed at that moment. By page 15 it all clicks together; you don't need to understand Helm to learn what a Pod is.
If a page feels too dense - stop, re-read. Still dense? Skip, come back.
Ready? Next: Setup →
01 - Setup¶
What this session is¶
About 30 minutes. Get a local Kubernetes cluster running. Install kubectl. Run your first commands.
Step 1: Pick a local-cluster tool¶
Pick one. They all work; you can switch later.
Option A: Docker Desktop (easiest on macOS / Windows) Open Docker Desktop → Settings → Kubernetes → check "Enable Kubernetes" → Apply. Wait a few minutes; Docker Desktop downloads everything and starts a single-node cluster.
Option B: minikube
brew install minikube # macOS
sudo apt install minikube # if available; else download from minikube.sigs.k8s.io
minikube start
Option C: kind (Kubernetes IN Docker)
brew install kind # macOS
# or: curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.24.0/kind-linux-amd64 && chmod +x ./kind && sudo mv ./kind /usr/local/bin/
kind create cluster
Option D: k3d (k3s in Docker)
Pick one. The rest of this path uses commands that work the same way regardless of which.
Step 2: Install kubectl¶
kubectl is the CLI. Docker Desktop's Kubernetes installs it; minikube and kind don't (always).
Verify:
Should print a version like Client Version: v1.31.x.
Step 3: Verify the cluster¶
You should see something like:
Kubernetes control plane is running at https://127.0.0.1:6443
CoreDNS is running at https://127.0.0.1:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
If you get connection refused or similar, the cluster isn't running. Restart the cluster (minikube start, kind delete cluster && kind create cluster, restart Docker Desktop).
Check the nodes:
You should see one node (your local "machine" in the cluster):
Cluster is up.
Step 4: Your first kubectl commands¶
kubectl get pods # no pods yet - empty
kubectl get pods -A # all namespaces - see system pods
kubectl get namespaces # list namespaces
kubectl version # client + server version
kubectl config current-context # which cluster you're talking to
-A is short for --all-namespaces. You'll see system pods running in kube-system (CoreDNS, kube-proxy, etc.) - these are Kubernetes' own internals.
Step 5: Run your first pod¶
The fastest way (we'll do this properly with YAML in page 02):
After a few seconds:
You just deployed nginx to your cluster.
Step 6: Look at the pod's logs¶
Should show nginx's startup messages.
Step 7: Port-forward to access it¶
The pod is running in the cluster but not reachable from your laptop yet. Use port-forward to tunnel:
In another terminal:
Should return the nginx welcome HTML.
Ctrl-C the port-forward when done. We'll do permanent exposure with Services in page 04.
Step 8: Clean up¶
Step 9: Set up shell completion (optional but valuable)¶
# bash
source <(kubectl completion bash)
echo 'source <(kubectl completion bash)' >> ~/.bashrc
# zsh
source <(kubectl completion zsh)
echo 'source <(kubectl completion zsh)' >> ~/.zshrc
Tab completion for resource names. Saves a lot of typing.
Also useful: alias k=kubectl:
Many Kubernetes users have done this; you'll see k get pods in tutorials and at work.
What just happened, conceptually¶
You ran kubectl run - that told the Kubernetes API server "I want a pod running nginx." The cluster's scheduler picked a node (your only one), pulled the image, started the container, marked the pod as Running.
You then asked the API server "show me the pod's logs" - it routed to the kubelet on the node, which fetched the logs from the container runtime.
You used port-forward to tunnel a local port through kubectl to the pod inside the cluster. Everything went through the API server.
Three components you've now interacted with:
- API server - the cluster's brain. Everything goes through it.
- Scheduler - decides which node a pod runs on.
- kubelet - agent on each node; runs the actual pods.
There's more (etcd, controllers, etc.); we'll meet them as needed.
What you might wonder¶
"Why is everything inside the cluster opaque to my laptop?" Pods get IPs only within the cluster's network. To reach them from outside you either port-forward (debugging), use a Service of type NodePort or LoadBalancer (page 04), or use an Ingress (page 09). Architectural separation.
"Can I run multiple clusters?"
Yes - kind create cluster --name another, kubectl config get-contexts to list, kubectl config use-context <name> to switch. Useful for testing different K8s versions or simulating multi-cluster setups.
"What's kubeconfig?"
~/.kube/config - the file kubectl reads to know which cluster to talk to and how to authenticate. Multiple clusters can coexist in one config; current-context is which one is active.
"What if I break the cluster?"
Local clusters are throwaway. kind delete cluster && kind create cluster recreates from scratch in 30 seconds. Don't be afraid to break things.
Done¶
- Local Kubernetes cluster running.
kubectlinstalled and talking to the cluster.- Ran a pod, viewed logs, port-forwarded.
- Recognized the API server / scheduler / kubelet model.
02 - Pods¶
What this session is¶
About 45 minutes. Pods - Kubernetes' smallest deployable unit. You'll write your first manifest YAML, apply it, inspect it, debug a broken one.
What a pod is¶
A Pod is one or more containers that share:
- A network namespace (same IP, same ports - they can reach each other on localhost).
- Volumes.
- A lifecycle (created together, destroyed together).
99% of pods have one container. The multi-container case is for tightly-coupled helpers ("sidecar pattern" - e.g. a log shipper running alongside the main app).
Think of a pod as "a wrapper for one container + its co-located helpers." When we say "deploying nginx to Kubernetes," we mean "a pod with one nginx container."
You almost never run pods directly. You'll use Deployments (page 03) which manage pods for you. But understanding pods first is essential.
Your first pod via YAML¶
# pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.27
ports:
- containerPort: 80
Apply:
Output: pod/nginx created.
Inspect:
describe shows everything - image, IP, node, events. Read it.
Logs:
Reach it (port-forward):
Delete:
Anatomy of the YAML¶
Every Kubernetes manifest has the same four top-level keys:
| Key | What it is |
|---|---|
apiVersion |
Which API version this resource uses (v1 for core resources) |
kind |
Type of resource (Pod, Deployment, Service, ...) |
metadata |
Name, labels, annotations |
spec |
The actual configuration (resource-specific) |
The names are stable across resources. Once you've memorized them, every K8s YAML reads with the same structure.
More detailed pod spec¶
A more realistic pod:
apiVersion: v1
kind: Pod
metadata:
name: web
labels:
app: web
tier: frontend
spec:
containers:
- name: nginx
image: nginx:1.27
ports:
- containerPort: 80
name: http
env:
- name: HELLO
value: "world"
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 10
restartPolicy: Always
New fields:
env- environment variables for the container.resources- CPU and memory budget.- requests - what the scheduler reserves (used to decide which node has room).
- limits - hard ceiling (container is throttled or killed if it exceeds).
- CPU in millicores:
100m= 0.1 CPU. Memory:128Mi= 128 mebibytes (useMi,Gi, not the SIM,G). livenessProbe- Kubernetes periodically checks; if it fails enough times, the container is restarted.restartPolicy-Always(default),OnFailure,Never.
Always set resource requests and limits. Pods without them can starve other pods or get killed unpredictably.
Multi-container pod (sidecar pattern)¶
apiVersion: v1
kind: Pod
metadata:
name: app-with-sidecar
spec:
containers:
- name: app
image: my-app:1.0
ports:
- containerPort: 8080
- name: log-shipper
image: my-log-shipper:1.0
# this container scrapes /shared/logs and sends to elsewhere
volumeMounts:
- name: logs
mountPath: /shared/logs
volumes:
- name: logs
emptyDir: {}
Both containers share the volume logs (an emptyDir - wiped when the pod dies). The main app writes logs there; the sidecar reads and ships them somewhere.
You'll rarely write this yourself - most needs are met by a single container. Recognize the pattern when you see it.
Pod lifecycle / status¶
kubectl get pods shows STATUS. Common values:
- Pending - pod is accepted but containers haven't started yet (image pulling, scheduling).
- Running - at least one container is alive.
- Succeeded - all containers exited with code 0. (Like a batch job.)
- Failed - at least one container exited with nonzero code, and
restartPolicywon't retry. - CrashLoopBackOff - container keeps crashing; Kubernetes is backing off restarts.
- ImagePullBackOff - image can't be pulled (wrong name, no auth).
describe shows recent events - the timeline of what happened. Almost always tells you what's wrong.
Debugging a broken pod¶
Real workflow when a pod won't start:
kubectl get pods- what state is it in?kubectl describe pod <name>- read the Events at the bottom. Usually says exactly what's wrong.kubectl logs <name>- what did the container print before crashing?kubectl logs <name> --previous- logs from the previous (crashed) container, if it restarted.kubectl exec -it <name> -- sh- shell in (if the container is at least briefly running).
Most issues are: wrong image name, missing env var, wrong port, can't reach a dependency, OOMKilled, no resource room on the node.
Pods are mortal¶
A pod dies if: - Its node dies. - It's evicted (resource pressure). - You delete it. - A controller (Deployment) replaces it with a new version.
When a pod dies, it's gone - a new pod gets a new name, a new IP. Anything you stored inside the pod is lost. That's why you use Deployments (which create replacement pods automatically) and volumes (for persistence).
Don't get attached to specific pods.
Exercise¶
-
Write and apply the basic pod YAML above (
pod.yaml). Apply, inspect, port-forward, curl it, delete. -
Write a pod that uses an env var:
Apply,apiVersion: v1 kind: Pod metadata: name: envtest spec: containers: - name: app image: alpine command: ["sh", "-c", "echo Hello, $WHO! && sleep 60"] env: - name: WHO value: "Kubernetes"kubectl logs envtest. Should print "Hello, Kubernetes!". -
Debug a broken pod intentionally:
Apply. RunapiVersion: v1 kind: Pod metadata: name: broken spec: containers: - name: app image: nonexistent/image:404kubectl get pods(should show ImagePullBackOff or ErrImagePull). Runkubectl describe pod broken. Read the Events. Delete. -
Resource limits:
Apply.apiVersion: v1 kind: Pod metadata: name: greedy spec: containers: - name: app image: alpine command: ["sh", "-c", "while true; do :; done"] resources: requests: { cpu: "100m", memory: "64Mi" } limits: { cpu: "200m", memory: "128Mi" }kubectl top pod greedy(may need metrics-server) - should show CPU usage capped near 200m. Delete.
What you might wonder¶
"Why YAML?" Kubernetes' API is declarative - you describe desired state in JSON/YAML. YAML is more human-friendly for editing. JSON also works.
"How do I see the YAML of a running resource?"
Useful for "what did Kubernetes actually create?" - includes auto-generated fields (UID, status, etc.)."What's apiVersion: v1 vs apiVersion: apps/v1?"
Some resources live in different API groups. Core resources (Pod, Service, ConfigMap) are in v1. Apps (Deployment, StatefulSet) are in apps/v1. Networking (Ingress) is in networking.k8s.io/v1. The right apiVersion for each resource is in the docs (or shown in kubectl explain Pod).
"What's kubectl explain?"
A built-in docs lookup. kubectl explain pod.spec.containers shows the schema. Useful when you forget a field name.
Done¶
- Write a basic Pod manifest.
- Apply, inspect, log, exec, delete.
- Read
kubectl describefor events. - Distinguish Pod status values.
- Debug a broken pod.
03 - Deployments¶
What this session is¶
About 45 minutes. Deployments - the resource you'll actually use most. Manages a set of identical pods (replicas), handles rolling updates, restarts dead pods.
Why deployments instead of pods¶
Raw pods are mortal. If a pod dies, it stays dead. If you want N copies, you create N pod manifests. If you update the image, you delete each pod and create new ones.
A Deployment does this for you. It says: "I want 3 replicas of this pod template; keep it that way." If a pod dies, the Deployment creates a new one. If you change the image version, the Deployment rolls out new pods and removes old ones one at a time.
You will write Deployments far more often than raw pods.
Your first Deployment¶
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.27
ports:
- containerPort: 80
Apply:
Check:
You should see one deployment and three pods with names like nginx-7b85f9c-xxxxx. The hash is from the pod template; the suffix is unique per pod.
Anatomy¶
replicas: 3- keep 3 pods running.selector.matchLabels- which pods this Deployment manages. Must match thetemplate.metadata.labels.template- the pod spec for new pods. Same structure as a Pod manifest, withoutapiVersion/kind.
The selector + labels is how Kubernetes knows which pods "belong" to this Deployment. They must match. Page 07 covers labels in depth.
Scale up or down¶
Now 5 pods. Scale down:
Or edit the YAML's replicas: field, re-apply.
Rolling updates¶
Change the image:
Or edit the YAML (image: nginx:1.26), re-apply.
Watch:
Kubernetes: 1. Creates a new pod with the new image. 2. Waits for it to be Ready. 3. Removes one old pod. 4. Repeats until all pods are the new version.
Zero downtime. The killer feature of Deployments.
Rollback¶
If the new version is broken:
Goes back to the previous version. Done. (kubectl rollout history deployment/nginx shows past revisions.)
Rolling-update strategy¶
The default behavior is configurable:
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # at most 1 extra pod above replicas during update
maxUnavailable: 1 # at most 1 fewer pod than replicas during update
The other strategy:
For most apps, the rolling default is right. Use Recreate only when the old and new versions can't coexist (database schema migrations, for example).
Readiness probes matter for rolling updates¶
A pod is "Ready" when its container started, and if a readiness probe is configured, the probe is passing. During a rolling update, Kubernetes waits for the new pod to be Ready before considering it healthy.
Without a readiness probe, Kubernetes assumes "Running == Ready" - which is often wrong (the container is up but the app isn't accepting requests yet).
spec:
template:
spec:
containers:
- name: nginx
image: nginx:1.27
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 2
periodSeconds: 5
Always add readiness probes to production apps.
Inspect what's happening¶
kubectl get deploy nginx -o yaml # the current YAML (with status)
kubectl describe deploy nginx # status + events
kubectl rollout status deploy/nginx # live status of an ongoing update
kubectl rollout history deploy/nginx # past revisions
The Deployment's status section shows: how many pods are available, ready, and updating.
ReplicaSets (a side note)¶
Each Deployment creates a ReplicaSet under the hood (one per image version). The ReplicaSet manages the actual pods. You can kubectl get replicasets to see them.
You almost never edit ReplicaSets directly - you edit the Deployment, which manages the ReplicaSets. Recognize the layer; don't worry about it.
Exercise¶
-
Apply the Deployment above. Check
kubectl get deployandkubectl get pods. -
Scale to 5:
-
Update the image, watch the rolling update:
-
Rollback:
-
Delete a pod manually:
-
Cleanup:
What you might wonder¶
"What if I delete the Deployment?"
The Deployment AND its pods are deleted. To keep the pods (rare), kubectl delete deploy nginx --cascade=orphan.
"What's the difference between Deployment and StatefulSet?" Deployments are for stateless pods (any pod is interchangeable). StatefulSets are for stateful pods that need stable identities (think Postgres replicas - each one has its own data). Use Deployments unless you have a clear stateful need.
"What's DaemonSet?" A controller that ensures one pod per node. Used for per-node agents (log collectors, network plugins). Different shape; recognize.
"How do horizontal autoscalers work?"
HorizontalPodAutoscaler (HPA) - adjusts replicas based on CPU/memory/custom metrics. Beyond beginner; mentioned for awareness.
Done¶
- Write a Deployment manifest.
- Scale replicas up/down.
- Trigger rolling updates with
kubectl set image. - Rollback with
kubectl rollout undo. - Add readiness probes.
- Understand selector/label matching.
04 - Services¶
What this session is¶
About 45 minutes. Services - how pods reach each other and how the outside world reaches your pods. The piece that turns "3 pods running nginx" into "a stable endpoint other things can talk to."
The problem¶
Each pod gets its own IP, but pod IPs are ephemeral - a pod dies and is replaced; the new one has a different IP. You can't hardcode a pod IP anywhere.
A Service is a stable virtual IP + DNS name that fronts a set of pods. Clients talk to the service; the service load-balances across whichever pods currently match.
Your first Service¶
Assuming the nginx Deployment from page 03 is running:
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
selector:
app: nginx # match pods with label app=nginx
ports:
- port: 80 # the service port
targetPort: 80 # the container port
type: ClusterIP # default - internal only
Apply:
Output:
The service has a stable IP. It also gets a DNS name: nginx.default.svc.cluster.local (or just nginx within the same namespace).
The four Service types¶
| Type | What it does |
|---|---|
ClusterIP (default) |
Internal-only. Reachable from within the cluster. |
NodePort |
Exposes the service on a port on every node (30000-32767). |
LoadBalancer |
In a cloud cluster, provisions an external cloud load balancer. |
ExternalName |
DNS alias to an external hostname. Rare. |
For pod-to-pod traffic: ClusterIP. For "I want a stable external port for testing": NodePort. For "I want a real public endpoint in the cloud": LoadBalancer. Local clusters often don't give you LoadBalancers automatically - use port-forward or NodePort.
Test it: pod-to-pod¶
Run a debug container:
kubectl run debug --rm -it --image=alpine -- sh
# inside the container:
wget -qO- http://nginx
# should return nginx's welcome HTML
exit
The wget reached the nginx pods via the service's ClusterIP. DNS resolved nginx to the service's IP. Different pods get balanced across the three nginx replicas.
NodePort: external access (sort of)¶
spec:
type: NodePort
selector:
app: nginx
ports:
- port: 80
targetPort: 80
nodePort: 30080 # optional - leave out for auto-assigned
Now reachable at <any-node-IP>:30080. On a local single-node cluster, that's localhost:30080.
NodePorts are for tests and ad-hoc access. For production, use a LoadBalancer or Ingress (page 09).
LoadBalancer (cloud)¶
In AWS/GCP/Azure, this provisions a real load balancer with a public IP. On local clusters, this stays pending forever - you'd use minikube tunnel or similar to fake it. Stick with port-forward for local dev.
How selectors find pods¶
The Service's selector matches pods by labels. Recall the Deployment's pod template:
Service's selector:
The service tracks all pods with app: nginx. As pods come and go, the service automatically updates its list of endpoints.
You can see the actual endpoints:
Port terminology¶
Sources of confusion:
port- the Service's port (what clients call).targetPort- the container's port (where traffic is forwarded).nodePort- for NodePort services, the port exposed on every node.
Often they're all the same (port: 80, targetPort: 80) but they don't have to be. Useful: port: 8080, targetPort: 8080 exposes a service on 8080 that talks to the container's 8080.
Service discovery via DNS¶
Inside the cluster, every Service has a DNS name:
<service-name>.<namespace>.svc.cluster.local
# or just <service-name> if you're in the same namespace
So a pod in the default namespace can reach the nginx service simply as nginx. A pod in another namespace would use nginx.default or nginx.default.svc.cluster.local.
This is how multi-service apps wire together: each app uses the service name of its dependency.
Headless service (briefly)¶
A service with clusterIP: None doesn't get a virtual IP - instead, DNS returns the IPs of all matching pods. Used for things like databases where clients need to talk to specific replicas. Mentioned for recognition.
Exercise¶
-
Apply the nginx Deployment + Service:
-
Test pod-to-pod:
-
Change to NodePort and access externally: Edit
service.yamltotype: NodePort, addnodePort: 30080. Apply: -
Watch endpoints update as pods change:
-
Cleanup:
What you might wonder¶
"Is the Service load-balancing or round-robin?" By default: load-balancing across endpoints. Behavior depends on the kube-proxy mode (iptables, IPVS, eBPF). For most cases, it's "good enough" round-robin-ish.
"Can a Service select pods from multiple Deployments?"
Yes - anything matching the labels. Useful for blue-green: two Deployments both labeled app: myapp; the Service serves both during the switchover.
"What about session stickiness?"
spec.sessionAffinity: ClientIP makes the service stick a client to the same backend pod. Rarely needed; mentioned.
"What's the difference between Service and Ingress?" Services are L4 (TCP/UDP). Ingresses are L7 (HTTP/HTTPS) and add host/path routing. Page 09.
Done¶
- Write a Service.
- Use it for pod-to-pod communication via DNS.
- Recognize the four Service types.
- Distinguish
port,targetPort,nodePort. - Inspect Endpoints.
Next: ConfigMaps and Secrets →
05 - ConfigMaps and Secrets¶
What this session is¶
About 45 minutes. How to inject configuration into pods without baking it into the image - ConfigMaps (non-sensitive) and Secrets (sensitive).
The problem¶
You build an image once and want to deploy it in dev, staging, prod - each with different config (different database URLs, API keys, log levels). Hardcoding in the Dockerfile breaks that. Hardcoding in the Deployment YAML scatters config across many places.
ConfigMap stores key-value config. Secret stores sensitive key-value config (passwords, API keys, certs). You reference them from Deployments.
ConfigMap¶
Create from a YAML file:
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
LOG_LEVEL: info
FEATURE_FLAG_A: "true"
app.properties: |
server.port=8080
timeout=30s
Apply:
Alternatively, create from a file or literals:
kubectl create configmap app-config --from-file=app.properties
kubectl create configmap app-config --from-literal=LOG_LEVEL=info --from-literal=PORT=8080
The YAML form is preferred for version-controlled config.
Use a ConfigMap in a Deployment¶
Two ways: as env vars, or mounted as files.
As env vars:
spec:
containers:
- name: app
image: my-app:1.0
env:
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: app-config
key: LOG_LEVEL
Or import all keys at once:
As mounted files:
spec:
containers:
- name: app
image: my-app:1.0
volumeMounts:
- name: config-vol
mountPath: /etc/app
volumes:
- name: config-vol
configMap:
name: app-config
Each key in the ConfigMap becomes a file. So app.properties is at /etc/app/app.properties and LOG_LEVEL is at /etc/app/LOG_LEVEL.
For an app that reads config from a file (like nginx), mount form is the way.
Secret¶
Same shape, but for sensitive data:
apiVersion: v1
kind: Secret
metadata:
name: app-secret
type: Opaque
stringData:
DATABASE_PASSWORD: hunter2
API_KEY: xyzzy
stringData accepts plain strings (Kubernetes base64-encodes them internally). The alternative data: field requires you to base64-encode yourself - annoying; prefer stringData.
Apply:
Note: when you view a Secret with -o yaml, values appear base64-encoded. Decode:
Use Secrets in pods the same way as ConfigMaps:
Or mount as files. For TLS certs, mounting is the common pattern.
Secrets are not actually that secret¶
Important caveat: Kubernetes Secrets are not encrypted at rest by default. They're base64-encoded (which is encoding, not encryption - trivially decoded). Anyone with API access to your cluster can read them.
What this means: - For local development: Secrets are fine. - For production: enable encryption at rest (kube-apiserver flag) AND use RBAC to limit who can read Secrets, AND/OR use external secret managers (HashiCorp Vault, AWS Secrets Manager, sealed-secrets, External Secrets Operator).
Treat Kubernetes Secrets as "values I prefer not to log" - not "values I can keep secret from attackers who own the cluster."
A full example: app with config and secret¶
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
LOG_LEVEL: debug
PORT: "8080"
---
apiVersion: v1
kind: Secret
metadata:
name: app-secret
type: Opaque
stringData:
DATABASE_PASSWORD: secret-pass
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app
spec:
replicas: 2
selector:
matchLabels: {app: app}
template:
metadata:
labels: {app: app}
spec:
containers:
- name: app
image: my-app:1.0
envFrom:
- configMapRef:
name: app-config
env:
- name: DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: app-secret
key: DATABASE_PASSWORD
ports:
- containerPort: 8080
The --- separates multiple YAML documents in one file. kubectl apply -f handles all three at once.
Updating ConfigMaps and Secrets¶
If you edit a ConfigMap and re-apply: - For env-injected values: the pods do NOT pick up the new values until they're restarted. (Env vars are set at container start.) - For mounted files: the kubelet updates the files within ~minutes. The app needs to re-read them (or use a config watcher).
So always restart Deployments after config changes:
That triggers a rolling restart that picks up new env values. Pair every ConfigMap/Secret update with a rollout restart for the apps that use them.
Exercise¶
-
Apply the full example above (three resources in one file). Check
kubectl get cm,secret,deploy. -
Inspect:
Should show your config + secret values. -
Update the ConfigMap: Edit
Still shows the old value. Restart: Now shows new value.LOG_LEVEL: info. Apply. -
Mount as file: Edit the Deployment to mount the ConfigMap as
/etc/config/instead of env. Apply, restart, exec,ls /etc/config/. Each key is a file.
What you might wonder¶
"Can I have multiple ConfigMaps for one app?"
Yes - list multiple configMapRef or secretRef under envFrom. Useful for splitting "app config" from "feature flags."
"What about typed secrets (TLS, dockerconfigjson)?"
Kubernetes has specific types: kubernetes.io/tls for TLS certs, kubernetes.io/dockerconfigjson for image-pull credentials. The type: field distinguishes. Functionally still key-value; the types signal intent and let tools handle them specially.
"Sealed Secrets? External Secrets?"
Sealed Secrets (bitnami-labs/sealed-secrets) encrypts a Secret to a public key so it's safe to commit to git. External Secrets Operator pulls from external secret managers and creates K8s Secrets dynamically. Both for production use.
Done¶
- Create ConfigMaps from YAML, files, or literals.
- Create Secrets (using
stringData). - Inject as env vars (
env,envFrom). - Mount as files (
volumeMounts+volumes). - Understand the "restart on config change" pattern.
- Know Secrets aren't really secret without extra work.
06 - Namespaces¶
What this session is¶
About 20 minutes. Namespaces - Kubernetes' way to group resources. Used for environment isolation (dev/staging/prod in one cluster), team boundaries, and avoiding name collisions.
What a namespace is¶
A virtual partition within a cluster. Resources (pods, services, etc.) live in a namespace; they only see other resources in their own namespace by default.
By default everything lives in the default namespace. Production clusters use namespaces for organization.
List namespaces¶
A fresh cluster has:
default- your default working namespace.kube-system- Kubernetes' own internals (DNS, kube-proxy, etc.). Don't touch.kube-public- publicly readable data. Rarely used.kube-node-lease- node heartbeats. Don't touch.
Create a namespace¶
Or in YAML:
Run things in a namespace¶
Two ways: per-command, or set context.
Per-command:
Set context (so subsequent commands use it without -n):
To see your current namespace:
The kubens tool (part of kubectx) makes this fast: kubens dev switches; kubens lists. Install: brew install kubectx.
Cross-namespace DNS¶
When pods talk to services in another namespace, use the full DNS name:
Within the same namespace, the short name works: http://nginx.
Resource quotas (briefly)¶
A ResourceQuota limits how much a namespace can consume:
apiVersion: v1
kind: ResourceQuota
metadata:
name: dev-quota
namespace: dev
spec:
hard:
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 16Gi
pods: "20"
Apply to enforce. Useful for multi-team clusters where one team shouldn't starve others.
When to use namespaces¶
Real patterns:
- Per-environment: dev, staging, prod namespaces in one cluster (small orgs). Bigger orgs use separate clusters per env.
- Per-team: team-platform, team-ml, team-frontend.
- Per-tenant: SaaS apps that need data isolation per customer.
- Per-application bundle: monitoring, logging, ingress-nginx.
A typical pattern: install a Helm chart into its own namespace (helm install grafana grafana/grafana --namespace monitoring --create-namespace).
Some things are NOT namespaced¶
A few resources are cluster-wide:
- Namespace itself (obviously).
- Node.
- PersistentVolume.
- ClusterRole, ClusterRoleBinding.
- CustomResourceDefinition.
You can check whether a resource type is namespaced:
Exercise¶
-
Create a namespace and run a Deployment in it:
-
List across all namespaces:
-
Switch context to dev:
Switch back:kubectl config set-context --current --namespace=default. -
Cross-namespace DNS test:
- In
default, apply the nginx Deployment + Service (page 03/04). -
In
dev, run a debug pod: -
Cleanup:
Deleting a namespace cascades to everything inside it. Powerful and dangerous.
What you might wonder¶
"Are namespaces a security boundary?" Soft yes, hard no. By default, pods in different namespaces can still reach each other over the network. RBAC limits who can edit what across namespaces. For real isolation (untrusted workloads), separate clusters.
"What about kube-system?"
Hands-off. Kubernetes' own components live there. Modifying them can break the cluster. Read-only.
"Should I namespace everything?"
For small projects, default is fine. For anything beyond ~10 services or multi-environment in one cluster, use namespaces.
Done¶
- Create and switch namespaces.
- Run resources in specific namespaces.
- Use cross-namespace DNS.
- Know which resources are namespaced.
07 - Labels and Selectors¶
What this session is¶
About 30 minutes. Labels are how Kubernetes finds things. Almost everything in K8s uses them - Services find pods by label, Deployments manage their pods by label, monitoring scrapes pods by label.
Labels¶
Labels are key=value pairs on resources, set in metadata:
Add or change labels on a running resource:
Selectors¶
Selectors query by label. Two forms:
Equality:
Set-based:
kubectl get pods -l 'app in (nginx,redis)'
kubectl get pods -l 'env notin (prod)'
kubectl get pods -l 'tier' # has the label, any value
kubectl get pods -l '!debug' # doesn't have the label
Use whichever fits. Equality is shorter; set-based is more flexible.
How K8s components use labels¶
Most controllers use selectors:
Service finds pods to route to:
Deployment manages pods matching its template:
kind: Deployment
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx # MUST match the selector
NetworkPolicy allows/denies traffic to pods matching:
HorizontalPodAutoscaler finds pods to scale.
PodDisruptionBudget protects pods from voluntary disruption.
It's the universal "how do I find these things" mechanism. Master it.
Common label conventions¶
These conventions are widely followed. Use them in your own YAML:
| Label | Meaning |
|---|---|
app.kubernetes.io/name |
The application's name (e.g. nginx, redis). |
app.kubernetes.io/instance |
A specific install (e.g. nginx-prod). |
app.kubernetes.io/version |
App version. |
app.kubernetes.io/component |
Role (e.g. database, frontend). |
app.kubernetes.io/part-of |
Higher-level app (e.g. wordpress). |
app.kubernetes.io/managed-by |
Tool managing this (e.g. helm). |
A pod from a typical Helm chart will have all of these - useful for filtering ("show me all pods that are part of the wordpress app").
Inside a Deployment template, app: <name> (short form) is widely used and that's also fine for most cases.
Annotations vs labels¶
Labels are for selection. Short, indexed, queryable, limited in size.
Annotations are arbitrary metadata. Free-form, not queryable. Used for: build info, prometheus configs, ingress rules ("rewrite this path"), etc.
metadata:
labels:
app: nginx # for selection
annotations:
description: "the main web frontend"
deployed-by: "ci-build-#1234"
nginx.ingress.kubernetes.io/rewrite-target: "/"
You set annotations the same way (kubectl annotate ...). They don't drive controller behavior the way labels do.
Real-world: filter by labels¶
Find everything tagged with app=nginx:
Across all namespaces:
Show with the labels column visible:
Show specific label as a column:
These are daily-use commands. The --show-labels flag in particular is useful when debugging "why isn't my Service routing to my pod?" - answer is almost always "the labels don't match."
A common debugging pattern¶
"My Service has no endpoints." Almost always: the Service's selector doesn't match any pods.
kubectl describe service nginx
# Endpoints: <none> ← the smoking gun
kubectl get pods --show-labels
# you see pods with app=nginx-app, not app=nginx
Fix one or the other. Pods or the Service selector - make them match.
Exercise¶
-
Apply the nginx Deployment + Service from page 04. Confirm endpoints exist:
-
Filter by label:
-
Break selection on purpose:
Fix: -
Add an annotation:
What you might wonder¶
"What characters can be in label keys/values?"
Limited: alphanumeric, -, _, .. Length: ≤63 chars for value, key has a similar limit. If you need to store arbitrary text, use annotations instead.
"Why both labels and tags? (in some clouds)" Different worlds. Cloud tags (AWS, GCP) are on cloud resources. K8s labels are on K8s resources. Some tools sync between them; don't confuse the two.
"What's a 'selector' in a NetworkPolicy?" Same idea - pick pods by labels. NetworkPolicies allow/deny traffic between pods matching the selector. Beyond beginner; you'll meet them eventually.
Done¶
- Add labels to resources.
- Query with selectors (equality and set-based).
- Recognize the
app.kubernetes.io/*convention. - Distinguish labels (selectable) from annotations (free-form metadata).
- Debug "no endpoints" issues by checking labels.
08 - Volumes and Storage¶
What this session is¶
About 45 minutes. How pods persist data. PersistentVolumes (PV), PersistentVolumeClaims (PVC), StorageClasses - Kubernetes' way to abstract over the underlying storage (cloud disk, NFS, local disk).
Pod-local storage: ephemeral¶
Containers in a pod can share temporary storage via emptyDir:
spec:
containers:
- name: writer
image: alpine
command: ["sh", "-c", "echo hi > /shared/note && sleep 60"]
volumeMounts:
- name: shared
mountPath: /shared
- name: reader
image: alpine
command: ["sh", "-c", "sleep 10 && cat /shared/note && sleep 60"]
volumeMounts:
- name: shared
mountPath: /shared
volumes:
- name: shared
emptyDir: {}
emptyDir lives as long as the pod. Pod dies → directory and contents are gone.
Useful for: in-memory scratch, cache between containers, log file written by one and shipped by another (sidecar).
Not useful for persistence across pod restarts.
Persistent storage: PV + PVC¶
For data that outlives pods, you need two resources:
- PersistentVolume (PV) - represents a real piece of storage (a cloud disk, an NFS export, a local-disk path). Cluster-scoped.
- PersistentVolumeClaim (PVC) - a request for storage of some size/access mode. Namespaced.
Kubernetes binds a PVC to an appropriate PV. The pod mounts the PVC.
Easier: dynamic provisioning¶
In most modern clusters, you don't create PVs manually. A StorageClass describes how to dynamically provision PVs on demand. Cloud clusters (EKS, GKE, AKS) come with a default StorageClass that creates cloud disks. Local clusters (minikube, kind) have a hostPath-based StorageClass.
You just write a PVC; the cluster creates a PV to satisfy it.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
# storageClassName: standard # optional - uses default if omitted
Apply:
kubectl get pvc shows status. If Bound, you got storage. If Pending, no StorageClass exists or no PVs satisfy.
Use the PVC in a Pod¶
apiVersion: v1
kind: Pod
metadata:
name: db
spec:
containers:
- name: postgres
image: postgres:16
env:
- name: POSTGRES_PASSWORD
value: secret
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumes:
- name: data
persistentVolumeClaim:
claimName: data
Apply. The pod mounts the PVC at /var/lib/postgresql/data. Data persists across pod deletes/restarts.
Delete the pod, recreate - Postgres still has its data.
Access modes¶
The PVC's accessModes constrains what kind of storage works:
ReadWriteOnce(RWO) - one node can read+write. Most cloud disks support this. Default.ReadOnlyMany(ROX) - many nodes can read.ReadWriteMany(RWX) - many nodes can read+write. NFS supports this; cloud block storage usually doesn't.
Most apps want RWO. RWX is needed only for "many pods write to the same shared filesystem" cases.
StorageClass¶
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
iops: "3000"
reclaimPolicy: Delete # delete the volume when PVC is deleted
volumeBindingMode: WaitForFirstConsumer
The provisioner is cloud-specific. You usually don't write StorageClasses yourself - the cloud or admin sets them up.
List what's available:
The (default) annotation marks the one used when a PVC doesn't specify storageClassName.
Reclaim policy¶
When a PVC is deleted, what happens to the underlying PV?
Delete- PV (and the cloud disk) deleted. Data gone.Retain- PV and disk kept. Admin reclaims manually. Safer for production.
Delete is the default for dynamically provisioned PVs. For data you can't lose, configure Retain and back up.
Resizing¶
Some StorageClasses support resizing (allowVolumeExpansion: true). Increase a PVC's resources.requests.storage and re-apply - the disk and filesystem grow. Beyond beginner; recognize.
StatefulSets (briefly)¶
A StatefulSet is like a Deployment but each pod has a stable name and its own PVC. For databases, message queues, anything that needs identity + storage per replica.
We're not going to cover StatefulSets in depth here. Recognize: when reading manifests, kind: StatefulSet means "pods with stable identities + per-pod storage." When you need to run Postgres or Cassandra on K8s, this is the pattern (usually via a Helm chart that wraps a StatefulSet).
A common pattern: stateful + stateless¶
A typical app:
- Stateless front-ends (web servers, API gateways) → Deployment, no PVCs.
- Stateful databases (Postgres) → StatefulSet, one PVC per replica.
The stateless half scales freely. The stateful half is restricted by storage.
Exercise¶
-
Create a PVC:
-
Use it in a Postgres pod:
-
Delete the pod, recreate, verify data:
-
Cleanup:
What you might wonder¶
"Where does my data actually live?"
On a cloud cluster: an EBS / persistent disk / Azure disk attached to the node running the pod. On local clusters: usually a hostPath under /var/lib/... on the node (your laptop).
"Can two pods share a PVC?" Only if the PVC's accessMode is ROX or RWX. For RWO (the common case), only one pod can mount at a time.
"How do I back up Kubernetes-managed data?" Two layers: the actual underlying storage (cloud snapshots), and the K8s metadata (PVC, PV definitions). Velero is the popular tool. Beyond beginner; mentioned for awareness.
"What's an emptyDir with medium: Memory?"
A tmpfs-backed emptyDir - lives in RAM, doesn't touch disk. Useful for secrets that shouldn't be persisted.
Done¶
- Use
emptyDirfor ephemeral pod-shared storage. - Create a PVC for persistent storage.
- Mount a PVC in a pod.
- Recognize StorageClass as the dynamic-provisioning machinery.
- Distinguish RWO / ROX / RWX access modes.
- Know that StatefulSet is the pattern for per-replica storage.
09 - Ingress¶
What this session is¶
About 30 minutes. Ingress - Kubernetes' HTTP/HTTPS routing layer. Lets you route external traffic to multiple services based on hostname or URL path, terminate TLS, all without provisioning a LoadBalancer per service.
The problem¶
A LoadBalancer Service gets one external IP per service. With 10 services, that's 10 cloud LBs - expensive and unwieldy. You also have no way to do "host-based routing" (route api.example.com to service A, dashboard.example.com to service B) or "path-based routing" (example.com/api → A, example.com/web → B).
Ingress solves this: one entrypoint, many routes.
How it works¶
Ingress requires an Ingress Controller - an actual reverse proxy (nginx, Traefik, HAProxy, etc.) running in the cluster. The Ingress resource you write is configuration the controller reads.
Common controllers:
ingress-nginx- official NGINX-based controller. Most common.traefik- modern, auto-discovery, also a popular default.istio/linkerdgateways - if you're running a service mesh.
You install one controller (cluster-wide), then write Ingress resources.
Install ingress-nginx (local cluster)¶
For minikube:
For kind:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/kind/deploy.yaml
Wait until the controller pods are ready:
Your first Ingress¶
Assume you have two Deployments + Services: web and api.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
spec:
ingressClassName: nginx
rules:
- host: example.local
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api
port:
number: 80
- path: /
pathType: Prefix
backend:
service:
name: web
port:
number: 80
Apply. Add to /etc/hosts:
(On minikube/kind you may need to use the controller's IP or port; check the controller's docs.)
Test:
Host-based routing¶
Multiple hosts on the same Ingress:
rules:
- host: web.example.local
http:
paths:
- path: /
pathType: Prefix
backend: {service: {name: web, port: {number: 80}}}
- host: api.example.local
http:
paths:
- path: /
pathType: Prefix
backend: {service: {name: api, port: {number: 80}}}
Both hosts hit the same controller; the controller routes by Host: header.
TLS¶
Terminate HTTPS at the Ingress:
spec:
tls:
- hosts:
- example.local
secretName: example-tls # a TLS-type Secret with cert + key
rules:
- host: example.local
http:
paths: [...]
You provide the cert as a Secret:
apiVersion: v1
kind: Secret
metadata:
name: example-tls
type: kubernetes.io/tls
data:
tls.crt: <base64-encoded PEM>
tls.key: <base64-encoded PEM>
For real certs, use cert-manager (popular K8s add-on; auto-provisions Let's Encrypt certs via DNS or HTTP-01 challenges). Cert-manager + ingress-nginx + DNS = "automatic HTTPS for any new Ingress." Way beyond beginner; mentioned because every production setup uses it.
Path types¶
| Type | What it means |
|---|---|
Prefix |
URL starts with the path |
Exact |
URL matches exactly |
ImplementationSpecific |
Controller-defined (often "regex" for nginx) |
Prefix is the common choice.
Annotations: controller-specific behavior¶
Most controllers extend Ingress via annotations. For ingress-nginx:
metadata:
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /$2
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: 50m
Each annotation tweaks one behavior. The /api/foo → /foo rewrite, the redirect HTTP → HTTPS, raising body-size limits. Refer to the controller's docs.
The newer Gateway API (a separate, more structured resource family) is the long-term replacement for these annotations. It's stable but adoption is incremental. Recognize the name.
A typical app's Ingress¶
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
ingressClassName: nginx
tls:
- hosts: [api.example.com, www.example.com]
secretName: example-tls
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend: {service: {name: api, port: {number: 80}}}
- host: www.example.com
http:
paths:
- path: /
pathType: Prefix
backend: {service: {name: web, port: {number: 80}}}
That's a real-world shape: host-based routing, automatic TLS via cert-manager, HTTPS redirect.
Debugging¶
kubectl get ingress
kubectl describe ingress app
kubectl logs -n ingress-nginx <controller-pod> # see what the controller is doing
Usually issues are: wrong ingressClassName, no controller installed, DNS not pointing at the controller, backend service has no endpoints (page 04/07).
Exercise¶
- Install an ingress controller (your local cluster's chosen method).
- Apply two Deployments + Services (
webandapi, both nginx). - Apply an Ingress that routes
/toweband/apitoapi. - Add
example.localto/etc/hostspointing at your cluster's ingress (often127.0.0.1). - Curl both paths, see different responses.
If you have time:
- Set up cert-manager and get a self-signed cert into a Secret.
- Add a TLS section to the Ingress; curl https://example.local.
What you might wonder¶
"Ingress vs LoadBalancer vs NodePort?" - NodePort: dev-only. - LoadBalancer: one external IP per service; cloud-only. - Ingress: one entrypoint, many routes. The right answer for HTTP/HTTPS in production.
"What about gRPC?" ingress-nginx supports HTTP/2 (which includes gRPC) but you need annotations to enable it. Or use a gateway controller that handles gRPC natively.
"What's a 'Gateway API'?" A successor design for ingress, with cleaner abstractions (Gateway, HTTPRoute, etc.). Stable as of recent K8s versions. Migration is gradual. Recognize.
Done¶
- Install an ingress controller.
- Write an Ingress resource.
- Use host- and path-based routing.
- Recognize how TLS termination works.
- Use annotations for controller-specific behavior.
10 - Helm¶
What this session is¶
About 45 minutes. Helm - Kubernetes' package manager. Lets you install complex applications (Postgres, Prometheus, cert-manager, ingress controllers, dozens of services) with one command, parameterized via values.
The problem¶
Some OSS applications are 20+ Kubernetes resources: a Deployment, a StatefulSet, several Services, ConfigMaps, Secrets, an Ingress, NetworkPolicies, ServiceAccount, ClusterRole, ClusterRoleBinding. Maintaining that as a stack of YAML files is painful.
A Helm chart packages them with templates and a single values file. Install with helm install. Upgrade with helm upgrade. Rollback with helm rollback. One command.
Install Helm¶
Verify:
Add a chart repository¶
Helm charts live in repositories (HTTP servers + index file). Add one:
Bitnami publishes high-quality charts for most popular OSS (PostgreSQL, Redis, RabbitMQ, MongoDB, etc.).
Other popular repos:
- https://prometheus-community.github.io/helm-charts
- https://grafana.github.io/helm-charts
- https://kubernetes.github.io/ingress-nginx
- https://charts.jetstack.io (cert-manager)
Search and install¶
helm install arguments:
- mydb - your release name (the install's identity).
- bitnami/postgresql - <repo>/<chart>.
- --namespace db --create-namespace - install in the db namespace; create it if it doesn't exist.
Helm renders the chart's templates into Kubernetes manifests and kubectl applys them. Many resources are created - a Service, StatefulSet, Secret with the auto-generated password, ConfigMap, ServiceAccount, etc.
Inspect:
helm list -A # list releases across all namespaces
kubectl get all -n db # all the resources Helm created
helm get manifest mydb -n db # the actual rendered YAML
Configure via values¶
Every chart has a values.yaml with configurable defaults. Override:
Create your own myvalues.yaml:
auth:
postgresPassword: my-secret-password
database: myapp
primary:
persistence:
size: 5Gi
resources:
requests:
memory: "256Mi"
cpu: "100m"
Install with your values:
Or override individual values:
helm install mydb bitnami/postgresql \
--set auth.postgresPassword=secret \
--set auth.database=myapp \
-n db --create-namespace
-f file is cleaner for non-trivial config; --set for one-off tweaks.
Upgrade and rollback¶
After installation, update config:
Helm computes the diff and applies. Tracks revisions:
Rollback to a previous revision:
Uninstall (removes everything Helm created):
A chart's structure¶
A Helm chart on disk:
mychart/
├── Chart.yaml (metadata: name, version, dependencies)
├── values.yaml (default values)
├── README.md
├── templates/
│ ├── deployment.yaml (with Go template placeholders)
│ ├── service.yaml
│ ├── ingress.yaml
│ ├── configmap.yaml
│ ├── _helpers.tpl (reusable template snippets)
│ └── NOTES.txt (printed after install)
└── charts/ (vendored dependency charts)
Templates use Go's templating language:
# templates/deployment.yaml (excerpt)
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}
spec:
replicas: {{ .Values.replicaCount }}
template:
spec:
containers:
- name: app
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
resources:
{{- toYaml .Values.resources | nindent 10 }}
{{ .Values.X }} reads from values.yaml. {{ .Release.Name }} is the install's name. The chart author writes templates; users pass values.
Reading a chart¶
When you encounter a chart in an OSS project:
Chart.yaml- name, version, dependencies.values.yaml- what's configurable (with defaults).templates/- what resources get created.README.md- usage docs; often has tables of all values.
helm template renders the chart without installing - useful to see what it WOULD create:
Write your own chart (brief)¶
Generates a starter chart with sensible defaults: Deployment, Service, Ingress, optional ServiceAccount, HPA. Edit the templates and values.yaml for your app.
Test render:
Install locally:
For a real first chart, start by adapting helm create's output. Many OSS projects ship a chart in charts/ or helm/ for their own deployment.
Helmfile / values per environment¶
Real apps have dev / staging / prod with different values. Patterns:
- Multiple values files:
values-dev.yaml,values-prod.yaml.helm install -f values-base.yaml -f values-prod.yaml. - Helmfile - a tool that wraps Helm with a higher-level declarative spec.
- Argo CD / Flux - GitOps tools that watch your git repo and apply Helm releases. Beyond beginner; widely used.
Exercise¶
- Install Helm.
- Add Bitnami repo:
- Install Postgres with custom values:
- Inspect rendered manifests:
- Upgrade: change
myvals.yaml(e.g. raise memory),helm upgrade mydb ... -n db. - Cleanup:
What you might wonder¶
"Helm vs Kustomize?"
Two ways to manage K8s YAML.
- Helm: templates + values. Better for distributing reusable apps.
- Kustomize: patches over base YAML. Built into kubectl (kubectl apply -k .). Better for "I want to tweak existing YAML."
Many teams use both: Helm for third-party charts; Kustomize for their own apps.
"Is Helm 'production-grade'?" Yes. Used everywhere. Helm 3 (current) is solid. Helm 2 (deprecated) had a Tiller server that was a known security issue; H3 is client-side only.
"What's helm install actually doing?"
1. Fetches the chart (from a repo or local).
2. Resolves values (defaults + your overrides).
3. Renders templates.
4. Calls kubectl apply for the rendered YAML.
5. Records the release in a Secret (or ConfigMap) in the target namespace.
That last step is how helm list works - it queries those Secrets to find releases.
Done¶
- Install Helm.
- Add chart repos.
- Install / upgrade / rollback / uninstall charts.
- Override values via files or
--set. - Read a chart's structure.
- Render templates without installing.
11 - kubectl Power Tools¶
What this session is¶
About 45 minutes. The kubectl commands you'll use to debug real clusters. Logs, exec, port-forward, top, events, and a few other essentials.
Logs¶
kubectl logs <pod> # all logs
kubectl logs -f <pod> # follow (like tail -f)
kubectl logs --tail 100 <pod> # last 100 lines
kubectl logs --since 10m <pod> # last 10 minutes
kubectl logs <pod> -c <container> # specific container (multi-container pod)
kubectl logs <pod> --previous # previous (crashed) container
Logs deployment-wide (across all matching pods):
For multiple pods at once, install stern (brew install stern):
stern is much nicer than kubectl's native multi-pod log handling. Indispensable.
Exec into a pod¶
kubectl exec -it <pod> -- sh # shell
kubectl exec <pod> -- ls /app # one-off
kubectl exec -it <pod> -c <container> -- bash
For Deployments (any matching pod):
Use exec to: inspect environment vars (env), check filesystem state, run database client (psql, redis-cli), test connectivity (nc -zv host port).
Port-forward¶
kubectl port-forward pod/<name> 8080:80
kubectl port-forward svc/<name> 8080:80
kubectl port-forward deploy/<name> 8080:80
Tunnel from your laptop into the cluster. Use for debugging - don't expose production this way.
Forward to a Service: kubectl picks a healthy backing pod.
Top¶
Requires metrics-server installed:
kubectl top nodes
kubectl top pods
kubectl top pods -A
kubectl top pods --sort-by=cpu
kubectl top pods --sort-by=memory
If error: Metrics API not available, install metrics-server (most local clusters need it explicitly; minikube addons enable metrics-server).
Events¶
The cluster's event log:
kubectl get events -A
kubectl get events --sort-by='.lastTimestamp'
kubectl get events --field-selector type=Warning
For a specific resource:
When debugging "why is this pod stuck," describe's Events section is usually the answer.
Watch¶
-w updates as things change. Useful for "watch a rolling update happen."
Describe¶
kubectl describe pod <name>
kubectl describe deployment <name>
kubectl describe service <name>
kubectl describe ingress <name>
Shows full configuration AND the relevant events. The most useful single command for debugging.
Edit¶
Powerful but discouraged for production - your changes aren't in version control. Use for quick experiments; for real changes, edit the YAML file and apply.
Patch¶
For surgical updates without full edit:
JSON or YAML format. Useful in scripts. Rarely needed interactively.
Diff¶
Shows what would change if you applied. Great for "did I update this correctly?"
Get with output formatting¶
kubectl get pods -o wide # extra columns (node, IP)
kubectl get pods -o yaml # full YAML
kubectl get pods -o json # full JSON (pipe to jq)
kubectl get pods -o name # just names (great in shell loops)
kubectl get pods -o jsonpath='{.items[*].metadata.name}' # custom field
kubectl get pods -o custom-columns='NAME:.metadata.name,IP:.status.podIP'
jsonpath is finicky but powerful. The custom-columns form is more readable.
Useful third-party kubectl helpers¶
Worth installing:
kubectx/kubens- switch contexts / namespaces fast.stern- tail logs across many pods.k9s- terminal UI for K8s. Liketopbut interactive and beautiful.kubectl-tree- show resource hierarchy (Deployment → ReplicaSet → Pods).kubectl-neat- strip generated fields from YAML for readability.stretchr/k- Lots of aliases and helpers. Or justalias k=kubectlyourself.
k9s in particular is something to install on day one. k9s then arrow keys + Enter to navigate. Logs are 1 keypress away; exec is 1 keypress away. Most clusters' day-2 operators live in k9s.
Real debugging workflow¶
A pod is failing. What I'd actually do:
kubectl get pods- status?kubectl describe pod <name>- Events at the bottom usually say why.kubectl logs <name>/kubectl logs <name> --previous- what did it print before failing?kubectl exec -it <name> -- sh- shell in (if it's running long enough). Check env vars, file paths, network reachability.kubectl get events --sort-by='.lastTimestamp' --all-namespaces- broader picture.stern <name>- if it restarts repeatedly, follow all incarnations.- Check related resources: ConfigMap, Secret, Service, PVC. Often the pod's fine; a dependency is wrong.
Exercise¶
-
Install
Or your distro's equivalent.sternandk9s: -
Exec into a running pod:
-
Watch a rolling update:
See pods come and go. -
Use stern:
-
Launch k9s:
Arrow keys, Enter to drill in,lfor logs,sfor shell,:for command,qto quit. Spend 10 minutes wandering. -
Resource sort:
Which pods use the most memory?
What you might wonder¶
"How do I know which container in a multi-container pod my logs are from?"
By default, you only see the first container. Specify with -c <name>. Or --all-containers for all of them.
"What's kubectl explain?"
Built-in docs: kubectl explain pod.spec.containers shows what fields exist. Useful when you forget a YAML field name.
"What's a kubeconfig?"
~/.kube/config - file kubectl reads to find clusters and credentials. Multiple clusters live here; kubectl config use-context <name> switches. kubectx makes this easier.
Done¶
- Read logs (with follow, previous, by selector).
- Exec into running containers.
- Port-forward for local access.
- Inspect with
get,describe,top,events. - Use third-party helpers (
stern,k9s). - Apply a real debugging workflow.
Next: Reading other people's manifests →
12 - Reading Other People's Manifests¶
What this session is¶
About 30 minutes. The strategy for reading a real-world Helm chart or set of K8s manifests - without trying to memorize everything at once.
The five-minute orientation¶
For any Kubernetes-deployed project:
- Read the project's README. What does it do? How is it deployed?
- Find the deployment artifacts. Common locations:
deploy/,kubernetes/,manifests/,helm/,charts/,k8s/. - Identify the deployment style:
- Plain YAML files in
manifests/- apply withkubectl apply -f manifests/. - Kustomize: look for
kustomization.yamlfiles. - Helm: look for
Chart.yaml+values.yaml. - Operator: look for CRDs and a controller pod.
- Find a values file or example. The project's own dev environment is often the most-realistic example to study.
- Render to YAML if it's not already:
helm template chart/ -f values.yamlkubectl kustomize manifests/overlays/dev/- Read the rendered output top to bottom, in this order: Deployments → Services → ConfigMaps/Secrets → Ingress → others.
After this orientation, you should be able to write a paragraph: "This project deploys A, B, and C, connected by a Service named D, exposed externally via Ingress E."
Reading a Helm chart¶
Open Chart.yaml:
apiVersion: v2
name: myapp
version: 1.2.0
appVersion: "2.5.0"
dependencies:
- name: postgresql
version: "11.6.0"
repository: https://charts.bitnami.com/bitnami
You learn: it's an app called myapp, with Postgres as a sub-chart.
Open values.yaml. The defaults plus the structure of what users can override. Skim it; come back when you need to tune something.
Open templates/. Each file is a (Go-templated) Kubernetes manifest. Common ones:
- deployment.yaml
- service.yaml
- ingress.yaml
- configmap.yaml
- secret.yaml
- serviceaccount.yaml
- _helpers.tpl (reusable snippets)
Read deployment.yaml first - that's where the main app spec is. Templated bits like {{ .Values.image.repository }} come from values.yaml.
Render to see what's actually produced:
Reading rendered YAML is often easier than reading templates.
Common patterns you'll see¶
A Deployment + Service per service. The fundamental pattern.
An Ingress at the front. Often one Ingress with multiple host- or path-based routes.
A ConfigMap for application config. Often mounted as files or injected as env.
A Secret for credentials. Sometimes generated by the chart (random passwords); sometimes you provide.
A StatefulSet + Headless Service for stateful workloads (databases, message brokers).
A Job or CronJob for migrations / scheduled tasks. Run-once or run-periodically pods.
HorizontalPodAutoscaler for stateless services. Scales based on CPU/memory/custom metrics.
PodDisruptionBudget for important services. Limits voluntary disruption during node drains.
NetworkPolicy for zero-trust networking. Allow/deny rules for pod-to-pod traffic.
A CustomResource (CRD instance) for operator-managed apps. E.g., Prometheus is a CR for the Prometheus Operator.
You don't need to know every detail. Recognize the type; understand what role it plays; look up specifics when needed.
CRDs and Operators¶
A CustomResourceDefinition (CRD) lets you add new resource types to Kubernetes. An Operator is code (usually a pod running in the cluster) that watches a CRD and reconciles real-world state to match.
Examples:
- cert-manager defines Certificate. The cert-manager operator watches them, provisions actual TLS certs via Let's Encrypt, stores them as Secrets.
- Prometheus Operator defines Prometheus, ServiceMonitor. The operator watches them and configures Prometheus instances.
- Postgres Operator (Zalando) defines postgresql. The operator manages a Postgres HA cluster.
When you see kubectl get <something-unusual>, you're likely looking at a CRD installed by an operator.
kubectl get crd # list all CRDs in the cluster
kubectl api-resources # all resource types, including CRDs
Things that look scary¶
- A wall of
{{ .Values.xxx }}in a Helm template - those are placeholders; render to see actual values. - Long
nodeSelector/tolerations/affinityblocks - scheduling rules. You can usually ignore; they affect placement, not behavior. initContainers- run BEFORE the main containers in a pod. Used for setup (waiting for the DB, fetching configs).hostPath,hostNetwork: true,privileged: true- bad signs in untrusted contexts; required for some system pods (network plugins, log collectors). Recognize.- Long
securityContextblocks - security hardening. Read what's restricted; it's usually self-explanatory. spec.template.spec.containers[0].resources.limits.ephemeral-storage- controls pod-local disk usage. Rare but real.
Exercise¶
Pick a real Kubernetes-deployed OSS project. Suggestions:
grafana/grafana- Grafana itself, plus its Helm chart. Well-documented.prometheus-community/helm-charts- Prometheus and friends.cert-manager/cert-manager- TLS automation.bitnami/charts- wide variety of well-organized charts.
Pick one's Helm chart. Apply orientation:
- README - what does it do?
Chart.yaml- what version + dependencies?values.yaml- what's configurable?templates/- what resources?- Render with your own simple values:
- Read the rendered output. Write a paragraph summarizing what gets deployed.
What you might wonder¶
"What if the project uses Kustomize instead of Helm?"
Same idea, different mechanism. kustomization.yaml lists base resources + patches. kubectl kustomize <dir> renders. Read the base first, then the overlays.
"What if the project uses Jsonnet, CUE, or Pulumi?" Less common but exists. Same approach: render to YAML, read the YAML.
"What if there are CRDs I don't recognize?"
kubectl explain <crd> shows the schema. The CRD's project has docs. Read.
Done¶
- Apply five-minute orientation to a K8s OSS project.
- Read Helm chart structure (
Chart.yaml,values.yaml,templates/). - Render charts to readable YAML.
- Recognize common resource patterns.
- Know CRDs / Operators exist.
13 - Picking a Project¶
What this session is¶
About 30 minutes plus browsing. What "Kubernetes-adjacent" OSS looks like and how to evaluate one.
What kinds of projects accept first contributions¶
With Kubernetes skills (without needing deep Go programming), you can contribute to:
- Helm charts - fixes, defaults, docs, examples.
- Kustomize bases / overlays - same idea.
- kubectl plugins - small CLIs that extend kubectl.
- Operators (read-only) - docs, sample manifests, bug reports.
- Documentation - k8s.io itself, controller docs, chart docs.
- GitHub Actions for K8s - useful tooling.
- CRD examples / starter manifests.
For deep code contributions (operators, controllers, kubectl itself), you'll need Go fluency - see Go from scratch.
10-minute evaluation¶
Same as other beginner paths:
| Signal | Target |
|---|---|
| Stars | 100-50000 |
| Last commit | Within a month |
| Open PRs | Some, not 200+ |
| Recent PR merge time | Under 14 days |
good first issue count |
≥5 |
| CONTRIBUTING.md exists | yes |
| Local install works (helm install or kubectl apply) | yes |
Candidates¶
Tier 1 - Helm charts (small surface)¶
bitnami/charts- Bitnami's chart collection. Each chart is a separate component. Excellent labels, very responsive.prometheus-community/helm-charts- Prometheus ecosystem.grafana/helm-charts- Grafana, Loki, Tempo, Mimir.cert-manager/cert-manager- TLS automation. Has its own chart + docs.
These accept Helm-only PRs - chart value defaults, missing values examples, README clarifications.
Tier 2 - kubectl plugins and small tools¶
derailed/k9s- terminal UI for K8s. Active, Go.stern/stern- multi-pod log tail. Go.ahmetb/kubectx-kubectx/kubens. Small Go.kubectl-plugin-list- search for kubectl plugins; many small ones exist.vmware-tanzu/velero- K8s backup/restore tool.
These often have Go code; some have shell or YAML-only changes available too.
Tier 3 - bigger projects, with K8s focus¶
argoproj/argo-cd- GitOps for K8s. Big project, well-organized.fluxcd/flux2- alternative GitOps tool.kubernetes-sigs/...- many smaller projects under the kubernetes-sigs org. Each is more focused than the mainkubernetes/kubernetesrepo.
Tier 4 - don't start here¶
kubernetes/kubernetesitself - huge, slow review, CLA required.prometheus/prometheus- popular but big.
Finding issues¶
Project's Issues → Labels. Filter by:
- good first issue
- help wanted
- documentation
- helm (if Helm-specific)
Comment to claim. Wait for maintainer.
What counts¶
Real contributions for someone with K8s skills:
- Fix a chart value default that's broken on a specific platform (ARM, EKS, etc.).
- Add a missing example to a chart's README.
- Improve the documentation for a CRD field.
- Add a missing
app.kubernetes.io/versionlabel. - Fix a broken example in the docs.
- Add a kubectl plugin to a curated list.
- Add a missing test for a Helm template helper.
- Improve an Ingress example to include TLS and HTTPS redirect.
All real. All count.
Exercise¶
- Browse three Tier 1 / Tier 2 projects.
- 10-minute evaluation.
- Pick the most responsive with
good first issues. - Read CONTRIBUTING.md.
- Clone: If it's a Helm chart, try installing it locally:
- Browse
good first issuetickets; pick two candidates. Don't claim yet.
What you might wonder¶
"What if I'm intimidated by the Operator/controller projects?" Skip them. There's plenty of work that's pure YAML / Helm / docs. Operators are Go code with deep K8s API knowledge - a separate skill set.
"What's a CNCF project?" The Cloud Native Computing Foundation hosts many of the most-used K8s ecosystem projects (Prometheus, Envoy, Helm, Argo, Linkerd, etc.). CNCF projects tend to have well-defined governance and contribution processes. Browse cncf.io/projects.
Done¶
- Articulate K8s-OSS contribution shapes.
- Run a 10-minute evaluation.
- Have specific candidate projects.
Next: Anatomy of a K8s-related OSS project →
14 - Anatomy of a K8s-Related OSS Project¶
What this session is¶
About 30 minutes. Walk through the typical layout of a K8s-related OSS project - Helm charts, operators, kubectl plugins.
Typical Helm chart repo¶
my-app-chart/
├── README.md
├── LICENSE
├── CONTRIBUTING.md
├── Chart.yaml (chart metadata)
├── values.yaml (default values)
├── values.schema.json (JSON schema for values, optional)
├── templates/
│ ├── _helpers.tpl
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── ingress.yaml
│ ├── configmap.yaml
│ ├── secret.yaml
│ ├── serviceaccount.yaml
│ ├── hpa.yaml
│ ├── tests/
│ │ └── test-connection.yaml
│ └── NOTES.txt
├── ci/ (test values files)
│ ├── ci-values.yaml
│ └── lint-values.yaml
├── .helmignore (like .gitignore but for chart packaging)
└── .github/workflows/
├── lint.yml (helm lint, chart-testing)
└── release.yml (publish to a chart repo)
Roles:
- Chart.yaml - metadata. Version. Dependencies.
- values.yaml - what users can configure. The README usually has a table of all values.
- values.schema.json - JSON schema; helm install validates against it.
- templates/ - Go-templated manifests. _helpers.tpl is shared snippets.
- templates/tests/ - Helm tests. Run with helm test <release>. Each test is a pod that runs an assertion (e.g., curl the service to ensure it works).
- ci/ - different value combos for CI testing.
- .helmignore - files NOT included when packaging the chart.
Typical operator repo (Go-based)¶
my-operator/
├── README.md
├── LICENSE
├── CONTRIBUTING.md
├── Makefile (build, test, deploy targets)
├── go.mod / go.sum
├── PROJECT (Kubebuilder/operator-sdk project file)
├── api/
│ └── v1alpha1/
│ ├── mything_types.go (Go struct for the CRD)
│ └── zz_generated_deepcopy.go (generated code)
├── controllers/
│ └── mything_controller.go (the reconcile loop)
├── config/
│ ├── crd/bases/ (generated CRD YAMLs)
│ ├── default/ (default deployment manifests)
│ ├── manager/ (operator pod deployment)
│ ├── rbac/ (ServiceAccount + Role + Binding)
│ └── samples/ (sample CR instances)
├── hack/ (helper scripts)
└── .github/workflows/
Operators are usually built with Kubebuilder or Operator SDK. The above is the Kubebuilder layout.
You wouldn't dive into operator Go code as a first contribution. Likely contributions: improve config/samples/ examples, fix typos in README.md, update RBAC in config/rbac/, improve test fixtures.
Typical kubectl plugin¶
kubectl-myplugin/
├── README.md
├── LICENSE
├── main.go (entry point, single binary)
├── pkg/
│ └── cmd/
│ └── ... (subcommands)
├── go.mod
├── Makefile
├── .github/workflows/
│ └── release.yml (build binaries for multiple platforms)
└── docs/
kubectl plugins are standalone binaries named kubectl-<plugin>. Put one in your PATH and kubectl <plugin> works. Many plugins are small (a few hundred lines of Go).
Contributions: subcommand improvements, docs, bug fixes.
CI: what your PR will be measured against¶
For Helm charts, common CI:
- name: Run chart-testing (lint)
run: ct lint --config ct.yaml
- name: Run chart-testing (install)
run: ct install --config ct.yaml
chart-testing is the standard Helm-CI tool. Replicate locally:
For operators, CI usually runs:
- make test (unit tests)
- make manifests (regenerate CRDs from Go types - should be no-op if you didn't change types)
- make docker-build (build the operator image)
For kubectl plugins, CI runs go test ./... and cross-compiles.
Conventions in CONTRIBUTING.md¶
- Setup. "How to run the chart locally," "how to build the operator binary."
- Tests. What
make testorct installchecks. - Commit format. Some projects use Conventional Commits. Many don't.
- PR template. Address every checkbox.
- DCO. CNCF projects often require
git commit -sfor the DCO. - CLA. Some projects require a CLA signature via a bot.
Reading a chart's README¶
Real-world chart READMEs typically have:
- Overview.
- Prerequisites (K8s version, Helm version).
- Installation:
- Configuration values table - every settable value, default, description.
- Upgrade notes (breaking changes per version).
- Troubleshooting.
Read this before opening a PR. Many issues are answered there.
A worked walkthrough¶
Pick a Bitnami chart, say bitnami/postgresql. Apply orientation:
-
Open
Chart.yaml-name: postgresql, dependencies (none direct), version (likely double-digits - Bitnami iterates fast). -
Open
values.yaml- heavily commented. The structure shows:image:,auth:,primary:,readReplicas:,metrics:, etc. -
Look in
templates/: primary/statefulset.yaml(Postgres primary)primary/svc.yamlread/statefulset.yaml(read replicas if configured)secrets.yamlserviceaccount.yaml-
metrics/ -
Read the primary StatefulSet template. Note how it pulls everything from
.Values.primary.*. -
Open
templates/NOTES.txt. That's what users see after install - "your password is X, connect with Y."
Five minutes; you have a mental map of one of the most-deployed Postgres charts.
Exercise¶
Use the project you picked in page 13:
- Clone locally.
- Walk the layout. Map files to categories.
- Read CONTRIBUTING.md.
- Find the CI workflow. List the commands.
- Run them locally:
- For Helm:
helm lint .and maybect lint. - For operator:
make test. - Identify the file your issue would touch.
What you might wonder¶
"What's app.kubernetes.io/managed-by: Helm?"
A label every Helm-installed resource gets, indicating who manages it. Useful for "show me all Helm-managed things."
"What's release.toolkit.fluxcd.io/...?"
Annotations the Flux GitOps operator uses to track resources. You'll see them in Flux-managed clusters.
Done¶
- Recognize Helm chart layout.
- Recognize operator (Kubebuilder) layout.
- Recognize kubectl plugin layout.
- Read CI workflows, run them locally.
- Read a chart README for configuration.
Next: Your first contribution →
15 - Your First Contribution¶
What this session is¶
The whole thing. Walk through making a real contribution to a real K8s-related OSS project, end-to-end.
The workflow¶
Identical pattern:
- Fork on GitHub.
- Clone your fork.
- Add upstream as remote.
- Branch off main.
- Set up: test the chart / operator installs cleanly on a fresh cluster.
- Change the file(s).
- Run lint + test locally (same commands CI runs).
- Push to your fork; open PR.
Step 1: Fork & clone¶
GitHub → Fork. Then:
git clone git@github.com:<you>/<project>.git
cd <project>
git remote add upstream git@github.com:<owner>/<project>.git
git fetch upstream
Step 2: Branch¶
Step 3: Set up¶
For a Helm chart:
helm lint .
helm template . | head -50
helm install test . -n test --create-namespace
kubectl get all -n test
helm uninstall test -n test
For an operator:
For a kubectl plugin:
If anything fails on a fresh clone, fix that first or ask in the issue.
Step 4: Make the change¶
The change should be: - Small. 1-10 lines for a first PR. - Focused. One issue per PR. - Tested. Re-run the lint and test commands locally.
For Helm chart docs: edit README.md or comments in values.yaml. For chart template fixes: edit templates/<file>.yaml. For operator docs: usually README.md or docs/.
Step 5: Re-run CI's commands locally¶
Whatever the workflow runs:
helm lint .
ct lint --config ct.yaml
ct install --config ct.yaml # actually installs against a fresh kind cluster
For chart-testing's install command, you need kind (or any local cluster) running.
For operators:
All green? Push. Red? Fix locally first.
Step 6: Commit and push¶
If DCO required:
Push:
GitHub prints a URL to open the PR.
Step 7: Open the PR¶
On the upstream repo, "Compare & pull request."
- Title. Short, descriptive.
- Description. What changed, why, how tested. Reference issue:
Closes #123. - Checklist. Address every item.
Submit. CI runs. Fix anything red by pushing more commits.
Worked example: improving a Bitnami chart's README¶
Suppose you noticed bitnami/postgresql's chart README has an outdated example using apiVersion: v1beta1 for a long-deprecated resource. You'd:
git clone git@github.com:<you>/charts.git
cd charts
git remote add upstream git@github.com:bitnami/charts.git
git fetch upstream
git checkout -b docs/postgresql-update-apiversion
# Edit bitnami/postgresql/README.md, fix the apiVersion example.
# Lint:
cd bitnami/postgresql
helm lint .
cd -
# Commit (Bitnami uses DCO):
git add bitnami/postgresql/README.md
git commit -s -m "[bitnami/postgresql] Update outdated apiVersion in README"
git push origin docs/postgresql-update-apiversion
Open PR. Wait for review.
What review looks like¶
Standard: 1. "LGTM, merging." Done. 2. "Could you change these?" Address each. Push commits. 3. "Not what we want." Rare for first PRs. 4. Silence. Polite check-in after 1 week.
For CNCF projects, reviews are often more rigorous (multiple reviewers required, formal LGTM/Approve labels). Read the project's review guidelines.
After the merge¶
- Update fork's
main. - Delete branch.
- Take a screenshot.
- Sit with it.
After your first PR¶
- Pick another issue. Familiarity compounds.
- After 3-5 PRs, become a regular. Review others.
- Build your own kubectl plugin or Helm chart. Publish.
- Move toward operators (requires Go).
What you might wonder¶
"PR sits for weeks?" Polite check-in. CNCF projects can have multi-week review cycles by design (formal LGTM/Approve flow). Patience.
"What about Kubernetes itself?" A category of its own. CLA required, multiple reviewers, conformance tests. SIG-Docs is the on-ramp - documentation contributions are well-shepherded and a respected path into the project. Don't start there for a first OSS PR; build experience with smaller projects first.
"Maintainer rude?" Disengage. Try another project.
Done with this path¶
You've:
- Installed kubectl and a local cluster.
- Deployed pods, Deployments, Services.
- Managed config and secrets.
- Used PVCs for persistent storage.
- Set up Ingress for routing.
- Installed Helm charts.
- Debugged with kubectl power tools.
- Read a real K8s OSS project.
- Submitted a PR.
What you should do next: keep deploying things to your local cluster. Apply Helm charts of tools you actually use. Read their YAML. Familiarity compounds.
Recommended next paths on this site:
- Kubernetes Mastery (senior reference) - 24-week deep dive into control plane, kubelet/CRI, controllers, networking, day-2 ops.
- Container Internals (senior reference) - how the underlying containers actually work.
- Linux Kernel - the substrate.
- Go from Scratch - if you want to write operators or kubectl plugins.
Congratulations. You are no longer a beginner.