Saltar a contenido

Prelude-What Containers Actually Are

Sit with this document for an evening before week 1.


1. There Is No Such Thing As a "Container"

The kernel has no concept of "container." There is no struct container in /proc. What people call a container is a bundle of kernel features applied together to a process:

  • One or more namespaces (PID, NET, MNT, UTS, IPC, USER, CGROUP) for isolation.
  • One or more cgroups v2 for resource limits.
  • A rootfs mounted as the process's /, usually via pivot_root and an OverlayFS stack.
  • A seccomp filter restricting syscalls.
  • An LSM label (SELinux/AppArmor) restricting object access.
  • A capabilities mask restricting privilege.

A "container runtime" is a program that arranges these things from a specification (OCI runtime config), then execves the user's command. That's it. Docker is not the OS. Containers are not VMs. There is no hypervisor-equivalent.

If you internalize this, the rest of the curriculum is bookkeeping.


2. The OCI Layer Cake

The Open Container Initiative defines three specifications that everything in the ecosystem implements:

  1. Image Spec-what a container image is: a manifest, a config, a stack of layer tarballs, all addressable by content (SHA-256).
  2. Runtime Spec-what a runtime configuration is: a config.json + a rootfs directory.
  3. Distribution Spec-what a registry is: an HTTP API for pushing/pulling content-addressed blobs and manifests.

docker pull (image + distribution), docker run (runtime), docker build (image), docker push (distribution)-all four operations are OCI-spec'd. Once you can do them with runc, skopeo, and buildah, you understand the ecosystem.


3. The Tooling Map

Concern OCI-spec tool Daemon-based equivalent
Build images buildah docker build
Pull / push / inspect images skopeo docker pull/push/inspect
Run containers (low-level) runc / crun / youki hidden under dockerd
Run containers (high-level) podman, nerdctl docker CLI
Container daemon containerd, CRI-O dockerd

By month 4 you should never type docker again for any task in this curriculum, except to demonstrate equivalence with the daemon-based world.


4. Cost Model

A working container engineer reasons along five axes:

Axis Question
Image What's in this image? What's its provenance, vulnerability surface, layer count, total size?
Runtime What namespaces, cgroups, seccomp, capabilities, LSM are applied?
Filesystem What's the storage driver (overlay2, fuse-overlayfs, btrfs)? Is the layered FS the bottleneck?
Network What CNI? Bridge, macvlan, host-mode? What's the per-packet overhead?
Supply chain Is it signed? Is the SBOM accurate? What's the SLSA provenance level?

Beginner courses teach axis 1 only.


5. The Reading List

Primary - The OCI specs themselves (opencontainers/image-spec, opencontainers/runtime-spec, opencontainers/distribution-spec). Each is short-read all three before week 1 ends. - runc source (opencontainers/runc), particularly libcontainer/. - containerd architecture docs (containerd/containerd/docs/). - buildah and podman documentation. - Container Security (Liz Rice). Best concise text on the security model.

Secondary - Linux in Action (David Clinton)-chapters 8–10 if you want a softer on-ramp. - The CNCF Cloud Native Glossary-terminology calibration. - Aleksa Sarai's blog (cyphar.com)-runc maintainer; deep posts on rootless, user namespaces.

Adjacent (you must know) - The Linux curriculum's namespaces and cgroups chapters. If you skip them, this curriculum will not stick.


6. Curriculum Philosophy

  1. Spec first, tool second. Whenever a behavior surprises you, the OCI spec is the source of truth. docker run with no flags hides ~50 default values; runc exposes them.
  2. Daemonless by default. All weekly labs target the daemonless toolchain (buildah, skopeo, podman, runc). Learn the canonical primitives; the daemon-based version is an ergonomic skin on top.
  3. Rootless by default once feasible. Modern Linux supports rootless containers via user namespaces. Practice it from week 9 onward.

7. What Containers Are Not For

  • Hard isolation. A container is a process with namespaces. Kernel exploits cross containers. For untrusted multi-tenant code, use a VM-class isolation layer (gVisor, Kata, Firecracker)-covered in APPENDIX_A.
  • Stateful systems with strict durability. Volume management adds complexity; production databases benefit from running outside containers (or with mature operators in K8s, see the Kubernetes curriculum).
  • GUI applications. Possible (X11/Wayland forwarding) but rarely the right tool.

You are now ready for Week 1. Open 01_MONTH_OCI_FOUNDATIONS.md.

Comments