12 - Reading Other People's Dockerfiles¶

What this session is¶

About 30 minutes. Strategy for reading a real-world Dockerfile and compose.yaml so you understand what an OSS project is doing.

The five-minute orientation¶

For any containerized project:

Read the project's README - what does it do, how to run it.
Find the Dockerfile (or Dockerfile.* variants) - usually at repo root or docker/.
Read top to bottom. Each instruction has an obvious purpose; you've seen them in pages 05-09.
Find any compose.yaml - tells you the multi-container topology.
Find the CI workflow (.github/workflows/) - shows how the image is built and pushed.

After five minutes you should be able to summarize: "This project produces an image based on X, running Y as Z user, exposing port N."

Reading top to bottom¶

FROM golang:1.23 AS builder       # build stage - full Go toolchain
WORKDIR /src
COPY go.mod go.sum ./             # dep manifest first (cache)
RUN go mod download
COPY . .                          # source
RUN CGO_ENABLED=0 go build -o /app/myapp ./cmd/myapp

FROM gcr.io/distroless/static:nonroot   # final stage - minimal
COPY --from=builder /app/myapp /myapp   # copy just the binary
USER nonroot:nonroot
EXPOSE 8080
ENTRYPOINT ["/myapp"]

Read each line: "build with Go 1.23, copy deps then download then source, compile, switch to distroless, copy binary, run as non-root user, expose 8080."

You can predict from this Dockerfile: - Image will be tiny (~10MB) - distroless + static binary. - Runs as a non-root user - hard to escape. - Single binary - easy to debug.

Read a compose.yaml¶

services:
  web:
    build: .
    ports: ["8080:8080"]
    environment:
      DATABASE_URL: postgres://postgres:secret@db:5432/myapp
    depends_on:
      db:
        condition: service_healthy

  db:
    image: postgres:16
    environment:
      POSTGRES_PASSWORD: secret
      POSTGRES_DB: myapp
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s

volumes:
  pgdata:

Read: "two services, web builds from the Dockerfile in this dir and talks to a postgres database named db over the auto-created network, with the database's data in a named volume."

You can predict: "to run this locally, docker compose up -d will probably Just Work after I set the right env vars."

Patterns you'll see in real projects¶

Multi-stage with --from=builder - almost universal for compiled languages.

HEALTHCHECK instructions inside Dockerfiles (alternative to compose health checks). The image documents how to determine if it's healthy.

ARG for version pinning at build time:

ARG NODE_VERSION=20
FROM node:${NODE_VERSION}-alpine

ONBUILD - instructions that run when this image is used as a base. Rare; recognize.

Init systems (tini, dumb-init):

RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["myapp"]

tini is a minimal init that handles signals (SIGTERM, zombie reaping) correctly. Useful when the app doesn't handle PID 1 duties itself (typical for Node, Python apps).

SHELL instruction - uses bash instead of /bin/sh -c:

SHELL ["/bin/bash", "-c"]
RUN set -eo pipefail; do_thing | other_thing

docker-entrypoint.sh - a wrapper script as the entrypoint that does setup before running the main command. The Postgres official image's entrypoint, for example, sets up the database directory on first run.

What to look for when evaluating a project¶

When considering contributing:

Does the image build cleanly? Try docker build . from a fresh clone. If it errors, that's a "good first issue" target right there.
Is the image reasonable size? docker images <name> - anything over 500MB for a typical web service deserves attention.
Does it run as non-root? Check with docker run --rm <image> id.
Are secrets baked in? Run docker history <image> --no-trunc and grep for suspicious things.
Is it pinned? FROM ubuntu instead of FROM ubuntu:24.04 is fragile. PRs that pin base images are usually welcome.
Multi-arch builds? If they ship only amd64 in 2026, ARM users (Apple Silicon, Raspberry Pi) can't use it without slow QEMU emulation. PRs adding ARM builds are valuable.

These are all PR opportunities for someone with container skills.

A worked example: read a real project's container setup¶

Pick a public project. Suggestion: Plausible Community Edition (plausible/community-edition).

Clone it: git clone https://github.com/plausible/community-edition.
Look at Dockerfile (or look up the upstream one).
Look at docker-compose.yml.
Read the README about deployment.

After 10 minutes you should know: - What base image they use. - Whether they build multi-stage. - What services compose into the stack (web app, postgres, clickhouse, etc.). - How they configure secrets.

This is exactly the work you'd do before opening a PR.

Exercise¶

Pick a small OSS project with a Dockerfile. Suggestions: - peterbourgon/ff (Go) - small CLI library; may or may not have a Dockerfile; you can suggest one if not. - fatih/color (Go) - terminal colors library. - mholt/caddy (Go) - web server. Has a public Dockerfile. - grafana/grafana (Go + TypeScript) - observability. Excellent Dockerfile + CI.

Clone one. Find its Dockerfile. Apply the five-minute orientation. Write a paragraph: - What base image? - Multi-stage? - Final size (build it; check)? - Non-root? - Anything you'd improve?

That paragraph IS your potential PR plan.

What you might wonder¶

"What if the project's Dockerfile uses things I haven't seen?" Look them up. Most instructions are covered in pages 05-10. Less common ones (ONBUILD, STOPSIGNAL, HEALTHCHECK) are in the Dockerfile reference docs.

"What's the right time to suggest a Dockerfile improvement?" After understanding why it's structured the way it is. Some quirks are intentional (work around an upstream bug, need a specific tool). Investigate before "improving."

"What about non-Dockerfile container projects? (Podman, Buildah, Nixpkgs OCI builders, etc.)" The Dockerfile format is the lingua franca; almost every project uses it. Podman/Buildah read the same format. Nix is a different world (declarative builds, reproducibility); rare but powerful.

Done¶

Read a Dockerfile top to bottom.
Read a compose.yaml's services topology.
Recognize common patterns (multi-stage, tini, entrypoint scripts).
Spot common improvement opportunities for PRs.

Next: Picking a project →