03 - Images and Tags¶
What this session is¶
About 30 minutes. You'll learn what an image actually is, how tags work, how to find and inspect images, and the Docker Hub model.
What an image is¶
An image is a stack of read-only layers, plus some metadata (entrypoint, default command, exposed ports, environment variables).
When you create a container, Docker adds a thin read-write layer on top. Changes the container makes are in that layer; the underlying layers stay shared with other containers.
Two consequences: 1. Containers start fast (no copying - just stack a new writable layer). 2. Containers using the same image share disk space.
Tags: image versions¶
An image reference has the form:
REGISTRY- where the image lives (defaults todocker.io).NAMESPACE- the user/org publishing it (defaults tolibraryfor official images).IMAGE- the image name.TAG- a label, typically a version (defaults tolatest).
Examples:
| Short form | Full form |
|---|---|
nginx |
docker.io/library/nginx:latest |
nginx:1.27 |
docker.io/library/nginx:1.27 |
myorg/myapp:v1.2.0 |
docker.io/myorg/myapp:v1.2.0 |
ghcr.io/foo/bar:main |
(literal - GHCR registry) |
The trap
:latest is a label, not a guarantee. It points to whichever build the maintainer last tagged as latest - which can change. Pin to a specific version tag in production: nginx:1.27, not nginx:latest. For local experimentation, latest is fine.
Pull and list¶
docker images shows: repository, tag, image ID, size, age.
Inspect¶
Outputs a long JSON with: layers, env vars, exposed ports, entrypoint, default command, the build history. Useful when figuring out why an image behaves a certain way.
docker history nginx:1.27 is a friendlier view of just the layers:
Each line is a build step (a layer). Sizes tell you what dominates the image. A 1GB image is mostly something; docker history shows what.
Search Docker Hub¶
Returns matching repositories with star counts. For more, browse hub.docker.com - better filtering and READMEs.
Reading a Docker Hub page for an image tells you:
- Supported tags (versions).
- Configuration env vars.
- Usage examples.
- Source repo (often on GitHub) - the Dockerfile is public.
Official vs unofficial¶
library/nginx is an official image - curated, maintained by the upstream project or by Docker. They live under the library namespace (often hidden - nginx alone is shorthand for library/nginx).
Third-party images live under user/org namespaces: bitnami/postgresql, linuxserver/jellyfin, etc. Anyone can publish to Docker Hub; verify maintainers before running untrusted code.
Signals of trust:
- "Official Image" badge or Verified Publisher badge on Docker Hub.
- Maintained by the project itself (e.g. nginx, postgres, python).
- Pulls in the millions.
- Active CI, recent updates, signed images.
Image size matters¶
Smaller images = faster pulls, faster deploys, smaller attack surface. Compare:
docker pull ubuntu # ~80MB
docker pull debian # ~120MB
docker pull alpine # ~5MB
docker pull busybox # ~5MB
docker pull gcr.io/distroless/static # ~2MB
For your own images (page 05+): start from a small base unless you genuinely need a full distro.
Remove unused images¶
Local images pile up. Clean up:
docker image rm IMAGE # remove one
docker image prune # remove dangling (no tag)
docker image prune -a # remove ALL not used by any container
docker system prune # broader cleanup (containers, networks, etc.)
docker system df shows how much space Docker is using.
A worked example: which Python image to pick¶
Suppose you want a Python container. Docker Hub python page lists tags:
python:3.12- full Debian-based, ~1GB. Most flexible; has gcc, locales, etc.python:3.12-slim- Debian-based, ~150MB. Stripped down.python:3.12-alpine- Alpine-based, ~50MB. Smallest, but glibc-incompatible (some Python wheels won't install).
Rule of thumb: start with python:3.12-slim. If a pip install fails on a wheel, fall back to python:3.12. Try alpine last (often more pain than savings).
Multi-architecture images¶
Modern images are usually built for multiple architectures (linux/amd64, linux/arm64). Docker pulls the one matching your host. The same nginx:1.27 works on an Intel Mac, an Apple Silicon Mac, an x86 server, a Raspberry Pi.
You can force one:
Useful on Apple Silicon when an image hasn't been built for ARM.
Exercise¶
-
Pull two versions of nginx:
Note they share lots of disk space - common layers are shared. -
Inspect:
-
Compare sizes:
-
Pin discipline: find one place where you saw
nginx:latestin this path's earlier examples. Mentally substitutenginx:1.27. (Or any specific tag.) That's what you should write in production. -
Cleanup:
What you might wonder¶
"Why are images so big?" A base distro is hundreds of MB. Adding a language runtime adds more. Application code is usually small; system bloat dominates. Page 09 covers slimming.
"What's a 'digest' vs a 'tag'?"
A digest is the cryptographic hash of the exact image content (sha256:abc...). Immutable. A tag is a movable label. For maximum reproducibility, pin by digest: nginx@sha256:abc.... Verbose but unambiguous.
"Where is the data stored?"
On Linux: /var/lib/docker/. On macOS/Windows: inside a VM that Docker Desktop manages. docker system df shows usage; docker volume ls shows your named volumes.
Done¶
- Understand images as stacked layers.
- Read image references (
repo:tag). - Pull, list, inspect images.
- Pick a base image by size.
- Recognize official vs third-party.
- Clean up unused images.