Skip to content

03 - Images and Tags

What this session is

About 30 minutes. You'll learn what an image actually is, how tags work, how to find and inspect images, and the Docker Hub model.

What an image is

An image is a stack of read-only layers, plus some metadata (entrypoint, default command, exposed ports, environment variables).

When you create a container, Docker adds a thin read-write layer on top. Changes the container makes are in that layer; the underlying layers stay shared with other containers.

Two consequences: 1. Containers start fast (no copying - just stack a new writable layer). 2. Containers using the same image share disk space.

Tags: image versions

An image reference has the form:

[REGISTRY/]NAMESPACE/IMAGE:TAG
  • REGISTRY - where the image lives (defaults to docker.io).
  • NAMESPACE - the user/org publishing it (defaults to library for official images).
  • IMAGE - the image name.
  • TAG - a label, typically a version (defaults to latest).

Examples:

Short form Full form
nginx docker.io/library/nginx:latest
nginx:1.27 docker.io/library/nginx:1.27
myorg/myapp:v1.2.0 docker.io/myorg/myapp:v1.2.0
ghcr.io/foo/bar:main (literal - GHCR registry)

The trap

:latest is a label, not a guarantee. It points to whichever build the maintainer last tagged as latest - which can change. Pin to a specific version tag in production: nginx:1.27, not nginx:latest. For local experimentation, latest is fine.

Pull and list

docker pull nginx:1.27                  # download
docker images                           # list local images
docker images | grep nginx

docker images shows: repository, tag, image ID, size, age.

Inspect

docker inspect nginx:1.27

Outputs a long JSON with: layers, env vars, exposed ports, entrypoint, default command, the build history. Useful when figuring out why an image behaves a certain way.

docker history nginx:1.27 is a friendlier view of just the layers:

IMAGE          CREATED        SIZE      COMMENT
abc123         2 weeks ago    20MB      RUN apt-get install nginx
...

Each line is a build step (a layer). Sizes tell you what dominates the image. A 1GB image is mostly something; docker history shows what.

Search Docker Hub

docker search nginx

Returns matching repositories with star counts. For more, browse hub.docker.com - better filtering and READMEs.

Reading a Docker Hub page for an image tells you: - Supported tags (versions). - Configuration env vars. - Usage examples. - Source repo (often on GitHub) - the Dockerfile is public.

Official vs unofficial

library/nginx is an official image - curated, maintained by the upstream project or by Docker. They live under the library namespace (often hidden - nginx alone is shorthand for library/nginx).

Third-party images live under user/org namespaces: bitnami/postgresql, linuxserver/jellyfin, etc. Anyone can publish to Docker Hub; verify maintainers before running untrusted code.

Signals of trust: - "Official Image" badge or Verified Publisher badge on Docker Hub. - Maintained by the project itself (e.g. nginx, postgres, python). - Pulls in the millions. - Active CI, recent updates, signed images.

Image size matters

Smaller images = faster pulls, faster deploys, smaller attack surface. Compare:

docker pull ubuntu       # ~80MB
docker pull debian       # ~120MB
docker pull alpine       # ~5MB
docker pull busybox      # ~5MB
docker pull gcr.io/distroless/static  # ~2MB

For your own images (page 05+): start from a small base unless you genuinely need a full distro.

Remove unused images

Local images pile up. Clean up:

docker image rm IMAGE                   # remove one
docker image prune                      # remove dangling (no tag)
docker image prune -a                   # remove ALL not used by any container
docker system prune                     # broader cleanup (containers, networks, etc.)

docker system df shows how much space Docker is using.

A worked example: which Python image to pick

Suppose you want a Python container. Docker Hub python page lists tags:

  • python:3.12 - full Debian-based, ~1GB. Most flexible; has gcc, locales, etc.
  • python:3.12-slim - Debian-based, ~150MB. Stripped down.
  • python:3.12-alpine - Alpine-based, ~50MB. Smallest, but glibc-incompatible (some Python wheels won't install).

Rule of thumb: start with python:3.12-slim. If a pip install fails on a wheel, fall back to python:3.12. Try alpine last (often more pain than savings).

Multi-architecture images

Modern images are usually built for multiple architectures (linux/amd64, linux/arm64). Docker pulls the one matching your host. The same nginx:1.27 works on an Intel Mac, an Apple Silicon Mac, an x86 server, a Raspberry Pi.

You can force one:

docker pull --platform=linux/amd64 nginx:1.27

Useful on Apple Silicon when an image hasn't been built for ARM.

Exercise

  1. Pull two versions of nginx:

    docker pull nginx:1.27
    docker pull nginx:1.25
    docker images | grep nginx
    
    Note they share lots of disk space - common layers are shared.

  2. Inspect:

    docker inspect nginx:1.27 | head -50
    docker history nginx:1.27
    

  3. Compare sizes:

    docker pull alpine
    docker pull debian:bookworm-slim
    docker pull ubuntu
    docker images | grep -E "alpine|debian|ubuntu" | head
    

  4. Pin discipline: find one place where you saw nginx:latest in this path's earlier examples. Mentally substitute nginx:1.27. (Or any specific tag.) That's what you should write in production.

  5. Cleanup:

    docker images
    docker image prune -a            # confirms before deleting
    

What you might wonder

"Why are images so big?" A base distro is hundreds of MB. Adding a language runtime adds more. Application code is usually small; system bloat dominates. Page 09 covers slimming.

"What's a 'digest' vs a 'tag'?" A digest is the cryptographic hash of the exact image content (sha256:abc...). Immutable. A tag is a movable label. For maximum reproducibility, pin by digest: nginx@sha256:abc.... Verbose but unambiguous.

"Where is the data stored?" On Linux: /var/lib/docker/. On macOS/Windows: inside a VM that Docker Desktop manages. docker system df shows usage; docker volume ls shows your named volumes.

Done

  • Understand images as stacked layers.
  • Read image references (repo:tag).
  • Pull, list, inspect images.
  • Pick a base image by size.
  • Recognize official vs third-party.
  • Clean up unused images.

Next: Container lifecycle →

Comments