Table of Contents

0. Setup & Environment

Install Docker Desktop for Mac

# Via Homebrew (recommended)
brew install --cask docker

# Or download directly:
# https://www.docker.com/products/docker-desktop/

# After install, launch Docker.app from Applications
# (required on first install to complete daemon setup)
What's included: Docker Desktop bundles the Docker Engine, CLI (docker), Docker Compose v2 (docker compose), and BuildKit — everything you need to build and run containers locally.

Verify Installation

docker --version          # Docker CLI version
docker compose version    # Compose v2 (built-in plugin)
docker info               # Engine details — confirms daemon is running

If docker info fails with "Cannot connect to the Docker daemon", Docker Desktop is not running. Open it from Applications and wait for the whale icon in the menu bar to become steady.

Resource Configuration

Docker Desktop → Settings → Resources controls how much of your Mac's hardware containers can use. Defaults are 2 CPUs and 2 GB RAM.

For this refresher, defaults are fine. If you plan to run heavier workloads locally (Spark, Kafka, multi-service stacks), increase CPUs to 4+ and RAM to 6–8 GB. Watch for OOM kills (docker inspect <container> | grep OOMKilled) as a sign you need more memory.

Quick Smoke Test

# Pull and run the official hello-world image — confirms pull + run pipeline works
docker run --rm hello-world

# Run an interactive Alpine container
docker run -it --rm alpine sh
# Inside the container:
cat /etc/os-release   # confirm you're inside Alpine Linux
exit                  # tears down the container (--rm cleans it up)

Useful Aliases

# Add to ~/.zshrc
alias d='docker'
alias dc='docker compose'
alias dps='docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"'
dps prints a clean, aligned table of running containers — names, status, and port bindings — without the noise of the default docker ps output. Reload your shell with source ~/.zshrc after adding these.

1. Core Concepts

Key Abstractions

AbstractionWhat it isAnalogy
ImageRead-only, layered filesystem snapshot + metadata (entrypoint, env, ports)Class definition / executable binary
ContainerRunning (or stopped) instance of an image with a writable layer on topProcess / running instance
RegistryRemote store for images (Docker Hub, ECR, GCR, ACR, Harbor)npm registry / artifact repository
VolumePersistent storage managed by Docker, lives outside container lifecycleExternal hard drive
NetworkVirtual network connecting containers; controls routing and DNSVLAN / VPC subnet

Docker Architecture

Docker is a client-server system. The components you interact with:

OCI Standards
The Open Container Initiative defines two specs: Image Spec (how images are structured and stored) and Runtime Spec (how containers are executed). Any OCI-compliant runtime can run OCI images — podman, containerd, CRI-O all speak the same format.

Docker Desktop vs Docker Engine

FeatureDocker DesktopDocker Engine
PlatformmacOS, Windows, LinuxLinux only (native)
GUIYesNo
Linux VMYes (required on Mac/Win)No — runs natively
LicensePaid for large orgsFree (Apache 2.0)
IncludesEngine, Compose, BuildKit, ScoutEngine only
File sharingVia VM (can have overhead)Direct bind mounts
Rootless Docker
Run Docker daemon as non-root for improved security. Trade-offs: no host port <1024 binding, some network features unavailable, overlay2 storage driver may need configuration. Enable with dockerd-rootless-setuptool.sh install.

2. Images

Essential Image Commands

# Pull from registry (defaults to Docker Hub)
docker pull nginx:1.25-alpine
docker pull ghcr.io/org/repo:sha256-abc123

# List local images
docker image ls
docker image ls --filter dangling=true   # untagged images
docker images -a                         # include intermediate layers

# Tag an image (creates alias, same layer data)
docker tag myapp:latest myapp:1.2.3
docker tag myapp:latest registry.example.com/org/myapp:1.2.3

# Push to registry
docker push registry.example.com/org/myapp:1.2.3

# Inspect image metadata (entrypoint, env, layers, architecture)
docker inspect nginx:alpine
docker image inspect --format '{{.Config.Cmd}}' nginx:alpine

# Remove images
docker rmi myapp:old
docker image prune          # remove dangling (untagged)
docker image prune -a       # remove ALL unused images

# Show image history / layers
docker image history nginx:alpine
docker image history --no-trunc nginx:alpine

Image Naming: Registry/Repo:Tag

# Full form: [registry/][org/]repository[:tag][@digest]
docker.io/library/nginx:1.25-alpine      # Docker Hub official image
docker.io/username/myapp:latest          # Docker Hub personal image
ghcr.io/org/repo:v1.2.3                  # GitHub Container Registry
123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:prod  # AWS ECR

# Digest pins to exact image content (immutable — unlike tags)
docker pull nginx@sha256:abc123def456...
Tags are mutable
nginx:latest today may differ from nginx:latest tomorrow. In production, always pin to a specific version tag or digest. Use digests for fully reproducible builds.

Image Layers and Caching

Every RUN, COPY, and ADD instruction creates a new read-only layer. Layers are content-addressed and shared across images — pulling python:3.12-alpine reuses layers already present for python:3.11-alpine.

Multi-Architecture Images

# Create and use a multi-platform builder
docker buildx create --name multiarch --use
docker buildx inspect --bootstrap

# Build for multiple platforms and push
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  --tag registry.example.com/myapp:1.0.0 \
  --push \
  .

# Inspect platform support of an image
docker buildx imagetools inspect nginx:alpine

# Build for current platform only (no push needed)
docker buildx build --load -t myapp:dev .

3. Dockerfile

Core Instructions

# syntax=docker/dockerfile:1
# The magic comment enables BuildKit features (cache mounts, secrets, heredocs)

FROM ubuntu:22.04 AS base

# Labels are key-value metadata attached to the image
LABEL org.opencontainers.image.version="1.0.0"
LABEL maintainer="[email protected]"

# ARG: build-time variable (not in final image, unless used in ENV)
ARG APP_VERSION=dev
ARG TARGETARCH   # automatically set by BuildKit for multi-arch builds

# ENV: runtime environment variable (persists in image)
ENV APP_ENV=production \
    PORT=8080 \
    APP_VERSION=${APP_VERSION}

# WORKDIR: sets working directory, creates it if missing
WORKDIR /app

# COPY: preferred over ADD; copies files from build context
COPY go.mod go.sum ./            # copy dependency files first (cache benefit)
COPY . .                          # copy rest of source

# ADD: like COPY but also handles URLs and auto-extracts tarballs
# Use only when you need its extra features — COPY is clearer
ADD https://example.com/file.tar.gz /tmp/
ADD archive.tar.gz /extracted/

# RUN: executes a command and creates a new layer
# Always chain related commands to minimize layers
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
      curl \
      ca-certificates && \
    rm -rf /var/lib/apt/lists/*

# USER: drop privileges before the final CMD/ENTRYPOINT
RUN groupadd -r appuser && useradd -r -g appuser appuser
USER appuser

# EXPOSE: documents which port the app listens on (does NOT publish)
EXPOSE 8080

# VOLUME: creates a mount point; signals that this path should be externally mounted
VOLUME ["/data", "/logs"]

# HEALTHCHECK: tells Docker how to test if the container is healthy
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD curl -f http://localhost:8080/health || exit 1

# STOPSIGNAL: signal sent when docker stop is called (default: SIGTERM)
STOPSIGNAL SIGTERM

# SHELL: change the default shell used for RUN/CMD/ENTRYPOINT shell form
SHELL ["/bin/bash", "-o", "pipefail", "-c"]

CMD vs ENTRYPOINT

InstructionPurposeOverridable?
CMDDefault command or default args to ENTRYPOINTYes — by docker run image <args>
ENTRYPOINTFixed executable; container always runs thisOnly with --entrypoint flag
# --- Exec form (PREFERRED) ---
# No shell, PID 1, receives signals directly
ENTRYPOINT ["./app"]
CMD ["--config", "/etc/app/config.yaml"]
# docker run myimage --config /other.yaml   → replaces CMD only

# --- Shell form ---
# Runs as /bin/sh -c "...", spawns a shell (NOT PID 1)
# Signals (SIGTERM) go to the shell, not your process
CMD ./app --config /etc/app/config.yaml   # AVOID in production

# --- Common patterns ---
# Wrapper script for init tasks (migrations, envsubst, etc.)
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["nginx", "-g", "daemon off;"]

# Pure CMD when no fixed executable needed
CMD ["python", "-m", "uvicorn", "app.main:app", "--host", "0.0.0.0"]
Shell form + signals = dropped SIGTERM
Using shell form for ENTRYPOINT or CMD means your process is a child of /bin/sh. docker stop sends SIGTERM to PID 1 (the shell), not your app. The shell may not forward it, causing a 10-second timeout before SIGKILL. Always use exec form in production.

.dockerignore

Controls what gets sent to the Docker daemon as the build context. Smaller context = faster builds and no accidental secret leakage.

# .dockerignore
.git
.github
**/.DS_Store
**/node_modules
**/__pycache__
**/*.pyc
.env
.env.*
*.local
coverage/
.nyc_output
dist/        # built artifacts (unless you intentionally COPY them)
docs/
tests/
*.md
Dockerfile*
docker-compose*
.dockerignore
Missing .dockerignore leaks secrets
Without a .dockerignore, your entire project directory (including .env, SSH keys, Git history) is sent to the daemon and may end up in image layers. Always create this file.

4. Multi-Stage Builds

Multi-stage builds use multiple FROM instructions in a single Dockerfile. Only the final stage ends up in the shipped image — earlier stages (compilers, test runners, build tools) are discarded.

Go: Builder Pattern

# syntax=docker/dockerfile:1
FROM golang:1.22-alpine AS builder
WORKDIR /build

# Download deps first (cached separately from source)
COPY go.mod go.sum ./
RUN go mod download

# Build static binary
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-w -s" -o /app ./cmd/server

# ---- Final stage: distroless or scratch ----
FROM scratch
# Copy CA certificates for HTTPS calls
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=builder /app /app
EXPOSE 8080
ENTRYPOINT ["/app"]
# Final image: ~5–10 MB (no OS, no shell)

Node.js: Build + Runtime Separation

# syntax=docker/dockerfile:1
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production

FROM node:20-alpine AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build    # produces /app/dist

FROM node:20-alpine AS runtime
WORKDIR /app
ENV NODE_ENV=production
# Only copy what's needed at runtime
COPY --from=deps /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist
COPY package.json .

RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

EXPOSE 3000
CMD ["node", "dist/index.js"]

Python: Wheel Builder Pattern

# syntax=docker/dockerfile:1
FROM python:3.12-slim AS builder
WORKDIR /build

# Install build dependencies (gcc, etc.) only in builder
RUN apt-get update && apt-get install -y --no-install-recommends gcc
COPY requirements.txt .
RUN pip wheel --no-cache-dir --wheel-dir /wheels -r requirements.txt

FROM python:3.12-slim AS runtime
WORKDIR /app

# Install pre-built wheels — no compiler needed
COPY --from=builder /wheels /wheels
RUN pip install --no-cache-dir --no-index --find-links /wheels /wheels/* && \
    rm -rf /wheels

COPY . .
RUN adduser --disabled-password --no-create-home appuser
USER appuser

CMD ["python", "-m", "gunicorn", "app.main:app", "-b", "0.0.0.0:8000"]

Named Stages and Targeting

# Build a specific named stage (useful for running tests in CI)
docker build --target builder -t myapp:builder .
docker build --target test -t myapp:test .
docker build --target runtime -t myapp:prod .

# Reference a stage from another stage
# COPY --from=  
# COPY --from=     # can also use an external image!

5. Image Optimization

Layer Ordering for Cache Efficiency

# BAD: source code copied before dependencies
# Any source change invalidates the pip install layer
FROM python:3.12-slim
COPY . /app
RUN pip install -r /app/requirements.txt    # re-runs on every source change

# GOOD: dependencies first, source code last
FROM python:3.12-slim
COPY requirements.txt /app/
RUN pip install -r /app/requirements.txt    # cached until requirements.txt changes
COPY . /app                                  # invalidates only subsequent layers

Base Image Selection

Base ImageSizeShellUse case
ubuntu:22.04~77 MBbashGeneral purpose, broad compatibility
debian:bookworm-slim~75 MBbashGood default for most apps
python:3.12-slim~130 MBbashPython apps without Alpine quirks
alpine:3.19~8 MBshSmallest glibc alternative; musl libc (watch for compat)
gcr.io/distroless/static~2 MBNoneGo static binaries; no shell attack surface
gcr.io/distroless/python3~52 MBNonePython apps; no shell, no package manager
scratch0 MBNoneFully static binaries (Go, Rust); minimal attack surface
Alpine and musl libc gotchas
Alpine uses musl libc instead of glibc. Python wheels compiled for glibc won't work, and some C extensions behave differently. Prefer python:3.12-slim (Debian-based) for Python to avoid obscure runtime issues.

apt-get Best Practices

# CORRECT: one RUN, update + install + clean in same layer
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
      curl \
      ca-certificates \
      libpq-dev && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# --no-install-recommends: skips recommended (not required) packages
# rm -rf /var/lib/apt/lists/*: removes package index (no longer needed)
# Both must be in the SAME RUN to actually reduce layer size

Security Scanning

# Docker Scout (built into Docker Desktop / Docker CLI)
docker scout cves myapp:latest          # list CVEs
docker scout recommendations myapp:latest   # suggest base image upgrades
docker scout quickview myapp:latest

# Trivy (open source, excellent for CI)
trivy image myapp:latest
trivy image --severity HIGH,CRITICAL myapp:latest
trivy image --exit-code 1 --severity CRITICAL myapp:latest  # fail CI on critical

# Snyk
snyk container test myapp:latest
snyk container monitor myapp:latest  # ongoing monitoring

6. Containers

docker run — Key Flags

# Basic run
docker run nginx:alpine

# Detached (background), named, with port mapping
docker run -d --name webserver -p 8080:80 nginx:alpine
#                               ^host ^container

# Interactive terminal (for debugging, running shells)
docker run -it ubuntu:22.04 bash
docker run -it --rm alpine sh   # --rm removes container on exit

# Environment variables
docker run -e DATABASE_URL=postgres://... -e DEBUG=true myapp:latest
docker run --env-file .env myapp:latest    # load from file

# Volume mounts
docker run -v /host/path:/container/path myapp:latest  # bind mount
docker run -v mydata:/data myapp:latest                # named volume
docker run --mount type=bind,source=$(pwd),target=/app myapp:latest

# Resource limits
docker run --memory="512m" --cpus="1.5" myapp:latest

# Network
docker run --network mynetwork myapp:latest
docker run --network host myapp:latest    # share host network stack

# Run as specific user
docker run --user 1001:1001 myapp:latest

# Override entrypoint
docker run --entrypoint /bin/sh myapp:latest -c "env"

# Remove automatically when stopped
docker run --rm myapp:latest

Container Lifecycle

StateDescription
createdContainer created but not started
runningPID 1 process is executing
pausedAll processes frozen (SIGSTOP)
stopped/exitedPID 1 exited; filesystem preserved
deadFailed to stop cleanly
# Lifecycle management
docker start mycontainer          # start a stopped container
docker stop mycontainer           # send SIGTERM, wait 10s, then SIGKILL
docker stop -t 30 mycontainer     # custom timeout
docker restart mycontainer
docker kill mycontainer           # send SIGKILL immediately
docker kill --signal SIGHUP mycontainer  # send custom signal
docker pause / docker unpause mycontainer

# Remove containers
docker rm mycontainer             # remove stopped container
docker rm -f mycontainer          # force remove running container
docker container prune            # remove all stopped containers

Exec, Logs, and Copy

# Execute a command in a running container
docker exec mycontainer ls /app
docker exec -it mycontainer bash          # interactive shell
docker exec -it -e DEBUG=1 mycontainer sh # with env var
docker exec -u root mycontainer whoami    # as specific user

# Stream logs
docker logs mycontainer
docker logs --follow mycontainer          # tail -f equivalent
docker logs --tail 100 mycontainer
docker logs --since 30m mycontainer       # last 30 minutes
docker logs --since 2024-01-01T00:00:00 mycontainer

# Copy files between host and container
docker cp mycontainer:/app/config.yaml ./config.yaml   # container → host
docker cp ./config.yaml mycontainer:/app/config.yaml   # host → container

# Resource usage
docker stats                  # live stats for all running containers
docker stats mycontainer      # single container
docker top mycontainer        # running processes inside container

7. Networking

Network Drivers

DriverDescriptionUse case
bridgeDefault. Creates a virtual network; containers talk via internal IP or DNS name (user-defined only)Single-host container communication
hostShares the host's network stack — no isolation, no port mapping neededPerformance-sensitive apps; Linux only
noneCompletely disables networkingBatch jobs, security-sensitive workloads
overlayMulti-host networking for Swarm and KubernetesDistributed services across multiple Docker hosts
macvlanAssigns a real MAC address; container appears as physical device on LANLegacy apps expecting direct LAN access
ipvlanLike macvlan but shares MAC, uses IP for routingEnvironments that restrict MAC changes

Network Commands

# List networks
docker network ls

# Create a user-defined bridge network
docker network create mynetwork
docker network create --driver bridge --subnet 172.20.0.0/16 mynetwork

# Connect / disconnect running containers
docker network connect mynetwork mycontainer
docker network disconnect mynetwork mycontainer

# Inspect network (shows connected containers, IP assignments)
docker network inspect mynetwork

# Remove unused networks
docker network rm mynetwork
docker network prune

Container DNS Resolution

# On user-defined bridge networks, containers resolve each other by name
# Container "db" is reachable as "db" from any container on the same network

docker network create backend
docker run -d --name db --network backend postgres:16
docker run -d --name api --network backend myapp:latest
# Inside "api", postgres is reachable at: postgres://db:5432/mydb
#                                          hostname = container name ^

# Default bridge network does NOT support name-based DNS
# (only user-defined networks have automatic DNS)

# Alias: a container can have multiple DNS names on a network
docker network connect --alias cache --alias redis mynetwork mycontainer
Always use user-defined bridge networks
The default bridge network does not provide DNS resolution by container name — you'd need to use IPs or --link (deprecated). User-defined bridge networks give you automatic name-based DNS, better isolation, and the ability to connect/disconnect at runtime.

Port Mapping

# -p host_port:container_port
docker run -p 8080:80 nginx         # bind host:8080 → container:80
docker run -p 127.0.0.1:8080:80 nginx  # bind only loopback (safer for dev)
docker run -p 80 nginx              # random host port → container:80
docker run -P nginx                 # publish all EXPOSE'd ports to random host ports

# Find which host port was assigned
docker port mycontainer
docker port mycontainer 80

8. Volumes & Storage

Volume Types

TypeSyntaxManaged byUse case
Named volume-v mydata:/dataDockerPersistent data (databases, uploads). Portable, backed up with docker commands.
Bind mount-v /host/path:/container/pathHost OSDevelopment (live code reload), config injection. Host path must exist.
tmpfs mount--tmpfs /tmp:size=100mMemorySensitive data (secrets, tokens) that must not persist to disk.
Anonymous volume-v /dataDockerAvoid writing to container layer; discarded on docker rm -v

Volume Commands

# Create and manage volumes
docker volume create mydata
docker volume ls
docker volume inspect mydata      # shows mountpoint on host
docker volume rm mydata
docker volume prune               # remove all unused volumes (DESTRUCTIVE)

# Preferred --mount syntax (explicit, readable)
docker run --mount type=volume,source=mydata,target=/data myapp
docker run --mount type=bind,source=$(pwd)/config,target=/etc/app/config,readonly myapp
docker run --mount type=tmpfs,target=/tmp,tmpfs-size=100m myapp

# Legacy -v syntax (still common)
docker run -v mydata:/data myapp
docker run -v $(pwd):/app myapp           # bind mount
docker run -v /tmp:ro myapp              # readonly mount

Backup and Restore Volumes

# Backup: run a temporary container, tar the volume contents to host
docker run --rm \
  -v mydata:/data \
  -v $(pwd):/backup \
  alpine \
  tar czf /backup/mydata-backup.tar.gz -C /data .

# Restore: unpack into a fresh volume
docker volume create mydata-restored
docker run --rm \
  -v mydata-restored:/data \
  -v $(pwd):/backup \
  alpine \
  sh -c "tar xzf /backup/mydata-backup.tar.gz -C /data"

# Copy data between volumes using a temp container
docker run --rm \
  -v source_vol:/source:ro \
  -v dest_vol:/dest \
  alpine \
  cp -a /source/. /dest/
Bind mount permission issues
When bind-mounting into a container running as a non-root user, the host directory UID/GID must match the container user's UID/GID. A common fix: chown -R 1001:1001 ./data on the host, matching the UID used in USER 1001 in the Dockerfile.

9. Docker Compose

compose.yaml Structure

# compose.yaml (preferred filename; docker-compose.yml also accepted)
services:
  web:
    build:
      context: .
      dockerfile: Dockerfile
      args:
        APP_VERSION: "1.2.3"
    image: myapp:dev        # tag the built image
    ports:
      - "8080:8080"
    environment:
      - DATABASE_URL=postgres://user:pass@db:5432/mydb
      - REDIS_URL=redis://cache:6379
    env_file:
      - .env               # loaded before 'environment' (environment wins)
    volumes:
      - ./src:/app/src     # bind mount for dev hot-reload
      - uploads:/app/uploads
    depends_on:
      db:
        condition: service_healthy
      cache:
        condition: service_started
    networks:
      - backend
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 10s

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: mydb
      POSTGRES_USER: user
      POSTGRES_PASSWORD: pass
    volumes:
      - pgdata:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql:ro
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user -d mydb"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - backend

  cache:
    image: redis:7-alpine
    command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru
    volumes:
      - redisdata:/data
    networks:
      - backend

networks:
  backend:
    driver: bridge   # default; explicit for clarity

volumes:
  pgdata:
  redisdata:
  uploads:

Essential Compose Commands

# Start services (build if needed, detached)
docker compose up -d
docker compose up -d --build          # force rebuild
docker compose up --no-deps web       # start only 'web' (no dependencies)

# Stop and remove containers, networks (volumes preserved by default)
docker compose down
docker compose down --volumes         # also remove named volumes (DESTRUCTIVE)
docker compose down --rmi local       # also remove locally-built images

# Build/rebuild images
docker compose build
docker compose build --no-cache web   # force rebuild 'web' service
docker compose build --parallel       # build all in parallel

# Status and logs
docker compose ps
docker compose ps --format json
docker compose logs -f                # follow all services
docker compose logs -f --tail=50 web  # follow 'web', last 50 lines

# Execute commands
docker compose exec web bash
docker compose exec -T web python manage.py migrate  # -T disables TTY (for CI)
docker compose run --rm web python manage.py createsuperuser

# Scaling
docker compose up -d --scale web=3   # run 3 web replicas

# Pull latest images for all services
docker compose pull

# Restart a specific service
docker compose restart web

Environment Variables in Compose

# Precedence (highest to lowest):
# 1. Shell environment when running docker compose
# 2. .env file in project directory (auto-loaded, variable substitution only)
# 3. env_file: directive
# 4. environment: directive (with default values)

services:
  app:
    environment:
      # Literal value
      NODE_ENV: production
      # Passed through from shell
      - SECRET_KEY
      # With default if not set in shell
      LOG_LEVEL: ${LOG_LEVEL:-info}
    env_file:
      - .env.shared
      - .env.${APP_ENV:-development}   # conditional env file
# .env file (for variable substitution in compose.yaml, NOT secrets)
POSTGRES_VERSION=16
APP_PORT=8080

# .env is NOT for secrets in production — use Docker secrets or a vault

Profiles

# Define services that only start when a profile is active
services:
  web:
    image: myapp       # always starts

  debug-tools:
    image: nicolaka/netshoot
    profiles: [debug]   # only starts with: docker compose --profile debug up

  migrate:
    image: myapp
    command: python manage.py migrate
    profiles: [tools]
docker compose --profile debug up
docker compose --profile tools run migrate

10. Docker Compose Advanced

Override Files

# Compose automatically merges these files (in order):
# 1. compose.yaml
# 2. compose.override.yaml (auto-loaded if present)

# Explicit multiple files:
docker compose -f compose.yaml -f compose.prod.yaml up -d

# Pattern: base config in compose.yaml, dev overrides in compose.override.yaml
# compose.override.yaml is gitignored in some teams to allow per-developer configs
# compose.override.yaml — developer convenience (NOT committed to prod)
services:
  web:
    build:
      target: development    # use dev build stage with hot-reload
    volumes:
      - .:/app               # bind mount full source
      - /app/node_modules    # anonymous volume to prevent host modules overwrite
    environment:
      DEBUG: "true"
    ports:
      - "9229:9229"          # Node.js debugger port

Dependency Ordering with Health Checks

services:
  web:
    depends_on:
      db:
        condition: service_healthy    # wait for healthy state
      migrate:
        condition: service_completed_successfully  # wait for migration to finish

  migrate:
    image: myapp
    command: python manage.py migrate
    depends_on:
      db:
        condition: service_healthy

  db:
    image: postgres:16-alpine
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER"]
      interval: 5s
      timeout: 3s
      retries: 10
      start_period: 5s
depends_on does not wait for app readiness
depends_on: condition: service_healthy waits for the healthcheck to pass, but only if a healthcheck is defined. Without a healthcheck, service_started only waits for the container to start (not the service inside it to be ready). Always define healthchecks for databases and critical services.

Watch Mode (Compose Watch)

# compose.yaml — define watch rules per service
services:
  web:
    build: .
    develop:
      watch:
        - action: sync          # copy changed files into container (no rebuild)
          path: ./src
          target: /app/src
        - action: rebuild       # trigger docker compose build on these changes
          path: package.json
        - action: sync+restart  # sync files then restart container
          path: ./config
          target: /app/config
docker compose watch    # start watch mode (requires BuildKit)

Build Secrets and Args

services:
  app:
    build:
      context: .
      args:
        BUILD_DATE: ${BUILD_DATE}   # available during build only
      secrets:
        - npmrc                     # secret mounted during build

secrets:
  npmrc:
    file: ./.npmrc     # content of .npmrc passed as build secret
# In Dockerfile — use build secret without it leaking into layers
# syntax=docker/dockerfile:1
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc \
    npm ci --registry https://npm.pkg.github.com

11. Registry & Distribution

Registry Commands

# Login
docker login                              # Docker Hub (prompts)
docker login ghcr.io -u USERNAME          # GitHub Container Registry
docker login 123456.dkr.ecr.us-east-1.amazonaws.com  # AWS ECR

# AWS ECR login (pipe the token directly)
aws ecr get-login-password --region us-east-1 | \
  docker login --username AWS --password-stdin \
  123456.dkr.ecr.us-east-1.amazonaws.com

# Push and pull
docker push registry.example.com/org/myapp:1.2.3
docker pull registry.example.com/org/myapp@sha256:abc123...

# Logout (clears stored credentials)
docker logout registry.example.com

Private Registry Options

RegistryHostNotes
Docker Hubdocker.ioRate limited: 100 pulls/6hr (anon), 200/6hr (free); paid for private
GHCRghcr.ioFree for public repos; included in GitHub Actions
AWS ECR*.dkr.ecr.*.amazonaws.comIAM-based auth; 500 MB free/month; cross-account sharing
GCR / Artifact Registry*.pkg.devGoogle Cloud; regional; Workload Identity Federation
ACR*.azurecr.ioAzure; geo-replication; integrated with AKS
HarborSelf-hostedOpen source; scanning, replication, RBAC, proxying upstream registries

Image Signing and Verification

# Cosign (Sigstore) — most modern approach
# Sign with a key
cosign sign --key cosign.key registry.example.com/myapp:1.0.0

# Sign in keyless mode (OIDC — works in GitHub Actions)
cosign sign registry.example.com/myapp:1.0.0  # signs with GitHub OIDC identity

# Verify
cosign verify --key cosign.pub registry.example.com/myapp:1.0.0

# Docker Content Trust (legacy, uses Notary)
export DOCKER_CONTENT_TRUST=1
docker push myapp:latest    # automatically signs on push
docker pull myapp:latest    # automatically verifies on pull

12. Security

Run as Non-Root

# Create a dedicated user with no home dir, no password, no login shell
RUN groupadd -r --gid 1001 appgroup && \
    useradd -r --uid 1001 --gid appgroup --no-create-home appuser

# For Alpine (uses addgroup/adduser)
RUN addgroup -S -g 1001 appgroup && \
    adduser -S -u 1001 -G appgroup appuser

# Switch to non-root before final CMD/ENTRYPOINT
USER appuser

# If files need to be owned by the app user
COPY --chown=appuser:appgroup . /app

Capabilities and Read-only Filesystem

# Drop all capabilities, add back only what's needed
docker run --cap-drop ALL --cap-add NET_BIND_SERVICE myapp
# NET_BIND_SERVICE: bind to ports < 1024
# CHOWN: change file ownership
# SETUID/SETGID: change process UID/GID

# Read-only root filesystem (excellent for immutability)
# App must write only to explicitly declared tmpfs mounts
docker run --read-only \
  --tmpfs /tmp:size=50m \
  --tmpfs /var/run:size=10m \
  myapp

# In Compose:
services:
  app:
    read_only: true
    tmpfs:
      - /tmp:size=50m

Build-time Secrets (BuildKit)

# Pass a secret at build time — NOT stored in image layers
docker build \
  --secret id=npmrc,src=$HOME/.npmrc \
  --secret id=github_token,env=GITHUB_TOKEN \
  -t myapp .

# Equivalent with environment variable
GITHUB_TOKEN=ghp_abc123 docker build \
  --secret id=github_token,env=GITHUB_TOKEN \
  -t myapp .
# In Dockerfile: secret available only during this RUN step
RUN --mount=type=secret,id=github_token \
    GITHUB_TOKEN=$(cat /run/secrets/github_token) \
    go build ./...

Runtime Secrets

# Docker Swarm secrets (encrypted at rest, in-memory in container)
echo "s3cr3tpassword" | docker secret create db_password -
docker service create \
  --secret db_password \
  myapp
# Available at /run/secrets/db_password inside container

# In production: prefer a secrets manager
# - AWS Secrets Manager / Parameter Store
# - HashiCorp Vault
# - GCP Secret Manager
# - Kubernetes Secrets (with external-secrets-operator for real security)
Never bake secrets into images
Any ENV or ARG value set during build is visible in docker inspect and in image layer history. Credentials passed as ENV MY_SECRET=... are permanently stored in the image. Use --secret for build-time secrets and environment injection (via runtime env or secrets manager) for runtime secrets.

Security Checklist

13. Docker in CI/CD

GitHub Actions: Build & Push

# .github/workflows/docker.yml
name: Build and Push

on:
  push:
    branches: [main]
    tags: ['v*']

jobs:
  build:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write    # needed for GHCR

    steps:
      - uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Log in to GHCR
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ghcr.io/${{ github.repository }}
          tags: |
            type=semver,pattern={{version}}
            type=semver,pattern={{major}}.{{minor}}
            type=sha,prefix=sha-,format=short
            type=raw,value=latest,enable=${{ github.ref == 'refs/heads/main' }}

      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          context: .
          platforms: linux/amd64,linux/arm64
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha                # GitHub Actions cache
          cache-to: type=gha,mode=max

BuildKit Cache in CI

# Registry-based cache (works across runners, most reliable)
docker buildx build \
  --cache-from type=registry,ref=registry.example.com/myapp:buildcache \
  --cache-to type=registry,ref=registry.example.com/myapp:buildcache,mode=max \
  -t myapp:latest \
  --push .

# Local cache (same runner only)
docker buildx build \
  --cache-from type=local,src=/tmp/.buildx-cache \
  --cache-to type=local,dest=/tmp/.buildx-cache-new,mode=max \
  -t myapp:latest .

# mode=max: cache all layers (not just final stage) — much better hit rate

Tagging Strategy

Tag patternExampleUse case
Semverv1.2.3Release artifact — immutable and meaningful
Git SHAsha-a1b2c3dExact traceability — know exactly what code is running
Branch namemain, feature-xyzStaging deploys — mutable, overwritten on each push
latestlatestOnly for local dev. Never deploy latest to production.

Image Promotion Workflow

# Build once, promote by re-tagging (never rebuild the same code)
# Build and push with Git SHA tag
docker build -t registry/myapp:sha-${GIT_SHA} --push .

# Promote to staging (re-tag, don't rebuild)
docker buildx imagetools create \
  --tag registry/myapp:staging \
  registry/myapp:sha-${GIT_SHA}

# Promote to production after staging validation
docker buildx imagetools create \
  --tag registry/myapp:v1.2.3 \
  --tag registry/myapp:latest \
  registry/myapp:sha-${GIT_SHA}

# Verify: all three tags point to the same digest
docker manifest inspect registry/myapp:v1.2.3 | jq '.config.digest'
docker manifest inspect registry/myapp:sha-${GIT_SHA} | jq '.config.digest'

14. Debugging & Troubleshooting

Container Won't Start

# Check exit code and last logs
docker ps -a                         # see all containers including stopped
docker logs mycontainer
docker logs --tail 50 mycontainer

# Inspect container state
docker inspect mycontainer | jq '.[0].State'
# {  "Status": "exited", "ExitCode": 1, "Error": "..." }

# Run interactively to debug entrypoint issues
docker run -it --entrypoint sh myapp:latest
docker run -it --entrypoint bash myapp:latest

# Common exit codes:
# 1   — Application error
# 126 — Entrypoint not executable (check permissions, chmod +x)
# 127 — Entrypoint not found (wrong path, missing binary in image)
# 137 — SIGKILL (OOM killer or docker kill)
# 139 — Segfault
# 143 — SIGTERM not handled (timed out, got SIGKILL)

Disk Space Management

# Check disk usage
docker system df
docker system df -v   # verbose: per-image/container/volume breakdown

# Clean up (in order of aggressiveness)
docker container prune          # stopped containers
docker image prune              # dangling images
docker image prune -a           # all unused images
docker volume prune             # unused volumes
docker network prune            # unused networks
docker builder prune            # BuildKit cache
docker builder prune --keep-storage 10GB  # keep last 10 GB of cache

# Nuclear option (DESTRUCTIVE — removes everything not in use)
docker system prune -a --volumes

Networking Issues

# Test DNS resolution inside a container
docker exec mycontainer nslookup db
docker exec mycontainer cat /etc/resolv.conf

# Test connectivity
docker exec mycontainer curl -v http://other-service:8080/health
docker exec mycontainer ping db

# Inspect which network a container is on
docker inspect mycontainer | jq '.[0].NetworkSettings.Networks'

# Use netshoot for advanced network debugging
docker run --rm -it --network container:mycontainer \
  nicolaka/netshoot \
  tcpdump -i eth0 port 5432

# Check port conflicts on host
lsof -i :8080
ss -tlnp | grep 8080

Events and Inspect

# Real-time event stream
docker events
docker events --filter type=container
docker events --filter event=die
docker events --since 1h --filter container=mycontainer

# Deep inspect (JSON) — useful for debugging config
docker inspect mycontainer
docker inspect mycontainer | jq '.[0].HostConfig.PortBindings'
docker inspect mycontainer | jq '.[0].Mounts'
docker inspect mycontainer | jq '.[0].NetworkSettings.IPAddress'

# Inspect image layers
docker image inspect myapp:latest | jq '.[0].RootFS.Layers'

# View history of how image was built
docker history --no-trunc myapp:latest

Performance Debugging

# Live container resource usage
docker stats                          # all containers
docker stats mycontainer --no-stream  # single snapshot

# Output: CONTAINER, CPU %, MEM USAGE/LIMIT, MEM %, NET I/O, BLOCK I/O, PIDS

# If container is hitting memory limit (OOM kill):
docker inspect mycontainer | jq '.[0].State.OOMKilled'
# → true means it was killed by the OOM killer

# Slow build? Profile layer timing
BUILDKIT_PROGRESS=plain docker build . 2>&1 | grep -E '#[0-9]+ DONE'

# Large image? Dive (interactive image layer explorer)
dive myapp:latest   # brew install dive / snap install dive

15. BuildKit

BuildKit is the default build backend since Docker 23.0. It enables parallel builds, better caching, secrets, SSH forwarding, and more.

Enabling BuildKit
Docker 23.0+ enables BuildKit by default. For older versions: DOCKER_BUILDKIT=1 docker build . or set "features": {"buildkit": true} in /etc/docker/daemon.json.

Cache Mounts

# syntax=docker/dockerfile:1

# Cache apt package index across builds (huge speedup)
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
    --mount=type=cache,target=/var/lib/apt,sharing=locked \
    apt-get update && apt-get install -y --no-install-recommends curl

# Cache pip download directory
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt

# Cache Go module download cache
RUN --mount=type=cache,target=/go/pkg/mod \
    --mount=type=cache,target=/root/.cache/go-build \
    go build -o /app ./...

# Cache npm
RUN --mount=type=cache,target=/root/.npm \
    npm ci

SSH Forwarding

# syntax=docker/dockerfile:1
FROM alpine AS builder
RUN apk add --no-cache openssh-client git

# Mount SSH agent socket — private keys never enter the image
RUN --mount=type=ssh \
    git clone [email protected]:private-org/private-repo.git /src
# Build with SSH agent forwarding
eval $(ssh-agent)
ssh-add ~/.ssh/id_ed25519
docker build --ssh default -t myapp .

Heredocs in Dockerfile

# syntax=docker/dockerfile:1
# Write multiple files in a single RUN layer without echo chains
RUN < /etc/app/config.yaml << 'YAML'
server:
  port: 8080
  timeout: 30s
YAML
chown appuser:appgroup /etc/app/config.yaml
EOF

# Create a file using COPY heredoc syntax
COPY <

Parallel Build Stages

# syntax=docker/dockerfile:1
# BuildKit builds independent stages in parallel automatically

FROM golang:1.22 AS go-builder
COPY go/ /src/go
RUN cd /src/go && go build -o /bin/server ./cmd/server

FROM node:20 AS node-builder
COPY frontend/ /src/frontend
RUN cd /src/frontend && npm ci && npm run build

FROM python:3.12-slim AS py-builder
COPY python/ /src/python
RUN pip wheel -w /wheels -r /src/python/requirements.txt

# Final stage: collect all artifacts
FROM debian:bookworm-slim
COPY --from=go-builder /bin/server /usr/local/bin/server
COPY --from=node-builder /src/frontend/dist /var/www/html
COPY --from=py-builder /wheels /wheels
RUN pip install --no-index --find-links /wheels /wheels/*

16. Production Patterns

Health Checks

# HTTP health check
HEALTHCHECK --interval=30s --timeout=5s --start-period=15s --retries=3 \
  CMD curl -f http://localhost:8080/health || exit 1

# TCP check (for databases, caches)
HEALTHCHECK --interval=10s --timeout=3s --retries=5 \
  CMD nc -z localhost 5432 || exit 1

# Custom script
COPY healthcheck.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/healthcheck.sh
HEALTHCHECK --interval=30s CMD healthcheck.sh
# Container health states: starting, healthy, unhealthy
docker ps     # shows health status in STATUS column
docker inspect --format='{{.State.Health.Status}}' mycontainer
docker inspect --format='{{range .State.Health.Log}}{{.Output}}{{end}}' mycontainer

Graceful Shutdown

# Ensure SIGTERM reaches your process:
# 1. Use exec form (not shell form) for CMD/ENTRYPOINT
ENTRYPOINT ["./app"]          # receives SIGTERM directly

# 2. If you need a wrapper script, exec into the process at the end
# docker-entrypoint.sh:
# #!/bin/sh
# set -e
# run_migrations
# exec "$@"        # ← "exec" replaces the shell process — signals reach the app
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["./app"]

# 3. Increase stop timeout if app needs longer to drain connections
# docker stop -t 60 mycontainer
# In Compose:
#   stop_grace_period: 60s
# compose.yaml
services:
  app:
    stop_grace_period: 60s    # docker compose down waits up to 60s for clean stop
    stop_signal: SIGTERM      # signal to send (default)

Restart Policies

PolicyBehavior
noNever restart (default)
alwaysAlways restart, even after docker stop. Starts on daemon start.
unless-stoppedRestart unless manually stopped. Does NOT restart if stopped before daemon restarts.
on-failure[:N]Restart only on non-zero exit, optionally limit to N retries
docker run --restart unless-stopped myapp
docker run --restart on-failure:5 myapp

Logging Drivers

# Configure logging driver at run time
docker run \
  --log-driver json-file \
  --log-opt max-size=10m \
  --log-opt max-file=3 \
  myapp

# Common drivers:
# json-file (default)  — JSON files on host, docker logs works
# syslog               — send to syslog
# journald             — systemd journal
# fluentd              — Fluentd collector
# awslogs              — CloudWatch Logs
# gelf                 — Graylog Extended Log Format
# splunk               — Splunk HTTP Event Collector
# none                 — discard all logs
// /etc/docker/daemon.json — set default for all containers
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "5",
    "compress": "true"
  }
}
Non-blocking log mode
By default, if a log buffer fills up, the container blocks. In production, add --log-opt mode=non-blocking --log-opt max-buffer-size=4m to prevent log backpressure from affecting your application.

Resource Limits in Production

# Memory: hard limit (OOM kill at limit) + soft limit (swappiness hint)
docker run --memory="512m" --memory-reservation="256m" myapp

# CPU: fractional vCPU (--cpus), period+quota, or relative weight
docker run --cpus="1.5" myapp          # use at most 1.5 CPU cores
docker run --cpu-shares=512 myapp      # relative weight (default 1024)

# Combined resource constraints
docker run \
  --memory="1g" \
  --memory-swap="1g" \       # equal to --memory means no swap
  --cpus="2" \
  --pids-limit=100 \         # max number of processes (prevent fork bombs)
  myapp
# compose.yaml — resource limits
services:
  app:
    deploy:
      resources:
        limits:
          cpus: "1.5"
          memory: 512M
        reservations:
          cpus: "0.5"
          memory: 256M

12-Factor App Principles with Docker

FactorDocker Implementation
CodebaseOne image per service, built from one repo
DependenciesAll deps bundled in the image; no reliance on host packages
ConfigPassed via environment variables or mounted config files — never baked in
Backing servicesOther containers (db, cache) reached by DNS name over user-defined networks
Build/Release/RunBuild = docker build, Release = tag + push, Run = docker run
ProcessesStateless containers; state in volumes or external stores
Port bindingEXPOSE + -p flag; container is self-contained web server
ConcurrencyScale by running more containers (--scale N)
DisposabilityFast startup, graceful SIGTERM shutdown
Dev/prod paritySame image runs in all environments; env vars change behavior
LogsWrite to stdout/stderr; Docker captures via log driver
Admin processesdocker run --rm or docker compose run --rm for one-off tasks

17. Common Pitfalls & Gotchas

PID 1 Problem (Zombie Processes)

PID 1 in Linux is special: it must reap zombie processes (children that have exited but whose parent hasn't called wait()). Shells do this automatically, but your Go/Python/Node app probably doesn't.

# Solution 1: tini (minimal init, bundled in Docker since 1.13)
ENTRYPOINT ["/sbin/tini", "--", "./app"]
# Or at docker run time: docker run --init myapp

# Solution 2: dumb-init (Yelp's init, good for multi-process containers)
RUN apt-get install -y dumb-init
ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["./app"]

# Solution 3: --init flag (uses Docker's bundled tini)
docker run --init myapp
When you need tini
You need an init process when: (1) your process spawns sub-processes that can become zombies, (2) you use a shell script as an entrypoint that spawns children, or (3) you run multiple processes in one container. Single-process Go binaries that don't spawn children are often fine without it.

Never Use :latest in Production

# BAD — non-reproducible, breaks silently on registry push
FROM python:latest
FROM node:latest

# GOOD — pinned to a specific version
FROM python:3.12.3-slim-bookworm
FROM node:20.12.2-alpine3.19

# BETTER — also pin by digest for guaranteed reproducibility
FROM python:3.12.3-slim-bookworm@sha256:abc123def456...

Build Cache Invalidation Pitfalls

# PROBLEM: RUN apt-get update in isolation caches the stale index forever
FROM ubuntu:22.04
RUN apt-get update                          # layer cached ← stale after weeks
RUN apt-get install -y curl                 # curl version from stale index

# SOLUTION: Always combine update + install in one RUN
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y --no-install-recommends curl

# PROBLEM: COPY . invalidates all subsequent layers, even if unrelated files changed
COPY . /app                  # any file change → rebuilds everything below
RUN pip install -r /app/requirements.txt
RUN go build ./...

# SOLUTION: Copy specific files that trigger the expensive step
COPY requirements.txt /app/
RUN pip install -r /app/requirements.txt   # cached until requirements.txt changes
COPY . /app                                # only re-copies source, pip cached

Volume Permission (UID/GID Mismatch)

# Symptom: "Permission denied" writing to a bind-mounted directory
# Cause: container user UID (e.g. 1001) != host directory owner UID (e.g. 1000)

# Fix option 1: Match UIDs — create the user with the same UID as your host user
# In Dockerfile:
# ARG UID=1000
# RUN useradd -u ${UID} appuser
# docker build --build-arg UID=$(id -u) .

# Fix option 2: Set ownership on host
chown -R 1001:1001 ./data   # match container UID

# Fix option 3: Use named volumes (Docker manages ownership)
# Named volumes don't have this problem — Docker sets correct permissions on creation

# Fix option 4: In Dockerfile, use COPY --chown or chown in entrypoint
COPY --chown=appuser:appgroup . /app

Docker Socket Mounting Security Risk

# Mounting /var/run/docker.sock gives container full Docker daemon access
# A compromised container can: spin up privileged containers, escape to host, etc.
docker run -v /var/run/docker.sock:/var/run/docker.sock myapp  # DANGER

# Alternatives:
# 1. Docker-in-Docker (dind) with explicit privilege scope
# 2. Kaniko (builds OCI images without Docker daemon — safe in Kubernetes)
# 3. Buildah (rootless container image builder)
# 4. Use a dedicated CI service (GitHub Actions hosted runners, etc.)

Large Build Context

# Slow build? Check build context size first:
docker build --no-cache . 2>&1 | head -5
# Sending build context to Docker daemon  2.1GB   ← problem

# Causes: node_modules, .git, large test fixtures, build artifacts

# Fix: comprehensive .dockerignore (see Section 3)
# Quick diagnose:
du -sh * | sort -rh | head -20   # find largest directories to exclude

Layer Squashing Trade-offs

# --squash collapses all layers into one (experimental)
docker build --squash -t myapp .

# Pros: smaller final image, no intermediate data visible in history
# Cons: loses layer sharing with other images, loses caching for intermediate steps
# Better alternative: multi-stage builds + clean up in the same RUN command

# Anti-pattern: try to delete files added in an earlier layer — doesn't work
FROM ubuntu:22.04
RUN apt-get install -y curl   # adds layer with curl
RUN apt-get purge -y curl     # creates new layer without curl but previous STILL EXISTS
# ↑ The image still contains curl — it's just hidden by the upper layer

Other Gotchas

COPY vs ADD
Prefer COPY over ADD. ADD auto-extracts tarballs and fetches URLs, which can introduce surprising behavior. Use ADD only when you explicitly need those features.
ENV persists into final image
ARG values are only available during build; they don't appear in docker inspect unless echoed into ENV. But if you write ENV SECRET=$BUILD_ARG_SECRET, the secret IS visible in the final image. Use --secret instead.
Use COPY --link for better cache behavior
COPY --link . (BuildKit) creates an independent layer that doesn't depend on the previous layer's cache key. This means changing earlier layers doesn't invalidate the COPY layer, improving cache hit rates in some multi-stage builds.
Container time zone
Containers inherit UTC by default. To set timezone: ENV TZ=America/New_York + RUN apt-get install -y tzdata (Debian), or RUN apk add tzdata (Alpine). Or bind-mount: -v /etc/localtime:/etc/localtime:ro.
Quick reference: clean up everything
docker system prune -af --volumes removes all stopped containers, unused images, unused volumes, and unused networks. Use with caution — this deletes all data in named volumes.
Skilark Tech Refreshers · Last updated February 2026 · skilark.com