← CLI Cheatsheets

Docker CLI

Docker CLI Cheatsheet

Audience: DevOps engineers, beginner to senior. Every section moves from fundamentals to production-grade usage. Callouts highlight gotchas, senior tips, and production notes.


1. Container Lifecycle

docker run — the swiss-army command

# Syntax
docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

# Named, detached, restart-always, published port
docker run -d --name myapp --restart=always -p 8080:3000 node:20-alpine

# Interactive shell (removes container on exit)
docker run --rm -it ubuntu:24.04 bash

# Resource constraints (CPU + memory)
docker run -d --name api \
  --cpus="1.5" \
  --memory="512m" \
  --memory-swap="512m" \
  node:20-alpine

# Environment variables from file + override
docker run --env-file .env -e NODE_ENV=production myimage

# Named volume + bind mount
docker run -d \
  -v pgdata:/var/lib/postgresql/data \
  -v $(pwd)/init.sql:/docker-entrypoint-initdb.d/init.sql:ro \
  postgres:16

# Network + hostname + DNS alias
docker run -d \
  --network backend-net \
  --network-alias db \
  --hostname postgres-primary \
  postgres:16

# PID namespace sharing (debug another container's processes)
docker run --rm --pid=container:myapp busybox ps aux

# IPC namespace sharing (shared memory, message queues)
docker run -d --ipc=host myapp-with-shared-mem

# Run as specific user
docker run --user 1000:1000 myimage

# Read-only root filesystem (security hardening)
docker run --read-only --tmpfs /tmp --tmpfs /run myimage

# Init process (reaps zombie processes)
docker run --init myimage
FlagPurposeDefault
-dDetached modeforeground
-itInteractive + TTYno
--rmRemove on exitkeep
--nameContainer namerandom
--restartno, on-failure[:n], always, unless-stoppedno
--cpusCPU quota (float)unlimited
--memoryHard memory limitunlimited
--memory-swapmemory + swap; equal = no swap2x memory
--env-fileLoad env from file
-vVolume/bind mount
-pPublish port host:container
--networkAttach to networkbridge
--pidPID namespace
--ipcIPC namespace
--read-onlyRead-only rootfswritable
--initRun tini as PID 1no

Gotcha: --restart=always restarts even after docker stop. Use unless-stopped in production to avoid surprises after daemon restarts during maintenance.

Senior tip: --memory-swap equal to --memory disables swap entirely. This prevents swap thrash under OOM pressure — preferred in Kubernetes-like environments where the scheduler needs predictable memory behavior.

Production note: Always set --cpus and --memory in production. Without limits, a noisy neighbor can starve the host.


Lifecycle Control

# Start / stop / restart
docker start myapp
docker stop myapp                     # SIGTERM → wait 10s → SIGKILL
docker stop --time=30 myapp           # give 30s for graceful shutdown
docker restart --time=5 myapp

# Pause / unpause (sends SIGSTOP/SIGCONT to cgroup)
docker pause myapp                    # freeze all processes, no CPU usage
docker unpause myapp

# Wait for container to exit, get exit code
docker wait myapp; echo "exit: $?"

# Rename (works on running containers)
docker rename myapp myapp-v2

# Remove (use -f to force-remove running containers)
docker rm myapp
docker rm -f myapp
docker rm -v myapp                    # also remove anonymous volumes

# Remove all stopped containers
docker container prune
docker container prune --filter until=24h

Gotcha: docker stop sends SIGTERM to PID 1. If PID 1 is a shell script (not exec form), it won’t forward the signal to children. Use exec in scripts or use --init.

docker update — live resource changes

# Change CPU/memory limits without restart
docker update --cpus="2" --memory="1g" --memory-swap="1g" myapp

# Change restart policy on live container
docker update --restart=unless-stopped myapp

# Apply to multiple containers
docker update --memory="256m" container1 container2

Senior tip: docker update modifies cgroup settings live. Useful in incidents where a container is consuming too much memory and you need to throttle it immediately without a restart.


2. Image Management

docker build

# Basic build
docker build -t myapp:1.0 .

# Build from specific Dockerfile, different context
docker build -f docker/Dockerfile.prod -t myapp:prod ./

# Build args (available at build time only, not runtime)
docker build --build-arg NODE_VERSION=20 --build-arg APP_ENV=prod -t myapp .

# Multi-stage: build only a specific target
docker build --target builder -t myapp:builder .

# Use BuildKit (faster, better cache, secrets support)
DOCKER_BUILDKIT=1 docker build -t myapp .
# or set in daemon.json: "features": {"buildkit": true}

# Cache from a registry image (useful in CI)
docker build --cache-from myregistry/myapp:cache -t myapp:latest .

# Cross-platform build (requires buildx)
docker buildx build --platform linux/amd64,linux/arm64 -t myapp:latest --push .

# Build with secrets (never embedded in layers)
docker build --secret id=mysecret,src=./secret.txt -t myapp .
# In Dockerfile: RUN --mount=type=secret,id=mysecret cat /run/secrets/mysecret

# No-cache build (forces fresh pull of base image too)
docker build --no-cache --pull -t myapp .

# Progress output: auto (default), plain (CI-friendly), tty
docker build --progress=plain -t myapp .

Senior tip: --cache-from is a CI game-changer. Pull the last build tag, pass it as cache, push it as the new cache. Cuts build time 60-80% for large images.

Gotcha: Build args set via --build-arg are visible in docker history. Never pass secrets as build args — use --secret (BuildKit) instead.

Image Operations

# Pull with specific digest (immutable, reproducible)
docker pull nginx:1.25.3
docker pull nginx@sha256:a484819eb60211f5299034ac80f6a681b06f89e65866ce91f356ed7c72af059c

# Push to registry
docker push myregistry.example.com/myapp:1.0

# Tag (source:tag → target:tag)
docker tag myapp:latest myregistry.example.com/myapp:1.0
docker tag myapp:latest myregistry.example.com/myapp:latest

# List images (with sizes, digests)
docker images
docker images --digests
docker images --filter dangling=false
docker images --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}"

# Remove image
docker rmi myapp:old
docker rmi -f myapp:used-by-stopped-container

# Remove dangling images (untagged, not referenced)
docker image prune
# Remove ALL unused images (not referenced by any container)
docker image prune -a
docker image prune -a --filter "until=72h"

# Save to tarball (preserves layers + tags)
docker save myapp:latest | gzip > myapp.tar.gz
docker save -o myapp.tar myapp:latest myapp:v1

# Load from tarball
docker load < myapp.tar.gz
docker load -i myapp.tar

# Export container filesystem (no layers, no history)
docker export mycontainer | gzip > mycontainer-fs.tar.gz

# Import filesystem as new image
docker import mycontainer-fs.tar.gz myapp:imported

# View image layer history
docker history myapp:latest
docker history --no-trunc myapp:latest   # full commands

# Inspect image (full metadata, env, entrypoint, layers)
docker inspect myapp:latest
docker inspect --format '{{.Config.Env}}' myapp:latest
docker inspect --format '{{json .RootFS.Layers}}' myapp:latest | jq .

Gotcha: docker export/import strips all metadata (ENV, ENTRYPOINT, EXPOSE). Use docker save/load for full image transfer.

Senior tip: Pin images to digest in production. Tags are mutable — nginx:latest today ≠ nginx:latest tomorrow. Use digest to guarantee bit-for-bit reproducibility.

docker manifest — multi-arch awareness

# Inspect multi-arch manifest list
docker manifest inspect node:20-alpine

# Create and push a manifest list (requires experimental CLI)
docker manifest create myregistry/myapp:latest \
  myregistry/myapp:amd64 \
  myregistry/myapp:arm64

docker manifest annotate myregistry/myapp:latest \
  myregistry/myapp:arm64 --os linux --arch arm64

docker manifest push myregistry/myapp:latest

3. Container Inspection & Debugging

docker ps

docker ps                               # running containers
docker ps -a                            # all containers
docker ps -q                            # only IDs (scriptable)
docker ps --filter status=exited
docker ps --filter name=myapp
docker ps --filter ancestor=nginx:1.25
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"

# Kill all running containers
docker kill $(docker ps -q)

# Remove all stopped containers
docker rm $(docker ps -aq -f status=exited)

docker logs

docker logs myapp                        # all logs since start
docker logs -f myapp                     # follow (like tail -f)
docker logs --tail=100 myapp             # last 100 lines
docker logs --since 1h myapp             # last 1 hour
docker logs --since "2024-01-15T10:00:00" myapp
docker logs --until "2024-01-15T11:00:00" myapp
docker logs -t myapp                     # include timestamps
docker logs -f --tail=50 -t myapp        # combined: follow + timestamps + last 50

Gotcha: docker logs only works with the json-file and journald log drivers. If you’ve set --log-driver=fluentd or syslog, this command returns nothing.

docker inspect

# Full JSON metadata
docker inspect myapp

# Specific fields using Go template
docker inspect --format '{{.State.Status}}' myapp
docker inspect --format '{{.NetworkSettings.IPAddress}}' myapp
docker inspect --format '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' myapp
docker inspect --format '{{json .HostConfig.Binds}}' myapp | jq .
docker inspect --format '{{.Config.Env}}' myapp

# Inspect multiple objects
docker inspect myapp db redis

docker exec

# Interactive shell
docker exec -it myapp bash
docker exec -it myapp sh                 # for alpine-based containers

# Run as specific user
docker exec -it -u root myapp bash
docker exec -it -u www-data myapp bash

# Set environment variable for the exec session
docker exec -it -e DEBUG=1 myapp bash

# Non-interactive command
docker exec myapp cat /etc/hosts
docker exec myapp env | grep DB_

# Working directory
docker exec -w /app myapp ls -la

Senior tip: docker exec -u root is your escape hatch when a container runs as non-root and you need to debug as root. In Kubernetes, this is kubectl exec + --user only works if the image has root available.

docker stats, top, diff

# Live resource usage (CPU, MEM, NET I/O, BLOCK I/O)
docker stats
docker stats myapp                       # single container
docker stats --no-stream                 # snapshot (good for scripts)
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"

# Processes inside container
docker top myapp
docker top myapp aux                     # ps flags passthrough

# Filesystem changes since container start
docker diff myapp
# A = Added, C = Changed, D = Deleted

docker cp

# Copy from container to host
docker cp myapp:/app/logs/error.log ./error.log
docker cp myapp:/app/config/ ./config-backup/

# Copy from host to container
docker cp ./new-config.yaml myapp:/app/config/config.yaml

# Works on stopped containers too
docker cp stopped-container:/var/log/app.log .

Production note: docker cp is invaluable for forensics after a crash. Pull logs, configs, or core dumps from stopped containers before they’re removed.

docker events

# Stream real-time events
docker events

# Filter by type
docker events --filter type=container
docker events --filter event=die
docker events --filter event=oom

# Historical events (last 10 minutes)
docker events --since 10m
docker events --since "2024-01-15T10:00:00" --until "2024-01-15T11:00:00"

# Format output
docker events --format '{{.Time}} {{.Actor.Attributes.name}} {{.Action}}'

Senior tip: docker events --filter event=oom is your OOM kill detector. Pipe it to an alerting script in production to get notified when containers are killed by the OOM killer.

docker port & docker attach

# Show port mappings
docker port myapp
docker port myapp 80

# Attach to container's stdout/stderr (not a shell)
docker attach myapp
# Detach without stopping: Ctrl+P, Ctrl+Q

Gotcha: docker attach connects to PID 1’s stdio. If the container’s entrypoint doesn’t write to stdout/stderr, you’ll see nothing. Use docker exec for interactive shells.


4. Volumes & Storage

Volume Commands

# Create named volume
docker volume create pgdata
docker volume create --driver local \
  --opt type=tmpfs \
  --opt device=tmpfs \
  --opt o=size=100m,uid=1000 \
  myapp-tmpfs

# List volumes
docker volume ls
docker volume ls --filter dangling=true

# Inspect volume (see mountpoint, driver, options)
docker volume inspect pgdata

# Remove specific volume
docker volume rm pgdata

# Remove all unused volumes
docker volume prune
docker volume prune --filter label=environment=dev

Mount Syntax Comparison

Scenario-v (short)--mount (explicit)
Named volume-v pgdata:/var/lib/postgresql/data--mount type=volume,src=pgdata,dst=/var/lib/postgresql/data
Bind mount-v $(pwd)/app:/app--mount type=bind,src=$(pwd)/app,dst=/app
Read-only bind-v $(pwd)/config:/config:ro--mount type=bind,src=$(pwd)/config,dst=/config,readonly
tmpfs(not supported)--mount type=tmpfs,dst=/tmp,tmpfs-size=100m

Senior tip: Prefer --mount over -v in scripts and Compose files. It’s explicit, self-documenting, and fails loudly when the source path doesn’t exist (bind mounts). -v silently creates a named volume if the source doesn’t look like a path.

Backup & Restore Patterns

# Backup a named volume to tarball
docker run --rm \
  -v pgdata:/data:ro \
  -v $(pwd):/backup \
  busybox tar czf /backup/pgdata-$(date +%Y%m%d).tar.gz -C /data .

# Restore a named volume from tarball
docker run --rm \
  -v pgdata:/data \
  -v $(pwd):/backup \
  busybox tar xzf /backup/pgdata-20240115.tar.gz -C /data

# Clone a volume
docker run --rm \
  -v pgdata:/source:ro \
  -v pgdata-clone:/dest \
  busybox cp -av /source/. /dest/

Volume Drivers

# NFS volume (production multi-host storage)
docker volume create \
  --driver local \
  --opt type=nfs \
  --opt o=addr=nfs-server.example.com,rw \
  --opt device=:/exports/mydata \
  nfs-volume

# Amazon EFS via NFS driver
docker volume create \
  --driver local \
  --opt type=nfs4 \
  --opt o=addr=fs-xxx.efs.us-east-1.amazonaws.com,rw \
  --opt device=:/ \
  efs-volume

Production note: Named volumes survive container removal. Bind mounts tie you to the host filesystem path. In production, prefer named volumes for data persistence and bind mounts only for config files and development code.


5. Networking

Network Commands

# Create networks
docker network create mynet                              # bridge (default)
docker network create --driver bridge \
  --subnet 172.28.0.0/16 \
  --ip-range 172.28.5.0/24 \
  --gateway 172.28.5.254 \
  mynet

# Overlay network (for Swarm / multi-host)
docker network create --driver overlay --attachable mynet-overlay

# macvlan (container gets real MAC/IP on host network)
docker network create -d macvlan \
  --subnet=192.168.1.0/24 \
  --gateway=192.168.1.1 \
  -o parent=eth0 \
  macvlan-net

# List networks
docker network ls
docker network ls --filter driver=bridge

# Inspect network (see connected containers, IP config)
docker network inspect mynet
docker network inspect --format '{{range .Containers}}{{.Name}}: {{.IPv4Address}}{{"\n"}}{{end}}' mynet

# Connect / disconnect live containers
docker network connect mynet myapp
docker network connect --alias db --ip 172.28.5.100 mynet myapp
docker network disconnect mynet myapp

# Remove network
docker network rm mynet
docker network prune

Network Driver Comparison

DriverUse CaseIsolationPerformance
bridgeDefault single-hostContainer-levelSlight NAT overhead
hostMax performance, same IP as hostNoneWire speed
overlaySwarm / multi-hostContainer-levelVXLAN overhead
macvlanContainers need L2 on host networkPhysical NIC-levelNear wire speed
ipvlanLike macvlan, L3 modeSubnet-levelNear wire speed
noneNo networkingFullN/A

Senior tip: --network host is the nuclear option for performance. The container shares the host’s network namespace — no NAT, no port mapping overhead. Never use it unless you understand the security implications (no network isolation).

DNS Resolution in Docker

  • Containers on user-defined bridge networks can resolve each other by container name and network alias
  • Default bridge (docker0) does NOT have built-in DNS — use --link (deprecated) or move to user-defined network
  • Docker’s embedded DNS server runs at 127.0.0.11 inside containers
  • DNS search domain is the container’s hostname domain
# Container-to-container by name (user-defined bridge)
docker network create appnet
docker run -d --name redis --network appnet redis:7
docker run -d --name api --network appnet myapi
# Inside api container: redis:6379 resolves correctly

# Multiple aliases for service discovery
docker run -d --network appnet --network-alias cache --network-alias session redis:7

Gotcha: Container names with underscores may not resolve via DNS in some Docker versions (Docker 20.x+). Prefer hyphens in container names for DNS reliability.


6. Docker Compose

Core Commands

# Start all services (build if needed)
docker compose up -d

# Start with build (always rebuild)
docker compose up -d --build

# Start specific services
docker compose up -d api worker

# Scale a service (stateless services only)
docker compose up -d --scale worker=5

# Stop and remove containers (keep volumes)
docker compose down

# Stop and remove containers + volumes
docker compose down -v

# Stop and remove containers + volumes + images
docker compose down -v --rmi all

# Rebuild images without starting
docker compose build
docker compose build --no-cache api

# Pull all images
docker compose pull

# View logs
docker compose logs -f
docker compose logs -f --tail=100 api
docker compose logs --since 30m worker

# List services
docker compose ps

# Execute command in running service
docker compose exec api bash
docker compose exec -u root api bash
docker compose exec api env | grep DB

# Run a one-off command (new container, removed after)
docker compose run --rm api python manage.py migrate
docker compose run --rm --no-deps api bash

# Restart services
docker compose restart
docker compose restart api

# View current config (merged, with env expansion)
docker compose config

# Validate config
docker compose config --quiet && echo "Config OK"

Profiles — Environment Segmentation

# docker-compose.yml
services:
  api:
    image: myapi
  
  worker:
    image: myapi
    command: celery worker
    profiles: [worker]
  
  flower:
    image: mher/flower
    profiles: [monitoring]
  
  jaeger:
    image: jaegertracing/all-in-one
    profiles: [tracing, monitoring]
docker compose --profile worker up -d
docker compose --profile monitoring --profile tracing up -d
# Or via env var:
COMPOSE_PROFILES=worker,monitoring docker compose up -d

depends_on with Healthchecks

services:
  db:
    image: postgres:16
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s
      timeout: 5s
      retries: 5
      start_period: 10s

  api:
    image: myapi
    depends_on:
      db:
        condition: service_healthy   # wait for healthcheck to pass
      redis:
        condition: service_started   # just wait for start (default)

Gotcha: Without condition: service_healthy, depends_on only waits for the container to start — not for the service inside to be ready. Use healthchecks for databases and message brokers.

Override Files & Environment Files

# docker-compose.yml            (base config)
# docker-compose.override.yml   (auto-loaded in dev)
# docker-compose.prod.yml       (explicit for production)

docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

# Environment file
docker compose --env-file .env.prod up -d
# docker-compose.prod.yml (override example)
services:
  api:
    image: myregistry/myapi:${TAG}
    restart: unless-stopped
    deploy:
      resources:
        limits:
          cpus: '1.0'
          memory: 512M
    logging:
      driver: fluentd
      options:
        fluentd-address: "localhost:24224"
        tag: "app.api"

Secrets & Configs (Compose v3.1+)

services:
  api:
    image: myapi
    secrets:
      - db_password
    configs:
      - source: nginx_config
        target: /etc/nginx/nginx.conf

secrets:
  db_password:
    file: ./secrets/db_password.txt

configs:
  nginx_config:
    file: ./nginx.conf

Production note: In Docker Swarm mode, secrets are stored in the Raft log encrypted at rest. In standalone Compose, they’re mounted as files — still better than env vars because they don’t appear in docker inspect env output.


7. Registry & Distribution

Login & Authentication

# Login to Docker Hub
docker login

# Login to private registry
docker login myregistry.example.com
docker login myregistry.example.com -u myuser -p mypassword  # not recommended

# Login with stdin (CI-friendly, avoids shell history)
echo "$REGISTRY_PASSWORD" | docker login -u "$REGISTRY_USER" --password-stdin myregistry.example.com

# Logout
docker logout myregistry.example.com

# View stored credentials
cat ~/.docker/config.json

Production note: Never use -p flag in shell scripts — the password appears in ps output and shell history. Use --password-stdin with environment variables.

docker buildx — Multi-Platform Builds

# Create a builder with multi-platform support
docker buildx create --name multiarch --driver docker-container --bootstrap
docker buildx use multiarch

# List builders
docker buildx ls

# Build and push multi-arch image in one step
docker buildx build \
  --platform linux/amd64,linux/arm64,linux/arm/v7 \
  --tag myregistry/myapp:latest \
  --push \
  .

# Build locally for testing (single platform, loads into docker)
docker buildx build --platform linux/amd64 --load -t myapp:test .

# Inspect builder (check supported platforms)
docker buildx inspect multiarch --bootstrap

# Bake — multi-service buildx with a config file
docker buildx bake --file docker-bake.hcl --push

Senior tip: docker buildx build --push bypasses the local Docker image store entirely. Images go directly to the registry. Use --load to pull the result back locally for testing, but only for a single platform.

Image Signing with cosign

# Install cosign (https://github.com/sigstore/cosign)
# Sign an image after push
cosign sign myregistry/myapp:latest

# Verify signature
cosign verify myregistry/myapp:latest

# Sign with a key file
cosign sign --key cosign.key myregistry/myapp:latest
cosign verify --key cosign.pub myregistry/myapp:latest

8. System & Cleanup

Disk Space Management

# Show Docker disk usage breakdown
docker system df
docker system df -v   # verbose: per-image, per-volume, per-build-cache

# Prune levels (least to most aggressive)

# 1. Dangling images only (untagged, not referenced by any container)
docker image prune

# 2. All unused images (not used by any running/stopped container)
docker image prune -a

# 3. Stopped containers
docker container prune

# 4. Unused networks
docker network prune

# 5. Unused volumes (DESTRUCTIVE — check before running)
docker volume prune

# 6. Build cache
docker builder prune
docker builder prune --keep-storage 5GB

# 7. Nuclear: everything at once
docker system prune
docker system prune -a          # includes unused images
docker system prune -a -f       # no confirmation prompt
docker system prune -a --volumes  # includes volumes (very destructive)

# Filtered pruning
docker container prune --filter until=24h
docker image prune -a --filter until=72h
docker image prune --filter label=stage=builder
CommandWhat it removesDanger Level
docker image pruneDangling imagesLow
docker image prune -aAll unused imagesMedium
docker container pruneStopped containersLow
docker volume pruneUnused volumesHigh — data loss
docker network pruneUnused networksLow
docker builder pruneBuild cacheLow
docker system prune -a --volumesEverythingNuclear

Production note: Run docker system df before prune to understand what will be freed. Schedule docker image prune -a --filter until=72h as a cron job to prevent disk exhaustion in CI environments.

Gotcha: docker volume prune removes volumes not attached to ANY container — including stopped ones. A stopped DB container means its volume is at risk. Use docker volume prune --filter label=... to be surgical.


9. Docker Context & Remote Management

# List contexts
docker context ls

# Create SSH context (connect to remote Docker daemon)
docker context create remote-prod \
  --docker "host=ssh://deploy@prod.example.com"

# Use a context
docker context use remote-prod
docker context use default

# Run a command in a specific context without switching
docker --context remote-prod ps

# Create TLS context (direct TCP + certificates)
docker context create remote-tls \
  --docker "host=tcp://prod.example.com:2376,ca=/path/to/ca.pem,cert=/path/to/cert.pem,key=/path/to/key.pem"

# DOCKER_HOST env override (ad-hoc, no context needed)
export DOCKER_HOST="ssh://deploy@prod.example.com"
docker ps  # runs against remote host
unset DOCKER_HOST

# Remove context
docker context rm remote-prod

Senior tip: SSH contexts are the safest way to manage remote Docker daemons. No need to expose Docker’s TCP socket — SSH handles auth. The deploy user just needs to be in the docker group on the remote host.

Gotcha: Exposing Docker’s TCP socket (-H tcp://0.0.0.0:2376) without TLS gives root-equivalent remote code execution. Never do this on internet-facing hosts.


10. Dockerfile Best Practices

Layer Caching Strategy

# BAD: invalidates cache on any source change
FROM node:20-alpine
COPY . .
RUN npm install

# GOOD: copy package files first, install deps, then copy source
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./          # only invalidates when deps change
RUN npm ci --only=production
COPY . .                       # source change only rebuilds this layer

Multi-Stage Builds

# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Stage 2: Production runtime (tiny image)
FROM node:20-alpine AS runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package.json .

# Stage 3: Tests (never shipped to production)
FROM builder AS test
RUN npm test

USER node
EXPOSE 3000
CMD ["node", "dist/server.js"]
# Build only the runtime stage (skip test stage)
docker build --target runtime -t myapp:prod .

Signal Handling — ENTRYPOINT vs CMD

# Shell form: PID 1 is /bin/sh, signals not forwarded to app
CMD node server.js           # DON'T — signals lost

# Exec form: PID 1 is the app itself, signals forwarded correctly
CMD ["node", "server.js"]   # DO — SIGTERM reaches node

# ENTRYPOINT + CMD pattern (allows argument override)
ENTRYPOINT ["node"]
CMD ["server.js"]            # docker run myapp worker.js overrides CMD

Gotcha: Shell form (CMD node server.js) wraps the command in /bin/sh -c. The shell becomes PID 1 and doesn’t forward signals. Your app ignores SIGTERM and Docker waits the full timeout before SIGKILL. Use exec form always.

ARG vs ENV

ARG NODE_ENV=production        # build-time only, not in final image env
ENV NODE_ENV=$NODE_ENV         # runtime env, set from ARG

ARG APP_VERSION
LABEL version=$APP_VERSION     # bake version into image metadata

Gotcha: ARG values appear in docker history. ENV values appear in docker inspect. Neither is secure for secrets. Use --secret (BuildKit) for credentials needed at build time.

Security Best Practices

# Non-root user
FROM node:20-alpine
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app
COPY --chown=appuser:appgroup . .
USER appuser

# Minimal image with HEALTHCHECK
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD wget -qO- http://localhost:3000/health || exit 1

# Signal configuration
STOPSIGNAL SIGTERM

# .dockerignore (always include)

.dockerignore essentials:

.git
.gitignore
node_modules
npm-debug.log
Dockerfile*
docker-compose*
.env
.env.*
*.md
coverage/
.nyc_output
.DS_Store

Senior tip: HEALTHCHECK in Dockerfile gives you health status in docker ps and Compose depends_on: condition: service_healthy. It also integrates with Swarm and some orchestrators for auto-replacement of unhealthy containers.


11. Production Patterns

Logging Drivers

# json-file (default) with rotation
docker run -d \
  --log-driver json-file \
  --log-opt max-size=100m \
  --log-opt max-file=3 \
  myapp

# Forward to syslog
docker run -d \
  --log-driver syslog \
  --log-opt syslog-address=udp://logserver:514 \
  --log-opt tag="{{.Name}}" \
  myapp

# Forward to fluentd
docker run -d \
  --log-driver fluentd \
  --log-opt fluentd-address=localhost:24224 \
  --log-opt tag="app.{{.Name}}" \
  myapp

# AWS CloudWatch Logs
docker run -d \
  --log-driver awslogs \
  --log-opt awslogs-region=us-east-1 \
  --log-opt awslogs-group=/app/prod \
  --log-opt awslogs-stream=api-$(hostname) \
  myapp

Production note: Set log rotation (max-size, max-file) on every container. Without it, the json-file driver fills your disk silently over time — especially on high-traffic services.

Healthchecks in docker run

docker run -d \
  --health-cmd="wget -qO- http://localhost:3000/health || exit 1" \
  --health-interval=30s \
  --health-timeout=10s \
  --health-start-period=30s \
  --health-retries=3 \
  myapp

Init Process — Zombie Reaping

# Use Docker's built-in init (tini)
docker run --init myapp

# Or embed tini in Dockerfile
FROM alpine:3.19
RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["myapp"]

Senior tip: --init adds tini as PID 1. Tini properly reaps zombie processes (wait4 syscall) and forwards signals. Essential for containers that spawn child processes (e.g., Celery workers, shell scripts that fork).

Read-Only Root Filesystem

docker run -d \
  --read-only \
  --tmpfs /tmp:rw,size=64m,noexec \
  --tmpfs /var/run:rw,size=10m \
  -v app-logs:/var/log/app \
  myapp

Production note: --read-only prevents attackers from writing persistent backdoors to the container filesystem. Combine with --tmpfs for directories that legitimately need writes (temp files, PIDs, locks).

Secrets Management in Production

# docker secret (Swarm mode only)
echo "supersecret" | docker secret create db_password -
docker service create \
  --secret db_password \
  --env DB_PASSWORD_FILE=/run/secrets/db_password \
  myapp
# In app: read from /run/secrets/db_password file path

# External secrets: HashiCorp Vault, AWS SSM, GCP Secret Manager
# Inject at container start via entrypoint script, not as env vars

Complete Production docker-compose.yml Pattern

version: '3.9'

services:
  api:
    image: myregistry/myapi:${TAG:-latest}
    restart: unless-stopped
    init: true
    read_only: true
    tmpfs:
      - /tmp:size=64m,noexec
    user: "1000:1000"
    ports:
      - "127.0.0.1:3000:3000"   # bind to loopback only, use reverse proxy
    environment:
      NODE_ENV: production
      LOG_LEVEL: info
    env_file:
      - .env.prod
    secrets:
      - db_password
    deploy:
      resources:
        limits:
          cpus: '1.0'
          memory: 512M
        reservations:
          cpus: '0.25'
          memory: 128M
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s
    logging:
      driver: json-file
      options:
        max-size: "50m"
        max-file: "3"
    depends_on:
      db:
        condition: service_healthy
    networks:
      - backend

  db:
    image: postgres:16-alpine
    restart: unless-stopped
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password
    secrets:
      - db_password
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s
      timeout: 5s
      retries: 5
      start_period: 15s
    deploy:
      resources:
        limits:
          memory: 1G
    logging:
      driver: json-file
      options:
        max-size: "20m"
        max-file: "3"
    networks:
      - backend

volumes:
  pgdata:
    external: true   # must be created before: docker volume create pgdata

secrets:
  db_password:
    file: ./secrets/db_password.txt

networks:
  backend:
    driver: bridge
    internal: true   # no external internet access from this network

Senior tip: Binding ports to 127.0.0.1 (e.g., 127.0.0.1:3000:3000) prevents direct external access. Traffic routes through your reverse proxy (Nginx, Caddy, Traefik). This is a fundamental defense-in-depth measure.

Production note: Use external: true for volumes in production Compose files. This prevents docker compose down -v from accidentally destroying your data. You have to be intentional about volume deletion.


Quick Reference: Most-Used Commands

# Get a shell inside a running container
docker exec -it <name> bash

# Follow logs with timestamps
docker logs -ft <name>

# One-liner: remove all stopped containers
docker container prune -f

# One-liner: remove all unused images
docker image prune -af

# Inspect IP of a container
docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' <name>

# Watch resource usage
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}"

# Rebuild and restart a single Compose service
docker compose up -d --build --no-deps api

# Force-recreate without rebuild
docker compose up -d --force-recreate api

# Copy DB backup from container
docker exec db pg_dump -U postgres mydb | gzip > db-$(date +%Y%m%d).sql.gz

# Run a disposable alpine debug container on same network
docker run --rm -it --network <net> nicolaka/netshoot