Container Runtime Ecosystem
Container Runtime Ecosystem
Docker is not the only container runtime. Understanding the ecosystem — containerd, runc, CRI-O, Podman — is essential for principal-level roles because you need to make infrastructure decisions about which runtime to use and why.
The Runtime Stack
┌─────────────────────────────────────────────────┐
│ User Interface │
│ docker CLI / nerdctl / podman / crictl / kubectl │
└─────────────┬───────────────────────────────────┘
│
┌─────────────┴───────────────────────────────────┐
│ High-Level Runtime (manages images, networking) │
│ dockerd / containerd / CRI-O / Podman │
└─────────────┬───────────────────────────────────┘
│ OCI Runtime Spec
┌─────────────┴───────────────────────────────────┐
│ Low-Level Runtime (creates the actual container) │
│ runc / crun / gVisor / Kata Containers │
└─────────────┬───────────────────────────────────┘
│ Linux syscalls
┌─────────────┴───────────────────────────────────┐
│ Linux Kernel (namespaces, cgroups, seccomp) │
└─────────────────────────────────────────────────┘
ELI5: Think of it like a restaurant. The CLI is the waiter (takes your order). The high-level runtime is the kitchen manager (coordinates everything). The low-level runtime is the actual chef (does the cooking). The kernel is the kitchen equipment (oven, stove). You can swap out the waiter, manager, or chef independently — as long as they all follow the same recipes (OCI spec).
Docker Engine Architecture
docker CLI → dockerd (Docker daemon) → containerd → runc
| Component | Role | Process |
|---|---|---|
| docker CLI | User interface, sends API calls | docker |
| dockerd | Docker daemon, API server, builds, networking, volumes | dockerd |
| containerd | Container lifecycle management, image management | containerd |
| containerd-shim | Keeps container alive if containerd restarts | containerd-shim-runc-v2 |
| runc | Creates the container (namespaces, cgroups, exec) | runc (exits after start) |
Key insight: runc starts the container and then exits. The containerd-shim keeps the container running. This means containerd (and even dockerd) can be restarted without killing running containers.
Think of it this way: runc is like the person who lights a candle (creates the container process) — they leave after the candle is lit. The containerd-shim is the candleholder (keeps it running). If the person who lit it leaves the room (containerd restarts), the candle stays lit.
containerd
The industry-standard container runtime. Used by Docker, Kubernetes, and most cloud providers.
What containerd does:
- Image pull/push/management
- Container lifecycle (create, start, stop, delete)
- Content-addressable storage for images
- Snapshot management (filesystem layers)
- Task management (running processes)
- Networking (via CNI plugins)
What containerd does NOT do:
- Build images (use BuildKit, kaniko, etc.)
- Docker Compose (use nerdctl for Compose-like experience)
- Swarm orchestration
# Direct containerd interaction via ctr
ctr images pull docker.io/library/nginx:latest
ctr run docker.io/library/nginx:latest my-nginx
# nerdctl: Docker-compatible CLI for containerd
nerdctl run -d -p 80:80 nginx
nerdctl compose up # Docker Compose compatible!
Why this matters: Kubernetes dropped Docker (dockershim) in v1.24 and switched to containerd directly via CRI. If your K8s cluster was using Docker, it’s now using containerd. Understanding containerd is understanding what runs your K8s pods.
CRI-O
A lightweight container runtime purpose-built for Kubernetes. Nothing else.
| Feature | containerd | CRI-O |
|---|---|---|
| Primary use | General purpose + K8s | K8s only |
| Image building | No (but supports BuildKit) | No |
| Docker compatibility | Partial (via nerdctl) | None — K8s only |
| Scope | Broad (embeddable, extensible) | Narrow (CRI implementation) |
| Used by | Docker, K8s (EKS, GKE, AKS) | OpenShift (Red Hat), K8s |
| OCI compliant | Yes | Yes |
Decision framework: Use containerd for most Kubernetes clusters — it’s the default on EKS, GKE, and AKS. Use CRI-O if you’re on OpenShift or want the most minimal runtime possible (fewer features = smaller attack surface).
Podman, Buildah, Skopeo
Red Hat’s container toolkit. Three separate tools replacing Docker’s monolith.
| Tool | Replaces | Key Feature |
|---|---|---|
| Podman | docker run/stop/exec | Daemonless, rootless by default |
| Buildah | docker build | Fine-grained image building, no daemon |
| Skopeo | docker pull/push/inspect | Image operations without pulling layers |
Podman vs Docker
| Feature | Docker | Podman |
|---|---|---|
| Architecture | Client-server (dockerd daemon) | Daemonless (fork-exec) |
| Root required | Default (rootless mode available) | Rootless by default |
| Systemd integration | Generates systemd units | Native systemd support, podman generate systemd |
| Pod concept | No native pods | Yes — groups containers like K8s pods |
| CLI compatibility | Original | Drop-in compatible (alias docker=podman) |
| Compose | Docker Compose | podman-compose or podman compose (Compose v2) |
| Socket | /var/run/docker.sock (root) | Per-user socket (rootless) |
ELI5: Docker is like a hotel with a 24/7 concierge (daemon) — everything goes through them. Podman is like Airbnb — no middleman, you just walk in and start using the place. No daemon means no single point of failure, but also means no persistent background service managing your containers.
When to use Podman over Docker:
- Security-conscious environments (rootless by default, no daemon socket to protect)
- RHEL/CentOS/Fedora environments (native support)
- Systemd integration (generate and manage services directly)
- When you need pod-level grouping without Kubernetes
Skopeo (Image Operations)
# Inspect image without pulling
skopeo inspect docker://docker.io/library/nginx:latest
# Copy between registries without local pull
skopeo copy docker://source-registry/myapp:v1 docker://dest-registry/myapp:v1
# Sync entire repository
skopeo sync --src docker --dest docker source-registry/myapp dest-registry/
Why Skopeo matters: In CI/CD, you often need to copy images between registries (dev → staging → prod). With Docker, you’d pull locally then push — moving potentially GBs of data through the CI runner. Skopeo does registry-to-registry transfer without touching local storage.
Sandboxed Runtimes
Standard containers share the host kernel. For stronger isolation:
| Runtime | Isolation | Overhead | Use Case |
|---|---|---|---|
| runc | Namespaces + cgroups | Minimal | Default, most workloads |
| gVisor (runsc) | User-space kernel | Moderate (syscall interception) | Untrusted workloads, multi-tenant |
| Kata Containers | Lightweight VM per container | Higher (VM boot) | Strong isolation, compliance |
| Firecracker | MicroVM | Very low for VM (125ms boot) | AWS Lambda, serverless |
Decision framework: Trust the code? → runc (default). Running untrusted user code (like a code execution platform)? → gVisor. Need VM-level isolation for compliance? → Kata. Building a serverless platform? → Firecracker.
# Kubernetes: use gVisor RuntimeClass
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: gvisor
handler: runsc
---
apiVersion: v1
kind: Pod
metadata:
name: sandboxed-pod
spec:
runtimeClassName: gvisor
containers:
- name: untrusted-app
image: user-submitted-code:latest
Key Takeaways for Interviews
- “Docker vs containerd?” → Docker uses containerd under the hood. K8s dropped dockershim and talks to containerd directly via CRI. containerd is the actual runtime; Docker adds build, compose, swarm on top.
- “Why did K8s remove Docker?” → K8s only needs CRI. Docker doesn’t implement CRI — it needed dockershim as a translator. Removing the shim simplifies the stack. Your containers still work — they’re OCI images running on containerd.
- “Podman vs Docker?” → Podman is daemonless + rootless by default. Drop-in Docker CLI replacement. Better security posture. Use on RHEL/Fedora or when daemon socket is a security concern.
- “When would you use gVisor?” → Running untrusted code (SaaS platforms, code execution services, multi-tenant environments). Adds a user-space kernel that intercepts syscalls — stronger isolation than namespaces but with performance cost.
- “Explain the OCI spec” → Image spec (how images are packaged), Runtime spec (how containers are run), Distribution spec (how registries work). Ensures any OCI image runs on any OCI runtime.