Container Networking

10 min read 2046 words

Container Networking

Network issues are the #1 production pain point for containers. If you don’t understand how container networking works at the iptables/veth level, you’ll spend hours debugging problems that take minutes once you see the architecture.

Docker Network Drivers

Driver	Scope	Use Case	Container-to-Container	External Access
bridge	Single host	Default. Dev/test.	Via bridge IP	Port mapping (-p)
host	Single host	Max network performance	Via localhost	Direct (no NAT)
none	Single host	Fully isolated	No networking	None
overlay	Multi-host	Swarm/K8s clusters	Across hosts via VXLAN	Via ingress/load balancer
macvlan	Single host	Legacy integration	Each container gets real MAC	Direct on physical network
ipvlan	Single host	Like macvlan, shared MAC	Shares host MAC address	Direct on physical network

ELI5: Bridge is like a home WiFi router — all your devices connect to it and share one public IP. Host is like plugging directly into the modem — fastest, but no isolation. Overlay is like a VPN — containers on different machines talk as if they’re on the same network. Macvlan gives each container its own “phone number” on the physical network.

Bridge Network (Default)

Every Docker installation creates a docker0 bridge. When you run a container without specifying a network, it connects here.

┌─────────────┐  ┌─────────────┐
│ Container A  │  │ Container B  │
│  eth0        │  │  eth0        │
│  172.17.0.2  │  │  172.17.0.3  │
└──────┬───────┘  └──────┬───────┘
       │ veth            │ veth
       │                 │
┌──────┴─────────────────┴───────┐
│         docker0 bridge          │
│         172.17.0.1              │
└────────────┬────────────────────┘
             │ NAT (iptables)
┌────────────┴────────────────────┐
│         Host eth0               │
│         192.168.1.100           │
└─────────────────────────────────┘

Default Bridge vs User-Defined Bridge

Feature	Default bridge (`docker0`)	User-defined bridge
DNS resolution	NO — only IP addresses	YES — containers resolve by name
Automatic connection	All containers by default	Only explicitly connected containers
Isolation	All containers can see each other	Only containers on same network
Live connect/disconnect	No	Yes (`docker network connect/disconnect`)
Link legacy support	`--link` needed for name resolution	Built-in DNS

Why this matters: On the default bridge, containers can only reach each other by IP address. IPs change on restart. This is why beginners hardcode IPs and then everything breaks. Always use user-defined bridge networks — they provide DNS-based service discovery automatically.

Common mistake: Using the default bridge network in production. Always create a named network: docker network create myapp. The DNS resolution alone is worth it.

How Port Mapping Works

docker run -p 8080:80 does this under the hood:

Docker adds an iptables DNAT rule: traffic to host:8080 → container:80
Docker adds an iptables MASQUERADE rule for return traffic
docker-proxy process listens on host:8080 as a fallback (for non-TCP/UDP, hairpin NAT)

# See the actual iptables rules Docker creates
iptables -t nat -L -n | grep DNAT
# DNAT  tcp  --  0.0.0.0/0  0.0.0.0/0  tcp dpt:8080 to:172.17.0.2:80

Think of it this way: Port mapping is like a receptionist at a hotel. You call the hotel’s main number (host:8080), and the receptionist forwards your call to Room 204 (container:80). The iptables DNAT rule IS the receptionist.

Host Network

docker run --network host — container uses the host’s network stack directly. No network namespace.

When to use:

Maximum network performance (no NAT overhead, no bridge hop)
Applications that need to bind to many ports dynamically
Network monitoring tools that need to see all host traffic

When NOT to use:

Multiple containers that need the same port (port conflicts)
Any security-sensitive deployment (no network isolation at all)
Production services (usually) — you lose container network isolation

ELI5: Host network is like giving your container a master key to the entire building’s phone system. It can use any phone line (port) directly, but it can also accidentally pick up calls meant for other residents.

Overlay Network (Multi-Host)

Overlay networks connect containers across multiple Docker hosts. Used by Docker Swarm and Kubernetes.

Host A (192.168.1.10)              Host B (192.168.1.20)
┌──────────────────┐              ┌──────────────────┐
│ Container A      │              │ Container B      │
│ 10.0.0.2         │              │ 10.0.0.3         │
└───────┬──────────┘              └───────┬──────────┘
        │                                 │
┌───────┴──────────┐              ┌───────┴──────────┐
│ Overlay network  │              │ Overlay network  │
│ VXLAN tunnel     ├──────────────┤ VXLAN tunnel     │
│ (encapsulates)   │  UDP 4789    │ (decapsulates)   │
└───────┬──────────┘              └───────┴──────────┘
        │                                 │
┌───────┴──────────────────────────────────┴──────────┐
│                Physical Network                       │
└──────────────────────────────────────────────────────┘

How it works:

Container A sends a packet to 10.0.0.3 (Container B’s overlay IP)
Docker encapsulates the packet inside a VXLAN UDP packet (port 4789)
Outer header: Host A → Host B
Host B’s VTEP decapsulates and delivers to Container B
Container B sees a packet from 10.0.0.2 — no idea it crossed physical networks

ELI5: Overlay networking is like putting a letter inside another envelope. The inner envelope says “To: Container B.” The outer envelope says “To: Host B.” The post office (physical network) only reads the outer envelope. Host B opens the outer envelope and delivers the inner letter to the right container.

Performance impact: VXLAN adds ~50 bytes overhead per packet and CPU cost for encap/decap. For most workloads, negligible. For high-throughput, latency-sensitive workloads (10Gbps+), consider macvlan or host networking.

Macvlan and IPvlan

Both give containers direct access to the physical network — no NAT, no bridge.

Feature	Macvlan	IPvlan L2	IPvlan L3
MAC address	Unique per container	Shared with host	Shared with host
L2 adjacency	Yes	Yes	No (routed)
Promiscuous mode	Required on parent	Not required	Not required
Cloud compatibility	Often blocked (AWS, etc.)	Better cloud support	Best cloud support
Use case	Legacy VLANs, bare metal	Cloud, single MAC required	Pure routing, no broadcast

When to use macvlan/ipvlan: You have legacy applications that need to be on a specific VLAN, need direct L2 connectivity, or need performance without NAT overhead. Common in network appliance containers and telecom workloads.

Common mistake: Using macvlan in AWS/GCP/Azure. Cloud providers typically don’t allow multiple MAC addresses per NIC (or charge for it). Use ipvlan L2 instead.

DNS and Service Discovery

Docker’s Built-in DNS (User-Defined Networks)

Docker runs an embedded DNS server at 127.0.0.11 for containers on user-defined networks.

Container A → DNS query "web" → 127.0.0.11 → resolves to 172.18.0.3

Resolution order:

Container’s /etc/hosts entries
Docker’s embedded DNS (container name → IP)
Host’s DNS configuration (for external domains)

Docker Compose Service Discovery

services:
  web:
    image: nginx
  api:
    image: myapi
    # can reach nginx at http://web:80 — service name = DNS name

In Compose, the service name IS the DNS name. All services on the same network can resolve each other by name.

ELI5: Docker DNS works like a hotel directory. You don’t need to know what room number (IP) the “restaurant” (container) is in — you just ask the front desk (DNS server) for “restaurant” and they tell you the room number. If the restaurant moves to a different room (container restarts with new IP), the directory updates automatically.

Kubernetes DNS (CoreDNS)

<service-name>.<namespace>.svc.cluster.local

Every Service in K8s gets a DNS entry. Pods resolve services by name within the same namespace or by FQDN across namespaces.

# Same namespace
curl http://api-service:8080

# Cross namespace
curl http://api-service.production.svc.cluster.local:8080

Kubernetes Networking Model

K8s has three fundamental networking requirements:

Every pod gets its own IP — no NAT between pods
Pods can communicate with any other pod — across nodes, without NAT
The IP a pod sees itself as = the IP others see it as

CNI Plugins

The Container Network Interface (CNI) is how K8s implements networking. The cluster operator chooses a CNI plugin.

CNI Plugin	Networking	Network Policy	Extra Features	Best For
Calico	L3 (BGP), VXLAN, IPIP	Yes (full)	eBPF dataplane option	General purpose, performance
Cilium	eBPF-based	Yes (L3-L7)	Service mesh, observability, encryption	Advanced security, L7 policies
Flannel	VXLAN overlay	No (needs Calico for policy)	Simple, minimal config	Small clusters, simplicity
Weave	VXLAN, sleeve	Yes	Encryption, multicast	Small/medium clusters
AWS VPC CNI	AWS ENI-based	Yes (with Calico)	Native VPC IP per pod	AWS EKS (default)
Azure CNI	Azure VNET	Yes	Native Azure IP per pod	AKS (default)

Decision framework: Starting a new cluster? Cilium if you want eBPF + L7 policies + service mesh. Calico if you want proven stability + good performance. Flannel if you want simplicity and don’t need network policies. Cloud-managed cluster? Use the cloud’s default CNI (VPC CNI on EKS, Azure CNI on AKS) unless you need features they don’t provide.

Service Types

Type	How It Works	Use Case
ClusterIP	Internal IP reachable only within cluster	Service-to-service communication
NodePort	Opens a port (30000-32767) on every node	Dev/test, direct node access
LoadBalancer	Provisions cloud LB pointing to NodePorts	Production external traffic
ExternalName	CNAME record to external DNS	Proxy to external services

Client → Cloud LB → NodePort on any node → kube-proxy → Pod

Common mistake: Using NodePort in production. NodePort exposes a port on EVERY node, limits you to ports 30000-32767, and doesn’t load-balance well. Use LoadBalancer (cloud) or Ingress (for HTTP).

Ingress and Gateway API

Feature	Ingress (legacy)	Gateway API (modern)
Maturity	Stable, widely supported	GA since K8s 1.26
L7 routing	Host/path-based	Host/path/header/method-based
TCP/UDP support	Depends on controller	Native (TCPRoute, UDPRoute)
Multi-tenancy	No built-in model	GatewayClass for shared infra
TLS termination	Yes	Yes, more flexible
Traffic splitting	Controller-specific annotations	Native (weight-based)

Think of it this way: Ingress is like a simple reception desk — it looks at your name and directs you to the right room. Gateway API is like a smart building management system — it can route you based on your name, your department, your badge type, and even split visitors across multiple rooms for A/B testing.

Network Policies (Kubernetes)

By default, all pods can talk to all pods. Network Policies restrict this.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all-ingress
spec:
  podSelector: {}      # applies to all pods in namespace
  policyTypes:
  - Ingress             # block all incoming traffic
  # no ingress rules = deny all

Key principles:

Network Policies are additive — if any policy allows traffic, it’s allowed
An empty podSelector {} means “all pods in this namespace”
If no NetworkPolicy selects a pod, all traffic is allowed (default allow)
Once ANY NetworkPolicy selects a pod, all non-matching traffic is denied (default deny for that pod)

Common mistake: Assuming NetworkPolicies are enforced by default. They’re NOT — your CNI plugin must support them. Flannel doesn’t. Calico and Cilium do. Deploy Flannel + NetworkPolicies = false sense of security.

Interview pattern — “Defense in depth”:

Default deny all ingress in every namespace
Explicitly allow only needed communication paths
Use labels for pod selection (not IPs — IPs change)
Separate namespaces for different trust boundaries

Debugging Network Issues

Essential Commands

# What network is a container on?
docker inspect <container> | jq '.[0].NetworkSettings.Networks'

# DNS resolution inside container
docker exec <container> nslookup <service-name>

# See all iptables NAT rules Docker created
iptables -t nat -L -n -v

# Trace packet path (on host)
tcpdump -i docker0 -n port 80

# K8s: debug DNS resolution
kubectl run dnstest --image=busybox:1.36 --rm -it -- nslookup kubernetes.default

# K8s: check if NetworkPolicy is blocking
kubectl describe networkpolicy -n <namespace>

# K8s: check service endpoints
kubectl get endpoints <service-name>

Common Problems & Root Causes

Symptom	Likely Cause	Fix
Container can’t reach internet	Missing NAT rule or DNS config	Check `iptables -t nat -L`, check `/etc/resolv.conf`
Container can’t reach other container by name	Using default bridge (no DNS)	Use user-defined network
Service unreachable in K8s	No endpoints (pods not ready)	`kubectl get endpoints`, check readiness probes
Intermittent timeouts in K8s	DNS resolution issues (ndots:5)	Add `dnsConfig.options: [{name: ndots, value: "2"}]`
Cross-node pod communication fails	CNI plugin misconfigured, firewall	Check CNI pods, check node firewall rules for VXLAN (4789)

Why this matters: The ndots:5 default in Kubernetes means EVERY DNS query tries 5 search domains before going external. A lookup for api.example.com generates 6 queries instead of 1. This causes latency and DNS server load. Setting ndots: 2 fixes most cases.

Key Takeaways for Interviews

“Explain Docker networking” → veth pairs connect containers to bridges. iptables DNAT handles port mapping. User-defined bridges provide DNS. Overlay uses VXLAN for multi-host.
“How does K8s networking work?” → CNI plugin assigns IPs. Every pod gets a routable IP. kube-proxy (iptables/IPVS) implements Services. CoreDNS handles discovery.
“Bridge vs overlay vs macvlan?” → Bridge = single host + NAT. Overlay = multi-host + VXLAN encapsulation. Macvlan = direct physical network access.
“How do you secure pod communication?” → Network Policies (deny all + explicit allow), mTLS via service mesh (Istio/Cilium), namespace isolation.
“CNI plugin selection?” → Cilium for eBPF + L7. Calico for stability + performance. Cloud default for managed clusters.

Container Networking#

Docker Network Drivers#

Bridge Network (Default)#

Default Bridge vs User-Defined Bridge#

How Port Mapping Works#

Host Network#

Overlay Network (Multi-Host)#

Macvlan and IPvlan#

DNS and Service Discovery#

Docker’s Built-in DNS (User-Defined Networks)#

Docker Compose Service Discovery#

Kubernetes DNS (CoreDNS)#

Kubernetes Networking Model#

CNI Plugins#

Service Types#

Ingress and Gateway API#

Network Policies (Kubernetes)#

Debugging Network Issues#

Essential Commands#

Common Problems & Root Causes#

Key Takeaways for Interviews#

Container Networking

Docker Network Drivers

Bridge Network (Default)

Default Bridge vs User-Defined Bridge

How Port Mapping Works

Host Network

Overlay Network (Multi-Host)

Macvlan and IPvlan

DNS and Service Discovery

Docker’s Built-in DNS (User-Defined Networks)

Docker Compose Service Discovery

Kubernetes DNS (CoreDNS)

Kubernetes Networking Model

CNI Plugins

Service Types

Ingress and Gateway API

Network Policies (Kubernetes)

Debugging Network Issues

Essential Commands

Common Problems & Root Causes

Key Takeaways for Interviews