← Networking Mastery — Fundamentals to Principal

Network Fundamentals

Network Fundamentals

This is the bedrock. Everything else in networking — HTTP, TLS, gRPC, load balancing — is just specialization built on these concepts. Get this right and debugging gets much, much easier.


OSI Model vs TCP/IP Model

The Models

The OSI model has 7 layers. The TCP/IP model (also called the Internet model) has 4. They both describe the same thing — how data travels from one machine to another — but OSI is a conceptual teaching framework while TCP/IP is what actually ships.

LayerOSI NameTCP/IP LayerReal Protocols
7ApplicationApplicationHTTP, DNS, SMTP, FTP, SSH
6PresentationApplicationTLS/SSL, MIME, compression
5SessionApplicationTLS sessions, RPC, NetBIOS
4TransportTransportTCP, UDP, SCTP
3NetworkInternetIP (v4/v6), ICMP, OSPF, BGP
2Data LinkNetwork AccessEthernet, Wi-Fi (802.11), ARP
1PhysicalNetwork AccessCables, fiber, radio signals

In practice: when engineers say “L3” or “L4,” they mean OSI layers. When they say “transport layer,” they mean TCP/UDP. The two vocabularies coexist.

ELI5: OSI is like a university textbook definition of how a restaurant works — front-of-house, kitchen, suppliers, etc. TCP/IP is the actual restaurant: host takes order, kitchen makes food, done. The textbook model has more categories; the real restaurant has what actually works. Use the textbook model to understand concepts, work in the real restaurant every day.

Why This Matters for Debugging

Knowing which layer your problem lives at cuts debugging time dramatically:

  • Can’t ping the host at all? → L1/L2 issue. Check physical cable, switch, ARP table.
  • Ping works but TCP connection refused? → L3/L4. Check firewall rules, netstat, port binding.
  • Connection established but HTTP 502? → L7. Backend process crashed, check app logs.
  • Intermittent packet loss? → L1/L2. Check for duplex mismatch, NIC errors, switch port.

Why this matters: Engineers who don’t know their layers waste hours looking at HTTP logs when the problem is a misconfigured MTU. Layers give you a systematic place to start.


Packets, Frames, and Segments

Encapsulation: Headers All the Way Down

When your app sends "Hello" over HTTP, that string gets wrapped in headers at each layer before hitting the wire. This is encapsulation.

App data:    [ "Hello" ]
L7 (HTTP):   [ HTTP headers | "Hello" ]
L4 (TCP):    [ TCP header | HTTP headers | "Hello" ]
L3 (IP):     [ IP header  | TCP header  | HTTP headers | "Hello" ]
L2 (Eth):    [ Eth header | IP header   | TCP header   | HTTP headers | "Hello" | Eth trailer ]

Each wrapper has a specific name:

LayerPDU NameKey Header Fields
L2FrameSource MAC, Dest MAC, EtherType
L3PacketSource IP, Dest IP, TTL, Protocol
L4 (TCP)SegmentSource Port, Dest Port, Seq#, Ack#
L4 (UDP)DatagramSource Port, Dest Port, Length

ELI5: Think of sending a letter internationally. You write your letter (app data), put it in an envelope with a recipient name (TCP/HTTP), put that inside a mailer with a full address (IP), and then a courier wraps it in a shipping package with a barcode for the sorting machine (Ethernet frame). Each wrapper is only read by the right handler — the sorting machine only reads the barcode, not your letter.

MTU and Fragmentation

MTU (Maximum Transmission Unit) is the largest single packet a link can carry. Ethernet default: 1500 bytes. Your IP header (20 bytes) + TCP header (20 bytes) = 40 bytes of overhead, leaving 1460 bytes for data per segment (the TCP MSS, Maximum Segment Size).

What happens when a packet is too big? Fragmentation — the router splits it. The receiving end reassembles it. Problems:

  • Fragmentation is expensive — CPU overhead on routers
  • If any fragment is lost, the entire packet is retransmitted
  • Some firewalls drop fragments
  • Path MTU Discovery (PMTUD) uses ICMP “Fragmentation Needed” messages to find the lowest MTU on a path — if your firewall blocks ICMP, PMTUD breaks silently

Jumbo frames: MTU up to 9000 bytes, used inside data centers. Never cross the public internet. Saves CPU overhead on storage/backup traffic between servers.

Common mistake: Enabling jumbo frames on servers but not on the switch between them. Packets hit the switch, get dropped (or fragmented if the switch doesn’t support it), and you get mysterious performance degradation that only affects large transfers.


IP Addressing

IPv4: 32 Bits, Running Out Since 2011

IPv4 gives $2^{32}$ ≈ 4.3 billion addresses. Sounds like a lot until you realize the internet has 5 billion people, each with multiple devices. The internet ran out of unallocated IPv4 space in 2011. We survive through NAT (more on that below).

Format: four octets in decimal, separated by dots. 192.168.1.100 = 11000000.10101000.00000001.01100100 in binary.

IPv6: 128 Bits, Plenty of Room

$2^{128}$ ≈ $3.4 \times 10^{38}$ addresses. Every grain of sand on Earth could have a trillion addresses. Format: eight groups of 4 hex digits. 2001:0db8:85a3:0000:0000:8a2e:0370:7334. Leading zeros can be omitted, consecutive all-zero groups collapse to ::.

IPv6 adoption is slow because: existing IPv4 infrastructure works, NAT reduced the pressure, dual-stack deployments are complex, and ISPs/enterprises move slowly. But it’s growing — ~40% of Google traffic is IPv6 now.

Private vs Public IP Ranges

RangeCIDRAddressesUse
10.0.0.0 – 10.255.255.25510.0.0.0/816.7 millionCorporate networks, cloud VPCs
172.16.0.0 – 172.31.255.255172.16.0.0/121 millionDocker default, some corp networks
192.168.0.0 – 192.168.255.255192.168.0.0/1665,536Home routers, small offices

Private IPs are not routable on the public internet — your router at home translates them.

NAT: Network Address Translation

NAT is how your home router lets 50 devices share one public IP. The router maintains a translation table: when device 192.168.1.5:54321 sends a packet to 8.8.8.8:53, the router rewrites the source to 203.0.113.1:54321 (the public IP), records the mapping, and when the response comes back, rewrites the destination back to 192.168.1.5:54321.

ELI5: NAT is like a corporate mail room. Employees use internal extension numbers (private IPs). The mail room has one street address (public IP). When you send a letter out, the mail room puts their address on it. When a reply comes back to the mail room, they look up who originally sent it and deliver it internally. The outside world only ever sees the mail room’s address.

NAT implications for engineers:

  • Servers behind NAT can’t receive unsolicited incoming connections without port forwarding
  • P2P protocols (WebRTC, game servers) need NAT traversal techniques (STUN, TURN)
  • NAT breaks IP-level security assumptions — the source IP in a packet isn’t the true sender

CIDR Notation

192.168.1.0/24 means: the first 24 bits are the network prefix, the remaining 8 bits are for hosts.

$$\text{Hosts} = 2^{(32 - \text{prefix})} - 2$$

The -2 subtracts the network address (all host bits 0) and broadcast address (all host bits 1).

CIDRSubnet MaskTotal IPsUsable Hosts
/8255.0.0.016,777,21616,777,214
/16255.255.0.065,53665,534
/24255.255.255.0256254
/28255.255.255.2401614
/30255.255.255.25242
/32255.255.255.25511 host (single host route)

Quick mental math: /24 = 256, /25 = 128, /26 = 64. Each bit you add to the prefix halves the address space.


Subnetting

Why Subnets Exist

Three reasons engineers create subnets instead of one flat network:

  1. Broadcast control: Every L2 broadcast goes to every device in a subnet. A flat /8 with 16 million devices would drown in ARP broadcasts. Routers don’t forward broadcasts, so subnets contain the noise.
  2. Security: Put your database in a different subnet with restrictive routing rules. Traffic between subnets goes through a router/firewall you control.
  3. Organization: Dev, staging, prod in separate subnets. Each team’s services isolated. Cloud VPCs use this to separate public-facing (web servers) from private (databases).

ELI5: Subnets are like neighborhoods in a city. Mail within your neighborhood (subnet) gets delivered by the local postal worker (switch). Mail to another neighborhood goes through the central post office (router). Without neighborhoods, one postal worker would have to know everyone in the entire city — chaos.

Calculating Subnet Ranges

For 10.0.1.0/24:

  • Network address: 10.0.1.0 (all host bits = 0)
  • Broadcast address: 10.0.1.255 (all host bits = 1)
  • Usable range: 10.0.1.110.0.1.254

For 10.0.0.0/26 (borrowing 2 bits from a /24 gives 4 subnets of 64 addresses each):

  • Subnet 0: 10.0.0.010.0.0.63
  • Subnet 1: 10.0.0.6410.0.0.127
  • Subnet 2: 10.0.0.12810.0.0.191
  • Subnet 3: 10.0.0.19210.0.0.255

VLSM: Right-Sizing Subnets

VLSM (Variable Length Subnet Masking) lets you allocate different-sized subnets from the same block. A /30 for a point-to-point link (only 2 hosts needed), a /24 for a large office, a /28 for a DMZ.

Cloud patterns you’ll see constantly:

  • VPC: /16 (65,536 IPs) — room to grow, not too wasteful
  • Subnets: /24 per availability zone per tier (web, app, db)
  • Management/bastion: /28 — tiny, just a few IPs

Common mistake: Making all subnets /24 by habit in a VPC, then running out of subnet space for a /16 VPC. Plan your CIDR hierarchy before you start. You can’t resize subnets in AWS without recreating them.


Routing

Longest Prefix Match: The Core Algorithm

When a router receives a packet, it looks up the destination IP in its routing table and picks the most specific matching route — the one with the longest prefix (highest CIDR number).

Routing table:
  0.0.0.0/0    → gateway 1.2.3.4  (default route)
  10.0.0.0/8   → gateway 10.0.0.1
  10.0.1.0/24  → gateway 10.0.1.1
  10.0.1.5/32  → gateway 10.0.1.5

Packet dest: 10.0.1.5
Match candidates: /0, /8, /24, /32  → picks /32 (most specific)

ELI5: It’s like GPS routing. If you’re looking for “123 Main St, Springfield, Illinois, USA,” the router picks the most specific match it knows. It knows about Illinois before it knows about the specific street. The more specific the address it has, the more precisely it can route you.

Default Gateway

The default route 0.0.0.0/0 matches everything — it’s the “I have no idea, send it here” route. Your laptop has a default gateway (your home router). Your home router has a default gateway (your ISP). Your ISP has default routes to major internet exchange points. Every packet eventually gets somewhere via this chain.

Static vs Dynamic Routing

TypeHowWhen to use
StaticYou manually configure routesSmall networks, specific security requirements
OSPFRouters auto-discover neighbors, share topologyEnterprise internal routing
BGPRouters exchange reachability info between autonomous systemsInternet routing, cloud multi-homing

BGP is the protocol that runs the internet. Every ISP, cloud provider, and large company is an Autonomous System (AS) — a network under a single administrative control with an AS number (e.g., AWS is AS16509). BGP is how AS16509 tells the rest of the internet “I have routes to these IP prefixes.” When you see news about “internet routing incidents,” it’s almost always BGP: a misconfiguration causes wrong routes to propagate, and large swaths of the internet become unreachable.

Traceroute: TTL Tricks

Every IP packet has a TTL (Time to Live) field — an integer that each router decrements by 1. When TTL hits 0, the router drops the packet and sends back an ICMP “Time Exceeded” message (with its own IP in the source).

traceroute exploits this: send packets with TTL=1, then TTL=2, then TTL=3… Each router along the path drops the packet when TTL expires and reveals itself via the ICMP response.

traceroute google.com
# Output: hop-by-hop path, RTT for each hop

# On Linux, traceroute uses UDP by default:
traceroute -T google.com   # TCP mode (bypasses some firewalls)
traceroute -I google.com   # ICMP mode (like Windows tracert)

What traceroute tells you: where latency is added (big RTT jump = slow link or far-away hop), where packets stop (firewall blocking), asymmetric routing (forward and return paths differ).

Common mistake: Trusting traceroute completely. ICMP responses from routers are low priority — a router can forward packets fast but respond to ICMP slowly, making a hop look slower than it is. Use it for direction, not precise measurement.


ARP, DHCP, and ICMP

ARP: Bridging L2 and L3

IP works at L3, but Ethernet frames at L2 use MAC addresses. Before sending a frame, a device needs to know the MAC address for a given IP. That’s ARP’s job.

Device A wants to reach 192.168.1.5:
1. A broadcasts: "Who has 192.168.1.5? Tell 192.168.1.1"
2. Device at .5 replies: "192.168.1.5 is at MAC aa:bb:cc:dd:ee:ff"
3. A caches this mapping in its ARP table
4. A sends the Ethernet frame to that MAC address
arp -n              # view ARP table on Linux
ip neigh show       # modern equivalent

ARP cache entries expire (usually 5–20 minutes). If a device’s IP changes, stale ARP entries cause brief connectivity loss until they expire or are flushed.

ARP spoofing: A malicious device responds to ARP requests claiming to have the MAC for a gateway IP. All traffic intended for the gateway goes to the attacker instead — a classic man-in-the-middle attack. Only possible on local networks. Modern switches with Dynamic ARP Inspection (DAI) can detect and block this.

ELI5: ARP is like asking “hey, who in this room goes by the name Bob?” out loud (broadcast). Bob raises his hand (unicast reply). You note “Bob is the tall guy in the corner” (cache it). Next time you need Bob, you go directly to him. ARP spoofing is someone else raising their hand and claiming to be Bob.

DHCP: Automatic IP Assignment

DHCP lets devices get an IP address, subnet mask, default gateway, and DNS server automatically. Without DHCP, every device needs manual configuration. The exchange is DORA:

  1. Discover: Client broadcasts “anyone have a DHCP server?” (UDP, src 0.0.0.0:68, dst 255.255.255.255:67)
  2. Offer: Server replies “I can give you 192.168.1.100, valid for 24 hours”
  3. Request: Client broadcasts “I’ll take the offer from server X” (still broadcast — there may be multiple DHCP servers)
  4. Acknowledge: Server confirms the lease

Leases have a duration. At 50% of lease time, the client tries to renew. If renewal fails, at 87.5% it tries any DHCP server. At expiry it releases the IP.

ELI5: DHCP is like checking into a hotel. You show up (Discover), the front desk offers you a room (Offer), you confirm you want it (Request), they hand you the key (Acknowledge). The key works for your stay (lease duration). If you want to stay longer, you ask to renew before checkout.

ICMP: Network Diagnostics Protocol

ICMP is not a data transport — it’s for control messages and diagnostics. Key message types:

TypeNameUsed by
0Echo Replyping response
3Destination UnreachablePort closed, host unreachable, fragmentation needed
8Echo Requestping
11Time Exceededtraceroute
ping -c 4 google.com          # 4 packets, RTT stats
ping -s 1400 192.168.1.1      # test with larger packet size (MTU check)

Common mistake: Blocking all ICMP on firewalls for “security.” This breaks Path MTU Discovery (PMTUD) and makes debugging hell. You should block echo requests from the public internet, but always allow ICMP Type 3 (Destination Unreachable) and Type 11 (Time Exceeded) — these are critical for correct operation.


Ports and Sockets

Ports: Demultiplexing Traffic on a Host

An IP address gets traffic to a machine. A port gets traffic to the right process on that machine. 16-bit integer: 0–65535.

RangeNameUse
0–1023Well-knownHTTP (80), HTTPS (443), SSH (22), DNS (53), SMTP (25)
1024–49151RegisteredMySQL (3306), PostgreSQL (5432), Redis (6379), MongoDB (27017)
49152–65535EphemeralOS-assigned source ports for outgoing connections

A socket is the unique combination of (source IP, source port, destination IP, destination port, protocol). This 5-tuple uniquely identifies every connection in the network stack. Two clients can connect to the same server IP:port because they have different source ports.

ss -tuln          # show listening sockets (no DNS resolution, numeric ports)
ss -tupn          # show established connections with process IDs
netstat -antp     # older equivalent, still widely available

ELI5: An IP address is like a building address, a port is like an apartment number. The postal service (OS network stack) delivers mail (packets) to the right apartment (process). The socket is the full address including who’s sending — “apartment 443 in building 1.2.3.4, sent by apartment 54321 in building 5.6.7.8.”

Ephemeral Port Exhaustion

Each outgoing TCP connection consumes one ephemeral port. The default range on Linux is 32768–60999 — about 28,000 ports. A high-traffic service making thousands of connections per second to the same destination can exhaust them.

Symptoms: connect: cannot assign requested address errors, even though CPU/memory are fine.

cat /proc/sys/net/ipv4/ip_local_port_range   # current ephemeral range
sysctl -w net.ipv4.ip_local_port_range="1024 65535"  # expand range

# Also check TIME_WAIT connections eating ports:
ss -s       # socket statistics summary

Solutions: expand the ephemeral port range, enable SO_REUSEADDR/SO_REUSEPORT, reduce TIME_WAIT duration, or use connection pooling to reduce connection churn.

Why this matters: Port exhaustion is a real production issue for API gateways, proxies, and services that make many outbound requests. It’s non-obvious — the machine appears healthy but all new connections fail.


Network Types and Topologies

LAN, WAN, VPN

TypeScopeSpeedLatencyYour Control
LANBuilding/campus1–100 Gbps<1msFull
WANCities/countries10–100 Gbps (backbone)10–200msISP-dependent
VPNAny — tunneled over WANLimited by underlying WAN+5–20ms overheadEncryption layer only

VPNs create an encrypted tunnel. Traffic exits your VPN client, gets encrypted, travels to the VPN endpoint, gets decrypted, then goes to the destination. You trade latency and bandwidth for privacy and simulated co-location.

Cloud Networking: VPCs

A VPC (Virtual Private Cloud) is your private network inside a cloud provider. Key components:

  • Subnets: Subdivisions of the VPC CIDR, tied to availability zones
  • Route tables: Attached to subnets, control where traffic goes
  • Security Groups: Stateful firewall at the instance/ENI level (allow rules only)
  • NACLs: Stateless firewall at the subnet level (allow and deny rules)
  • Internet Gateway: The door between your VPC and the public internet
  • NAT Gateway: Lets private subnet instances reach the internet without being reachable from it

ELI5: A VPC is like a private office building. The VPC is the building. Subnets are individual floors. Security groups are the locks on each office door. NACLs are the badge readers at each floor’s elevator. The Internet Gateway is the lobby entrance. The NAT Gateway is the mail slot — packages can go out, but nothing gets in unsolicited.

Overlay Networks: VXLAN and Container Networking

Physical networks use VLANs (max 4096) to segment traffic. At cloud scale, you need millions of virtual networks. VXLAN (Virtual Extensible LAN) solves this by encapsulating Ethernet frames inside UDP packets — an overlay over the existing IP network. VXLAN supports 16 million virtual network IDs.

Docker and Kubernetes use overlay networks built on VXLAN (or similar: Flannel, Calico, Cilium):

  • Each pod gets an IP in a virtual network
  • Traffic between pods on different hosts gets VXLAN-encapsulated and routed via the host network
  • From the pod’s perspective, it’s just talking to another IP address
# Inspect Docker network
docker network ls
docker network inspect bridge

# View routing in a Kubernetes pod
kubectl exec -it <pod> -- ip route
kubectl exec -it <pod> -- ip addr

ELI5: VXLAN is like putting a letter inside another letter. The outer envelope (UDP/IP) is addressed to the physical machine. The inner envelope (Ethernet frame) is addressed to the virtual container. The outer envelope gets the data to the right building; the inner envelope gets it to the right tenant.


Summary: Quick Reference

Layer Reference

When you see…LayerLikely cause
Can’t ping, ARP failingL1–L2Cable, switch, VLAN
Ping works, TCP RSTL3–L4Firewall, routing, closed port
TCP connects, HTTP 5xxL7Application, load balancer config
High latency, packet lossL1–L3Congestion, bad link, routing loop

Address Types at a Glance

AddressLengthScopeChanges?
MAC48 bits (6 bytes)Local segment onlyNo (hardware-assigned)
IPv432 bitsGlobal (public) or local (private)Yes (DHCP, NAT)
IPv6128 bitsGlobalRarely
Port16 bitsPer-host process identifierPer connection (ephemeral)

Subnet Quick Reference

CIDRMaskHostsCommon Use
/8255.0.0.0~16MLarge private networks
/16255.255.0.0~65KCloud VPCs
/24255.255.255.0254Subnets, typical floor
/28255.255.255.24014Small segments, DMZ
/30255.255.255.2522Point-to-point links
/32255.255.255.2551Single host routes

The Debugging Ladder

Is the host reachable at all?  → ping
  └─ No  → check L1/L2: arp -n, check switch
  └─ Yes → Is the port open?  → nc -zv host port
             └─ No  → check firewall, ss -tuln on target
             └─ Yes → Is the app responding? → curl -v
                        └─ No  → check app logs, L7 config

Next up: TCP & UDP Deep Dive — handshakes, flow control, congestion, and why TCP’s reliability costs you latency.