Type System & Protocols
Python’s type system is a layered thing — duck typing at the bottom, structural protocols in the middle, full generics at the top. You don’t have to live at any one layer. The skill is knowing which layer fits the problem.
Python’s Type Philosophy
Python started as a duck-typed language: if an object has the methods you need, it works, regardless of its declared type. No inheritance required. This is intentional, and it’s still the default.
Gradual typing is the formal name for what Python does: you can add type hints to any subset of your codebase, and everything still runs. A function with no hints and a function with full generics can coexist in the same file. The type checker ignores what you didn’t annotate.
The critical thing to internalize: type hints are not enforced at runtime by default. CPython ignores them completely. They exist for mypy, pyright, your IDE, and your colleagues. This is a feature, not a bug — it lets you adopt types without rewriting everything.
def greet(name: str) -> str:
return "Hello " + name
greet(42) # runs fine, no error — Python doesn't check
The type spectrum from least to most strict:
| Level | What it looks like | Who benefits |
|---|---|---|
| No hints | def process(data): | Prototypes, scripts |
| Basic hints | def process(data: list) -> dict: | IDE autocomplete |
| Typed collections | def process(data: list[str]) -> dict[str, int]: | Mypy basic checks |
| Generics | def process(data: list[T]) -> T: | Library authors |
| Runtime validation | pydantic.BaseModel | API boundaries, user input |
ELI5: Type hints are like writing “FRAGILE” on a box. The postal service (Python) ignores it. But the person reading your label (your IDE, mypy) can use it to catch problems before the box gets thrown around.
Core Type Hints
Primitives are straightforward: int, str, float, bool, bytes, None. For None as a type (not a value), use type[None] — but usually you mean Optional.
Collections use the built-in names directly in Python 3.9+:
# Prefer this (3.9+)
def process(items: list[str], mapping: dict[str, int]) -> tuple[int, ...]:
...
# Old way (still valid, just verbose)
from typing import List, Dict, Tuple
def process(items: List[str], mapping: Dict[str, int]) -> Tuple[int, ...]:
...
tuple[int, str] means exactly two elements, an int then a str. tuple[int, ...] means a tuple of any length, all ints.
Optional and Union:
# These are identical — prefer | in 3.10+
from typing import Optional, Union
def get_user(id: int) -> Optional[str]: ... # old
def get_user(id: int) -> str | None: ... # modern
def load(src: Union[str, Path]) -> bytes: ... # old
def load(src: str | Path) -> bytes: ... # modern
Any is the escape hatch. When you use Any, you’re telling the type checker “stop looking here.” It’s contagious — Any in, Any out, type safety gone.
from typing import Any
def legacy_api(data: Any) -> Any: # type checker gives up on both sides
return data["result"]
Literal constrains a parameter to specific values without a full enum:
from typing import Literal
Mode = Literal["read", "write", "append"]
def open_file(path: str, mode: Mode) -> None: ...
open_file("data.txt", "read") # ok
open_file("data.txt", "delete") # mypy error
TypeAlias and the type statement (3.12+):
# 3.10-3.11
from typing import TypeAlias
Vector: TypeAlias = list[float]
# 3.12+ — cleaner, no import needed
type Vector = list[float]
type Matrix = list[Vector]
ELI5:
Literalis like a dropdown menu instead of a free-text field. The type checker enforces that you only pick from the list.
Common mistake: Using list instead of list[T] and thinking you get type safety. list alone is equivalent to list[Any] — the checker won’t warn about what’s inside.
Generics and TypeVar
When a function’s output type depends on its input type, you need generics. Without them, you’re forced to return Any or overload.
from typing import TypeVar
T = TypeVar("T")
def first(items: list[T]) -> T: # T flows from input to output
return items[0]
x: int = first([1, 2, 3]) # T = int, mypy knows the result is int
y: str = first(["a", "b"]) # T = str
Bound TypeVar — constrain what T can be:
from typing import TypeVar, Protocol
class Comparable(Protocol):
def __lt__(self, other: Any) -> bool: ...
C = TypeVar("C", bound=Comparable)
def minimum(a: C, b: C) -> C: # T must support __lt__
return a if a < b else b
Constrained TypeVar — T can only be exactly one of the listed types:
Numeric = TypeVar("Numeric", int, float) # only int or float, no subclasses
def double(x: Numeric) -> Numeric:
return x * 2
Bound vs constrained is a common confusion point:
| Bound | Constrained | |
|---|---|---|
T = TypeVar("T", bound=X) | T is X or any subclass of X | — |
T = TypeVar("T", X, Y) | — | T is exactly X or exactly Y |
| Typical use | “must support interface” | “only these exact types” |
Generic classes:
from typing import Generic, TypeVar
T = TypeVar("T")
class Stack(Generic[T]):
def __init__(self) -> None:
self._items: list[T] = []
def push(self, item: T) -> None:
self._items.append(item)
def pop(self) -> T:
return self._items.pop()
s: Stack[int] = Stack()
s.push(1)
s.push("oops") # mypy error
Python 3.12 syntax eliminates the TypeVar declaration noise:
# 3.12+ — no TypeVar import needed
def first[T](items: list[T]) -> T:
return items[0]
class Stack[T]:
def push(self, item: T) -> None: ...
def pop(self) -> T: ...
ParamSpec lets you type decorators that preserve the wrapped function’s signature:
from typing import ParamSpec, Callable, TypeVar
P = ParamSpec("P")
R = TypeVar("R")
def logged(fn: Callable[P, R]) -> Callable[P, R]:
def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
print(f"Calling {fn.__name__}")
return fn(*args, **kwargs)
return wrapper
@logged
def add(x: int, y: int) -> int:
return x + y
add(1, "oops") # mypy still catches this — signature preserved
ELI5: TypeVar is a blank label that gets filled in based on usage. If you pass a list of ints, the label says “int” for that call. Generic classes are like templates with blank labels on the blueprint.
Variance: Covariance, Contravariance, Invariance
This is where most people’s eyes glaze over. But it matters when you use containers as function arguments.
The problem: If Dog is a subtype of Animal, is list[Dog] a subtype of list[Animal]?
The answer is no — and the reason is mutation:
def add_cat(animals: list[Animal]) -> None:
animals.append(Cat()) # valid for list[Animal]
dogs: list[Dog] = [Dog()]
add_cat(dogs) # now dogs contains a Cat — broken!
list[T] is invariant: you can’t substitute list[Dog] for list[Animal] in either direction. The type checker rejects both.
Covariance — substitution is safe in the “out” direction (read-only). Sequence[Dog] IS safely usable as Sequence[Animal] because Sequence is read-only:
from typing import Sequence
def print_all(animals: Sequence[Animal]) -> None:
for a in animals:
print(a.name) # just reading, never writing
dogs: Sequence[Dog] = [Dog()]
print_all(dogs) # safe — Dog has everything Animal has
Contravariance — substitution is safe in the “in” direction. A function that handles Animal can handle Dog too (since Dog is an Animal):
from typing import Callable
def apply_to_dog(fn: Callable[[Dog], None]) -> None:
fn(Dog())
def handle_animal(a: Animal) -> None:
print(a.name)
apply_to_dog(handle_animal) # safe — handle_animal works on any Animal, including Dog
Summary table:
| Type | Direction | Example | Why |
|---|---|---|---|
| Invariant | Neither | list[T] | Supports read and write |
| Covariant | Subtype flows up | Sequence[T], Iterator[T] | Read-only |
| Contravariant | Supertype flows down | Callable[[T], None] | Write/consume only |
ELI5: Covariance: a bag of apples can serve as a bag of fruit when someone just wants to grab and eat. Contravariance: a machine designed to process any fruit can process apples — you’d never send a machine that only handles apples to process all fruit. Invariance: a vending machine slot for size-A cans won’t accept size-B even if size-B is “almost the same.”
For your own generics, mark variance explicitly:
T_co = TypeVar("T_co", covariant=True)
T_contra = TypeVar("T_contra", contravariant=True)
The Liskov Substitution Principle (LSP) is the underlying rule: a subtype must be usable wherever its parent is, without breaking the program. Variance rules are how mypy enforces LSP in containers.
Protocol — Structural Subtyping
ABC requires explicit inheritance. Protocol does not — if an object has the right methods and attributes, it satisfies the protocol. This is the formal way to express duck typing with type checking.
from typing import Protocol
class Drawable(Protocol):
def draw(self) -> None: ...
class Circle:
def draw(self) -> None:
print("O")
class Square:
def draw(self) -> None:
print("□")
def render(shape: Drawable) -> None:
shape.draw()
render(Circle()) # works — Circle has draw()
render(Square()) # works — Square has draw()
# Neither inherits from Drawable
runtime_checkable adds isinstance() support, but only checks method names, not signatures:
from typing import runtime_checkable, Protocol
@runtime_checkable
class Closeable(Protocol):
def close(self) -> None: ...
class Connection:
def close(self) -> None: ...
isinstance(Connection(), Closeable) # True
Building real protocols:
from typing import Protocol, Any
from typing import runtime_checkable
class Repository(Protocol[T]):
def get(self, id: int) -> T: ...
def save(self, entity: T) -> None: ...
def delete(self, id: int) -> bool: ...
class Serializable(Protocol):
def to_dict(self) -> dict[str, Any]: ...
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "Serializable": ...
Protocol vs ABC decision framework:
| Question | Lean Protocol | Lean ABC |
|---|---|---|
| Do you control all implementors? | No | Yes |
| Is inheritance the right model? | No | Yes |
| Do you need shared implementation? | No | Yes |
| Is it a third-party integration? | Yes | — |
| Do you want duck typing + type safety? | Yes | — |
Think of it this way: ABC is a job description you must explicitly sign. Protocol is a skills assessment — if you can do the work, you qualify, regardless of who hired you or what your title is.
Common mistake: Making every Protocol @runtime_checkable. It adds overhead and only checks method existence, not signatures. Use it only when you actually call isinstance().
Advanced Typing Features
@overload — different return types for different input types:
from typing import overload
@overload
def load(path: str) -> str: ...
@overload
def load(path: bytes) -> bytes: ...
def load(path: str | bytes) -> str | bytes: # actual implementation
if isinstance(path, str):
return path.upper()
return path.upper()
Only the @overload stubs matter to the type checker. The real implementation isn’t type-checked against calls.
TypeGuard and TypeIs — narrow types inside conditionals:
from typing import TypeGuard
def is_str_list(val: list[object]) -> TypeGuard[list[str]]:
return all(isinstance(x, str) for x in val)
items: list[object] = ["a", "b"]
if is_str_list(items):
items.upper() # mypy knows items is list[str] here
TypeIs (3.13+) is stricter — it narrows in both the true and false branches.
Never and NoReturn:
from typing import Never, NoReturn
def raise_error(msg: str) -> NoReturn: # function never returns normally
raise RuntimeError(msg)
def assert_never(value: Never) -> Never: # called only if type narrowing missed a case
raise AssertionError(f"Unexpected value: {value}")
Self — return type for methods that return self:
from typing import Self
class Builder:
def set_name(self, name: str) -> Self: # returns the actual subclass, not Builder
self.name = name
return self
class AdvancedBuilder(Builder):
pass
b: AdvancedBuilder = AdvancedBuilder().set_name("x") # mypy knows the type is AdvancedBuilder
Without Self, you’d have to use TypeVar("T", bound="Builder") — much more verbose.
TypedDict — type safety for dict-shaped data:
from typing import TypedDict, NotRequired, Required
class UserRecord(TypedDict):
id: int
name: str
email: NotRequired[str] # optional key
def process(user: UserRecord) -> None:
print(user["name"]) # type-safe access
process({"id": 1, "name": "Alice"}) # ok
process({"id": 1, "name": "Alice", "email": "a@b.c"}) # ok
process({"id": 1}) # mypy error: missing 'name'
TypedDict is the right tool when you’re working with JSON-shaped data and can’t use dataclasses or pydantic — common in API response parsing.
ELI5:
TypedDictis a blueprint for a dictionary. You specify exactly which keys exist and what type their values are. The type checker checks every access and construction against the blueprint.
Runtime Type Checking
Type hints don’t run. But sometimes you need actual runtime enforcement — API boundaries, user-uploaded configs, deserialized JSON.
isinstance() and issubclass() are the baseline. For most built-in types they’re O(1). For Protocol with @runtime_checkable, they only check method names, not signatures.
typing.get_type_hints() resolves annotations at runtime, handling forward references:
import typing
class User:
name: str
age: int
print(typing.get_type_hints(User))
# {'name': <class 'str'>, 'age': <class 'int'>}
Library comparison:
| Library | How it works | Overhead | Best for |
|---|---|---|---|
pydantic | Validates on construction | Medium | API schemas, config |
beartype | Decorator, checks at call time | Low | Library boundaries |
typeguard | Decorator, checks at call time | Medium | Testing only |
cattrs | Explicit converters | Low | Serialization |
When runtime checking is worth it:
- At trust boundaries: user input, config files, external API responses
- In library public APIs when callers can’t be trusted
- During testing to catch type contract violations early
When it’s not worth it:
- Internal function calls within a module you control
- Hot paths where overhead matters
- When pydantic models would require restructuring clean code
ELI5: Static type checking is like proofreading before you send an email. Runtime checking is like a spell-check that runs the moment you click send. The first is free; the second costs CPU on every send.
Common mistake: Adding beartype or typeguard to every function in the codebase. Type checking has overhead. Reserve it for the entry points where untrusted data enters your system.
Type Checking Tools
mypy — the reference implementation. Strictest, most complete, slowest. Configure in pyproject.toml:
[tool.mypy]
strict = true
python_version = "3.12"
warn_return_any = true
warn_unused_ignores = true
--strict enables all the checks you want for production code: disallow_untyped_defs, no_implicit_optional, warn_unreachable, etc.
pyright — Microsoft’s checker, what VS Code/Pylance uses under the hood. Faster than mypy, slightly different rules. Better for real-time IDE feedback.
pytype — Google’s checker. Infers types from usage, good for adding types to untyped code. Slower.
| Tool | Speed | Inference | IDE | Strictness |
|---|---|---|---|---|
| mypy | Slow | Manual | Limited | Highest |
| pyright | Fast | Some | VS Code native | High |
| pytype | Very slow | Strong | Limited | Medium |
Gradual adoption strategy — don’t try to type everything at once:
- Run
mypy --ignore-missing-imports— fix only errors in files you’re editing - Add
# type: ignoresparingly for legacy code you won’t touch - Enable
strictfor new files via per-module config:
[[tool.mypy.overrides]]
module = "myapp.new_module.*"
strict = true
[[tool.mypy.overrides]]
module = "myapp.legacy.*"
ignore_errors = true
- Tighten the legacy exceptions over time as you touch those files
Decision Table
| Situation | What to use |
|---|---|
| Function input/output can be different types | @overload |
Return type is the same class as self | Self |
| Need an interface without forcing inheritance | Protocol |
| Need shared implementation in the interface | ABC |
| Dict with known keys | TypedDict |
| String parameter with limited values | Literal |
| Generic function/class | TypeVar or 3.12 [T] syntax |
| Validate data at runtime | pydantic or beartype |
| Escape hatch for untyped third-party code | Any with a comment |
| Type narrowing in if-blocks | TypeGuard / TypeIs |
| Function that never returns | NoReturn |
| Exhaustive match checking | assert_never(Never) |