← Python Mastery — Senior to Principal

Object Model & Data Model

Python’s data model is the contract between your code and the interpreter. Master it and everything clicks — descriptors, super(), property, @classmethod, slots, all of it. Skip it and you cargo-cult patterns you don’t understand.


Everything Is an Object

In Python, “everything is an object” isn’t marketing copy — it’s a precise statement. Functions, classes, modules, None, integers, types themselves — all are instances of some class, all have an identity, a type, and a value.

>>> type(int)        # int is an instance of type
<class 'type'>
>>> type(type)       # type is an instance of itself
<class 'type'>
>>> isinstance(int, object)  # and a subclass of object
True

The three identity tools you must know:

ToolWhat it gives youWhen to use
id(x)Memory address (CPython)Debugging identity
type(x)The exact class, no inheritanceType dispatch
x is ySame object in memoryNone checks, singletons
x == yCalls __eq__, may be overriddenValue equality

is vs == is not interchangeable. Use is only for None, True, False, and sentinel objects. For everything else, use ==.

CPython interning: the trap

CPython caches small integers (-5 to 256) and many short strings as a performance optimization. This creates a subtle trap:

a = 256; b = 256
a is b   # True — same cached object

a = 257; b = 257
a is b   # False in most contexts — two separate objects

ELI5: Integers -5 to 256 are like pre-printed forms in a government office — everyone gets the same sheet because they’re so common. Bigger numbers are printed fresh each time. is checks if you got the same sheet of paper; == checks if the paper says the same thing.

Common mistake: Testing if x is 1 or if x is "hello". This works coincidentally in the REPL because the REPL interns aggressively. It breaks in production.


The Attribute Lookup Chain

When you write obj.attr, Python doesn’t just look in one place. It runs a multi-step algorithm that most people don’t know:

1. type(obj).__mro__  → search for a DATA DESCRIPTOR in the class hierarchy
2. obj.__dict__       → instance dictionary
3. type(obj).__mro__  → search for non-data descriptor or plain class attribute
4. type(obj).__getattr__(obj, 'attr')  → fallback if defined

__getattribute__ intercepts everything

Every attribute access on an object calls type(obj).__getattribute__(obj, name). You almost never override this directly — it’s the engine running the lookup chain above. Override __getattr__ instead, which is only called when normal lookup fails.

class Strict:
    def __getattr__(self, name):
        raise AttributeError(f"No dynamic attributes allowed: {name}")

ELI5: __getattribute__ is the receptionist who handles every visitor. __getattr__ is the lost-and-found office — only called when the receptionist can’t find where you’re supposed to go.

Data descriptors vs non-data descriptors

This is where lookup priority gets non-obvious:

TypeHas __get__Has __set__/__delete__Priority vs instance dict
Data descriptorYesYesHigher than instance dict
Non-data descriptorYesNoLower than instance dict
Plain class attrNoNoLower than instance dict

property is a data descriptor (it has __get__ and __set__). This is why setting an instance attribute with the same name as a property doesn’t shadow it — the property wins.

class Foo:
    @property
    def x(self): return 42

f = Foo()
f.__dict__['x'] = 99   # force write to instance dict
f.x                    # still 42 — data descriptor wins

ELI5: Imagine a hotel safe (data descriptor) and a drawer in the room (instance dict). The hotel safe always takes priority because it’s controlled by the hotel. A note left in the drawer doesn’t override the safe.


Descriptors Deep Dive

Descriptors are the mechanism under property, classmethod, staticmethod, and Django/SQLAlchemy fields. Once you understand them, you stop treating these as magic.

The protocol

class Descriptor:
    def __set_name__(self, owner, name):   # called at class creation
        self.name = name

    def __get__(self, obj, objtype=None):  # obj is None when accessed from class
        if obj is None:
            return self
        return obj.__dict__.get(self.name)

    def __set__(self, obj, value):         # makes this a DATA descriptor
        obj.__dict__[self.name] = value

    def __delete__(self, obj):
        del obj.__dict__[self.name]

__set_name__ is called when the class body is executed — this is how a descriptor knows its own attribute name without being told.

Real example: type-checked attribute

class Typed:
    def __set_name__(self, owner, name):
        self.name = name
        self.private = f"_{name}"

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return getattr(obj, self.private, None)

    def __set__(self, obj, value):
        if not isinstance(value, self.expected_type):
            raise TypeError(
                f"{self.name} must be {self.expected_type.__name__}, "
                f"got {type(value).__name__}"
            )
        setattr(obj, self.private, value)

class Int(Typed):
    expected_type = int

class Person:
    age = Int()   # descriptor instance lives on the CLASS

    def __init__(self, age):
        self.age = age  # calls Int.__set__

p = Person(30)   # fine
p.age = "old"    # TypeError

Non-data descriptor: lazy property pattern

class lazy:
    def __init__(self, func):
        self.func = func

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        value = self.func(obj)
        # write to instance dict — next access bypasses descriptor
        obj.__dict__[self.func.__name__] = value
        return value

class Circle:
    def __init__(self, r): self.r = r

    @lazy
    def area(self):
        return 3.14159 * self.r ** 2

This works because lazy has no __set__ (non-data descriptor), so after the first call the instance dict entry shadows it.

ELI5: A lazy property is like a package that’s only assembled when you first open the box. After that, the assembled item sits there — you never rebuild it. A regular property is like a factory that rebuilds it every time you ask.


Method Resolution Order (MRO)

Python uses C3 linearization to compute the MRO. You don’t need to memorize the algorithm, but you need to understand what it guarantees:

  1. A class always comes before its parents
  2. If multiple classes share a parent, they keep the order from the class definition
  3. If the above two rules conflict, TypeError is raised
class A: pass
class B(A): pass
class C(A): pass
class D(B, C): pass

D.__mro__
# (<class 'D'>, <class 'B'>, <class 'C'>, <class 'A'>, <class 'object'>)

The diamond is handled: A appears once, after both B and C.

super() is next-in-MRO, not parent

This is the single most misunderstood thing about super():

class B(A):
    def method(self):
        super().method()   # NOT "call A.method"
                           # "call the next class after B in the MRO of the actual instance"

If the actual instance is a D, and D.__mro__ is [D, B, C, A], then super() inside B.method calls C.method, not A.method.

Cooperative multiple inheritance

The pattern that makes super() work across diamond hierarchies:

class A:
    def setup(self, **kwargs):
        super().setup(**kwargs)   # must call super even if you think you're at the top

class B(A):
    def setup(self, b_param=None, **kwargs):
        self.b = b_param
        super().setup(**kwargs)

class C(A):
    def setup(self, c_param=None, **kwargs):
        self.c = c_param
        super().setup(**kwargs)

class D(B, C):
    def setup(self, **kwargs):
        super().setup(**kwargs)

Every class in the chain must accept **kwargs and pass them along. If any class in the chain swallows **kwargs without calling super(), the chain breaks.

ELI5: MRO is like a relay race order decided before the race starts. super() doesn’t say “pass the baton to my parent” — it says “pass to whoever is next in the relay order.” That might not be your direct parent.

Common mistake: Writing super(ClassName, self) in Python 3. Just write super() — Python 3 fills in the class and instance automatically from the surrounding context.


__slots__

What it does

By default, every instance stores its attributes in a __dict__ (a full Python dictionary). __slots__ replaces that with a fixed-size C-level struct:

class Point:
    __slots__ = ('x', 'y')

    def __init__(self, x, y):
        self.x = x
        self.y = y

Memory comparison

A __dict__ on a modern CPython costs ~200-300 bytes before you store anything. A slot costs roughly 8 bytes per attribute (one pointer).

1M instances of Point(x, y)Memory
With __dict__~350 MB
With __slots__~56 MB

Trade-offs

With __slots__Without __slots__
Fixed attribute setDynamic attributes
~6x memory savingsFlexible
Slightly faster accessCompatible with __dict__-based tools
Complicates inheritanceSimple inheritance
No __weakref__ by defaultWeak references work

Inheritance complication: If a parent doesn’t define __slots__, subclasses get __dict__ anyway. You need __slots__ at every level for full savings.

class Base:
    __slots__ = ()          # empty slots, no __dict__ allocated

class Point(Base):
    __slots__ = ('x', 'y')  # now truly no __dict__

ELI5: __dict__ is like carrying an expandable backpack for each object — flexible but heavy. __slots__ is a clipboard with labeled fields — rigid but much lighter. When you have a million clipboards, the weight difference matters.

When to use: Only when you’re creating millions of instances of the same shape and memory is measurably a problem. Profile first. Don’t add __slots__ defensively.


Dunder Protocol Methods

The data model lets your objects participate in Python syntax. Don’t implement protocol methods unless you’re building a type that fits that protocol.

Container protocol

MethodTriggered byNotes
__len__len(x), bool(x) if no __bool__Return int >= 0
__getitem__x[key], iteration fallbackMust raise IndexError/KeyError to stop iteration
__setitem__x[key] = val
__delitem__del x[key]
__contains__val in xFalls back to __iter__ scan
__iter__for item in x, iter(x)Return an iterator
__reversed__reversed(x)Optional; falls back to __len__+__getitem__

Numeric protocol

Python tries the left operand first (__add__), then the right with the reflected method (__radd__). If both return NotImplemented, raises TypeError.

class Vector:
    def __add__(self, other):
        if isinstance(other, Vector):
            return Vector(self.x + other.x, self.y + other.y)
        return NotImplemented    # NOT raise TypeError — return NotImplemented

    def __radd__(self, other):   # handles: 0 + vector (useful for sum())
        return self.__add__(other)

    def __iadd__(self, other):   # handles +=, should mutate self or return new
        ...

ELI5: __add__ is “can you handle self + other?” and __radd__ is “can you handle other + self?” when other doesn’t know how to add your type. NotImplemented means “I can’t do this, ask the other side” — it’s not an exception.

__repr__ vs __str__

MethodCalled byPurposeRule
__repr__repr(), REPL display, !r formatDeveloper view, should be unambiguousShould look like a constructor call if possible
__str__print(), str(), !s formatUser view, readableFalls back to __repr__ if not defined

If you only implement one, implement __repr__. __str__ falls back to __repr__, not the reverse.

class Point:
    def __repr__(self):
        return f"Point({self.x!r}, {self.y!r})"  # unambiguous, reproducible

    def __str__(self):
        return f"({self.x}, {self.y})"            # clean for display

__hash__ and __eq__ consistency rule

Python enforces: objects that compare equal must have the same hash.

# If you define __eq__, Python SETS __hash__ = None (unhashable) automatically
# You must explicitly define __hash__ to keep your objects hashable

class Point:
    def __eq__(self, other):
        return (self.x, self.y) == (other.x, other.y)

    def __hash__(self):
        return hash((self.x, self.y))   # must use same fields as __eq__

Common mistake: Defining __eq__ and then being surprised that your objects can’t be used as dict keys. Python is protecting you from a hash table invariant violation.

ELI5: __hash__ and __eq__ are like the library catalog system — books on the same shelf (same hash bucket) might not be the same book, but two books that ARE the same must always be in the same place. If you redefine “same book” (__eq__) without updating the shelving rule (__hash__), the catalog breaks.


Object Creation: __new__ vs __init__

Most Python developers write __init__ and never think about __new__. Here’s when that matters:

obj = MyClass(args)
# Python actually does:
# 1. obj = MyClass.__new__(MyClass, args)   ← allocates + creates the instance
# 2. MyClass.__init__(obj, args)            ← initializes it
# 3. returns obj

When you need __new__

Immutable types: You can’t change an immutable object in __init__ because by then it’s already created. __new__ is your only chance.

class UpperStr(str):
    def __new__(cls, value):
        return super().__new__(cls, value.upper())
        # can't do this in __init__ — str is already immutable by then

Singletons:

class Singleton:
    _instance = None

    def __new__(cls, *args, **kwargs):
        if cls._instance is None:
            cls._instance = super().__new__(cls)
        return cls._instance

ELI5: __new__ is the architect who designs and builds the building. __init__ is the interior decorator who furnishes it. For most buildings, you only care about the furniture. But if the building itself has weird shape constraints (like “must be made of marble” i.e., immutable), the architect needs explicit instructions.


Copy Semantics

Assignment is binding, not copying

a = [1, 2, 3]
b = a           # b points to the SAME list
b.append(4)
print(a)        # [1, 2, 3, 4] — a is affected

Shallow vs deep copy

import copy

original = [[1, 2], [3, 4]]

shallow = copy.copy(original)    # new list, same inner lists
deep = copy.deepcopy(original)   # new list, new inner lists

shallow[0].append(99)
print(original[0])   # [1, 2, 99] — shallow copy shares inner objects

deep[0].append(99)
print(original[0])   # [1, 2] — deep copy is fully independent
OperationCreates new outer?Creates new inner?Use when
=NoNoAlways binding
copy.copy()YesNoFlat containers, performance-sensitive
copy.deepcopy()YesYesNested mutable structures
list[:], dict.copy()YesNoIdiomatic shallow copy

The mutable default argument trap

# WRONG — default list is created ONCE at function definition time
def add_item(item, lst=[]):
    lst.append(item)
    return lst

add_item(1)  # [1]
add_item(2)  # [1, 2] ← surprise

# RIGHT
def add_item(item, lst=None):
    if lst is None:
        lst = []
    lst.append(item)
    return lst

ELI5: A mutable default argument is like putting a communal notepad in your office — every call shares the same notepad. What the previous caller wrote is still there. Use None and create a fresh notepad inside the function.

Common mistake: This also bites in class definitions — class Foo: items = [] means all instances share the same list unless you assign self.items = [] in __init__.


Summary: When to Reach for What

You want to…Use
Validate attributes on assignmentData descriptor or property
Compute an attribute once and cache itNon-data descriptor (lazy property)
Save memory for millions of simple instances__slots__
Create an immutable type subclassOverride __new__
Make objects work with +, *, len(), inImplement the relevant dunder protocol
Control attribute lookup globallyOverride __getattribute__ (rarely)
Handle missing attributes gracefullyOverride __getattr__
Debug MRO issuesPrint ClassName.__mro__
Make objects hashable after defining __eq__Also define __hash__ using same fields
Avoid shared mutable state in defaultsUse None sentinel, create inside function