Type a data structure¶
The question. You have some named-field data — a user record, an address, a config block — and you want to annotate its shape. Python has four candidates: plain dict[K, V], TypedDict, NamedTuple, and @dataclass. They overlap, and each earns its place for different reasons.
The short answer: default to @dataclass. Reach for TypedDict when the data stays a dict (JSON, YAML, config files); reach for NamedTuple for tiny immutable records you'll unpack; reach for dict[K, V] when the keys are data, not named fields.
# The default: @dataclass for records with named fields
from dataclasses import dataclass, field
@dataclass
class Address:
street: str
postcode: str
@dataclass
class User:
id: int
name: str
address: Address # nested dataclass
active: bool = True
tags: list[str] = field(default_factory=list) # mutable default
def display(self) -> str:
return f'{self.name} (id={self.id}) @ {self.address.postcode}'
alice = User(
id=1,
name='Alice',
address=Address('42 High St', 'SW1A 1AA'),
tags=['admin'],
)
print(alice.display())
print(alice) # auto-generated __repr__
# Typed collections of records compose naturally
users: list[User] = [alice]
by_id: dict[int, User] = {u.id: u for u in users}
print(by_id[1].name)
# Variant: TypedDict — when the data is already a dict (JSON, YAML, CSV rows)
from typing import TypedDict
try:
from typing import NotRequired # Python 3.11+
except ImportError:
from typing_extensions import NotRequired
class UserRecord(TypedDict):
id: int
name: str
active: bool
email: NotRequired[str] # this key may be missing
alice: UserRecord = {'id': 1, 'name': 'Alice', 'active': True}
print(alice['name']) # still a dict at runtime — bracket access
# Variant: NamedTuple — small, immutable, unpackable
from typing import NamedTuple
class Point(NamedTuple):
x: float
y: float
p = Point(3.0, 4.0)
print(p.x, p.y) # attribute access
x, y = p # tuple unpacking works
print('sum:', x + y)
print('== tuple:', p == (3.0, 4.0)) # equal to regular tuples — feature or footgun
# Variant: plain dict[K, V] — when the keys are DATA, not named fields
# Counters, caches, indexes, lookups — all best expressed as a dict[K, V]
char_counts: dict[str, int] = {}
for ch in 'hello':
char_counts[ch] = char_counts.get(ch, 0) + 1
print(char_counts)
Why @dataclass is the default¶
A dataclass gives you __init__, __repr__, and __eq__ for free, plus attribute access (user.name, not user['name']), methods, inheritance, and type hints per field — with almost no boilerplate. It handles nested records, mutable defaults (field(default_factory=list)), and immutability (frozen=True) through decorator parameters. The IDE experience is also better: rename-symbol works on attributes, but not on dict keys.
TypedDict earns its place when the data is already a dict at the boundary — parsed JSON, csv.DictReader rows, yaml.safe_load output. Converting it to a class means two places for the types to live (the class and the JSON schema) and runtime cost on every conversion. TypedDict adds type-checking without changing the runtime shape.
NamedTuple earns its place for small, genuinely immutable records where tuple-like behaviour (unpacking, equality with plain tuples, hashability) is an asset rather than a surprise. Two or three fields; coordinates, RGB values, (key, value) pairs with names.
Plain dict[K, V] is the right call when the keys are data — word counts, user-by-id lookups, request caches. When you find yourself typing dict[str, str | int | bool | list[...]], the keys have become named fields and you should promote to a class.
Trade-offs¶
TypedDict can't do methods. It's a hint, not a runtime class. You can't attach behaviour to it, can't inherit from non-TypedDict bases, and the IDE sees bracket access, not attribute access. For anything more than wire-format documentation, convert to a dataclass once you've validated the payload.
NamedTuple's tuple-ness leaks. Point(3, 4) == (3, 4) is True, which is occasionally useful and occasionally a silent bug when you're comparing heterogeneous records. Iteration, indexing, and slicing all work; that's the design.
Don't convert between shapes casually. Each conversion is code you have to maintain and a place where types can drift from runtime. If a TypedDict and a @dataclass of the same shape exist side by side, one of them is almost certainly in the wrong role.
Decision flow, in one line:
- Keys are data? →
dict[K, V] - Already a dict at the boundary? →
TypedDict - Tiny, immutable, want to unpack? →
NamedTuple - Everything else →
@dataclass
Related reading¶
- Choose between @dataclass, NamedTuple, and a plain class — the class-side view of the same decision.
@dataclassparameters reference — every decorator option in one place.- Type a function signature — how the same types look in parameter and return positions.