Work with nested structures¶
The question. You have flat records — rows from a CSV, JSON entries from an API, dicts read from a log — and you want them organised by one or more categories so you can iterate "for each department, for each person, …" The flipside is reaching into an already-nested structure without crashing on missing keys.
Two patterns cover most of this: group flat records using dict.setdefault(...), and safely access nested values using chained dict.get(...) calls with {} fallbacks.
# Task: group flat records into a nested {category: [items...]} structure
records = [
{'department': 'Engineering', 'name': 'Alice'},
{'department': 'Marketing', 'name': 'Bob'},
{'department': 'Engineering', 'name': 'Charlie'},
{'department': 'Marketing', 'name': 'Diana'},
]
grouped: dict[str, list[str]] = {}
for record in records:
dept = record['department']
grouped.setdefault(dept, []).append(record['name'])
for dept, names in grouped.items():
print(f'{dept}: {", ".join(names)}')
# Safe access into deep nesting, without try/except
# data.get(a, {}).get(b, {}).get(c) returns None if any layer is missing
school = {
'Year 10': {
'Alice': {'maths': 85, 'science': 90},
'Bob': {'maths': 72, 'science': 68},
},
'Year 11': {
'Charlie': {'maths': 91, 'science': 88},
},
}
alice_maths = school.get('Year 10', {}).get('Alice', {}).get('maths')
missing = school.get('Year 12', {}).get('Diana', {}).get('maths')
print('Alice maths:', alice_maths)
print('missing:', missing)
Why it works¶
dict.setdefault(key, default) is the idiomatic "give me the list at this key, creating it if needed". It's atomic in intent: one line, no two-step "check if it exists, else insert" dance. The returned reference is the same object in the dict, so .append(...) on it mutates the dict's value in place.
For deeper grouping (by two or more keys), collections.defaultdict is often cleaner: defaultdict(lambda: defaultdict(list)) gives you two-level auto-creation, and you can skip the setdefault boilerplate entirely. setdefault wins for one-level grouping because it avoids the import and the subtle gotcha that printing a defaultdict can create missing keys.
The safe-access pattern — d.get(a, {}).get(b, {}).get(c) — falls out of the fact that .get(key, default) never raises KeyError, and {} chains cleanly with more .get() calls. If any layer is missing, the final answer is None (or whatever default you pass to the last .get()). For code paths where a missing key is genuinely exceptional, a try/except KeyError block still reads fine — but the chained .get() is what you want when missing is normal.
Trade-offs¶
setdefault runs the default every time. It's lazy-returned but eagerly constructed — d.setdefault(k, expensive_call()) calls expensive_call() on every iteration, even when the key already exists. For expensive defaults, guard explicitly or use defaultdict.
defaultdict has side-effects. Reading a missing key creates it. d[missing] is both a read and a write; len(d) silently grows. For dictionaries you're iterating and debugging, this can be surprising. Convert to a plain dict (dict(dd)) before handing it off.
Chained .get() hides the shape. If "missing" for one layer should be an error but missing for another should be None, the chain can't express that. Unpack the access step by step when the rules differ.
Nested structures beyond two or three levels usually want a real model. A dataclass or a Pydantic model makes the shape explicit, gives you attribute access (school.year_10.alice.maths), and centralises validation. Reach for one once the dicts grow a third level of nesting.
Related reading¶
- Convert between data structures — the "transpose records" example is grouping's flipside.
- Choose the right data structure — when to stop nesting dicts and reach for a
dataclass. - Merge and compare dictionaries — merging is a special case of nesting.