Chain and group iterables¶

The question. You have several iterables you'd like to treat as one, or one iterable whose adjacent values should be grouped, or two iterables you want to walk in parallel. You want to do this without materialising the data into a list first.

The answer: reach for itertools.chain and itertools.groupby. Together they handle the end-to-end-concatenate and runs-of-equal-values cases. zip handles the parallel case and product/combinations/permutations handle the combinatoric cases — see the extra cells.

In [ ]:

Copied!





# Chain several iterables end-to-end, then group adjacent equal values.
# Classic worked example: two teams' scores, combined and grouped by score.
from itertools import chain, groupby

team_a = [('Alice', 12), ('Bob', 18), ('Carol', 9)]
team_b = [('Dan', 12), ('Eve', 20), ('Fern', 18)]

# chain: concatenate without materialising. Each argument can be any iterable.
combined = sorted(chain(team_a, team_b), key=lambda p: -p[1])

# groupby: runs of ADJACENT equal values. Sort first if you need SQL-style GROUP BY.
for score, players in groupby(combined, key=lambda p: p[1]):
    names = [name for name, _ in players]   # materialise inside the loop
    if len(names) > 1:
        print(f'tied at {score}: {names}')
# Chain several iterables end-to-end, then group adjacent equal values.
# Classic worked example: two teams' scores, combined and grouped by score.
from itertools import chain, groupby

team_a = [('Alice', 12), ('Bob', 18), ('Carol', 9)]
team_b = [('Dan', 12), ('Eve', 20), ('Fern', 18)]

# chain: concatenate without materialising. Each argument can be any iterable.
combined = sorted(chain(team_a, team_b), key=lambda p: -p[1])

# groupby: runs of ADJACENT equal values. Sort first if you need SQL-style GROUP BY.
for score, players in groupby(combined, key=lambda p: p[1]):
    names = [name for name, _ in players]   # materialise inside the loop
    if len(names) > 1:
        print(f'tied at {score}: {names}')

Variant: `zip` and `zip_longest` for parallel iteration¶

Built-in zip yields tuples from each iterable in lockstep, stopping at the shortest. Use strict=True (Python 3.10+) to fail loudly on length mismatch rather than silently truncating. itertools.zip_longest pads missing values instead of stopping.

In [ ]:

Copied!





from itertools import zip_longest

names  = ['Ada', 'Grace', 'Linus']
scores = [95, 88, 72]

for name, score in zip(names, scores, strict=True):
    print(f'{name}: {score}')

# Pad ragged inputs:
print(list(zip_longest([1, 2, 3, 4], ['x', 'y'], fillvalue='?')))

# Transpose for free:
rows = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
print(list(zip(*rows)))
from itertools import zip_longest

names  = ['Ada', 'Grace', 'Linus']
scores = [95, 88, 72]

for name, score in zip(names, scores, strict=True):
    print(f'{name}: {score}')

# Pad ragged inputs:
print(list(zip_longest([1, 2, 3, 4], ['x', 'y'], fillvalue='?')))

# Transpose for free:
rows = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
print(list(zip(*rows)))

Variant: `product`, `combinations`, `permutations` for combinatorics¶

When you need every combination across several iterables, itertools.product is the cross-join equivalent. combinations and permutations are the order-doesn't-matter and order-matters variants of pick-from-one-iterable.

In [ ]:

Copied!





from itertools import product, combinations, permutations

sizes = ['S', 'M', 'L']
colours = ['red', 'blue']

# Every (size, colour) pair
for s, c in product(sizes, colours):
    print(f'{s}-{c}')

# All 3-digit binary strings
print(list(product('01', repeat=3))[:4], '...')

# Pick 2 from 'abcd' — order doesn't matter vs. does
print(list(combinations('abcd', 2)))
print(list(permutations('abc', 2)))
from itertools import product, combinations, permutations

sizes = ['S', 'M', 'L']
colours = ['red', 'blue']

# Every (size, colour) pair
for s, c in product(sizes, colours):
    print(f'{s}-{c}')

# All 3-digit binary strings
print(list(product('01', repeat=3))[:4], '...')

# Pick 2 from 'abcd' — order doesn't matter vs. does
print(list(combinations('abcd', 2)))
print(list(permutations('abc', 2)))

Why this works¶

chain is a generator: it yields from its first argument, then its second, and so on. Nothing is copied — the arguments could be lists, generator expressions, file handles, or the output of other itertools calls, and chain would still use O(1) extra memory.

groupby is the catch-and-release version of SQL GROUP BY: it groups adjacent equal values only. That's the same semantics as the Unix uniq command. If your input isn't sorted by the group key, sort first. The yielded sub-iterator is only valid while the outer loop is on that iteration — as soon as you advance to the next (key, group) pair, the previous group is silently exhausted. That's why the canonical answer materialises names inside the loop.

The two compose cleanly because they both speak the iterator protocol. chain feeds into sorted (which is eager but returns a list), sorted feeds into groupby. The only eager step is sorted — if your data is already ordered, you can drop it and the whole pipeline stays lazy.

Trade-offs¶

Reach for chain.from_iterable(nested) when you have an iterable of iterables — it's identical to chain(*nested) but doesn't force the outer iterable into memory. It flattens one level only; deeper flattening needs a small recursive generator.

If you only need counts per group, collections.Counter beats groupby — no sort step, no adjacency requirement. groupby earns its place when you need to iterate every member of each group, not just the total.

Common traps: iterating the same sub-iterator twice, or saving sub-iterators for later (they'll all come back empty). The avoid common iterator mistakes recipe has the full catalogue.

Combine generators into a pipeline — chain and groupby are stages you'd wire into a larger pipeline.
Avoid common iterator mistakes — the groupby-materialisation trap in detail.
itertools cheatsheet — every itertools function at a glance.

Chain and group iterables¶

Variant: zip and zip_longest for parallel iteration¶

Variant: product, combinations, permutations for combinatorics¶

Why this works¶

Trade-offs¶

Related reading¶

Variant: `zip` and `zip_longest` for parallel iteration¶

Variant: `product`, `combinations`, `permutations` for combinatorics¶