Convert between data structures¶

The question. You've got data in one shape — a list of tuples, a dictionary, a CSV line — and you need it in another. You want a single reference for the common conversions that doesn't make you guess at constructor names.

The short version: the built-in constructors (list, tuple, set, dict) accept any iterable, so most conversions are a one-liner. The only ones that need more are the "pair two sequences" and "transpose records" cases — both shown below.

In [ ]:

Copied!





# The common conversions, one after another

# list <-> tuple: same sequence, different mutability
nums = [1, 2, 3]
frozen = tuple(nums)
back = list(frozen)
print('list -> tuple -> list:', frozen, back)

# list <-> set: strips duplicates; set is unordered
names = ['Alice', 'Bob', 'Alice', 'Charlie', 'Bob']
unique = set(names)
ordered = sorted(unique)            # sort when you need a stable order
print('list -> set -> sorted list:', unique, ordered)

# dict <-> list: three useful views via keys(), values(), items()
prices = {'apples': 1.50, 'bread': 1.20, 'milk': 0.95}
print('keys:  ', list(prices.keys()))
print('values:', list(prices.values()))
print('items: ', list(prices.items()))

# list-of-pairs -> dict: dict() accepts any iterable of key/value pairs
pairs = [('apples', 1.50), ('bread', 1.20), ('milk', 0.95)]
print('pairs -> dict:', dict(pairs))

# two parallel lists -> dict: zip() pairs them, dict() consumes the pairs
keys = ['name', 'age', 'city']
vals = ['Alice', 30, 'London']
print('zipped dict:', dict(zip(keys, vals)))

# string <-> list: split on a delimiter, join with one
csv_line = 'Alice,30,London'
fields = csv_line.split(',')
print('split:', fields, '| rejoined:', ' | '.join(fields))

# list-of-records -> column dict (transpose): one entry per field
records = [
    {'name': 'Alice', 'age': 30},
    {'name': 'Bob',   'age': 25},
    {'name': 'Charlie', 'age': 35},
]
columns: dict[str, list] = {}
for r in records:
    for k, v in r.items():
        columns.setdefault(k, []).append(v)
print('transposed:', columns)
# The common conversions, one after another

# list <-> tuple: same sequence, different mutability
nums = [1, 2, 3]
frozen = tuple(nums)
back = list(frozen)
print('list -> tuple -> list:', frozen, back)

# list <-> set: strips duplicates; set is unordered
names = ['Alice', 'Bob', 'Alice', 'Charlie', 'Bob']
unique = set(names)
ordered = sorted(unique)            # sort when you need a stable order
print('list -> set -> sorted list:', unique, ordered)

# dict <-> list: three useful views via keys(), values(), items()
prices = {'apples': 1.50, 'bread': 1.20, 'milk': 0.95}
print('keys:  ', list(prices.keys()))
print('values:', list(prices.values()))
print('items: ', list(prices.items()))

# list-of-pairs -> dict: dict() accepts any iterable of key/value pairs
pairs = [('apples', 1.50), ('bread', 1.20), ('milk', 0.95)]
print('pairs -> dict:', dict(pairs))

# two parallel lists -> dict: zip() pairs them, dict() consumes the pairs
keys = ['name', 'age', 'city']
vals = ['Alice', 30, 'London']
print('zipped dict:', dict(zip(keys, vals)))

# string <-> list: split on a delimiter, join with one
csv_line = 'Alice,30,London'
fields = csv_line.split(',')
print('split:', fields, '| rejoined:', ' | '.join(fields))

# list-of-records -> column dict (transpose): one entry per field
records = [
    {'name': 'Alice', 'age': 30},
    {'name': 'Bob',   'age': 25},
    {'name': 'Charlie', 'age': 35},
]
columns: dict[str, list] = {}
for r in records:
    for k, v in r.items():
        columns.setdefault(k, []).append(v)
print('transposed:', columns)

Why it works¶

Every one of these conversions leans on the same principle: the built-in container constructors (list, tuple, set, dict) accept any iterable. You don't need a dedicated API to go from a list of pairs to a dict — dict(pairs) is enough, because a list of two-element tuples iterates as exactly what dict wants.

zip(a, b) produces an iterable of pairs by walking two sequences in lockstep; feeding that to dict() is the canonical "pair two lists" move. dict.keys(), dict.values(), and dict.items() return views — live references to the dict's contents — so wrapping them in list() gives you a snapshot you can mutate independently.

The transposition case is the only one that isn't a single constructor call, because "list of records → column dict" needs you to loop once per field. setdefault keeps the loop body to one line: if the column list doesn't exist yet, create it; either way, return the list so we can .append to it.

Trade-offs¶

Sets don't preserve order. set(names) gives you uniqueness but no promise about iteration order. If order matters, reach for list(dict.fromkeys(names)) — dicts remember insertion order, and fromkeys builds a dict from any iterable, dropping duplicates on the way.

zip stops at the shorter sequence. Pairing a 3-item list with a 5-item list gives you three pairs, not five. Use itertools.zip_longest when missing values should be padded rather than truncated.

dict(pairs) silently overwrites. If two pairs share a key, the later one wins. For counting or grouping instead of mapping, reach for collections.Counter or a small loop with setdefault.

Converting is cheap; copying isn't free. list(big_tuple) walks the whole tuple. For genuinely large data, convert once and work with the target shape; don't round-trip in a hot loop.

Choose the right data structure — which shape fits the problem in the first place.
Merge and compare dictionaries — conversions that keep you inside the dict world.
Work with nested structures — transposition at more depth.

Convert between data structures¶

Why it works¶

Trade-offs¶

Related reading¶