The iteration protocol¶
A for loop in Python looks simple, but under the hood it's powered by a small, explicit contract between two kinds of objects. Once you see that contract, you can make any object work with for loops, list comprehensions, sum(), max(), unpacking, itertools — all of it.
This notebook covers that contract: the iterator protocol.
What for really does¶
When you write:
for x in things:
...
Python doesn't just "loop over things". It does something quite specific:
- Call
iter(things)to get an iterator. - Repeatedly call
next(iterator)to get the next value. - When
next()raisesStopIteration, stop.
Let's do that by hand.
numbers = [10, 20, 30]
it = iter(numbers) # step 1: get an iterator
print(next(it)) # step 2: get values one at a time
print(next(it))
print(next(it))
The list is iterable — it knows how to produce an iterator. The thing iter() returned is the iterator itself. They are not the same object.
print(type(numbers)) # list — the iterable
print(type(it)) # list_iterator — the iterator
One more next() call and the iterator is exhausted. Python signals that by raising StopIteration.
try:
next(it)
except StopIteration:
print('done — no more values')
Iterable vs iterator — the distinction that matters¶
- Iterable: an object you can get an iterator from. Lists, tuples, strings, sets, dicts, files, ranges, generators — all iterable. The test is whether
iter(obj)works. - Iterator: the stateful thing that actually produces values via
next(). It remembers how far through the sequence you are.
One iterable can produce many independent iterators. A single iterator is consumed once — after it's exhausted, it stays exhausted.
xs = [1, 2, 3]
a = iter(xs)
b = iter(xs) # a and b are independent
print(next(a), next(a)) # 1 2
print(next(b)) # 1 — b has its own position
This distinction explains a common beginner trap: iterating over an iterator twice.
it = iter([1, 2, 3])
first_sum = sum(it) # consumes the iterator completely
second_sum = sum(it) # iterator is now empty
print(first_sum, second_sum)
The second sum(it) returns 0. It isn't a bug — it's the protocol working as specified. If you need to iterate twice, either keep the iterable around (re-call iter() each time) or materialise the values into a list.
The dunder methods¶
An iterable defines __iter__. An iterator defines both __iter__ and __next__. An iterator's __iter__ returns self — that's how for loops transparently accept both iterables and iterators.
We'll build a tiny iterator from scratch to make this concrete.
class Countdown:
'''Iterable. Each iter() call produces a fresh iterator.'''
def __init__(self, start):
self.start = start
def __iter__(self):
return CountdownIterator(self.start)
class CountdownIterator:
'''Iterator. Stateful; consumed once.'''
def __init__(self, current):
self.current = current
def __iter__(self):
return self # iterators are their own iterator
def __next__(self):
if self.current <= 0:
raise StopIteration
value = self.current
self.current -= 1
return value
for n in Countdown(3):
print(n)
Two independent iterations of the same Countdown both work, because each for loop calls iter() and gets a fresh CountdownIterator.
c = Countdown(3)
print(list(c)) # first pass
print(list(c)) # second pass — still works
Whereas a bare CountdownIterator is used up after one pass:
ci = CountdownIterator(3)
print(list(ci)) # first pass consumes it
print(list(ci)) # empty
Anything that obeys the protocol plugs into everything¶
for is just one consumer. Any function or construct that iterates uses the same protocol. That's why a custom iterator works with list(), sum(), max(), tuple(), unpacking, comprehensions, in, and the itertools module, all without extra work on your part.
c = Countdown(5)
print(list(c)) # list() calls iter/next
print(sum(Countdown(5))) # 5+4+3+2+1
print(max(Countdown(5))) # 5
print(tuple(Countdown(3))) # (3, 2, 1)
a, b, c_ = Countdown(3) # unpacking uses the protocol
print(a, b, c_)
print(2 in Countdown(5)) # 'in' iterates until it finds a match
This is the payoff. Implement __iter__ once and your type slots into every iteration-aware tool Python has. We'll lean on this throughout the guide — it's why generators (next notebook) and itertools (notebook 3) feel so composable.
Quick check — implement an iterator¶
Build an iterable Repeat(value, n) that yields the same value n times. The class should work with a for loop and with list() called twice on the same instance.
Hints:
- You'll need two classes (one iterable, one iterator) or one class whose
__iter__returns a new iterator each call. - The iterator's
__next__should count down how many yields remain.
# Your turn — fill these in:
class Repeat:
def __init__(self, value, n):
...
def __iter__(self):
...
# Expected behaviour (uncomment once implemented):
# r = Repeat('hi', 3)
# print(list(r)) # ['hi', 'hi', 'hi']
# print(list(r)) # ['hi', 'hi', 'hi'] — independent iteration
# print(sum(Repeat(5, 4))) # 20
One working solution¶
class Repeat:
def __init__(self, value, n):
self.value = value
self.n = n
def __iter__(self):
return _RepeatIterator(self.value, self.n)
class _RepeatIterator:
def __init__(self, value, remaining):
self.value = value
self.remaining = remaining
def __iter__(self):
return self
def __next__(self):
if self.remaining <= 0:
raise StopIteration
self.remaining -= 1
return self.value
r = Repeat('hi', 3)
print(list(r))
print(list(r))
print(sum(Repeat(5, 4)))
Summary¶
for x in obj:callsiter(obj)thennext(...)repeatedly, stopping onStopIteration.- Iterable and iterator are different. An iterable creates iterators; an iterator is the one-shot state.
- Implementing
__iter__on your class lets it plug into every iteration-aware tool in the standard library. - Because iterators are consumed, iterating the same iterator twice silently yields nothing the second time — one of the most common iteration bugs.
Next up: generator functions. Writing iterator classes by hand is rare in practice, because yield gives you the same behaviour in a few lines.