Custom iterators¶

Generator functions cover most of what you need. So when would you write an iterator class by hand?

The honest answer is: rarely. But "rarely" isn't "never". This notebook walks through the cases where a class is the right shape, the mechanics of the protocol, and a couple of patterns — restartable iteration, attaching state, and integrating with len() or indexing — that don't fit comfortably into a generator function.

The protocol, recap¶

Two methods:

__iter__(self) — return something with a __next__. For a class that is the iterator, return self. For a class that just makes iterators, return a fresh iterator object.
__next__(self) — return the next value, or raise StopIteration when there are no more.

That's it. There is no other contract — no length, no indexing, no rewind. Anything more is something you're choosing to add.

In [ ]:

Copied!





class Counter:
    '''A simple iterator that counts from 1 to stop.'''
    def __init__(self, stop):
        self.stop = stop
        self.current = 0

    def __iter__(self):
        return self          # this object IS its own iterator

    def __next__(self):
        if self.current >= self.stop:
            raise StopIteration
        self.current += 1
        return self.current


for x in Counter(3):
    print(x)
class Counter:
    '''A simple iterator that counts from 1 to stop.'''
    def __init__(self, stop):
        self.stop = stop
        self.current = 0

    def __iter__(self):
        return self          # this object IS its own iterator

    def __next__(self):
        if self.current >= self.stop:
            raise StopIteration
        self.current += 1
        return self.current


for x in Counter(3):
    print(x)

When to reach for a class instead of a generator¶

Three situations make a class the better choice.

1. You want indexing or `len()` alongside iteration¶

A generator function is purely sequential. If callers also want to ask "how many items?" or "give me the third one", you need a class that implements __len__ and __getitem__ as well as iteration.

In [ ]:

Copied!





class Range3D:
    '''A 3D grid of (x, y, z) coordinates — iterable, indexable, sized.'''
    def __init__(self, nx, ny, nz):
        self.nx, self.ny, self.nz = nx, ny, nz

    def __len__(self):
        return self.nx * self.ny * self.nz

    def __getitem__(self, i):
        # decode flat index into (x, y, z)
        z, rem = divmod(i, self.nx * self.ny)
        y, x   = divmod(rem, self.nx)
        return (x, y, z)

    def __iter__(self):
        for i in range(len(self)):
            yield self[i]


grid = Range3D(2, 2, 2)
print(len(grid))
print(grid[3])
print(list(grid))
class Range3D:
    '''A 3D grid of (x, y, z) coordinates — iterable, indexable, sized.'''
    def __init__(self, nx, ny, nz):
        self.nx, self.ny, self.nz = nx, ny, nz

    def __len__(self):
        return self.nx * self.ny * self.nz

    def __getitem__(self, i):
        # decode flat index into (x, y, z)
        z, rem = divmod(i, self.nx * self.ny)
        y, x   = divmod(rem, self.nx)
        return (x, y, z)

    def __iter__(self):
        for i in range(len(self)):
            yield self[i]


grid = Range3D(2, 2, 2)
print(len(grid))
print(grid[3])
print(list(grid))

Notice the trick on the last line: even when you have __getitem__, you can still write __iter__ as a generator function inside the class. The two ways aren't mutually exclusive.

(In fact, Python will fall back to __getitem__-with-integer-keys-starting-at-0 if you don't define __iter__, but that fallback is brittle and best avoided. Define __iter__ explicitly.)

2. You want naturally re-iterable behaviour¶

A generator function returns a one-shot iterator. If you want for over the same object to work multiple times, you need something to hold the configuration and produce a fresh iterator each time. The cleanest version is two classes: an outer "iterable" and an inner "iterator". The outer's __iter__ returns a new instance of the inner.

In [ ]:

Copied!





class Chunks:
    '''Iterable: split an iterable into chunks of size n. Re-iterable.'''
    def __init__(self, source, size):
        self.source = source
        self.size = size

    def __iter__(self):
        return _ChunksIterator(self.source, self.size)


class _ChunksIterator:
    def __init__(self, source, size):
        self.source = iter(source)   # store the underlying iterator
        self.size = size

    def __iter__(self):
        return self

    def __next__(self):
        chunk = []
        for _ in range(self.size):
            try:
                chunk.append(next(self.source))
            except StopIteration:
                if chunk:
                    return chunk
                raise
        return chunk


c = Chunks([1, 2, 3, 4, 5, 6, 7], 3)
print(list(c))    # [[1,2,3], [4,5,6], [7]]
print(list(c))    # [[1,2,3], [4,5,6], [7]] — works again
class Chunks:
    '''Iterable: split an iterable into chunks of size n. Re-iterable.'''
    def __init__(self, source, size):
        self.source = source
        self.size = size

    def __iter__(self):
        return _ChunksIterator(self.source, self.size)


class _ChunksIterator:
    def __init__(self, source, size):
        self.source = iter(source)   # store the underlying iterator
        self.size = size

    def __iter__(self):
        return self

    def __next__(self):
        chunk = []
        for _ in range(self.size):
            try:
                chunk.append(next(self.source))
            except StopIteration:
                if chunk:
                    return chunk
                raise
        return chunk


c = Chunks([1, 2, 3, 4, 5, 6, 7], 3)
print(list(c))    # [[1,2,3], [4,5,6], [7]]
print(list(c))    # [[1,2,3], [4,5,6], [7]] — works again

You could almost do this with a generator function:

def chunks(source, size):
    chunk = []
    for x in source:
        chunk.append(x)
        if len(chunk) == size:
            yield chunk
            chunk = []
    if chunk:
        yield chunk

…but chunks(my_list, 3) returns a one-shot generator. Calling list(...) on it twice would empty it the first time. The class form is naturally re-iterable because each for loop calls __iter__ and gets a fresh _ChunksIterator.

3. The iterator owns external state — files, sockets, database cursors¶

If your iterator wraps a resource that needs explicit setup or teardown (open a file, dial a connection), the class form gives you __enter__ / __exit__ and __del__ to manage that resource. Generators can do this with try/finally, but a class makes the lifecycle visible.

In [ ]:

Copied!





class LineReader:
    '''Iterate over the lines of a file. Closes the file on exhaustion.'''
    def __init__(self, path):
        self.path = path
        self._file = None

    def __iter__(self):
        # open lazily so that constructing a LineReader doesn't open the file
        self._file = open(self.path)
        return self

    def __next__(self):
        if self._file is None:
            raise StopIteration
        line = self._file.readline()
        if not line:
            self._file.close()
            self._file = None
            raise StopIteration
        return line.rstrip('\n')


# (Skipping the live demo — would need a real file. The pattern is what matters.)
print('LineReader defined')
class LineReader:
    '''Iterate over the lines of a file. Closes the file on exhaustion.'''
    def __init__(self, path):
        self.path = path
        self._file = None

    def __iter__(self):
        # open lazily so that constructing a LineReader doesn't open the file
        self._file = open(self.path)
        return self

    def __next__(self):
        if self._file is None:
            raise StopIteration
        line = self._file.readline()
        if not line:
            self._file.close()
            self._file = None
            raise StopIteration
        return line.rstrip('\n')


# (Skipping the live demo — would need a real file. The pattern is what matters.)
print('LineReader defined')

For most file work in Python you'd just write with open(path) as f: for line in f: ... — files are already iterable. The pattern above is what you'd reach for when you're wrapping something that isn't already a file but feels like one (an HTTP stream, a custom protocol parser).

Patterns that come up often¶

A peekable iterator¶

Sometimes you want to look at the next value without consuming it — for parsing, for instance. A class lets you cache the look-ahead in an attribute.

In [ ]:

Copied!





class Peekable:
    '''Wraps any iterable; adds a peek() that returns the next value
    without advancing.'''
    _SENTINEL = object()

    def __init__(self, iterable):
        self._it = iter(iterable)
        self._cache = self._SENTINEL

    def __iter__(self):
        return self

    def __next__(self):
        if self._cache is not self._SENTINEL:
            v, self._cache = self._cache, self._SENTINEL
            return v
        return next(self._it)

    def peek(self, default=_SENTINEL):
        if self._cache is self._SENTINEL:
            try:
                self._cache = next(self._it)
            except StopIteration:
                if default is self._SENTINEL:
                    raise
                return default
        return self._cache


p = Peekable([10, 20, 30])
print(p.peek())   # 10 — non-destructive
print(p.peek())   # 10 — still
print(next(p))    # 10
print(next(p))    # 20
class Peekable:
    '''Wraps any iterable; adds a peek() that returns the next value
    without advancing.'''
    _SENTINEL = object()

    def __init__(self, iterable):
        self._it = iter(iterable)
        self._cache = self._SENTINEL

    def __iter__(self):
        return self

    def __next__(self):
        if self._cache is not self._SENTINEL:
            v, self._cache = self._cache, self._SENTINEL
            return v
        return next(self._it)

    def peek(self, default=_SENTINEL):
        if self._cache is self._SENTINEL:
            try:
                self._cache = next(self._it)
            except StopIteration:
                if default is self._SENTINEL:
                    raise
                return default
        return self._cache


p = Peekable([10, 20, 30])
print(p.peek())   # 10 — non-destructive
print(p.peek())   # 10 — still
print(next(p))    # 10
print(next(p))    # 20

A counting iterator¶

When you want to know "how many of those did I just process?" without doing a second pass, wrap the iterator in something that tracks it.

In [ ]:

Copied!





class Counted:
    '''Wraps an iterable and exposes how many items have been yielded.'''
    def __init__(self, iterable):
        self._it = iter(iterable)
        self.count = 0

    def __iter__(self):
        return self

    def __next__(self):
        v = next(self._it)         # propagates StopIteration
        self.count += 1
        return v


nums = Counted(range(100))
total = sum(x for x in nums if x % 7 == 0)
print(f'sum={total}, processed={nums.count}')
class Counted:
    '''Wraps an iterable and exposes how many items have been yielded.'''
    def __init__(self, iterable):
        self._it = iter(iterable)
        self.count = 0

    def __iter__(self):
        return self

    def __next__(self):
        v = next(self._it)         # propagates StopIteration
        self.count += 1
        return v


nums = Counted(range(100))
total = sum(x for x in nums if x % 7 == 0)
print(f'sum={total}, processed={nums.count}')

Restartable iterator over a callable source¶

If the data isn't already a sequence — say it comes from calling a function each time — you can build a re-iterable around the function. Each __iter__ call creates a fresh iterator that calls the function again.

In [ ]:

Copied!





import random

class Sampled:
    '''Re-iterable: each iteration draws a fresh sample of the same shape.'''
    def __init__(self, sample_fn, n):
        self.sample_fn = sample_fn
        self.n = n

    def __iter__(self):
        for _ in range(self.n):
            yield self.sample_fn()


rnd = random.Random(0)
s = Sampled(lambda: rnd.randint(1, 10), 5)
print(list(s))
print(list(s))   # different sample, but same shape and source
import random

class Sampled:
    '''Re-iterable: each iteration draws a fresh sample of the same shape.'''
    def __init__(self, sample_fn, n):
        self.sample_fn = sample_fn
        self.n = n

    def __iter__(self):
        for _ in range(self.n):
            yield self.sample_fn()


rnd = random.Random(0)
s = Sampled(lambda: rnd.randint(1, 10), 5)
print(list(s))
print(list(s))   # different sample, but same shape and source

Generator-as-method — the hybrid¶

Often the cleanest approach is a class that holds the configuration and a generator method. You get:

A re-iterable object (because __iter__ is a generator function — calling it returns a fresh generator each time).
Tidy initialisation in __init__.
Other methods on the same object for related behaviour.

This is by far the most common shape in real code.

In [ ]:

Copied!





class FibUpTo:
    '''Iterable: Fibonacci numbers up to a cap. Re-iterable. Sized? No —
    we don't precompute. But re-iteration works.'''
    def __init__(self, cap):
        self.cap = cap

    def __iter__(self):
        a, b = 0, 1
        while a <= self.cap:
            yield a
            a, b = b, a + b

    def first(self):
        '''Convenience — return just the first value.'''
        return next(iter(self))


f = FibUpTo(50)
print(list(f))    # works
print(list(f))    # still works
print(f.first())  # 0
class FibUpTo:
    '''Iterable: Fibonacci numbers up to a cap. Re-iterable. Sized? No —
    we don't precompute. But re-iteration works.'''
    def __init__(self, cap):
        self.cap = cap

    def __iter__(self):
        a, b = 0, 1
        while a <= self.cap:
            yield a
            a, b = b, a + b

    def first(self):
        '''Convenience — return just the first value.'''
        return next(iter(self))


f = FibUpTo(50)
print(list(f))    # works
print(list(f))    # still works
print(f.first())  # 0

This pattern is the "right" answer most of the time you find yourself wanting a custom iterator. Skip the boilerplate __next__ unless you need it.

Quick check — sliding-window iterator¶

Implement a class Window(iterable, size) that, when iterated, yields tuples representing a sliding window of size elements over the source. So Window([1,2,3,4,5], 3) yields (1,2,3), (2,3,4), (3,4,5).

Requirements:

The class is re-iterable — list(w) should work twice (assume the source iterable can also be iterated twice; for cleanness, accept any iterable).
Use the generator-method pattern.
Use a collections.deque(maxlen=size) to maintain the window.

In [ ]:

Copied!





from collections import deque

class Window:
    def __init__(self, iterable, size):
        ...

    def __iter__(self):
        ...


# Expected:
# w = Window([1, 2, 3, 4, 5], 3)
# print(list(w))    # [(1,2,3), (2,3,4), (3,4,5)]
# print(list(w))    # same — re-iterable
from collections import deque

class Window:
    def __init__(self, iterable, size):
        ...

    def __iter__(self):
        ...


# Expected:
# w = Window([1, 2, 3, 4, 5], 3)
# print(list(w))    # [(1,2,3), (2,3,4), (3,4,5)]
# print(list(w))    # same — re-iterable

Working solution¶

In [ ]:

Copied!





from collections import deque

class Window:
    def __init__(self, iterable, size):
        self.iterable = iterable
        self.size = size

    def __iter__(self):
        buf = deque(maxlen=self.size)
        for x in self.iterable:
            buf.append(x)
            if len(buf) == self.size:
                yield tuple(buf)


w = Window([1, 2, 3, 4, 5], 3)
print(list(w))
print(list(w))
print(list(Window(range(6), 4)))
from collections import deque

class Window:
    def __init__(self, iterable, size):
        self.iterable = iterable
        self.size = size

    def __iter__(self):
        buf = deque(maxlen=self.size)
        for x in self.iterable:
            buf.append(x)
            if len(buf) == self.size:
                yield tuple(buf)


w = Window([1, 2, 3, 4, 5], 3)
print(list(w))
print(list(w))
print(list(Window(range(6), 4)))

Summary¶

The iterator protocol needs only __iter__ and __next__. Anything else (__len__, __getitem__, peeking) is something you choose to add.
For most cases a generator function is shorter and clearer than an iterator class.
Reach for a class when you need indexing/sizing alongside iteration, when you want re-iterable behaviour without an outer wrapper, or when iteration owns external state (files, connections).
The most common practical pattern is a class with a generator method — config in __init__, behaviour in __iter__ defined with yield.

That closes the Learn track. The Recipes section has worked examples — streaming a large file, building pipelines, common iterator mistakes — and the Reference is where to look for the protocol, generator syntax, and the full itertools table.

Custom iterators¶

The protocol, recap¶

When to reach for a class instead of a generator¶

1. You want indexing or len() alongside iteration¶

2. You want naturally re-iterable behaviour¶

3. The iterator owns external state — files, sockets, database cursors¶

Patterns that come up often¶

A peekable iterator¶

A counting iterator¶

Restartable iterator over a callable source¶

Generator-as-method — the hybrid¶

Quick check — sliding-window iterator¶

Working solution¶

Summary¶

1. You want indexing or `len()` alongside iteration¶