{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c037be9c",
   "metadata": {},
   "source": "# Custom iterators\n\nGenerator functions cover most of what you need. So when *would* you write an iterator class by hand?\n\nThe honest answer is: rarely. But \"rarely\" isn't \"never\". This notebook walks through the cases where a class is the right shape, the mechanics of the protocol, and a couple of patterns — restartable iteration, attaching state, and integrating with `len()` or indexing — that don't fit comfortably into a generator function."
  },
  {
   "cell_type": "markdown",
   "id": "93eacf00",
   "metadata": {},
   "source": "## The protocol, recap\n\nTwo methods:\n\n- `__iter__(self)` — return *something with a `__next__`*. For a class that *is* the iterator, return `self`. For a class that just makes iterators, return a fresh iterator object.\n- `__next__(self)` — return the next value, or raise `StopIteration` when there are no more.\n\nThat's it. There is no other contract — no length, no indexing, no rewind. Anything more is something you're choosing to add."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f413f979",
   "metadata": {},
   "outputs": [],
   "source": "class Counter:\n    '''A simple iterator that counts from 1 to stop.'''\n    def __init__(self, stop):\n        self.stop = stop\n        self.current = 0\n\n    def __iter__(self):\n        return self          # this object IS its own iterator\n\n    def __next__(self):\n        if self.current >= self.stop:\n            raise StopIteration\n        self.current += 1\n        return self.current\n\n\nfor x in Counter(3):\n    print(x)"
  },
  {
   "cell_type": "markdown",
   "id": "d23cc7b2",
   "metadata": {},
   "source": "## When to reach for a class instead of a generator\n\nThree situations make a class the better choice."
  },
  {
   "cell_type": "markdown",
   "id": "a9a7093f",
   "metadata": {},
   "source": "### 1. You want indexing or `len()` alongside iteration\n\nA generator function is purely sequential. If callers also want to ask \"how many items?\" or \"give me the third one\", you need a class that implements `__len__` and `__getitem__` as well as iteration."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "69d70495",
   "metadata": {},
   "outputs": [],
   "source": "class Range3D:\n    '''A 3D grid of (x, y, z) coordinates — iterable, indexable, sized.'''\n    def __init__(self, nx, ny, nz):\n        self.nx, self.ny, self.nz = nx, ny, nz\n\n    def __len__(self):\n        return self.nx * self.ny * self.nz\n\n    def __getitem__(self, i):\n        # decode flat index into (x, y, z)\n        z, rem = divmod(i, self.nx * self.ny)\n        y, x   = divmod(rem, self.nx)\n        return (x, y, z)\n\n    def __iter__(self):\n        for i in range(len(self)):\n            yield self[i]\n\n\ngrid = Range3D(2, 2, 2)\nprint(len(grid))\nprint(grid[3])\nprint(list(grid))"
  },
  {
   "cell_type": "markdown",
   "id": "86152818",
   "metadata": {},
   "source": "Notice the trick on the last line: even when you have `__getitem__`, you can still write `__iter__` as a generator function inside the class. The two ways aren't mutually exclusive."
  },
  {
   "cell_type": "markdown",
   "id": "522d2455",
   "metadata": {},
   "source": "(In fact, Python will *fall back* to `__getitem__`-with-integer-keys-starting-at-0 if you don't define `__iter__`, but that fallback is brittle and best avoided. Define `__iter__` explicitly.)"
  },
  {
   "cell_type": "markdown",
   "id": "9c39ac7a",
   "metadata": {},
   "source": "### 2. You want naturally re-iterable behaviour\n\nA generator function returns a one-shot iterator. If you want `for` over the same object to work multiple times, you need *something* to hold the configuration and produce a fresh iterator each time. The cleanest version is two classes: an outer \"iterable\" and an inner \"iterator\". The outer's `__iter__` returns a new instance of the inner."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c3287a79",
   "metadata": {},
   "outputs": [],
   "source": "class Chunks:\n    '''Iterable: split an iterable into chunks of size n. Re-iterable.'''\n    def __init__(self, source, size):\n        self.source = source\n        self.size = size\n\n    def __iter__(self):\n        return _ChunksIterator(self.source, self.size)\n\n\nclass _ChunksIterator:\n    def __init__(self, source, size):\n        self.source = iter(source)   # store the underlying iterator\n        self.size = size\n\n    def __iter__(self):\n        return self\n\n    def __next__(self):\n        chunk = []\n        for _ in range(self.size):\n            try:\n                chunk.append(next(self.source))\n            except StopIteration:\n                if chunk:\n                    return chunk\n                raise\n        return chunk\n\n\nc = Chunks([1, 2, 3, 4, 5, 6, 7], 3)\nprint(list(c))    # [[1,2,3], [4,5,6], [7]]\nprint(list(c))    # [[1,2,3], [4,5,6], [7]] — works again"
  },
  {
   "cell_type": "markdown",
   "id": "f3e81783",
   "metadata": {},
   "source": "You could *almost* do this with a generator function:\n\n```python\ndef chunks(source, size):\n    chunk = []\n    for x in source:\n        chunk.append(x)\n        if len(chunk) == size:\n            yield chunk\n            chunk = []\n    if chunk:\n        yield chunk\n```\n\n…but `chunks(my_list, 3)` returns a one-shot generator. Calling `list(...)` on it twice would empty it the first time. The class form is naturally re-iterable because each `for` loop calls `__iter__` and gets a fresh `_ChunksIterator`."
  },
  {
   "cell_type": "markdown",
   "id": "40a2a409",
   "metadata": {},
   "source": "### 3. The iterator owns external state — files, sockets, database cursors\n\nIf your iterator wraps a resource that needs explicit setup or teardown (open a file, dial a connection), the class form gives you `__enter__` / `__exit__` and `__del__` to manage that resource. Generators can do this with `try`/`finally`, but a class makes the lifecycle visible."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5e323126",
   "metadata": {},
   "outputs": [],
   "source": "class LineReader:\n    '''Iterate over the lines of a file. Closes the file on exhaustion.'''\n    def __init__(self, path):\n        self.path = path\n        self._file = None\n\n    def __iter__(self):\n        # open lazily so that constructing a LineReader doesn't open the file\n        self._file = open(self.path)\n        return self\n\n    def __next__(self):\n        if self._file is None:\n            raise StopIteration\n        line = self._file.readline()\n        if not line:\n            self._file.close()\n            self._file = None\n            raise StopIteration\n        return line.rstrip('\\n')\n\n\n# (Skipping the live demo — would need a real file. The pattern is what matters.)\nprint('LineReader defined')"
  },
  {
   "cell_type": "markdown",
   "id": "ac948269",
   "metadata": {},
   "source": "For *most* file work in Python you'd just write `with open(path) as f: for line in f: ...` — files are already iterable. The pattern above is what you'd reach for when you're wrapping something that *isn't* already a file but feels like one (an HTTP stream, a custom protocol parser)."
  },
  {
   "cell_type": "markdown",
   "id": "d4d90d5e",
   "metadata": {},
   "source": "## Patterns that come up often"
  },
  {
   "cell_type": "markdown",
   "id": "9812d993",
   "metadata": {},
   "source": "### A peekable iterator\n\nSometimes you want to look at the next value without consuming it — for parsing, for instance. A class lets you cache the look-ahead in an attribute."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a9a09ecd",
   "metadata": {},
   "outputs": [],
   "source": "class Peekable:\n    '''Wraps any iterable; adds a peek() that returns the next value\n    without advancing.'''\n    _SENTINEL = object()\n\n    def __init__(self, iterable):\n        self._it = iter(iterable)\n        self._cache = self._SENTINEL\n\n    def __iter__(self):\n        return self\n\n    def __next__(self):\n        if self._cache is not self._SENTINEL:\n            v, self._cache = self._cache, self._SENTINEL\n            return v\n        return next(self._it)\n\n    def peek(self, default=_SENTINEL):\n        if self._cache is self._SENTINEL:\n            try:\n                self._cache = next(self._it)\n            except StopIteration:\n                if default is self._SENTINEL:\n                    raise\n                return default\n        return self._cache\n\n\np = Peekable([10, 20, 30])\nprint(p.peek())   # 10 — non-destructive\nprint(p.peek())   # 10 — still\nprint(next(p))    # 10\nprint(next(p))    # 20"
  },
  {
   "cell_type": "markdown",
   "id": "4f388ce6",
   "metadata": {},
   "source": "### A counting iterator\n\nWhen you want to know \"how many of those did I just process?\" without doing a second pass, wrap the iterator in something that tracks it."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "13e9e92a",
   "metadata": {},
   "outputs": [],
   "source": "class Counted:\n    '''Wraps an iterable and exposes how many items have been yielded.'''\n    def __init__(self, iterable):\n        self._it = iter(iterable)\n        self.count = 0\n\n    def __iter__(self):\n        return self\n\n    def __next__(self):\n        v = next(self._it)         # propagates StopIteration\n        self.count += 1\n        return v\n\n\nnums = Counted(range(100))\ntotal = sum(x for x in nums if x % 7 == 0)\nprint(f'sum={total}, processed={nums.count}')"
  },
  {
   "cell_type": "markdown",
   "id": "85421078",
   "metadata": {},
   "source": "### Restartable iterator over a callable source\n\nIf the data isn't already a sequence — say it comes from calling a function each time — you can build a re-iterable around the function. Each `__iter__` call creates a fresh iterator that calls the function again."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a4deb5f9",
   "metadata": {},
   "outputs": [],
   "source": "import random\n\nclass Sampled:\n    '''Re-iterable: each iteration draws a fresh sample of the same shape.'''\n    def __init__(self, sample_fn, n):\n        self.sample_fn = sample_fn\n        self.n = n\n\n    def __iter__(self):\n        for _ in range(self.n):\n            yield self.sample_fn()\n\n\nrnd = random.Random(0)\ns = Sampled(lambda: rnd.randint(1, 10), 5)\nprint(list(s))\nprint(list(s))   # different sample, but same shape and source"
  },
  {
   "cell_type": "markdown",
   "id": "12ff9f92",
   "metadata": {},
   "source": "## Generator-as-method — the hybrid\n\nOften the cleanest approach is a class that holds the configuration and a generator method. You get:\n\n- A re-iterable object (because `__iter__` is a generator function — calling it returns a fresh generator each time).\n- Tidy initialisation in `__init__`.\n- Other methods on the same object for related behaviour.\n\nThis is by far the most common shape in real code."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a63eb2f7",
   "metadata": {},
   "outputs": [],
   "source": "class FibUpTo:\n    '''Iterable: Fibonacci numbers up to a cap. Re-iterable. Sized? No —\n    we don't precompute. But re-iteration works.'''\n    def __init__(self, cap):\n        self.cap = cap\n\n    def __iter__(self):\n        a, b = 0, 1\n        while a <= self.cap:\n            yield a\n            a, b = b, a + b\n\n    def first(self):\n        '''Convenience — return just the first value.'''\n        return next(iter(self))\n\n\nf = FibUpTo(50)\nprint(list(f))    # works\nprint(list(f))    # still works\nprint(f.first())  # 0"
  },
  {
   "cell_type": "markdown",
   "id": "7d998138",
   "metadata": {},
   "source": "This pattern is the \"right\" answer most of the time you find yourself wanting a custom iterator. Skip the boilerplate `__next__` unless you need it."
  },
  {
   "cell_type": "markdown",
   "id": "6ce603c1",
   "metadata": {},
   "source": "## Quick check — sliding-window iterator\n\nImplement a class `Window(iterable, size)` that, when iterated, yields tuples representing a sliding window of `size` elements over the source. So `Window([1,2,3,4,5], 3)` yields `(1,2,3)`, `(2,3,4)`, `(3,4,5)`.\n\nRequirements:\n\n- The class is **re-iterable** — `list(w)` should work twice (assume the source iterable can also be iterated twice; for cleanness, accept any iterable).\n- Use the generator-method pattern.\n- Use a `collections.deque(maxlen=size)` to maintain the window."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2e3d761a",
   "metadata": {},
   "outputs": [],
   "source": "from collections import deque\n\nclass Window:\n    def __init__(self, iterable, size):\n        ...\n\n    def __iter__(self):\n        ...\n\n\n# Expected:\n# w = Window([1, 2, 3, 4, 5], 3)\n# print(list(w))    # [(1,2,3), (2,3,4), (3,4,5)]\n# print(list(w))    # same — re-iterable"
  },
  {
   "cell_type": "markdown",
   "id": "8688190b",
   "metadata": {},
   "source": "### Working solution"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b00ef161",
   "metadata": {},
   "outputs": [],
   "source": "from collections import deque\n\nclass Window:\n    def __init__(self, iterable, size):\n        self.iterable = iterable\n        self.size = size\n\n    def __iter__(self):\n        buf = deque(maxlen=self.size)\n        for x in self.iterable:\n            buf.append(x)\n            if len(buf) == self.size:\n                yield tuple(buf)\n\n\nw = Window([1, 2, 3, 4, 5], 3)\nprint(list(w))\nprint(list(w))\nprint(list(Window(range(6), 4)))"
  },
  {
   "cell_type": "markdown",
   "id": "39675766",
   "metadata": {},
   "source": "## Summary\n\n- The iterator protocol needs only `__iter__` and `__next__`. Anything else (`__len__`, `__getitem__`, peeking) is something you choose to add.\n- For most cases a generator function is shorter and clearer than an iterator class.\n- Reach for a class when you need indexing/sizing alongside iteration, when you want re-iterable behaviour without an outer wrapper, or when iteration owns external state (files, connections).\n- The most common practical pattern is a class with a generator *method* — config in `__init__`, behaviour in `__iter__` defined with `yield`.\n\nThat closes the Learn track. The [Recipes](https://agilearn.co.uk/guides/iterators-and-generators/recipes) section has worked examples — streaming a large file, building pipelines, common iterator mistakes — and the [Reference](https://agilearn.co.uk/guides/iterators-and-generators/reference) is where to look for the protocol, generator syntax, and the full `itertools` table."
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}