{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Chain and group iterables\n",
    "\n",
    "**The question.** You have several iterables you'd like to treat as one, or one iterable whose adjacent values should be grouped, or two iterables you want to walk in parallel. You want to do this without materialising the data into a list first.\n",
    "\n",
    "The answer: reach for `itertools.chain` and `itertools.groupby`. Together they handle the end-to-end-concatenate and runs-of-equal-values cases. `zip` handles the parallel case and `product`/`combinations`/`permutations` handle the combinatoric cases — see the extra cells.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Chain several iterables end-to-end, then group adjacent equal values.\n",
    "# Classic worked example: two teams' scores, combined and grouped by score.\n",
    "from itertools import chain, groupby\n",
    "\n",
    "team_a = [('Alice', 12), ('Bob', 18), ('Carol', 9)]\n",
    "team_b = [('Dan', 12), ('Eve', 20), ('Fern', 18)]\n",
    "\n",
    "# chain: concatenate without materialising. Each argument can be any iterable.\n",
    "combined = sorted(chain(team_a, team_b), key=lambda p: -p[1])\n",
    "\n",
    "# groupby: runs of ADJACENT equal values. Sort first if you need SQL-style GROUP BY.\n",
    "for score, players in groupby(combined, key=lambda p: p[1]):\n",
    "    names = [name for name, _ in players]   # materialise inside the loop\n",
    "    if len(names) > 1:\n",
    "        print(f'tied at {score}: {names}')\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Variant: `zip` and `zip_longest` for parallel iteration\n",
    "\n",
    "Built-in `zip` yields tuples from each iterable in lockstep, stopping at the shortest. Use `strict=True` (Python 3.10+) to fail loudly on length mismatch rather than silently truncating. `itertools.zip_longest` pads missing values instead of stopping.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from itertools import zip_longest\n",
    "\n",
    "names  = ['Ada', 'Grace', 'Linus']\n",
    "scores = [95, 88, 72]\n",
    "\n",
    "for name, score in zip(names, scores, strict=True):\n",
    "    print(f'{name}: {score}')\n",
    "\n",
    "# Pad ragged inputs:\n",
    "print(list(zip_longest([1, 2, 3, 4], ['x', 'y'], fillvalue='?')))\n",
    "\n",
    "# Transpose for free:\n",
    "rows = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]\n",
    "print(list(zip(*rows)))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Variant: `product`, `combinations`, `permutations` for combinatorics\n",
    "\n",
    "When you need every combination across several iterables, `itertools.product` is the cross-join equivalent. `combinations` and `permutations` are the order-doesn't-matter and order-matters variants of pick-from-one-iterable.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from itertools import product, combinations, permutations\n",
    "\n",
    "sizes = ['S', 'M', 'L']\n",
    "colours = ['red', 'blue']\n",
    "\n",
    "# Every (size, colour) pair\n",
    "for s, c in product(sizes, colours):\n",
    "    print(f'{s}-{c}')\n",
    "\n",
    "# All 3-digit binary strings\n",
    "print(list(product('01', repeat=3))[:4], '...')\n",
    "\n",
    "# Pick 2 from 'abcd' — order doesn't matter vs. does\n",
    "print(list(combinations('abcd', 2)))\n",
    "print(list(permutations('abc', 2)))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Why this works\n",
    "\n",
    "`chain` is a generator: it yields from its first argument, then its second, and so on. Nothing is copied — the arguments could be lists, generator expressions, file handles, or the output of other `itertools` calls, and `chain` would still use O(1) extra memory.\n",
    "\n",
    "`groupby` is the catch-and-release version of SQL `GROUP BY`: it groups *adjacent* equal values only. That's the same semantics as the Unix `uniq` command. If your input isn't sorted by the group key, sort first. The yielded sub-iterator is only valid while the outer loop is on that iteration — as soon as you advance to the next `(key, group)` pair, the previous group is silently exhausted. That's why the canonical answer materialises `names` inside the loop.\n",
    "\n",
    "The two compose cleanly because they both speak the iterator protocol. `chain` feeds into `sorted` (which is eager but returns a list), `sorted` feeds into `groupby`. The only eager step is `sorted` — if your data is already ordered, you can drop it and the whole pipeline stays lazy.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Trade-offs\n",
    "\n",
    "Reach for `chain.from_iterable(nested)` when you have an iterable *of* iterables — it's identical to `chain(*nested)` but doesn't force the outer iterable into memory. It flattens one level only; deeper flattening needs a small recursive generator.\n",
    "\n",
    "If you only need counts per group, `collections.Counter` beats `groupby` — no sort step, no adjacency requirement. `groupby` earns its place when you need to iterate every member of each group, not just the total.\n",
    "\n",
    "Common traps: iterating the same sub-iterator twice, or saving sub-iterators for later (they'll all come back empty). The [avoid common iterator mistakes](https://agilearn.co.uk/guides/iterators-and-generators/recipes/avoid-common-iterator-mistakes) recipe has the full catalogue.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Related reading\n",
    "\n",
    "- [Combine generators into a pipeline](https://agilearn.co.uk/guides/iterators-and-generators/recipes/combine-generators) — `chain` and `groupby` are stages you'd wire into a larger pipeline.\n",
    "- [Avoid common iterator mistakes](https://agilearn.co.uk/guides/iterators-and-generators/recipes/avoid-common-iterator-mistakes) — the groupby-materialisation trap in detail.\n",
    "- [itertools cheatsheet](https://agilearn.co.uk/guides/iterators-and-generators/reference/itertools-cheatsheet) — every `itertools` function at a glance.\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}