{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "9879cec8",
   "metadata": {},
   "source": "# Generics and collections\n\nA `list` is a list of *something*. For a type-checker to be useful, it needs to know what that something is — a `list[int]` is very different from a `list[str]` when it comes to what methods you can safely call on its elements.\n\nThis notebook covers generic types over collections (`list[int]`, `dict[str, int]`, etc.) and the abstract-container types (`Iterable`, `Sequence`, `Mapping`) that make function signatures more flexible."
  },
  {
   "cell_type": "markdown",
   "id": "dc39b7fe",
   "metadata": {},
   "source": "## Typing built-in collections\n\nPut the element type in square brackets. These are all valid on Python 3.9+:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "37d4a5d2",
   "metadata": {},
   "outputs": [],
   "source": "scores: list[int] = [85, 92, 78]\nnames: set[str] = {\"Alice\", \"Bob\"}\nages: dict[str, int] = {\"Alice\": 30, \"Bob\": 25}\n\nprint(scores)\nprint(names)\nprint(ages)"
  },
  {
   "cell_type": "markdown",
   "id": "36921772",
   "metadata": {},
   "source": "`dict[K, V]` takes two parameters — the key type and the value type. `tuple` can take any number of them, one per position (covered below).\n\nOn Python 3.8 and earlier you had to import these forms from the `typing` module: `List[int]`, `Dict[str, int]`, `Set[str]`. The modern built-in forms are shorter and should be preferred. The [`typing` reference](https://agilearn.co.uk/guides/type-hints/reference/typing-module-reference) covers the legacy syntax if you need it."
  },
  {
   "cell_type": "markdown",
   "id": "d92c06a1",
   "metadata": {},
   "source": "## Tuples — two different shapes\n\nTuples come in two flavours with different annotation syntax:\n\n**Fixed-length, heterogeneous**: `tuple[str, int, bool]` — a 3-tuple with specific types in each position:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eb0e8f6f",
   "metadata": {},
   "outputs": [],
   "source": "user: tuple[str, int, bool] = (\"Alice\", 30, True)\n# user = (\"Alice\", 30)          # would be a type error — wrong length\n# user = (30, \"Alice\", True)    # would be a type error — wrong types\nprint(user)"
  },
  {
   "cell_type": "markdown",
   "id": "c5432270",
   "metadata": {},
   "source": "**Variable-length, homogeneous**: `tuple[int, ...]` — a tuple of any number of `int`s. The `...` (literally an ellipsis) is part of the syntax."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9ccd5ec3",
   "metadata": {},
   "outputs": [],
   "source": "primes: tuple[int, ...] = (2, 3, 5, 7, 11, 13)\nprint(primes)\nprint(primes[0])"
  },
  {
   "cell_type": "markdown",
   "id": "7e87e5a8",
   "metadata": {},
   "source": "The heterogeneous form (`tuple[str, int, bool]`) is closer to a named-tuple or record — each position has a meaning. For \"variable-length list of integers that happens to be immutable\", use `tuple[int, ...]`. They're genuinely different shapes."
  },
  {
   "cell_type": "markdown",
   "id": "873305ea",
   "metadata": {},
   "source": "## Nesting\n\nGenerics compose — a list of dicts, a dict of lists, whatever you need:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c10f57b1",
   "metadata": {},
   "outputs": [],
   "source": "users: list[dict[str, str]] = [\n    {\"name\": \"Alice\", \"email\": \"alice@example.com\"},\n    {\"name\": \"Bob\", \"email\": \"bob@example.com\"},\n]\n\nscores_by_team: dict[str, list[int]] = {\n    \"red\": [12, 18, 9],\n    \"blue\": [15, 11, 14],\n}\n\nprint(users[0][\"name\"])\nprint(scores_by_team[\"red\"])"
  },
  {
   "cell_type": "markdown",
   "id": "313503e9",
   "metadata": {},
   "source": "A `dict[str, list[int]]` is a dict where keys are strings and values are lists of integers. You can go as deep as you need, but at some point the annotation gets unreadable — past two levels of nesting, consider a `TypedDict` or a dataclass instead (covered in the next notebook and the [data structure recipe](https://agilearn.co.uk/guides/type-hints/recipes/type-a-data-structure))."
  },
  {
   "cell_type": "markdown",
   "id": "4da57ef8",
   "metadata": {},
   "source": "## Abstract containers — `Iterable`, `Sequence`, `Mapping`\n\nA function that iterates over its argument doesn't really need a `list` — any iterable will do. Annotating it `list[int]` is too restrictive; the abstract `Iterable[int]` is more flexible:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8c85cf9b",
   "metadata": {},
   "outputs": [],
   "source": "from collections.abc import Iterable\n\ndef total(numbers: Iterable[int]) -> int:\n    return sum(numbers)\n\n# Accepts a list, tuple, set, generator — anything iterable\nprint(total([1, 2, 3]))\nprint(total((1, 2, 3)))\nprint(total({1, 2, 3}))\nprint(total(i for i in range(4)))"
  },
  {
   "cell_type": "markdown",
   "id": "0fc168c2",
   "metadata": {},
   "source": "The common hierarchy — each row accepts everything below it:\n\n| Type | Supports | Use when you need |\n| --- | --- | --- |\n| `Iterable[T]` | `for x in xs:` | To iterate once |\n| `Iterator[T]` | `next(xs)` | To pull elements lazily |\n| `Collection[T]` | `len(xs)`, `in`, iteration | Size and membership |\n| `Sequence[T]` | `xs[0]`, `xs[1:5]` | Indexed access (list, tuple) |\n| `MutableSequence[T]` | `.append()`, `.insert()` | To modify (list only) |\n| `Mapping[K, V]` | `xs[k]`, `k in xs` | Read-only dict-like |\n| `MutableMapping[K, V]` | `xs[k] = v`, `.update()` | To modify the mapping |\n\n**Rule of thumb**: annotate parameters with the most abstract type that supports what the function actually needs; annotate return types with the concrete type so callers know what they get. `def items(d: Mapping[K, V]) -> list[V]:` — accept a broad range, return a specific thing.\n\nImports come from `collections.abc` (preferred) or `typing` (legacy but still works)."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d806839b",
   "metadata": {},
   "outputs": [],
   "source": "from collections.abc import Mapping\n\ndef get_or_default(data: Mapping[str, int], key: str, default: int = 0) -> int:\n    return data.get(key, default)\n\n# Accepts any mapping — dict, ChainMap, MappingProxyType, etc.\nprint(get_or_default({\"a\": 1, \"b\": 2}, \"a\"))\nprint(get_or_default({\"a\": 1, \"b\": 2}, \"c\", default=99))"
  },
  {
   "cell_type": "markdown",
   "id": "7644eab2",
   "metadata": {},
   "source": "## Custom generics with `TypeVar`\n\nIf you write a function that takes a list and returns one of its elements, you want the element type in the return to match the element type of the input. A `TypeVar` is how you express that relationship:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ce9c7a6b",
   "metadata": {},
   "outputs": [],
   "source": "from typing import TypeVar\nfrom collections.abc import Sequence\n\nT = TypeVar(\"T\")\n\ndef first(items: Sequence[T]) -> T:\n    return items[0]\n\n# The return type matches the input's element type\nprint(first([1, 2, 3]))            # int\nprint(first([\"a\", \"b\", \"c\"]))        # str\nprint(first((True, False)))        # bool"
  },
  {
   "cell_type": "markdown",
   "id": "71fc5560",
   "metadata": {},
   "source": "The `T` is the same type across the annotation — if you pass a `list[str]`, you get a `str` back. Without a `TypeVar`, you'd have to write `items: Sequence[Any]) -> Any`, losing all the type information.\n\nOne subtlety: each call-site binds `T` to a single concrete type. A function annotated `def pair(a: T, b: T) -> tuple[T, T]` won't accept `pair(1, \"a\")` — `T` can't be both `int` and `str` at once. Use `T | U` or two separate TypeVars if you need independent types.\n\nOn Python 3.12+ there's a cleaner syntax — `def first[T](https://agilearn.co.uk/guides/type-hints/learn/items: Sequence[T]) -> T:` — the type parameter is declared in the function signature directly. The `TypeVar` form works on every version from 3.5 onwards, so we'll keep using it here."
  },
  {
   "cell_type": "markdown",
   "id": "486711ae",
   "metadata": {},
   "source": "## Constraining a `TypeVar`\n\nBy default `T` accepts any type. You can constrain it two ways:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "52cf45e3",
   "metadata": {},
   "outputs": [],
   "source": "# Bound: T must be a subtype of this\nfrom typing import TypeVar\n\nNumber = TypeVar(\"Number\", bound=float)    # T can be float or any subtype (int, etc.)\n\ndef double(x: Number) -> Number:\n    return x * 2\n\nprint(double(5))        # int — still matches because int is a subtype of float in typing\nprint(double(3.14))\n\n# Constraint set: T must be one of these exact types\nTextLike = TypeVar(\"TextLike\", str, bytes)\n\ndef first_char(s: TextLike) -> TextLike:\n    return s[:1]\n\nprint(first_char(\"hello\"))\nprint(first_char(b\"world\"))"
  },
  {
   "cell_type": "markdown",
   "id": "af5c5fe4",
   "metadata": {},
   "source": "Use `bound=` when you want \"any subtype of X\" (allowing subclasses). Use a constraint set when you want \"one of these specific types\" and want the type-checker to treat them as separate branches. Most of the time, `bound=` is the one you reach for."
  },
  {
   "cell_type": "markdown",
   "id": "c48e78a0",
   "metadata": {},
   "source": "## Exercise\n\nAnnotate the following functions and variables. Use the most abstract container type that supports what the function actually does."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7d8924b2",
   "metadata": {},
   "outputs": [],
   "source": "# 1. A function that counts unique elements\ndef count_unique(items):\n    return len(set(items))\n\n# 2. A function that returns the last element of a list, tuple, or string\ndef last(items):\n    return items[-1]\n\n# 3. A dict from student names to their exam scores (multiple scores per student)\ngrades = {\n    \"Alice\": [85, 92, 78],\n    \"Bob\": [70, 88, 82],\n}\n\n# 4. A function that takes a dict-like of string->int and returns the largest value\ndef max_value(data):\n    return max(data.values())\n\nprint(count_unique([1, 2, 2, 3]))\nprint(last(\"hello\"))\nprint(max_value({\"a\": 10, \"b\": 5}))"
  },
  {
   "cell_type": "markdown",
   "id": "4bad5fb2",
   "metadata": {},
   "source": "<details>\n<summary>Solution</summary>\n\n```python\nfrom collections.abc import Iterable, Sequence, Mapping\nfrom typing import TypeVar\n\n# 1. Any iterable works, and we don't care about element type\ndef count_unique(items: Iterable[object]) -> int:\n    return len(set(items))\n\n# 2. Needs indexing; TypeVar preserves the element type\nT = TypeVar(\"T\")\ndef last(items: Sequence[T]) -> T:\n    return items[-1]\n\n# 3. Variable annotation\ngrades: dict[str, list[int]] = {\n    \"Alice\": [85, 92, 78],\n    \"Bob\": [70, 88, 82],\n}\n\n# 4. Mapping for read-only access\ndef max_value(data: Mapping[str, int]) -> int:\n    return max(data.values())\n```\n\nNotes:\n\n- `count_unique` uses `Iterable[object]` because the function doesn't call any methods on the elements beyond hashing — any object works. You could also write `Iterable[Any]` or even just `Iterable` — all equivalent for this case.\n- `last`'s `TypeVar` means the return type matches the element type: pass a `list[int]`, get an `int`.\n- `max_value` uses `Mapping`, not `dict` — a `Mapping` is anything dict-like, which is all the function needs.\n</details>"
  },
  {
   "cell_type": "markdown",
   "id": "aa5696c2",
   "metadata": {},
   "source": "## Recap\n\n- Generic built-ins: `list[T]`, `set[T]`, `dict[K, V]`. Python 3.9+.\n- Tuples: `tuple[str, int, bool]` for fixed-length; `tuple[int, ...]` for variable-length.\n- Prefer abstract `Iterable`, `Sequence`, `Mapping` for *parameters*; concrete `list`, `dict` for *return types*.\n- `TypeVar` expresses \"the return type depends on the argument type\" — use when you need to preserve element types across a call.\n- Import abstract containers from `collections.abc` (preferred) or `typing` (legacy).\n\nNext: [Optional, Union, and friends](https://agilearn.co.uk/guides/type-hints/learn/04-typing-special-forms) — the expressive forms (`Literal`, `Callable`, `TypedDict`, `Any`) for trickier shapes."
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}