{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ac7f9100",
   "metadata": {},
   "source": "# Data classes\n\nA lot of classes exist for one reason: bundle a few values together under a name. The fields, an `__init__` that assigns each one to `self`, an `__eq__` that compares them, a `__repr__` that lists them. It's all boilerplate, and you write it the same way every time.\n\n`@dataclass` (added in Python 3.7) generates all of that from a class body that just lists the field names with their types. This notebook covers `@dataclass` in depth, then introduces `NamedTuple` and `TypedDict` as lighter-weight alternatives. The [decision recipe](https://agilearn.co.uk/guides/classes-and-objects/recipes/choose-between-dataclass-namedtuple-class) ties the three together."
  },
  {
   "cell_type": "markdown",
   "id": "beb9a532",
   "metadata": {},
   "source": "## A worked example: `Point` as a data class\n\nThe hand-rolled `Point` from the [previous notebook](https://agilearn.co.uk/guides/classes-and-objects/learn/02-dunder-methods) was about fifteen lines. The dataclass version is four:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2c8212a9",
   "metadata": {},
   "outputs": [],
   "source": "from dataclasses import dataclass\n\n@dataclass\nclass Point:\n    x: float\n    y: float\n\np1 = Point(3, 4)\np2 = Point(3, 4)\n\nprint(p1)            # __repr__ for free\nprint(p1 == p2)      # __eq__ for free"
  },
  {
   "cell_type": "markdown",
   "id": "e4a37786",
   "metadata": {},
   "source": "From a list of *annotated* fields, `@dataclass` generated:\n\n- `__init__` that takes `x` and `y` as parameters and assigns them to `self`.\n- `__repr__` that prints `Point(x=3, y=4)`.\n- `__eq__` that compares all the fields in order.\n\nThe annotations (`x: float`, `y: float`) aren't enforced at runtime — they're hints, the same as anywhere else in Python. But they're required for `@dataclass` to recognise the field. A bare `x = 0` line wouldn't be picked up."
  },
  {
   "cell_type": "markdown",
   "id": "6bb58049",
   "metadata": {},
   "source": "## Defaults and `field()`\n\nSimple defaults work as you'd expect:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8f604622",
   "metadata": {},
   "outputs": [],
   "source": "@dataclass\nclass Page:\n    title: str\n    word_count: int = 0\n    published: bool = False\n\nprint(Page(\"Untitled\"))\nprint(Page(\"Hello\", word_count=42, published=True))"
  },
  {
   "cell_type": "markdown",
   "id": "182049ec",
   "metadata": {},
   "source": "Just like ordinary functions, fields with defaults must come after fields without. If you try to put an undefaulted field after a defaulted one, `@dataclass` raises a `TypeError`."
  },
  {
   "cell_type": "markdown",
   "id": "8066862d",
   "metadata": {},
   "source": "### Mutable defaults need `default_factory`\n\nYou can't use a mutable object — `[]`, `{}`, `set()` — as a default value, because every instance would share the same object. `@dataclass` catches this and refuses to apply the decorator:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dcf495f3",
   "metadata": {},
   "outputs": [],
   "source": "from dataclasses import field\n\ntry:\n    @dataclass\n    class Bag:\n        items: list = []\nexcept ValueError as e:\n    print(f\"{type(e).__name__}: {e}\")"
  },
  {
   "cell_type": "markdown",
   "id": "24c4665e",
   "metadata": {},
   "source": "Use `field(default_factory=...)` instead. The factory is called fresh for each new instance:"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2d3f235e",
   "metadata": {},
   "outputs": [],
   "source": "@dataclass\nclass Bag:\n    items: list = field(default_factory=list)\n\na = Bag()\nb = Bag()\na.items.append(\"apple\")\nprint(a.items, b.items)   # b is unaffected"
  },
  {
   "cell_type": "markdown",
   "id": "9ffa1521",
   "metadata": {},
   "source": "## `frozen=True` — immutable instances\n\nPass `frozen=True` to the decorator and you get an immutable dataclass. Trying to assign to a field after construction raises. As a bonus, frozen dataclasses get a `__hash__`, so instances can go in `set`s and `dict` keys."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "effbc9d4",
   "metadata": {},
   "outputs": [],
   "source": "@dataclass(frozen=True)\nclass Coord:\n    lat: float\n    lon: float\n\nhome = Coord(51.5, -0.1)\n\ntry:\n    home.lat = 52.0\nexcept Exception as e:\n    print(f\"{type(e).__name__}: {e}\")\n\nprint({Coord(51.5, -0.1), Coord(51.5, -0.1)})   # hashable; one survives"
  },
  {
   "cell_type": "markdown",
   "id": "d2eca341",
   "metadata": {},
   "source": "Use `frozen=True` for value-like types: coordinates, money, configuration records — anything where two instances with the same fields *are* the same thing."
  },
  {
   "cell_type": "markdown",
   "id": "a8bf8bc0",
   "metadata": {},
   "source": "## `slots=True` — smaller, stricter instances\n\nPass `slots=True` (added in Python 3.10) and the generated class uses `__slots__` instead of an instance `__dict__`. Two practical effects:\n\n- Instances use noticeably less memory — useful when you're creating millions of them.\n- Setting an attribute that wasn't declared raises `AttributeError`, instead of silently adding it."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ba4897de",
   "metadata": {},
   "outputs": [],
   "source": "@dataclass(slots=True)\nclass Pixel:\n    x: int\n    y: int\n    colour: str\n\np = Pixel(0, 0, \"red\")\n\ntry:\n    p.alpha = 0.5      # not declared — refused\nexcept AttributeError as e:\n    print(f\"{type(e).__name__}: {e}\")"
  },
  {
   "cell_type": "markdown",
   "id": "ba8f1c1d",
   "metadata": {},
   "source": "## `order=True` — free comparisons\n\nPass `order=True` and `@dataclass` adds `__lt__`, `__le__`, `__gt__`, and `__ge__`. Comparison is tuple-style: it compares fields left-to-right, in declaration order."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "54d29e6e",
   "metadata": {},
   "outputs": [],
   "source": "@dataclass(order=True)\nclass Task:\n    priority: int\n    title: str\n\ntasks = [Task(2, \"write tests\"), Task(1, \"fix bug\"), Task(2, \"update docs\")]\nfor t in sorted(tasks):\n    print(t)"
  },
  {
   "cell_type": "markdown",
   "id": "476d8953",
   "metadata": {},
   "source": "Field declaration order matters here. `Task(1, ...)` sorts before any `Task(2, ...)` because priority is the first field. If you want to sort by `title` first, declare `title` first."
  },
  {
   "cell_type": "markdown",
   "id": "9e8373dd",
   "metadata": {},
   "source": "## `__post_init__` — derived fields and validation\n\nWhen the generated `__init__` isn't enough — you need a derived field, or you want to validate the inputs — define `__post_init__`. Dataclass calls it after the auto-generated `__init__` finishes."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7eaf6b63",
   "metadata": {},
   "outputs": [],
   "source": "@dataclass\nclass Rectangle:\n    width: float\n    height: float\n\n    def __post_init__(self):\n        if self.width <= 0 or self.height <= 0:\n            raise ValueError(\"sides must be positive\")\n        self.area = self.width * self.height\n\nr = Rectangle(3, 4)\nprint(r.area)\n\ntry:\n    Rectangle(-1, 4)\nexcept ValueError as e:\n    print(f\"{type(e).__name__}: {e}\")"
  },
  {
   "cell_type": "markdown",
   "id": "19e0c089",
   "metadata": {},
   "source": "## `NamedTuple` — when you really do want a tuple\n\nIf your record-like type is genuinely tuple-shaped — small, immutable, sometimes unpacked — `typing.NamedTuple` is even lighter than a frozen dataclass. It *is* a tuple, with attribute access added on top."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a0dec619",
   "metadata": {},
   "outputs": [],
   "source": "from typing import NamedTuple\n\nclass Coord(NamedTuple):\n    lat: float\n    lon: float\n\nhome = Coord(51.5, -0.1)\nprint(home.lat, home.lon)        # attribute access\nlat, lon = home                  # tuple unpacking\nprint(lat, lon)\nprint(home == (51.5, -0.1))      # equal to a plain tuple with same contents!"
  },
  {
   "cell_type": "markdown",
   "id": "d493d314",
   "metadata": {},
   "source": "Two things that follow from \"it's a tuple\":\n\n- Always immutable. No `frozen=True` decision to make.\n- Equal to plain tuples with the same contents — `Coord(51.5, -0.1) == (51.5, -0.1)` is `True`. Sometimes useful, sometimes a footgun.\n\nReach for `NamedTuple` when the type is small (two or three fields), immutable, and you genuinely want tuple-like behaviour. For anything bigger or mutable, prefer `@dataclass`."
  },
  {
   "cell_type": "markdown",
   "id": "62c30dda",
   "metadata": {},
   "source": "## `TypedDict` — typed dicts, not classes\n\nIf your data really is a dict — comes from JSON, gets passed to a library that expects a dict — but you want type-checker support for the keys, use `typing.TypedDict`. It's not a class in the runtime sense; it's a hint for type checkers like mypy and pyright. The [type hints guide](https://agilearn.co.uk/guides/type-hints) covers it in detail."
  },
  {
   "cell_type": "markdown",
   "id": "ef5e7aa0",
   "metadata": {},
   "source": "## Choosing between the three\n\nA quick decision tree:\n\n- **Reach for `@dataclass`** for almost everything. Mutable record types, value types (with `frozen=True`), anything with more than three fields, anything that needs `__post_init__` validation.\n- **Use `NamedTuple`** when the type is small, immutable, and tuple-like — coordinate pairs, key-value entries, points in time.\n- **Use `TypedDict`** when the data is genuinely a dict and you only need static type information.\n- **Hand-write the class** when you need behaviour that's hard to express through dataclass field declarations — heavy custom dunders, descriptors, complex constructor logic.\n\nThe [choose-between recipe](https://agilearn.co.uk/guides/classes-and-objects/recipes/choose-between-dataclass-namedtuple-class) goes deeper into the trade-offs."
  },
  {
   "cell_type": "markdown",
   "id": "dfde2437",
   "metadata": {},
   "source": "## Exercise\n\nDefine an `Order` dataclass for a small e-commerce system:\n\n- `id: str`\n- `customer: str`\n- `items: list[str]` (default to an empty list — remember `default_factory`)\n- `discount: float = 0.0`\n- A `__post_init__` that raises `ValueError` if `discount` is outside `0.0 <= discount <= 1.0`.\n- Make it `frozen=True` and `slots=True`.\n\nTest that an order with a 0.1 discount works, that the items field is independent across instances, and that a discount of 1.5 raises."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d45261a3",
   "metadata": {},
   "outputs": [],
   "source": "# Your code here\n"
  },
  {
   "cell_type": "markdown",
   "id": "feb4b933",
   "metadata": {},
   "source": "<details>\n<summary>Solution</summary>\n\n```python\nfrom dataclasses import dataclass, field\n\n@dataclass(frozen=True, slots=True)\nclass Order:\n    id: str\n    customer: str\n    items: list[str] = field(default_factory=list)\n    discount: float = 0.0\n\n    def __post_init__(self):\n        if not 0.0 <= self.discount <= 1.0:\n            raise ValueError(\"discount must be between 0 and 1\")\n\n# Note: with frozen=True you can't mutate `items` by reassigning it,\n# but you can still mutate the list contents — frozen freezes the\n# attribute binding, not the objects it points to.\n```\n</details>"
  },
  {
   "cell_type": "markdown",
   "id": "939a7604",
   "metadata": {},
   "source": "## Recap\n\n- `@dataclass` generates `__init__`, `__repr__`, and `__eq__` from an annotated field list.\n- Mutable defaults must use `field(default_factory=...)`.\n- `frozen=True` makes instances immutable and hashable.\n- `slots=True` saves memory and refuses undeclared attributes.\n- `order=True` adds `__lt__` and friends; comparison follows declaration order.\n- `__post_init__` runs after the generated `__init__` — use it for validation or derived fields.\n- For small immutable tuple-like types, `typing.NamedTuple` is even lighter.\n- For dict-shaped data, `typing.TypedDict` gives static type support.\n\nNext: [Inheritance and composition](https://agilearn.co.uk/guides/classes-and-objects/learn/04-inheritance-and-composition), where we'll see why Python programmers reach for inheritance less often than other-language programmers expect."
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}