{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Async and await\n",
    "\n",
    "`async`/`await` is Python's third concurrency tool, and the one that scales I/O-bound work the furthest. It runs **everything on a single thread**, with no GIL contention and no locks — yet it can juggle thousands of simultaneous connections. The trick is *cooperative* multitasking: tasks voluntarily hand control back whenever they hit a wait, and a scheduler called the **event loop** runs whichever task is ready.\n",
    "\n",
    "That cooperation is also the catch. A task only yields control at an `await`. If any piece of code runs for a long time *without* awaiting — a heavy computation, or a blocking call like `time.sleep` — it freezes every other task. The whole notebook comes down to: await the right things, and never block the loop.\n",
    "\n",
    "> Unlike threads and processes, these examples **do** run in the in-browser sandbox. Because the notebook already runs inside an event loop, the cells use `await main()` directly; in a normal `.py` script you'd write `asyncio.run(main())` instead — shown in comments throughout. The one exception is the final `asyncio.to_thread` example: it hands work to a real thread, so it's marked *\"runs locally only\"* and has no Run button."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Coroutines: `async def` and `await`\n",
    "\n",
    "A function defined with `async def` is a **coroutine function**. Calling it does *not* run the body — it returns a **coroutine object**, much like calling a generator function returns a generator. To actually run it, you `await` it (from inside another coroutine) or hand it to the event loop."
   ]
  },
  {
   "cell_type": "code",
   "metadata": {},
   "execution_count": null,
   "outputs": [],
   "source": [
    "import asyncio\n",
    "\n",
    "async def greet(name):\n",
    "    print(f'hello, {name}')\n",
    "    await asyncio.sleep(1)        # yields control for ~1s instead of blocking\n",
    "    print(f'goodbye, {name}')\n",
    "    return name.upper()\n",
    "\n",
    "coro = greet('Ada')\n",
    "print(type(coro))                 # a coroutine object — body hasn't run yet\n",
    "\n",
    "result = await coro               # in a .py script: asyncio.run(greet('Ada'))\n",
    "print('returned:', result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The key line is `await asyncio.sleep(1)`. This is the async cousin of `time.sleep` — but where `time.sleep` *blocks the whole thread*, `asyncio.sleep` *yields* to the event loop, letting other tasks run during the second. **Inside async code you must use the async-aware versions of waits** (`asyncio.sleep`, async libraries' I/O calls); a stray `time.sleep` freezes everything."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## `asyncio.run` is the front door\n",
    "\n",
    "In a script, `asyncio.run(main())` starts the event loop, runs the `main()` coroutine to completion, and shuts the loop down. It's the single entry point from ordinary synchronous code into the async world. You call it **once**, at the top of your program.\n",
    "\n",
    "```python\n",
    "async def main():\n",
    "    await greet('Ada')\n",
    "\n",
    "if __name__ == '__main__':\n",
    "    asyncio.run(main())     # the one place sync code enters async code\n",
    "```\n",
    "\n",
    "Because this notebook is *already* inside a running loop, calling `asyncio.run` here would raise `RuntimeError: asyncio.run() cannot be called from a running event loop`. That's why the cells `await` directly instead."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Running things concurrently with `gather`\n",
    "\n",
    "Awaiting coroutines one after another is still sequential — each finishes before the next starts. To overlap them, hand several to **`asyncio.gather`**, which schedules them all on the loop at once and waits for them all, returning their results **in order**."
   ]
  },
  {
   "cell_type": "code",
   "metadata": {},
   "execution_count": null,
   "outputs": [],
   "source": [
    "async def fetch(url):\n",
    "    print(f'start  {url}')\n",
    "    await asyncio.sleep(1)        # pretend this is a 1s network call\n",
    "    print(f'finish {url}')\n",
    "    return f'{url} -> ok'\n",
    "\n",
    "async def main():\n",
    "    import time\n",
    "    start = time.perf_counter()\n",
    "    results = await asyncio.gather(\n",
    "        fetch('A'), fetch('B'), fetch('C'),\n",
    "    )\n",
    "    print(results)\n",
    "    print(f'elapsed: {time.perf_counter() - start:.2f}s')   # ~1s, not 3s\n",
    "\n",
    "await main()                      # in a .py script: asyncio.run(main())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Three one-second waits finished in about one second — they overlapped. Notice all three `start` lines print before any `finish`: each `fetch` ran up to its `await`, yielded, and the loop moved on to the next."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## `create_task` schedules work to run in the background\n",
    "\n",
    "`gather` is convenient when you have the list of coroutines up front. When you want a coroutine to **start running now** while you do other things, wrap it in a **task** with `asyncio.create_task`. The task begins on the next `await`; you collect its result by awaiting the task later."
   ]
  },
  {
   "cell_type": "code",
   "metadata": {},
   "execution_count": null,
   "outputs": [],
   "source": [
    "async def main():\n",
    "    task = asyncio.create_task(fetch('background'))  # scheduled to run\n",
    "    print('task created; doing other work meanwhile')\n",
    "    await asyncio.sleep(0.2)\n",
    "    print('other work done; now waiting on the task')\n",
    "    result = await task                              # get its result\n",
    "    print(result)\n",
    "\n",
    "await main()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A coroutine you simply call and never `await` or wrap in a task **never runs** — and Python warns `coroutine was never awaited`. Creating a task is also how you get true overlap between a background job and foreground work."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## `TaskGroup`: the modern way to run a batch (Python 3.11+)\n",
    "\n",
    "`asyncio.TaskGroup` is the recommended structured replacement for `gather`. You create tasks inside an `async with` block; the block doesn't exit until they all finish. Its big advantage is **error handling**: if one task fails, the others are cancelled and the error propagates cleanly, so you never leak half-finished tasks."
   ]
  },
  {
   "cell_type": "code",
   "metadata": {},
   "execution_count": null,
   "outputs": [],
   "source": [
    "async def main():\n",
    "    results = []\n",
    "    async with asyncio.TaskGroup() as tg:          # Python 3.11+\n",
    "        for url in ('A', 'B', 'C'):\n",
    "            tg.create_task(fetch(url))\n",
    "        # the 'async with' block exits only when all tasks are done\n",
    "    print('all tasks complete')\n",
    "\n",
    "await main()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On Python 3.10 or earlier, use `asyncio.gather` instead. For new code on 3.11+, prefer `TaskGroup` for anything beyond a one-off `gather`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bounded concurrency with a `Semaphore`\n",
    "\n",
    "\"Fetch 10,000 URLs concurrently\" rarely means *all at once* — you'd exhaust sockets or get rate-limited. A **`Semaphore`** caps how many tasks are in the critical section simultaneously: acquire before the work, release after, and only N proceed at a time."
   ]
  },
  {
   "cell_type": "code",
   "metadata": {},
   "execution_count": null,
   "outputs": [],
   "source": [
    "async def fetch_limited(url, sem):\n",
    "    async with sem:                  # at most N of these run concurrently\n",
    "        await asyncio.sleep(0.5)\n",
    "        return f'{url} done'\n",
    "\n",
    "async def main():\n",
    "    sem = asyncio.Semaphore(3)       # cap at 3 in flight\n",
    "    urls = [f'url-{i}' for i in range(9)]\n",
    "    results = await asyncio.gather(*(fetch_limited(u, sem) for u in urls))\n",
    "    print(f'{len(results)} done')    # 9 tasks, 3 at a time -> ~1.5s\n",
    "\n",
    "await main()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Timeouts: don't let one slow task stall the batch\n",
    "\n",
    "Wrap an await in `asyncio.timeout` (3.11+) to cancel it if it runs too long; it raises `TimeoutError`. On older versions, `asyncio.wait_for(coro, timeout)` does the same job."
   ]
  },
  {
   "cell_type": "code",
   "metadata": {},
   "execution_count": null,
   "outputs": [],
   "source": [
    "async def slow():\n",
    "    await asyncio.sleep(5)\n",
    "    return 'eventually'\n",
    "\n",
    "async def main():\n",
    "    try:\n",
    "        async with asyncio.timeout(1):     # 3.11+; else: await asyncio.wait_for(slow(), 1)\n",
    "            await slow()\n",
    "    except TimeoutError:\n",
    "        print('gave up after 1s')\n",
    "\n",
    "await main()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The cardinal rule: never block the loop\n",
    "\n",
    "Everything above relies on tasks yielding at `await`. A long **synchronous** call — `time.sleep`, a CPU-heavy loop, a blocking `requests.get`, a synchronous DB driver — doesn't yield, so it freezes *every* task until it returns. This is the single most common async bug.\n",
    "\n",
    "When you must call blocking code from async, push it onto a thread with **`asyncio.to_thread`**, which runs it in a thread pool and gives you back an awaitable. (For CPU-bound work, send it to a `ProcessPoolExecutor` via `loop.run_in_executor` instead — threads won't help there.)"
   ]
  },
  {
   "cell_type": "code",
   "metadata": {
    "tags": [
     "no-run"
    ]
   },
   "execution_count": null,
   "outputs": [],
   "source": [
    "import time\n",
    "\n",
    "def blocking_call(n):\n",
    "    time.sleep(1)            # a synchronous library you can't change\n",
    "    return n * 2\n",
    "\n",
    "async def main():\n",
    "    # WRONG: calling blocking_call(3) directly would freeze the loop for 1s.\n",
    "    # RIGHT: run it in a worker thread so the loop stays responsive.\n",
    "    result = await asyncio.to_thread(blocking_call, 3)\n",
    "    print('result:', result)\n",
    "\n",
    "await main()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Recap\n",
    "\n",
    "- `async def` defines a coroutine; calling it returns a coroutine object that does nothing until **awaited**.\n",
    "- `asyncio.run(main())` is the one entry point from sync code (in a notebook, `await main()`).\n",
    "- Use **async-aware waits** (`asyncio.sleep`, async I/O libraries) — a `time.sleep` freezes the loop.\n",
    "- Overlap work with `gather`, background it with `create_task`, and prefer **`TaskGroup`** (3.11+) for batches.\n",
    "- Bound fan-out with a `Semaphore`; bound duration with `asyncio.timeout`/`wait_for`.\n",
    "- **Never block the loop** — offload blocking calls with `asyncio.to_thread`.\n",
    "\n",
    "That completes the tour of all three tools. The [Recipes](https://agilearn.co.uk/guides/concurrency/recipes) put them to work on concrete tasks, and the [Concepts](https://agilearn.co.uk/guides/concurrency/concepts) explain the GIL and how to choose between the three for any given problem."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}