{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "intro",
   "metadata": {},
   "source": [
    "# Working with paths\n",
    "\n",
    "The `pathlib` module is the modern, object-oriented way to work with file system paths in Python. It provides a cleaner and more intuitive interface than string-based path manipulation, and it works consistently across different operating systems.\n",
    "\n",
    "**Time commitment:** 15–20 minutes\n",
    "\n",
    "**Prerequisites:**\n",
    "\n",
    "- Basic Python knowledge\n",
    "- Completion of [Reading files](https://agilearn.co.uk/guides/file-handling/learn/01-reading-files) and [Writing files](https://agilearn.co.uk/guides/file-handling/learn/02-writing-files)\n",
    "\n",
    "## Learning objectives\n",
    "\n",
    "By the end of this tutorial, you will be able to:\n",
    "\n",
    "- Create `Path` objects from strings\n",
    "- Combine paths using the `/` operator\n",
    "- Access path components (name, stem, suffix, parent)\n",
    "- Check whether files and directories exist\n",
    "- List directory contents\n",
    "- Create and remove directories"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "creating-heading",
   "metadata": {},
   "source": [
    "## Creating `Path` objects\n",
    "\n",
    "You create a `Path` object by passing a string to the `Path` constructor. The `Path` class also provides class methods for common locations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "creating-demo",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "current = Path(\".\")\n",
    "home = Path.home()\n",
    "file_path = Path(\"data\") / \"reports\" / \"summary.txt\"\n",
    "\n",
    "print(f\"Current directory: {current.resolve()}\")\n",
    "print(f\"Home directory: {home}\")\n",
    "print(f\"File path: {file_path}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "combining-heading",
   "metadata": {},
   "source": [
    "## Combining paths with the `/` operator\n",
    "\n",
    "The `/` operator joins path components together. This is platform-independent &ndash; it uses the correct separator for each operating system."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "combining-demo",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "base = Path(\"projects\")\n",
    "project = base / \"my-app\"\n",
    "config = project / \"config\" / \"settings.txt\"\n",
    "\n",
    "print(config)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "combining-explain",
   "metadata": {},
   "source": [
    "This is much cleaner and safer than concatenating strings with `\"/\"` or `\"\\\\\"`, which can break on different platforms."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "components-heading",
   "metadata": {},
   "source": [
    "## Path components\n",
    "\n",
    "A `Path` object gives you easy access to the different parts of a file path."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "components-demo",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "path = Path(\"documents\") / \"reports\" / \"annual-report.pdf\"\n",
    "\n",
    "print(f\"Full path: {path}\")\n",
    "print(f\"Name: {path.name}\")\n",
    "print(f\"Stem: {path.stem}\")\n",
    "print(f\"Suffix: {path.suffix}\")\n",
    "print(f\"Parent: {path.parent}\")\n",
    "print(f\"Parts: {path.parts}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "components-explain",
   "metadata": {},
   "source": [
    "Here is what each property returns:\n",
    "\n",
    "- `name` -- the final component of the path (file name with extension)\n",
    "- `stem` -- the file name without the extension\n",
    "- `suffix` -- the file extension (including the dot)\n",
    "- `parent` -- the directory containing the file\n",
    "- `parts` -- a tuple of all the path components"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "checking-heading",
   "metadata": {},
   "source": [
    "## Checking whether files and directories exist\n",
    "\n",
    "The `Path` class provides methods to check the status of a path."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "checking-demo",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "Path(\"test-file.txt\").write_text(\"Hello\", encoding=\"utf-8\")\n",
    "\n",
    "print(f\"test-file.txt exists: {Path('test-file.txt').exists()}\")\n",
    "print(f\"test-file.txt is a file: {Path('test-file.txt').is_file()}\")\n",
    "print(f\"test-file.txt is a directory: {Path('test-file.txt').is_dir()}\")\n",
    "print(f\"nonexistent.txt exists: {Path('nonexistent.txt').exists()}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "resolve-heading",
   "metadata": {},
   "source": [
    "## Resolving and absolute paths\n",
    "\n",
    "The `resolve()` method converts a relative path to an absolute path. The `is_absolute()` method tells you whether a path is already absolute."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "resolve-demo",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "relative = Path(\"test-file.txt\")\n",
    "absolute = relative.resolve()\n",
    "\n",
    "print(f\"Relative: {relative}\")\n",
    "print(f\"Absolute: {absolute}\")\n",
    "print(f\"Is absolute: {absolute.is_absolute()}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "mkdir-heading",
   "metadata": {},
   "source": [
    "## Creating directories\n",
    "\n",
    "The `mkdir()` method creates a directory. Use `parents=True` to create all intermediate directories, and `exist_ok=True` to avoid errors if the directory already exists."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "mkdir-demo",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "Path(\"example-dir\").mkdir(exist_ok=True)\n",
    "Path(\"nested/sub/dir\").mkdir(parents=True, exist_ok=True)\n",
    "\n",
    "print(f\"example-dir exists: {Path('example-dir').exists()}\")\n",
    "print(f\"nested/sub/dir exists: {Path('nested/sub/dir').exists()}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "listing-heading",
   "metadata": {},
   "source": [
    "## Listing directory contents\n",
    "\n",
    "The `iterdir()` method iterates over all items in a directory. The `glob()` method filters items by a pattern."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "listing-demo",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "# Create some files in the example directory\n",
    "(Path(\"example-dir\") / \"file1.txt\").write_text(\"Content 1\", encoding=\"utf-8\")\n",
    "(Path(\"example-dir\") / \"file2.txt\").write_text(\"Content 2\", encoding=\"utf-8\")\n",
    "(Path(\"example-dir\") / \"data.csv\").write_text(\"a,b,c\", encoding=\"utf-8\")\n",
    "\n",
    "print(\"All items in example-dir:\")\n",
    "for item in sorted(Path(\"example-dir\").iterdir()):\n",
    "    print(f\"  {item.name} (file: {item.is_file()})\")\n",
    "\n",
    "print(\"\\nOnly .txt files:\")\n",
    "for item in sorted(Path(\"example-dir\").glob(\"*.txt\")):\n",
    "    print(f\"  {item.name}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "rename-heading",
   "metadata": {},
   "source": [
    "## Renaming and deleting\n",
    "\n",
    "The `rename()` method renames a file or directory, and the `unlink()` method deletes a file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "rename-demo",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "Path(\"old-name.txt\").write_text(\"Some content\", encoding=\"utf-8\")\n",
    "Path(\"old-name.txt\").rename(\"new-name.txt\")\n",
    "\n",
    "print(f\"old-name.txt exists: {Path('old-name.txt').exists()}\")\n",
    "print(f\"new-name.txt exists: {Path('new-name.txt').exists()}\")\n",
    "\n",
    "Path(\"new-name.txt\").unlink()\n",
    "print(f\"After deletion: {Path('new-name.txt').exists()}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "withsuffix-heading",
   "metadata": {},
   "source": [
    "## Changing path components\n",
    "\n",
    "The `with_name()`, `with_stem()`, and `with_suffix()` methods return new `Path` objects with modified components."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "withsuffix-demo",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "path = Path(\"documents\") / \"report.txt\"\n",
    "\n",
    "print(f\"Original: {path}\")\n",
    "print(f\"Change suffix: {path.with_suffix('.md')}\")\n",
    "print(f\"Change name: {path.with_name('notes.txt')}\")\n",
    "print(f\"Change stem: {path.with_stem('summary')}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "practical-heading",
   "metadata": {},
   "source": [
    "## Practical example: organising files by extension\n",
    "\n",
    "Here is a practical function that groups files in a directory by their extension."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "practical-demo",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "\n",
    "def list_files_by_extension(directory: str | Path) -> dict[str, list[str]]:\n",
    "    \"\"\"Group files in a directory by their extension.\n",
    "\n",
    "    Args:\n",
    "        directory: The path to the directory to scan.\n",
    "\n",
    "    Returns:\n",
    "        A dictionary mapping extensions to lists of file names.\n",
    "    \"\"\"\n",
    "    path = Path(directory)\n",
    "    result: dict[str, list[str]] = {}\n",
    "    for item in path.iterdir():\n",
    "        if item.is_file():\n",
    "            ext = item.suffix if item.suffix else \"(no extension)\"\n",
    "            result.setdefault(ext, []).append(item.name)\n",
    "    return result\n",
    "\n",
    "\n",
    "grouped = list_files_by_extension(\"example-dir\")\n",
    "for ext, files in sorted(grouped.items()):\n",
    "    print(f\"{ext}: {files}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "exercises-heading",
   "metadata": {},
   "source": [
    "## Exercises\n",
    "\n",
    "Try these exercises to practise what you have learned.\n",
    "\n",
    "**Exercise 1:** Write a function that takes a directory path and returns the total number of files (not directories) in it.\n",
    "\n",
    "**Exercise 2:** Write a function that takes a file path and returns a new path with a different extension (for example, change `.txt` to `.md`).\n",
    "\n",
    "**Exercise 3:** Create a directory structure `project/src` and `project/tests`, create a sample `.py` file in each, then list all `.py` files recursively using `rglob()`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "solutions-heading",
   "metadata": {},
   "source": [
    "### Solutions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "solution-1",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "\n",
    "def count_files(directory: str | Path) -> int:\n",
    "    \"\"\"Count the number of files in a directory.\n",
    "\n",
    "    Args:\n",
    "        directory: The path to the directory to count files in.\n",
    "\n",
    "    Returns:\n",
    "        The number of files (not directories) in the directory.\n",
    "    \"\"\"\n",
    "    path = Path(directory)\n",
    "    return sum(1 for item in path.iterdir() if item.is_file())\n",
    "\n",
    "\n",
    "print(f\"Files in example-dir: {count_files('example-dir')}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "solution-2",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "\n",
    "def change_extension(filepath: str | Path, new_extension: str) -> Path:\n",
    "    \"\"\"Return a new path with a different file extension.\n",
    "\n",
    "    Args:\n",
    "        filepath: The original file path.\n",
    "        new_extension: The new extension (including the dot).\n",
    "\n",
    "    Returns:\n",
    "        A new Path object with the extension changed.\n",
    "    \"\"\"\n",
    "    return Path(filepath).with_suffix(new_extension)\n",
    "\n",
    "\n",
    "original = Path(\"documents/report.txt\")\n",
    "changed = change_extension(original, \".md\")\n",
    "print(f\"Original: {original}\")\n",
    "print(f\"Changed: {changed}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "solution-3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "Path(\"project/src\").mkdir(parents=True, exist_ok=True)\n",
    "Path(\"project/tests\").mkdir(parents=True, exist_ok=True)\n",
    "\n",
    "(Path(\"project/src\") / \"main.py\").write_text(\n",
    "    'print(\"Hello\")\\n', encoding=\"utf-8\"\n",
    ")\n",
    "(Path(\"project/tests\") / \"test_main.py\").write_text(\n",
    "    \"import unittest\\n\", encoding=\"utf-8\"\n",
    ")\n",
    "\n",
    "print(\"All .py files:\")\n",
    "for py_file in sorted(Path(\"project\").rglob(\"*.py\")):\n",
    "    print(f\"  {py_file}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cleanup",
   "metadata": {},
   "outputs": [],
   "source": [
    "import shutil\n",
    "from pathlib import Path\n",
    "\n",
    "Path(\"test-file.txt\").unlink(missing_ok=True)\n",
    "for d in [\"example-dir\", \"nested\", \"project\"]:\n",
    "    if Path(d).exists():\n",
    "        shutil.rmtree(d)\n",
    "\n",
    "print(\"Temporary files and directories removed.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "summary",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "In this tutorial, you learned how to use `pathlib.Path` for file and directory operations. Here are the key takeaways:\n",
    "\n",
    "- `pathlib.Path` provides an object-oriented way to work with file paths\n",
    "- The `/` operator joins paths in a platform-independent way\n",
    "- Path components (`name`, `stem`, `suffix`, `parent`) let you inspect paths easily\n",
    "- `exists()`, `is_file()`, and `is_dir()` check the status of a path\n",
    "- `mkdir()` creates directories (use `parents=True` for nested directories)\n",
    "- `iterdir()` and `glob()` list directory contents\n",
    "- `with_suffix()`, `with_name()`, and `with_stem()` create modified path copies\n",
    "\n",
    "In the [next tutorial](https://agilearn.co.uk/guides/file-handling/learn/04-csv-and-json-files), you will learn how to work with CSV and JSON files."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.12.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}