CSV and JSON files¶
CSV and JSON are two of the most common formats for storing and exchanging structured data. Python provides built-in modules for working with both formats, making it straightforward to read, write, and process structured data.
Time commitment: 15–20 minutes
Prerequisites:
- Basic Python knowledge (dictionaries, lists)
- Completion of Reading files, Writing files, and Working with paths
Learning objectives¶
By the end of this tutorial, you will be able to:
- Read CSV files using
csv.reader() - Write CSV files using
csv.writer() - Use
csv.DictReader()andcsv.DictWriter()for dictionary-based access - Read JSON files using
json.load() - Write JSON files using
json.dump() - Convert between Python objects and JSON strings
Part 1: CSV files¶
CSV (Comma-Separated Values) files store tabular data as plain text. Each line represents a row, and values within a row are separated by commas.
Reading CSV files¶
First, let us create a sample CSV file. Then you will read it using csv.reader().
from pathlib import Path
csv_content = """name,age,city
Alice,30,London
Bob,25,Manchester
Charlie,35,Edinburgh"""
Path("people.csv").write_text(csv_content, encoding="utf-8")
print("CSV file created.")
import csv
with open("people.csv", "r", encoding="utf-8", newline="") as f:
reader = csv.reader(f)
for row in reader:
print(row)
Each row is returned as a list of strings. Notice the newline="" parameter – this is required when opening CSV files to prevent issues with newline handling.
Skipping the header row¶
The first row of a CSV file typically contains column names. Use next() to read and skip the header.
import csv
with open("people.csv", "r", encoding="utf-8", newline="") as f:
reader = csv.reader(f)
header = next(reader)
print(f"Columns: {header}")
for row in reader:
print(f"{row[0]} is {row[1]} years old and lives in {row[2]}")
Using csv.DictReader() for named access¶
csv.DictReader() automatically uses the first row as field names. Each row is returned as a dictionary, which makes the code easier to read.
import csv
with open("people.csv", "r", encoding="utf-8", newline="") as f:
reader = csv.DictReader(f)
for row in reader:
print(f"{row['name']} is {row['age']} years old")
Writing CSV files¶
Use csv.writer() to write data to a CSV file. The writerow() method writes a single row, and writerows() writes multiple rows at once.
import csv
from pathlib import Path
data = [
["product", "price", "quantity"],
["Apples", "1.50", "10"],
["Bread", "2.00", "5"],
["Milk", "1.20", "8"],
]
with open("products.csv", "w", encoding="utf-8", newline="") as f:
writer = csv.writer(f)
writer.writerows(data)
print(Path("products.csv").read_text(encoding="utf-8"))
Writing CSV files with csv.DictWriter()¶
csv.DictWriter() writes dictionaries to a CSV file. You specify the field names, and the writer handles the ordering.
import csv
from pathlib import Path
people = [
{"name": "Diana", "age": 28, "city": "Bristol"},
{"name": "Edward", "age": 42, "city": "Leeds"},
]
with open("new-people.csv", "w", encoding="utf-8", newline="") as f:
fieldnames = ["name", "age", "city"]
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(people)
print(Path("new-people.csv").read_text(encoding="utf-8"))
Part 2: JSON files¶
JSON (JavaScript Object Notation) is a lightweight data format that is easy for both humans and machines to read. It maps naturally to Python dictionaries and lists.
Reading JSON files¶
Use json.load() to read JSON data from a file. The data is automatically converted to Python objects (dictionaries, lists, strings, numbers, and booleans).
from pathlib import Path
json_content = """{
"name": "Alice",
"age": 30,
"city": "London",
"hobbies": ["reading", "cycling", "cooking"]
}"""
Path("person.json").write_text(json_content, encoding="utf-8")
print("JSON file created.")
import json
from pathlib import Path
with Path("person.json").open("r", encoding="utf-8") as f:
data = json.load(f)
print(data)
print(f"Name: {data['name']}")
print(f"Hobbies: {', '.join(data['hobbies'])}")
Writing JSON files¶
Use json.dump() to write Python objects to a JSON file. The indent parameter produces human-readable output.
import json
from pathlib import Path
student = {
"name": "Bob",
"age": 25,
"subjects": ["Mathematics", "Physics", "Chemistry"],
"graduated": False,
}
with Path("student.json").open("w", encoding="utf-8") as f:
json.dump(student, f, indent=4)
print(Path("student.json").read_text(encoding="utf-8"))
Working with JSON strings¶
The json module also provides functions for working with JSON strings directly, without files:
json.loads()-- parse a JSON string into Python objectsjson.dumps()-- convert Python objects to a JSON string
import json
json_string = '{"temperature": 21.5, "unit": "Celsius"}'
data = json.loads(json_string)
print(data)
print(type(data))
back_to_string = json.dumps(data, indent=2)
print(back_to_string)
Working with lists of objects¶
JSON files often contain lists of objects. Reading and writing these works the same way.
import json
from pathlib import Path
books = [
{"title": "Pride and Prejudice", "author": "Jane Austen", "year": 1813},
{"title": "1984", "author": "George Orwell", "year": 1949},
{"title": "Great Expectations", "author": "Charles Dickens", "year": 1861},
]
with Path("books.json").open("w", encoding="utf-8") as f:
json.dump(books, f, indent=4)
with Path("books.json").open("r", encoding="utf-8") as f:
loaded_books = json.load(f)
for book in loaded_books:
print(f"{book['title']} by {book['author']} ({book['year']})")
Exercises¶
Try these exercises to practise what you have learned.
Exercise 1: Create a CSV file with a list of five books (title, author, year), then read it back using csv.DictReader() and print each book.
Exercise 2: Write a function that reads a JSON file containing a list of dictionaries and returns only items where a given key matches a given value.
Exercise 3: Write a function that converts a CSV file to a JSON file.
Solutions¶
import csv
books_data = [
{"title": "Emma", "author": "Jane Austen", "year": "1815"},
{"title": "Dracula", "author": "Bram Stoker", "year": "1897"},
{"title": "Frankenstein", "author": "Mary Shelley", "year": "1818"},
{"title": "Jane Eyre", "author": "Charlotte Bront\u00eb", "year": "1847"},
{"title": "Wuthering Heights", "author": "Emily Bront\u00eb", "year": "1847"},
]
with open("books.csv", "w", encoding="utf-8", newline="") as f:
writer = csv.DictWriter(f, fieldnames=["title", "author", "year"])
writer.writeheader()
writer.writerows(books_data)
with open("books.csv", "r", encoding="utf-8", newline="") as f:
reader = csv.DictReader(f)
for row in reader:
print(f"{row['title']} by {row['author']} ({row['year']})")
import json
from pathlib import Path
def filter_json(
filepath: str | Path, key: str, value: object
) -> list[dict]:
"""Read a JSON file and return items matching a key-value pair.
Args:
filepath: The path to the JSON file.
key: The key to filter on.
value: The value to match.
Returns:
A list of dictionaries where the key matches the value.
"""
path = Path(filepath)
with path.open("r", encoding="utf-8") as f:
data = json.load(f)
return [item for item in data if item.get(key) == value]
results = filter_json("books.json", "author", "Jane Austen")
for book in results:
print(book)
import csv
import json
from pathlib import Path
def csv_to_json(csv_path: str | Path, json_path: str | Path) -> None:
"""Convert a CSV file to a JSON file.
Each row in the CSV becomes a dictionary in a JSON array.
Args:
csv_path: The path to the input CSV file.
json_path: The path to the output JSON file.
"""
with open(csv_path, "r", encoding="utf-8", newline="") as f:
reader = csv.DictReader(f)
rows = list(reader)
with Path(json_path).open("w", encoding="utf-8") as f:
json.dump(rows, f, indent=4)
csv_to_json("books.csv", "books-converted.json")
print(Path("books-converted.json").read_text(encoding="utf-8"))
from pathlib import Path
for filename in ["people.csv", "products.csv", "new-people.csv",
"person.json", "student.json", "books.json",
"books.csv", "books-converted.json"]:
Path(filename).unlink(missing_ok=True)
print("Temporary files removed.")
Summary¶
In this tutorial, you learned how to work with two of the most common structured data formats. Here are the key takeaways:
csv.reader()andcsv.writer()handle CSV data as lists of stringscsv.DictReader()andcsv.DictWriter()provide dictionary-based access for cleaner code- Always use
newline=""when opening CSV files json.load()andjson.dump()read and write JSON filesjson.loads()andjson.dumps()work with JSON strings- The
indentparameter makes JSON output human-readable - Always specify
encoding="utf-8"for consistent behaviour
Congratulations! You have completed all four tutorials. You now have a solid foundation in file handling with Python. Explore the Recipes for more advanced techniques, or consult the Reference section for detailed technical documentation.