Reading files¶
Reading files is one of the most fundamental and practical skills you will learn in Python. Whether you need to process data, read configuration, or analyse text, file reading is where it all begins.
Time commitment: 15–20 minutes
Prerequisites:
- Basic Python knowledge (variables, strings, lists, the
print()function)
Learning objectives¶
By the end of this tutorial, you will be able to:
- Open text files using the built-in
open()function - Use
withstatements to manage files safely - Read entire file contents with the
read()method - Read files line by line for memory efficiency
- Use
pathlib.Pathfor a modern approach to file reading
Setting up a sample file¶
Before you can read a file, you need a file to read. The following code creates a sample text file that you will use throughout this tutorial.
from pathlib import Path
sample_content = """Line one of the sample file.
Line two has some different text.
Line three is the final line."""
Path("sample.txt").write_text(sample_content, encoding="utf-8")
print("Sample file created successfully.")
Opening a file with open()¶
The built-in open() function is the primary way to open files in Python. It returns a file object that you can use to read or write data.
The recommended way to open a file is with a with statement. This ensures the file is closed automatically when you are finished, even if an error occurs.
with open("sample.txt", "r", encoding="utf-8") as f:
content = f.read()
print(content)
Here is what happened in that code:
- The
withstatement opens the file and assigns the file object tof "r"means read mode, which is the default modeencoding="utf-8"ensures consistent behaviour across different operating systems- The file is automatically closed when the
withblock ends
You should always specify the encoding explicitly. Without it, Python uses the platform default, which varies between operating systems and can lead to unexpected results.
Reading the entire file with read()¶
The read() method reads the entire file content as a single string. This is convenient for small files, but it is not ideal for very large files because the entire content must fit in memory.
with open("sample.txt", "r", encoding="utf-8") as f:
content = f.read()
print(type(content))
print(repr(content))
Notice that read() returns a single string containing the entire file, including newline characters (\n).
Reading all lines with readlines()¶
The readlines() method returns a list of strings, one for each line in the file. Note that each string includes the newline character \n at the end.
with open("sample.txt", "r", encoding="utf-8") as f:
lines = f.readlines()
print(lines)
print(f"Number of lines: {len(lines)}")
Each line includes the trailing newline character. You can remove these using the strip() method with a list comprehension.
with open("sample.txt", "r", encoding="utf-8") as f:
lines = [line.strip() for line in f.readlines()]
print(lines)
Reading line by line¶
You can iterate directly over a file object to read it line by line. This is the most memory-efficient approach because only one line is held in memory at a time. This is the recommended approach for large files.
with open("sample.txt", "r", encoding="utf-8") as f:
for line in f:
print(line.strip())
Using readline() for one line at a time¶
The readline() method reads a single line each time it is called. This is useful when you only need the first few lines of a file.
with open("sample.txt", "r", encoding="utf-8") as f:
first_line = f.readline()
second_line = f.readline()
print(f"First line: {first_line.strip()}")
print(f"Second line: {second_line.strip()}")
A modern approach with pathlib¶
The pathlib module provides an object-oriented approach to file system paths. The Path class offers convenient methods for reading files.
Path.read_text() is the simplest way to read an entire text file. It opens the file, reads it, and closes it – all in one step.
from pathlib import Path
content = Path("sample.txt").read_text(encoding="utf-8")
print(content)
For more control, you can use Path.open(), which works like the built-in open() but is called on a Path object.
from pathlib import Path
path = Path("sample.txt")
with path.open("r", encoding="utf-8") as f:
for line in f:
print(line.strip())
Exercises¶
Try these exercises to practise what you have learned.
Exercise 1: Create a text file called greeting.txt containing three lines (your name, your favourite colour, and your favourite food), then read the file and print each line.
Exercise 2: Write a function called count_lines that takes a file path and returns the number of lines in the file. Use type hints.
Exercise 3: Write a function called find_longest_line that takes a file path and returns the longest line (stripped of whitespace). Use pathlib.Path.
Solutions¶
from pathlib import Path
Path("greeting.txt").write_text(
"Alice\nBlue\nPasta\n", encoding="utf-8"
)
with open("greeting.txt", "r", encoding="utf-8") as f:
for line in f:
print(line.strip())
from pathlib import Path
def count_lines(filepath: str | Path) -> int:
"""Count the number of lines in a text file.
Args:
filepath: The path to the file to count lines in.
Returns:
The number of lines in the file.
"""
path = Path(filepath)
with path.open("r", encoding="utf-8") as f:
return sum(1 for _ in f)
result = count_lines("sample.txt")
print(f"The file has {result} lines.")
from pathlib import Path
def find_longest_line(filepath: str | Path) -> str:
"""Find the longest line in a text file.
Args:
filepath: The path to the file to search.
Returns:
The longest line with leading and trailing whitespace removed.
"""
path = Path(filepath)
with path.open("r", encoding="utf-8") as f:
return max((line.strip() for line in f), key=len)
longest = find_longest_line("sample.txt")
print(f"Longest line: {longest}")
from pathlib import Path
Path("sample.txt").unlink(missing_ok=True)
Path("greeting.txt").unlink(missing_ok=True)
print("Temporary files removed.")
Summary¶
In this tutorial, you learned how to read text files in Python. Here are the key takeaways:
- The
open()function opens files for reading with the"r"mode withstatements ensure files are closed properly, even if an error occursread()reads the entire file as a single stringreadlines()returns a list of lines (each with a trailing newline)- Iterating over a file object reads line by line, which is memory-efficient
pathlib.Path.read_text()is the simplest way to read a text file- Always specify
encoding="utf-8"for consistent behaviour across platforms
In the next tutorial, you will learn how to write and append content to files.