Your first pattern¶
In this tutorial, you will write your first regular expression pattern and learn how to use Python's re module to search for text. By the end, you will be comfortable using re.search(), re.match(), and re.fullmatch() to find patterns in strings.
Time commitment: 15–20 minutes
Prerequisites:
- Basic Python knowledge (strings, variables, and functions)
- Python 3.12 or later installed
Learning objectives¶
By the end of this tutorial, you will be able to:
- Import and use the
remodule - Search for literal text using
re.search() - Understand the difference between
re.search(),re.match(), andre.fullmatch() - Work with match objects to extract matched text
- Use raw strings for regex patterns
- Use
re.compile()for reusable patterns
Getting started with the re module¶
Python includes a powerful regular expression module called re in its standard library. You do not need to install anything — simply import it.
import re
Searching for literal text¶
The simplest type of regular expression is a literal pattern — a pattern that matches exact text. The re.search() function scans through a string looking for the first location where the pattern matches.
result = re.search(r'hello', 'Say hello to the world')
print(result)
The re.search() function returns a match object when it finds a match, or None when it does not. The match object tells you where the match was found and what text was matched.
Let us look at what happens when a pattern is not found.
result = re.search(r'goodbye', 'Say hello to the world')
print(result)
When no match is found, re.search() returns None. This makes it easy to use in conditional statements.
text = 'The quick brown fox jumps over the lazy dog'
if re.search(r'fox', text):
print('Found "fox" in the text!')
else:
print('"fox" was not found.')
Working with match objects¶
When re.search() finds a match, it returns a match object that contains useful information. You can use the following methods:
.group()— returns the matched text.start()— returns the start position of the match.end()— returns the end position of the match.span()— returns a tuple of (start, end) positions
text = 'My postcode is SW1A 1AA'
match = re.search(r'SW1A', text)
if match:
print(f'Matched text: {match.group()}')
print(f'Start position: {match.start()}')
print(f'End position: {match.end()}')
print(f'Span: {match.span()}')
Notice that the start position is inclusive and the end position is exclusive, just like Python string slicing. You can verify this:
text = 'My postcode is SW1A 1AA'
match = re.search(r'SW1A', text)
if match:
start, end = match.span()
print(f'Using string slicing: "{text[start:end]}"')
print(f'Using .group(): "{match.group()}"')
Why use raw strings?¶
You may have noticed that the patterns above use r'...' syntax. This creates a raw string in Python. Raw strings treat backslashes as literal characters rather than escape sequences.
This matters because regular expressions use backslashes extensively. For example, \d matches any digit. Without a raw string, Python would try to interpret \d as an escape sequence before the re module ever sees it.
# With a raw string (recommended)
print(r'\d+') # Python sees: \d+
# Without a raw string (not recommended for regex)
print('\\d+') # You need to double the backslash
Always use raw strings for regex patterns. It makes your patterns easier to read and avoids subtle bugs.
Let us see a practical example. The pattern \d matches any digit character (0–9).
text = 'Order number: 42'
match = re.search(r'\d', text)
if match:
print(f'First digit found: {match.group()}')
re.search() versus re.match() versus re.fullmatch()¶
The re module provides three functions for checking whether a pattern matches a string. Each behaves differently:
| Function | Behaviour |
|---|---|
re.search() |
Scans the entire string for the first match anywhere |
re.match() |
Checks for a match only at the beginning of the string |
re.fullmatch() |
Checks whether the entire string matches the pattern |
Let us see how they differ.
text = 'hello world'
# re.search() finds 'world' anywhere in the string
print('search for "world":', re.search(r'world', text))
# re.match() only checks the beginning, so 'world' is not found
print('match for "world":', re.match(r'world', text))
# re.match() finds 'hello' because it is at the beginning
print('match for "hello":', re.match(r'hello', text))
text = 'hello'
# re.fullmatch() requires the entire string to match
print('fullmatch for "hello":', re.fullmatch(r'hello', text))
print('fullmatch for "hell":', re.fullmatch(r'hell', text))
A common mistake is using re.match() when you mean re.search(). Remember:
- Use
re.search()to find a pattern anywhere in a string - Use
re.match()to check if a string starts with a pattern - Use
re.fullmatch()to check if a string is exactly a pattern
Case sensitivity¶
By default, pattern matching is case-sensitive. The pattern r'hello' will not match 'Hello'.
print('Case-sensitive:', re.search(r'hello', 'Hello world'))
# Use the re.IGNORECASE flag for case-insensitive matching
print('Case-insensitive:', re.search(r'hello', 'Hello world', re.IGNORECASE))
The re.IGNORECASE flag (often shortened to re.I) makes the pattern match regardless of case. You will learn more about flags in later tutorials.
Compiling patterns with re.compile()¶
If you use the same pattern multiple times, you can compile it into a pattern object using re.compile(). This makes your code cleaner and can improve performance when the pattern is used repeatedly.
# Compile the pattern once
pattern = re.compile(r'python', re.IGNORECASE)
# Use the compiled pattern multiple times
texts = [
'I love Python programming',
'PYTHON is versatile',
'Java is also popular',
'python scripts are useful',
]
for text in texts:
match = pattern.search(text)
if match:
print(f'Found "{match.group()}" in: {text}')
else:
print(f'No match in: {text}')
A compiled pattern object has the same methods as the re module — .search(), .match(), .fullmatch(), and more — but you do not need to pass the pattern string each time.
Introducing metacharacters: the dot¶
So far, every character in our patterns has been a literal character that matches itself. Regular expressions become powerful when you use metacharacters — characters with special meanings.
The most basic metacharacter is the dot (.), which matches any single character except a newline.
# The dot matches any single character
print(re.search(r'h.t', 'hat')) # matches 'hat'
print(re.search(r'h.t', 'hot')) # matches 'hot'
print(re.search(r'h.t', 'hit')) # matches 'hit'
print(re.search(r'h.t', 'hoot')) # no match (two characters between h and t)
The pattern r'h.t' matches the letter h, followed by any single character, followed by the letter t.
If you need to match a literal dot, you must escape it with a backslash: r'\.'.
# Matching a literal dot
print(re.search(r'example\.com', 'Visit example.com')) # matches
print(re.search(r'example\.com', 'Visit exampleXcom')) # no match
# Without escaping, the dot matches any character
print(re.search(r'example.com', 'Visit exampleXcom')) # matches (dot matches X)
text = 'Learning regex is rewarding'
# Your code here
Exercise 2¶
Use re.match() to check whether the following string starts with "Error". Print "Starts with Error" or "Does not start with Error" accordingly.
log_line = 'Error: file not found'
# Your code here
Exercise 3¶
Use re.fullmatch() to check whether the string "yes" is an exact match. Test it against the strings "yes", "yes please", and "YES" (with re.IGNORECASE).
strings_to_test = ['yes', 'yes please', 'YES']
# Your code here
Exercise 4¶
Write a compiled pattern that matches the word "python" (case-insensitive). Use it to search through the list of strings below and print which ones contain a match.
texts = [
'Python is great',
'I prefer JavaScript',
'PYTHON is versatile',
'Learning python is fun',
]
# Your code here
Solutions¶
# Exercise 1
text = 'Learning regex is rewarding'
match = re.search(r'regex', text)
if match:
print(f'Found: {match.group()}')
# Exercise 2
log_line = 'Error: file not found'
if re.match(r'Error', log_line):
print('Starts with Error')
else:
print('Does not start with Error')
# Exercise 3
strings_to_test = ['yes', 'yes please', 'YES']
for s in strings_to_test:
result = re.fullmatch(r'yes', s, re.IGNORECASE)
if result:
print(f'"{s}" is an exact match')
else:
print(f'"{s}" is not an exact match')
# Exercise 4
texts = [
'Python is great',
'I prefer JavaScript',
'PYTHON is versatile',
'Learning python is fun',
]
pattern = re.compile(r'python', re.IGNORECASE)
for text in texts:
if pattern.search(text):
print(f'Match found in: "{text}"')
else:
print(f'No match in: "{text}"')
Summary¶
In this tutorial, you learned:
- Importing
re: Theremodule is built into Python and requires no installation re.search(): Finds the first match anywhere in a stringre.match(): Checks for a match at the beginning of a stringre.fullmatch(): Checks whether the entire string matches the pattern- Match objects: Use
.group(),.start(),.end(), and.span()to inspect matches - Raw strings: Always use
r'...'for regex patterns to avoid backslash issues re.compile(): Compile frequently used patterns for cleaner, faster code- The dot metacharacter:
.matches any single character (except newlines)
Next steps¶
In the next tutorial, Character classes and quantifiers, you will learn how to build more flexible patterns that match ranges of characters and control how many times a pattern repeats.