Parsing and formatting¶
Datetimes spend most of their lives as strings — in JSON payloads, log files, CSV columns, filenames. This notebook covers turning strings into datetimes and back again. The two main tools are fromisoformat/isoformat for the canonical ISO 8601 format, and strptime/strftime for everything else.
ISO 8601 — use this where you can¶
YYYY-MM-DDTHH:MM:SS (with optional microseconds and time zone offset). It sorts correctly as a string, it's unambiguous, it's the canonical format for almost every API. Python parses and emits it natively.
from datetime import date, datetime
d = date.fromisoformat("2026-04-21")
dt = datetime.fromisoformat("2026-04-21T14:30:00")
print(d)
print(dt)
And the round-trip:
print(d.isoformat())
print(dt.isoformat())
print(dt.isoformat(sep=" ", timespec="minutes"))
Two notes:
- Python 3.11+ accepts almost any ISO 8601 string, including time-zone offsets like
2026-04-21T14:30:00+01:00andZ(UTC). Earlier versions are stricter — they don't accept theZsuffix or all offset forms. If you're targeting older Python, stick withstrptimefor anything beyond plainYYYY-MM-DDTHH:MM:SS. fromisoformatis much faster thanstrptime. If your data is ISO 8601, use it.
strptime — parsing arbitrary formats¶
When the string isn't ISO 8601, use strptime(string, format). The format string uses the same directives as strftime — %Y, %m, %d, and so on — see the format codes reference for the full table.
from datetime import datetime
dt = datetime.strptime("21/04/2026 14:30", "%d/%m/%Y %H:%M")
print(dt)
dt = datetime.strptime("April 21, 2026", "%B %d, %Y")
print(dt)
dt = datetime.strptime("2026-112", "%Y-%j") # day-of-year
print(dt)
If the format doesn't match the string, strptime raises ValueError. Catch that if you're processing untrusted data.
try:
datetime.strptime("21-04-2026", "%d/%m/%Y")
except ValueError as e:
print(f"{type(e).__name__}: {e}")
strftime — formatting datetimes as strings¶
strftime(format) is the inverse — turn a datetime into a string using format directives.
dt = datetime(2026, 4, 21, 14, 30)
print(dt.strftime("%d/%m/%Y")) # 21/04/2026
print(dt.strftime("%A, %d %B %Y")) # Tuesday, 21 April 2026
print(dt.strftime("%Y-%m-%d %H:%M")) # 2026-04-21 14:30
print(dt.strftime("%Y-W%V-%u")) # ISO week-day
f-strings work too — f"{dt:%d/%m/%Y}" is equivalent to dt.strftime("%d/%m/%Y") and shorter in the common case:
print(f"{dt:%A %d %B %Y at %H:%M}")
The directives you actually use¶
The full table is in the reference page. The handful you'll use most:
| Directive | Meaning | Example |
|---|---|---|
%Y |
4-digit year | 2026 |
%m |
Zero-padded month | 04 |
%d |
Zero-padded day | 21 |
%H |
24-hour hour | 14 |
%M |
Zero-padded minute | 30 |
%S |
Zero-padded second | 00 |
%A |
Full weekday name | Tuesday |
%B |
Full month name | April |
%j |
Day of year | 112 |
Locale and weekday names¶
%A and %B give English names by default. Locale-specific names ("Mardi", "Avril") require setting the locale, which is global state and platform-specific — usually more trouble than it's worth. For internationalised output, prefer the babel library or compute the day-of-week as an integer and look up the localised string yourself.
Parsing data with mixed formats¶
Real-world date columns often contain several formats. The common pattern is to try each format in turn and use the first one that works:
FORMATS = ("%Y-%m-%d", "%d/%m/%Y", "%d-%b-%Y")
def parse_date(s):
for fmt in FORMATS:
try:
return datetime.strptime(s, fmt).date()
except ValueError:
continue
raise ValueError(f"no format matched {s!r}")
print(parse_date("2026-04-21"))
print(parse_date("21/04/2026"))
print(parse_date("21-Apr-2026"))
The parse-a-messy-date-column recipe takes this idea further — handling missing values, ambiguous dates, and the dataframe case.
Exercise¶
You receive log lines that look like:
2026-04-21T14:30:42 [INFO] User logged in
2026-04-21T14:31:05 [ERROR] Database timeout
Write a function parse_log_line(line) that returns a tuple of (timestamp, level, message), where timestamp is a datetime object, level is the level string, and message is the rest. Test it on the two lines above.
# Your code here
Solution
from datetime import datetime
def parse_log_line(line):
ts_str, rest = line.split(" ", 1)
timestamp = datetime.fromisoformat(ts_str)
level_part, message = rest.split("] ", 1)
level = level_part.lstrip("[")
return timestamp, level, message
print(parse_log_line("2026-04-21T14:30:42 [INFO] User logged in"))
print(parse_log_line("2026-04-21T14:31:05 [ERROR] Database timeout"))
Recap¶
fromisoformat/isoformatfor ISO 8601 — the canonical format. Use this whenever possible.strptime(string, format)to parse arbitrary formats.strftime(format)(or the f-string{dt:format}) to produce arbitrary formats.%Y/%m/%d/%H/%M/%Scover most of what you need; the reference has the rest.- For mixed-format inputs, try each format until one works.
Next: Time zones with zoneinfo, where datetimes get a lot more interesting.