Count and tally items¶
The question. You have a pile of items — words, log lines, events, purchases — and you want to know how often each occurs, which are the most common, or how two tallies compare.
The answer is Counter. Below are the patterns you'll use most: a basic frequency count, the top-N, counting only items that meet a condition, and combining or comparing tallies with Counter arithmetic.
Frequency count from any iterable¶
Pass the iterable straight to Counter. Anything iterable works — a list, a string, a generator, the lines of a file.
from collections import Counter
orders = ['tea', 'coffee', 'tea', 'juice', 'coffee', 'tea']
counts = Counter(orders)
print(counts) # Counter({'tea': 3, 'coffee': 2, 'juice': 1})
print(counts['tea']) # 3
print(counts['water']) # 0 — unseen items are zero, not an error
The top N¶
most_common(n) gives the n highest, ranked. Leave out n for the full ranking; slice from the end for the rarest.
from collections import Counter
text = 'the quick brown fox the lazy dog the end'
freq = Counter(text.split())
print(freq.most_common(2)) # [('the', 3), ('quick', 1)]
print(freq.most_common()[-1]) # ('end', 1) — the least common
Counting only what matches a condition¶
Feed Counter a generator expression to count a filtered or transformed view — no intermediate list needed.
from collections import Counter
words = ['Apple', 'apricot', 'Banana', 'avocado', 'Cherry', 'almond']
# count first letters, case-insensitively, for words longer than 5 letters
first_letters = Counter(w[0].lower() for w in words if len(w) > 5)
print(first_letters) # Counter({'a': 3, 'b': 1, 'c': 1})
Combining and comparing tallies¶
Counter arithmetic merges counts (+), finds what one has over another (-), or takes the shared minimum (&) and the per-item maximum (|). Great for rolling up daily counts or diffing two inventories.
from collections import Counter
monday = Counter(tea=8, coffee=5, juice=2)
tuesday = Counter(tea=6, coffee=7, water=3)
print(monday + tuesday) # combined: Counter({'tea': 14, 'coffee': 12, 'water': 3, 'juice': 2})
print(tuesday - monday) # grew on Tuesday: Counter({'water': 3, 'coffee': 2})
Total, unique count, and items over a threshold¶
total() sums every count; len() gives the number of distinct items; filtering most_common() (or .items()) finds items above a cut-off.
from collections import Counter
counts = Counter('mississippi')
print(counts.total()) # 11 — total letters
print(len(counts)) # 4 — distinct letters
print([item for item, n in counts.items() if n >= 4]) # ['i', 's'] — appear 4+ times
In short¶
Counter(iterable)tallies anything iterable in one line.most_common(n)ranks; slice the full list for the rarest.- A generator expression inside
Counter(...)counts a filtered/transformed view. + - & |roll up and compare tallies;total()andlen()give the grand total and the distinct count.