Run blocking calls in a thread pool¶

The question. You have a slow, blocking function — it downloads a URL, queries a database, reads a file, calls an API — and you need to run it over many inputs. Run sequentially, the waits stack up; you want them to overlap.

The answer: this is I/O-bound work, so reach for concurrent.futures.ThreadPoolExecutor. Submit one call per input, cap concurrency with max_workers, and collect the results — in input order with map, or as they finish with as_completed. Below are the three forms you'll use, with per-task error handling.

Spawns real threads — run locally to see the timings.

The simplest form: `map` for ordered results¶

When every input maps to one output and you want them back in order, executor.map is the cleanest. It's the built-in map, run concurrently.

In [ ]:

no-run

Copied!





from concurrent.futures import ThreadPoolExecutor
import time

def download(url):
    time.sleep(0.5)                 # stand-in for a blocking network call
    return f'{url}: 200 OK ({len(url)} bytes)'

urls = [f'https://example.com/page/{i}' for i in range(10)]

start = time.perf_counter()
with ThreadPoolExecutor(max_workers=5) as pool:
    results = list(pool.map(download, urls))   # ordered; raises on first error

print(results[0])
print(f'10 downloads in {time.perf_counter() - start:.2f}s')   # ~1s, not 5s
from concurrent.futures import ThreadPoolExecutor
import time

def download(url):
    time.sleep(0.5)                 # stand-in for a blocking network call
    return f'{url}: 200 OK ({len(url)} bytes)'

urls = [f'https://example.com/page/{i}' for i in range(10)]

start = time.perf_counter()
with ThreadPoolExecutor(max_workers=5) as pool:
    results = list(pool.map(download, urls))   # ordered; raises on first error

print(results[0])
print(f'10 downloads in {time.perf_counter() - start:.2f}s')   # ~1s, not 5s

map is concise but has one drawback: if any call raises, iterating the results re-raises that exception and you lose the rest. When some failures are expected, use as_completed instead.

`as_completed`: results as they finish, with per-task error handling¶

Submit each call, keep a dict mapping the future back to its input, and process each result the moment it's ready. Wrap .result() in try/except so one failure doesn't sink the batch — collect successes and failures separately.

In [ ]:

no-run

Copied!





from concurrent.futures import ThreadPoolExecutor, as_completed
import random

def download(url):
    time.sleep(random.uniform(0.1, 0.6))
    if url.endswith('7'):
        raise ConnectionError('timed out')   # simulate a failure
    return f'{url}: ok'

results, errors = {}, {}
with ThreadPoolExecutor(max_workers=5) as pool:
    future_to_url = {pool.submit(download, u): u for u in urls}
    for future in as_completed(future_to_url):
        url = future_to_url[future]
        try:
            results[url] = future.result()
        except Exception as exc:             # catch per task, keep going
            errors[url] = repr(exc)

print(f'{len(results)} ok, {len(errors)} failed')
print('failures:', errors)
from concurrent.futures import ThreadPoolExecutor, as_completed
import random

def download(url):
    time.sleep(random.uniform(0.1, 0.6))
    if url.endswith('7'):
        raise ConnectionError('timed out')   # simulate a failure
    return f'{url}: ok'

results, errors = {}, {}
with ThreadPoolExecutor(max_workers=5) as pool:
    future_to_url = {pool.submit(download, u): u for u in urls}
    for future in as_completed(future_to_url):
        url = future_to_url[future]
        try:
            results[url] = future.result()
        except Exception as exc:             # catch per task, keep going
            errors[url] = repr(exc)

print(f'{len(results)} ok, {len(errors)} failed')
print('failures:', errors)

This is the pattern to default to for real network work: every task either lands in results or errors, nothing is lost, and one bad URL doesn't abort the other nine.

Choosing `max_workers`¶

For I/O-bound work the pool can be much larger than your core count — the threads are mostly waiting, not computing. Sensible starting points:

Network/API calls: tens to low hundreds, but respect the service's rate limits and connection caps. More threads past the point of saturation just add overhead.
Local disk: small (4–8); disks don't parallelise well and too many readers thrash.

Leaving max_workers unset defaults to min(32, os.cpu_count() + 4) — a fine default for light I/O. When in doubt, measure a couple of values rather than guessing.

When this is the wrong tool¶

If download were CPU-bound instead — parsing, hashing, compression — threads would give you no speedup, because the GIL serialises Python computation. Swap ThreadPoolExecutor for ProcessPoolExecutor (that recipe). If you're making thousands of concurrent network calls and your libraries are async-capable, asyncio scales further on a single thread.

Run blocking calls in a thread pool¶

The simplest form: map for ordered results¶

as_completed: results as they finish, with per-task error handling¶

Choosing max_workers¶

When this is the wrong tool¶

The simplest form: `map` for ordered results¶

`as_completed`: results as they finish, with per-task error handling¶

Choosing `max_workers`¶