Modules and imports¶
Every import statement you write is doing more than meets the eye —
finding code, loading it, caching it, and binding names into your
current namespace. This tutorial unpacks what import actually does,
so the rest of this guide rests on a solid foundation.
Time commitment: 15 minutes
Prerequisites:
- Comfortable with functions and basic data structures
- You've used
import mathor similar at some point
Learning objectives¶
By the end of this tutorial, you will be able to:
- Explain what a module is and how Python finds one
- Use the four common forms of the
importstatement deliberately - Inspect
sys.modulesandsys.pathto debug an import - Explain the
if __name__ == "__main__":idiom
What is a module?¶
A module is, simply, a .py file. The filename — minus the
extension — is the module's name. When you write import math,
Python is looking for a file called math.py (or, in this case,
a built-in equivalent that ships with the interpreter).
Run the cell below to import the standard library's math module
and use it.
import math
print(math.pi)
print(math.sqrt(2))
Notice that you have to write math.pi, not just pi. The import
statement doesn't dump the module's contents into your namespace — it
creates a single name (math) that refers to the module, and you
reach into it with the dot.
Four forms of import¶
Almost everything you'll see in real code uses one of four forms.
# 1. Import the whole module under its own name.
import math
math.pi
# 2. Import specific names from the module into your namespace.
from math import pi, sqrt
pi, sqrt(2)
# 3. Rename on import — useful when names clash, or for conventional
# short aliases like `import numpy as np`.
import math as m
m.pi
# 4. Import a name and rename it.
from math import sqrt as square_root
square_root(9)
A fifth form — from math import * — also exists, but it's almost
always a mistake outside of an interactive session. It pulls every
public name from the module into your namespace, where it can shadow
your own variables silently.
What import actually does¶
When Python encounters import math for the first time in a session,
roughly this happens:
- Check
sys.modules, an in-memory cache of already-imported modules. Ifmathis there, bind it to the local name and stop. - Otherwise, walk through
sys.path— a list of directories — looking for a matching file or package. - Execute the module's code top-to-bottom, building up its namespace.
- Store the result in
sys.modulesso the nextimport mathis a cheap dictionary lookup.
That cache is why running import math ten times in a script costs
almost nothing after the first.
import sys
# math is now in the cache.
print("math in cache?", "math" in sys.modules)
print("modules cached:", len(sys.modules))
# The first few entries of sys.path — the search list.
for entry in sys.path[:5]:
print(repr(entry))
The first entry is typically the directory of the script you ran (or
an empty string in an interactive session, meaning "current
directory"). The rest are the standard library and any installed
packages. When an import mysteriously fails with ModuleNotFoundError,
the question to ask is almost always: "is the module's parent
directory on sys.path?"
if __name__ == "__main__":¶
Every module has a __name__ attribute. When a module is imported,
__name__ is set to the module's name (e.g. "math"). When a module
is run directly — python myscript.py — __name__ is set to the
special string "__main__".
# In a notebook, the cell's __name__ is "__main__" — the notebook
# itself is the top-level script.
print(__name__)
That's the basis for the idiom you'll see at the bottom of well-shaped scripts:
def main():
...
if __name__ == "__main__":
main()
It means "only run main() when this file is executed directly, not
when it's imported by something else". Without it, simply importing
the module would run its main routine — which is rarely what an
importing caller wants.
Reloading during development¶
Imports are cached, so re-running import math after you've edited
math.py won't pick up your changes. The standard library's
importlib.reload() exists for this case, but in practice you'll
almost always restart the Python process instead — much simpler, and
avoids the surprises that come with half-reloaded state.
Notebooks add their own wrinkle: an editor's autoreload extension can
re-import on save, which is convenient until it isn't. Keep
importlib.reload() in your back pocket; reach for it sparingly.
Recap and next steps¶
- A module is a
.pyfile; its name is the filename without the extension. import xbindsxto the module;from x import ybinds the inner name into your namespace.- Imports are cached in
sys.modules; module files are searched alongsys.path. __name__ == "__main__"distinguishes "I'm being run directly" from "I've been imported".
Next up: when one .py file isn't enough — Packages and namespaces.