Overview
In this reading we’ll introduce Python modules. A module is a term we use for a reusable chunk of Python code, which means code that is intended to be used in multiple programs.
Topics
-
Modules
-
The
import
keyword -
Writing your own modules
-
Some useful modules
Modules
Some programs are meant to be stand-alone i.e. the code in the program is intended to be used only for that program. But it’s not uncommon to write Python functions (and, as we’ll see later, Python classes) that are more generally useful and can be used in multiple programs. In order to make it easy to use the same code in multiple programs, programming languages have developed the idea of a "module". A modules is like a code "library" that you can "check out" and use in any program that needs it. (Modules are often informally called "libraries".)
Modules are important because the best code is code you don’t have to write! If someone has already written the code that you need and put it into a module, it’s usually better to use the module than rewrite the code yourself.
Module contents
A module can contain any kind of Python code whatsoever. Most of the time, though, a module will contain
-
functions
-
values (generally intended to be constants)
-
classes
We’ll talk about classes later in the course. For now, we’re mainly interested in modules that contain functions we might want to use.
Using modules
When you think of code that is likely to be re-used in multiple programs, one
thing you might think of is common math functions, like square root, sine,
cosine, etc. In this section, let’s assume that we want to use these
functions. If there is a module that contains these functions, we have to
import the module, which means to load the module into the computer’s memory
so that we can use its functions. In Python, there is a predefined module
called math
which contains all the standard mathematical functions; that’s the
module we’ll be importing.
The import
keyword and qualified names
The standard way to import a module is to use the import
keyword:
import math
Once you’ve done this, you get access to all the functions in the math
module. One of these is the square root function, called sqrt
in Python.
However, in order to use this function we have to include the name of the module
in the function call:
>>> import math >>> math.sqrt(4) 2.0 >>> math.sqrt(2) 1.4142135623730951
The sqrt
function always returns a floating-point (approximate real number)
value, which is why math.sqrt(4)
returns 2.0
and not 2
. [1]
One thing to be aware of is that if you use import
this way, you have to put
the name of the module before the name of the function, separated by a dot.
(This is exactly the "dot syntax" already described for objects, but here
instead of <object>.<method>
it means <module>.<function>
.) If you try to
leave it out, you get an error:
>>> import math >>> sqrt(2) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'sqrt' is not defined
What the error message is really telling you is that you need to write
math.sqrt
, not just sqrt
.
If you write >>> math.sqrt(2) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'math' is not defined The error message lets us know that Python doesn’t recognize the name |
Names like math.sqrt
are called qualified names; here, it means "the sqrt
function that is defined in the math
module". The reason Python’s import
statement works this way is that it’s nice to have imported names
"compartmentalized" inside their modules. This matters because other modules
may define different sqrt
functions. For instance, there is a module called
cmath
which provides math functions that work on complex numbers.
A complex number is a kind of number that contains a real and an imaginary
part. The real part is just a real number. The imaginary part is a real number
multiplied by the square root of -1, which is usually called >>> c = 1.0+2.0j >>> c * c (-3+4j) |
We can import
both modules at the same time and use both sqrt
functions,
because we have to qualify the names:
>>> import math >>> math.sqrt(-1) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: math domain error >>> import cmath >>> cmath.sqrt(-1) 1j
The math.sqrt
function doesn’t work on negative inputs (hence the error
message) but the cmath.sqrt
function works fine on negative inputs (giving
the result 1j
, not just j
).
We can even import both modules on the same line:
import math, cmath
More generally, you can import as many modules as you like on one line by
separating them with commas. On the other hand, if you have a lot of modules to
import, it’s more readable to have each import
on a separate line:
import math import cmath
In fact, importing multiple modules on one line is considered poor Python style. [3]
The from X import Y
syntax
Sometimes it’s annoying to have to qualify names. Maybe you are going to be
using the sqrt
function over and over again, and having to write math.sqrt
each time seems too verbose (and maybe also too hard to read). There is a
way to import a function that doesn’t require that the function’s name be
qualified: you use the from X import Y
syntax.
>>> from math import sqrt >>> sqrt(2.0) 1.4142135623730951
You can even import multiple names:
>>> from math import sin, cos, pi >>> sin(pi/2) + cos(pi/2) 1.0
Or you can go nuts and import every name in a module:
>>> from math import * >>> sqrt(sin(pi/3)) 0.9306048591020996
The asterisk character (*
) is called "star" in this context (it’s not
multiplication) and means "everything", so this will import every name in the
module, which you can then use without having to write the module name as a
qualifier (so sqrt
and not math.sqrt
). [4]
Beginning Python programmers love this form, because it’s very convenient to use. However, most of the time it’s also bad programming practice. The problem is that different modules will sometimes define the same name to mean different things:
>>> from math import sin >>> sin(1.0) 0.8414709848078965 >>> from evil import sin >>> sin('bear false witness') 'The check is in the mail!'
[The second example is made-up, of course.] If we use the from X import *
syntax routinely, we can get name clashes, where you import the same name
multiple times. In this case, Python will only allow you to use the last name
that was imported.
>>> from math import * >>> from evil import *
Now sin
means evil.sin
, and math.sin
can’t be used.
Name clashes can lead to difficult-to-find bugs. Our advice is therefore:
Don’t use the |
The import X as Y
syntax
OK, so hopefully you’re convinced that indiscriminate use of the from X import
*
syntax is a bad idea. Still, if a module has a long name it’s really
annoying to have to write it over and over as the first part of qualified names.
Fortunately, Python provides another way to do this: you can rename a module
when you import it using the import X as Y
syntax. Most of the time, you
rename a module to a much shorter name:
>>> import math as m >>> m.sqrt(2.0) 1.4142135623730951
The benefit of this is that you’re protected from name clashes as long as you
choose module names that are different. The downside of this is that m.sqrt
is perhaps less readable than either math.sqrt
or sqrt
. Nevertheless, we
think that it’s a good compromise.
Many popular Python libraries, such as NumPy and
Pandas (for multidimensional data analysis), and
matplotlib (for plotting and visualization) are
normally used with the import numpy as np import pandas as pd import matplotlib.pyplot as plt |
Modules are first-class
It might surprise you to know that modules are actually also Python objects! The technical way of saying this is that modules are "first-class" objects, but all that means is that they are objects like any other object. [5] Because of this, you can store modules in variables, you can pass them as arguments to functions, etc.
>>> import math >>> m = math >>> m.sqrt(2.0) 1.4142135623730951
So the import math as m
syntax is exactly equivalent to import math
followed
by m = math
.
Module documentation and the help
function
Modules can contain a lot of different functions, values, etc. How do we learn about what’s in a module?
One way is to go to the Python web site and look in the library documentation. The place to go is https://docs.python.org/library. This is the preferred approach if the module is part of Python’s standard libraries, which will be the case for most of the modules we’ll use in this course.
Another way is to use the help
function built in to Python. The help function
can take a module, a function, or a class as its argument and will print out the
documentation associated with that thing.
>>> import math >>> help(math.sqrt) Help on built-in function sqrt in module math: sqrt(x, /) Return the square root of x.
(Don’t worry about the /
in the sqrt(x, /)
line. It’s a recent Python
addition that means that the argument x
is "positional only". sqrt
only
takes one argument.)
You can get documentation for entire modules, too:
>>> import math >>> help(math) Help on module math: NAME math MODULE REFERENCE https://docs.python.org/3.9/library/math ...
(This goes on for many pages.)
One thing that’s interesting is that the help
function is not "magical" or
"special syntax" in any way; it’s just a regular Python function. The reason
it can take a function or a module as its argument follows from the fact that
functions and modules are themselves Python objects. On the other hand, if you
wanted help on (say) the def
keyword in Python, this won’t work:
>>> help(def) File "<stdin>", line 1 help(def) ^ SyntaxError: invalid syntax
Since def
is a keyword and not a Python object, this doesn’t work. You have
to go to the online documentation to learn everything about how def
works.
(Or just keep reading these readings )
The help
function only works if the function or module argument is known to
Python:
>>> help(math.sqrt) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'math' is not defined >>> import math >>> help(math.sqrt) Help on built-in function sqrt in module math: sqrt(x, /) Return the square root of x.
You can also write documentation for your own functions so that the help
function will work on them. We’ll see how to do that in the next reading.
The >>> help() ... [long welcome message] ... help> When invoked this way, >>> help() ... [long welcome message] ... help> math.sqrt Help on built-in function sqrt in math: math.sqrt = sqrt(x, /) Return the square root of x. If you need help on multiple topics, calling |
Writing your own modules
Writing a Python module is very easy. You just need to do two things:
-
Write your code in a file.
-
Make sure that the name of the file ends in
.py
.
Also, by convention, module names should be short and consist of all lowercase letters and (if necessary) underscores. [6]
Let’s write a simple module called greetings
which provides functions to print
out greeting messages. The module’s file will be named greetings.py
. Here’s
our first attempt:
# Module: greetings # Filename: greetings.py def greet(name): print('Hi there, {}!'.format(name)) def insult(name): print('Get lost, {}!'.format(name))
Once this code is saved into the file greetings.py
, you can import it like any
other module.
>>> import greetings >>> greetings.greet('El') Hi there, El! >>> greetings.insult('Mike') Get lost, Mike!
Or, if you like, you can import it the other way:
>>> from greetings import * >>> greet('El') Hi there, El!
Or even:
>>> import greetings as g >>> g.greet('El') Hi there, El!
The point of all this is that writing normal Python code in a file is exactly the same as writing a Python module. So Python modules are incredibly easy to create and use.
In the next reading, we will see how to add documentation to our modules so
that the |
Some useful modules
There are lots of useful modules included with Python. Some of the ones we will be using include:
-
math
(math functions) -
cmath
(complex number math functions) -
string
(functions on strings) -
random
(random number functions) -
sys
(system functions) -
os
(operating system specific functions) -
re
(regular expressions) -
tkinter
(simple graphical user interfaces)
There are many, many more useful modules besides these. Go to the Python module documentation for a full list. There are also many important and useful modules that aren’t included with Python. Some of these include:
-
numpy
(numerical programming with multidimensional arrays) -
scipy
(scientific functions) -
pandas
(data analysis) -
matplotlib
(plotting and visualization tools)
These modules are used extensively by data scientists and are one of the main reasons that Python is so popular. [7]
[End of reading]