CS 1 Reading 6: Modules

Overview

In this reading we’ll introduce Python modules. A module is a term we use for a reusable chunk of Python code, which means code that is intended to be used in multiple programs.

Topics

Modules
The import keyword
Writing your own modules
Some useful modules

Modules

Some programs are meant to be stand-alone i.e. the code in the program is intended to be used only for that program. But it’s not uncommon to write Python functions (and, as we’ll see later, Python classes) that are more generally useful and can be used in multiple programs. In order to make it easy to use the same code in multiple programs, programming languages have developed the idea of a "module". A modules is like a code "library" that you can "check out" and use in any program that needs it. (Modules are often informally called "libraries".)

Modules are important because the best code is code you don’t have to write! If someone has already written the code that you need and put it into a module, it’s usually better to use the module than rewrite the code yourself.

Module contents

A module can contain any kind of Python code whatsoever. Most of the time, though, a module will contain

functions
values (generally intended to be constants)
classes

We’ll talk about classes later in the course. For now, we’re mainly interested in modules that contain functions we might want to use.

Using modules

When you think of code that is likely to be re-used in multiple programs, one thing you might think of is common math functions, like square root, sine, cosine, etc. In this section, let’s assume that we want to use these functions. If there is a module that contains these functions, we have to import the module, which means to load the module into the computer’s memory so that we can use its functions. In Python, there is a predefined module called math which contains all the standard mathematical functions; that’s the module we’ll be importing.

The `import` keyword and qualified names

The standard way to import a module is to use the import keyword:

import math

Once you’ve done this, you get access to all the functions in the math module. One of these is the square root function, called sqrt in Python. However, in order to use this function we have to include the name of the module in the function call:

>>> import math
>>> math.sqrt(4)
2.0
>>> math.sqrt(2)
1.4142135623730951

The sqrt function always returns a floating-point (approximate real number) value, which is why math.sqrt(4) returns 2.0 and not 2. ^[1]

One thing to be aware of is that if you use import this way, you have to put the name of the module before the name of the function, separated by a dot. (This is exactly the "dot syntax" already described for objects, but here instead of <object>.<method> it means <module>.<function>.) If you try to leave it out, you get an error:

>>> import math
>>> sqrt(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'sqrt' is not defined

What the error message is really telling you is that you need to write math.sqrt, not just sqrt.

If you write math.sqrt but forget the import line it also fails:

>>> math.sqrt(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'math' is not defined

The error message lets us know that Python doesn’t recognize the name math until it’s been imported.

Names like math.sqrt are called qualified names; here, it means "the sqrt function that is defined in the math module". The reason Python’s import statement works this way is that it’s nice to have imported names "compartmentalized" inside their modules. This matters because other modules may define different sqrt functions. For instance, there is a module called cmath which provides math functions that work on complex numbers.

A complex number is a kind of number that contains a real and an imaginary part. The real part is just a real number. The imaginary part is a real number multiplied by the square root of -1, which is usually called i or j (Python calls it j). Python allows you to write literal complex numbers: ^[2]

>>> c = 1.0+2.0j
>>> c * c
(-3+4j)

We can import both modules at the same time and use both sqrt functions, because we have to qualify the names:

>>> import math
>>> math.sqrt(-1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: math domain error
>>> import cmath
>>> cmath.sqrt(-1)
1j

The math.sqrt function doesn’t work on negative inputs (hence the error message) but the cmath.sqrt function works fine on negative inputs (giving the result 1j, not just j).

We can even import both modules on the same line:

import math, cmath

More generally, you can import as many modules as you like on one line by separating them with commas. On the other hand, if you have a lot of modules to import, it’s more readable to have each import on a separate line:

import math
import cmath

In fact, importing multiple modules on one line is considered poor Python style. ^[3]

The `from X import Y` syntax

Sometimes it’s annoying to have to qualify names. Maybe you are going to be using the sqrt function over and over again, and having to write math.sqrt each time seems too verbose (and maybe also too hard to read). There is a way to import a function that doesn’t require that the function’s name be qualified: you use the from X import Y syntax.

>>> from math import sqrt
>>> sqrt(2.0)
1.4142135623730951

You can even import multiple names:

>>> from math import sin, cos, pi
>>> sin(pi/2) + cos(pi/2)
1.0

Or you can go nuts and import every name in a module:

>>> from math import *
>>> sqrt(sin(pi/3))
0.9306048591020996

The asterisk character (*) is called "star" in this context (it’s not multiplication) and means "everything", so this will import every name in the module, which you can then use without having to write the module name as a qualifier (so sqrt and not math.sqrt). ^[4]

Beginning Python programmers love this form, because it’s very convenient to use. However, most of the time it’s also bad programming practice. The problem is that different modules will sometimes define the same name to mean different things:

>>> from math import sin
>>> sin(1.0)
0.8414709848078965
>>> from evil import sin
>>> sin('bear false witness')
'The check is in the mail!'

[The second example is made-up, of course.] If we use the from X import * syntax routinely, we can get name clashes, where you import the same name multiple times. In this case, Python will only allow you to use the last name that was imported.

>>> from math import *
>>> from evil import *

Now sin means evil.sin, and math.sin can’t be used.

Name clashes can lead to difficult-to-find bugs. Our advice is therefore:

Don’t use the from X import * syntax unless there is a really good reason to.

The `import X as Y` syntax

OK, so hopefully you’re convinced that indiscriminate use of the from X import * syntax is a bad idea. Still, if a module has a long name it’s really annoying to have to write it over and over as the first part of qualified names. Fortunately, Python provides another way to do this: you can rename a module when you import it using the import X as Y syntax. Most of the time, you rename a module to a much shorter name:

>>> import math as m
>>> m.sqrt(2.0)
1.4142135623730951

The benefit of this is that you’re protected from name clashes as long as you choose module names that are different. The downside of this is that m.sqrt is perhaps less readable than either math.sqrt or sqrt. Nevertheless, we think that it’s a good compromise.

Many popular Python libraries, such as NumPy and Pandas (for multidimensional data analysis), and matplotlib (for plotting and visualization) are normally used with the import X as Y syntax:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Modules are first-class

It might surprise you to know that modules are actually also Python objects! The technical way of saying this is that modules are "first-class" objects, but all that means is that they are objects like any other object. ^[5] Because of this, you can store modules in variables, you can pass them as arguments to functions, etc.

>>> import math
>>> m = math
>>> m.sqrt(2.0)
1.4142135623730951

So the import math as m syntax is exactly equivalent to import math followed by m = math.

Module documentation and the `help` function

Modules can contain a lot of different functions, values, etc. How do we learn about what’s in a module?

One way is to go to the Python web site and look in the library documentation. The place to go is https://docs.python.org/library. This is the preferred approach if the module is part of Python’s standard libraries, which will be the case for most of the modules we’ll use in this course.

Another way is to use the help function built in to Python. The help function can take a module, a function, or a class as its argument and will print out the documentation associated with that thing.

>>> import math
>>> help(math.sqrt)
Help on built-in function sqrt in module math:

sqrt(x, /)
    Return the square root of x.

(Don’t worry about the / in the sqrt(x, /) line. It’s a recent Python addition that means that the argument x is "positional only". sqrt only takes one argument.)

You can get documentation for entire modules, too:

>>> import math
>>> help(math)
Help on module math:

NAME
    math

MODULE REFERENCE
    https://docs.python.org/3.9/library/math
...

(This goes on for many pages.)

One thing that’s interesting is that the help function is not "magical" or "special syntax" in any way; it’s just a regular Python function. The reason it can take a function or a module as its argument follows from the fact that functions and modules are themselves Python objects. On the other hand, if you wanted help on (say) the def keyword in Python, this won’t work:

>>> help(def)
  File "<stdin>", line 1
    help(def)
         ^
SyntaxError: invalid syntax

Since def is a keyword and not a Python object, this doesn’t work. You have to go to the online documentation to learn everything about how def works. (Or just keep reading these readings )

The help function only works if the function or module argument is known to Python:

>>> help(math.sqrt)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'math' is not defined
>>> import math
>>> help(math.sqrt)
Help on built-in function sqrt in module math:

sqrt(x, /)
    Return the square root of x.

You can also write documentation for your own functions so that the help function will work on them. We’ll see how to do that in the next reading.

The help function can also be called with no arguments:

>>> help()
... [long welcome message] ...
help>

When invoked this way, help puts you into an interpreted environment which runs on top of the interactive Python interpreter (as is suggested by the help> prompt, which is distinct from the usual Python >>> prompt). Inside this prompt, you can enter anything you want help on: a module, a function, a value, etc. This even works if you haven’t loaded a module. Imagine you start the Python interpreter and don’t import any modules. This still works:

>>> help()
... [long welcome message] ...
help> math.sqrt
Help on built-in function sqrt in math:

math.sqrt = sqrt(x, /)
    Return the square root of x.

If you need help on multiple topics, calling help this way is often useful. To get out of the help system, type quit at the prompt.

Writing your own modules

Writing a Python module is very easy. You just need to do two things:

Write your code in a file.
Make sure that the name of the file ends in .py.

Also, by convention, module names should be short and consist of all lowercase letters and (if necessary) underscores. ^[6]

Let’s write a simple module called greetings which provides functions to print out greeting messages. The module’s file will be named greetings.py. Here’s our first attempt:

# Module: greetings
# Filename: greetings.py

def greet(name):
    print('Hi there, {}!'.format(name))

def insult(name):
    print('Get lost, {}!'.format(name))

Once this code is saved into the file greetings.py, you can import it like any other module.

>>> import greetings
>>> greetings.greet('El')
Hi there, El!
>>> greetings.insult('Mike')
Get lost, Mike!

Or, if you like, you can import it the other way:

>>> from greetings import *
>>> greet('El')
Hi there, El!

Or even:

>>> import greetings as g
>>> g.greet('El')
Hi there, El!

The point of all this is that writing normal Python code in a file is exactly the same as writing a Python module. So Python modules are incredibly easy to create and use.

In the next reading, we will see how to add documentation to our modules so that the help function can print it out when requested.

Some useful modules

There are lots of useful modules included with Python. Some of the ones we will be using include:

math (math functions)
cmath (complex number math functions)
string (functions on strings)
random (random number functions)
sys (system functions)
os (operating system specific functions)
re (regular expressions)
tkinter (simple graphical user interfaces)

There are many, many more useful modules besides these. Go to the Python module documentation for a full list. There are also many important and useful modules that aren’t included with Python. Some of these include:

numpy (numerical programming with multidimensional arrays)
scipy (scientific functions)
pandas (data analysis)
matplotlib (plotting and visualization tools)

These modules are used extensively by data scientists and are one of the main reasons that Python is so popular. ^[7]

[End of reading]

1. In programming languages, the integer "2" and the floating-point number "2.0" are different, because the way they are represented internally is different.

2. To be completely accurate, Python allows you to write literal imaginary numbers, which means a regular number with the suffix j. When you write 1.0+2.0j, it’s actually a Python addition expression: the real number 1.0 added to the imaginary number 2.0j.

3. According to PEP8, which is a set of standard coding style conventions for Python.

4. This use of * is called "globbing" and derives ultimately from Unix shell syntax. It’s also used in a somewhat similar way in the Java language.

5. Functions are also first-class Python objects, and you can pass functions as arguments to other functions. This turns out to be very powerful and leads to a style of programming called "functional programming" which we’ll look at toward the end of the course.

6. This is also part of the PEP8 style guidelines. We will see PEP8 again.

7. They are also very easy to install using the pip tool. For instance, to install numpy all you need to do is type pip install numpy at a terminal prompt.

CS 1 Reading 6: Modules

Overview

Topics

Modules

Module contents

Using modules

The import keyword and qualified names

The from X import Y syntax

The import X as Y syntax

Modules are first-class

Module documentation and the help function

Writing your own modules

Some useful modules

The `import` keyword and qualified names

The `from X import Y` syntax

The `import X as Y` syntax

Module documentation and the `help` function