Reading 6: Modules
It would appear that we have reached the limits of what it is possible to achieve with computer technology, although one should be careful with such statements, as they tend to sound pretty silly in 5 years.
- John von Neumann
Overview
In this reading we'll introduce Python modules. A module is a term we use for a reusable chunk of Python code, which means code that is intended to be used in multiple programs.
Topics
- Modules
- The
import
keyword - Writing your own modules
- Some useful modules
Modules
Some programs are meant to be stand-alone i.e. the code in the program is intended to be used only for that program. But it's not uncommon to write Python functions (and, as we'll see later, Python classes) that are more generally useful and can be used in multiple programs. In order to make it easy to use the same code in multiple programs, programming languages have developed the idea of a "module". A module is like a code "library" that you can "check out" and use in any program that needs it. (Modules are often informally called "libraries".)
Modules are important because the best code is code you don't have to write! If someone has already written the code that you need, tested it, debugged it, and put it into a module, it's usually better to use the module than rewrite the code yourself.
Module contents
A module can contain any kind of Python code whatsoever. Most of the time, though, a module will contain
- functions
- values (generally intended to be constants)
- classes
We'll talk about classes later in the course. For now, we're mainly interested in modules that contain functions we might want to use.
Using modules
When you think of code that is likely to be re-used in multiple programs,
one thing you might think of is common math functions,
like square root, sine, cosine, etc.
In this section, let's assume that we want to use these functions.
If there is a module that contains these functions,
we have to import the module,
which means to load the module into the computer's memory
so that we can use its functions.
In Python, there is a predefined module called math
which contains all the standard mathematical functions;
that's the module we'll be importing.
The import
keyword and qualified names
The standard way to import a module is to use the import
keyword:
Once you've done this,
you get access to all the functions in the math
module.
One of these is the square root function, called sqrt
in Python.
However, in order to use this function
we have to include the name of the module in the function call:
The sqrt
function always returns a floating-point
(approximate real number) value,
which is why math.sqrt(4)
returns 2.0
and not 2
.1
One thing to be aware of is that if you use import
this way,
you have to put the name of the module before the name of the function,
separated by a dot.
(This is exactly the "dot syntax" already described for objects,
but here instead of <object>.<method>
it means <module>.<function>
.)
If you try to leave it out, you get an error:
>>> import math
>>> sqrt(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'sqrt' is not defined
What the error message is really telling you
is that you need to write math.sqrt
, not just sqrt
.
Note
If you write math.sqrt
but forget the import
line it also fails:
>>> math.sqrt(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'math' is not defined
The error message lets us know that
Python doesn't recognize the name math
until it's been imported.
Names like math.sqrt
are called qualified names;
here, it means "the sqrt
function that is defined in the math
module".
The reason Python's import
statement works this way
is that it's nice to have imported names "compartmentalized"
inside their modules.
This matters because other modules may define different sqrt
functions.
For instance, there is a module called cmath
which provides math functions that work on complex numbers.
Note
A complex number is a kind of number
that contains a real and an imaginary part.
The real part is just a real number.
The imaginary part is a real number
multiplied by the square root of -1,
which is usually called i
or j
(Python calls it j
).
Python allows you to write literal complex numbers:2
We can import
both modules at the same time
and use both sqrt
functions,
because we have to qualify the names:
>>> import math
>>> math.sqrt(-1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: math domain error
>>> import cmath
>>> cmath.sqrt(-1)
1j
The math.sqrt
function doesn't work on negative inputs
(hence the error message)
but the cmath.sqrt
function works fine on negative inputs.
We can even import both modules on the same line:
More generally,
you can import as many modules as you like on one line
by separating them with commas.
On the other hand, if you have a lot of modules to import,
it's more readable to have each import
on a separate line:
In fact, importing multiple modules on one line is considered poor Python style.3
The from X import Y
syntax
Sometimes it's annoying to have to qualify names.
Maybe you are going to be using the sqrt
function over and over again,
and having to write math.sqrt
each time seems too verbose
(and maybe also too hard to read).
There is a way to import a function
that doesn't require that the function's name be qualified:
you use the from X import Y
syntax.
You can even import multiple names:
Or you can go nuts and import every name in a module:
The asterisk character (*
) is called "star" in this context
(it's not multiplication) and means "everything",
so this will import every name in the module,
which you can then use without having
to write the module name as a qualifier
(so sqrt
and not math.sqrt
).
Beginning Python programmers love this form, because it's very convenient to use. However, most of the time it's also bad programming practice. The problem is that different modules will sometimes define the same name to mean different things:
>>> from math import sin
>>> sin(1.0)
0.8414709848078965
>>> from evil import sin
>>> sin('bear false witness')
'The check is in the mail!'
(The second example is made-up, of course.)
If we use the from X import *
syntax routinely,
we can get name clashes,
where you import the same name multiple times.
In this case, Python will only allow you to use
the last name that was imported.
Now sin
means evil.sin
, and math.sin
can't be used.
Name clashes can lead to difficult-to-find bugs. Our advice is therefore:
Tip
Don't use the from X import *
syntax
unless there is a really good reason to.
If you think you want to use this form,
try using from X import Y
i.e. specify
exactly which name(s) you want to import.
That's much less likely to cause problems,
and if a name clash happens, it will probably be easier to spot.
The import X as Y
syntax
OK, so hopefully you're convinced that
indiscriminate use of the from X import *
syntax is a bad idea.
Still, if a module has a long name
it's really annoying to have to write it over and over
as the first part of qualified names.
Fortunately, Python provides another way to do this:
you can rename a module when you import it
using the import X as Y
syntax.
Most of the time, you rename a module to a much shorter name:
The benefit of this is that you're protected from name clashes
as long as you choose module names that are different.
The downside of this is that m.sqrt
is perhaps less readable than either math.sqrt
or sqrt
.
Nevertheless, we think that it's a good compromise.
Note
Many popular Python libraries,
such as NumPy
and Pandas
(for multidimensional data analysis),
and matplotlib
(for plotting and visualization)
are normally used with the import X as Y
syntax:
Modules are first-class
It might surprise you to know that modules are actually also Python objects! The technical way of saying this is that modules are "first-class" objects, but all that means is that they are objects like any other object.4 Because of this, you can store modules in variables, you can pass them as arguments to functions, etc.
So the import math as m
syntax
is exactly equivalent to import math
followed by m = math
.
Module documentation and the help
function
Modules can contain a lot of different functions, values, etc. How do we learn about what's in a module?
One way is to go to the Python web site and look in the library documentation. The place to go is https://docs.python.org/library. This is the preferred approach if the module is part of Python's standard libraries, which will be the case for most of the modules we'll use in this course.
Another way is to use the help
function built in to Python.
The help function can take a module, a function,
or a class as its argument
and will print out the documentation associated with that thing.
>>> import math
>>> help(math.sqrt)
Help on built-in function sqrt in module math:
sqrt(x, /)
Return the square root of x.
Note
Don't worry about the /
in the sqrt(x, /)
line.
It means that the argument x
is "positional only".
This will mean more to you
when we talk about "keyword arguments" in later readings.
You can get documentation for entire modules, too:
>>> import math
>>> help(math)
Help on module math:
NAME
math
MODULE REFERENCE
https://docs.python.org/3.12/library/math.html
...
(This goes on for many pages.)
One thing that's interesting is that
the help
function is not "magical" in any way;
it's just a regular Python function.
The reason it can take a function or a module as its argument
follows from the fact that functions and modules
are themselves Python objects.
On the other hand, if you wanted help on (say)
the def
keyword in Python, this won't work:
Since def
is a keyword and not a Python object, this doesn't work.
You have to go to the online documentation
to learn everything about how def
works.
The help
function only works
if the function or module argument is known to Python:
>>> help(math.sqrt)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'math' is not defined
>>> import math
>>> help(math.sqrt)
Help on built-in function sqrt in module math:
sqrt(x, /)
Return the square root of x.
You can also write documentation for your own functions
so that the help
function will work on them.
We'll see how to do that in the next reading.
Note
The help
function can also be called with no arguments:
When invoked this way,
help
puts you into an interpreted environment
which runs on top of the interactive Python interpreter
(as is suggested by the help>
prompt,
which is distinct from the usual Python >>>
prompt).
Inside this prompt,
you can enter anything you want help on:
a module, a function, a value, etc.
This even works if you haven't loaded a module.
Imagine you start the Python interpreter and don't import any modules.
This still works:
>>> help()
... [long welcome message] ...
help> math.sqrt
Help on built-in function sqrt in math:
math.sqrt = sqrt(x, /)
Return the square root of x.
If you need help on multiple topics,
calling help
this way is often useful.
To get out of the help
system, type quit
at the prompt.
Writing your own modules
Writing a Python module is very easy. You just need to do two things:
- Write your code in a file. (Make sure it's a plain text file.)
- Make sure that the name of the file ends in
.py
.
Also, by convention, module names should be short and consist of all lowercase letters and (if necessary) underscores.5
Let's write a simple module called greetings
which provides functions to print out greeting messages.
The module's file will be named greetings.py
.
Here's our first attempt:
# Module: greetings
# Filename: greetings.py
def greet(name):
print('Hi there, {}!'.format(name))
def insult(name):
print('Get lost, {}!'.format(name))
Once this code is saved into the file greetings.py
,
you can import it like any other module.
>>> import greetings
>>> greetings.greet('Adam')
Hi there, Adam!
>>> greetings.insult('Mike')
Get lost, Mike!
Or, if you like, you can import it the other way:
Or even:
The point of all this is that writing normal Python code in a file is exactly the same as writing a Python module. So Python modules are incredibly easy to create and use.
Note
In the next reading,
we will see how to add documentation to our modules
so that the help
function can print it out when requested.
Some useful modules
There are lots of useful modules included with Python. Some commonly-used ones include:
math
(math functions)cmath
(complex number math functions)string
(functions on strings)random
(random number functions)sys
(system functions)os
(operating system specific functions)re
(regular expressions)
There are many, many more useful modules besides these. Go to the Python module documentation for a full list. There are also many important and useful modules that aren't included with Python. Some of these include:
numpy
(numerical programming with multidimensional arrays)scipy
(scientific functions)pandas
(data frames and data analysis)matplotlib
(plotting and visualization tools)
These modules are used extensively by data scientists and are one of the main reasons that Python is so popular.6
-
In programming languages, the integer
2
and the floating-point number2.0
are different, because the way they are represented internally is different. ↩ -
To be completely accurate, Python allows you to write literal imaginary numbers, which means a regular number with the suffix
j
. When you write1.0+2.0j
, it's actually a Python addition expression: the real number1.0
added to the imaginary number2.0j
. ↩ -
According to PEP8, which is a set of standard coding style conventions for Python. ↩
-
Functions are also first-class Python objects, and you can pass functions as arguments to other functions. This turns out to be very powerful and leads to a style of programming called "functional programming". Functional programming is the topic of CS 4 and CS 115. ↩
-
This is also part of the PEP8 style guidelines. We will see PEP8 again. ↩
-
They are also very easy to install using the
pip
tool. For instance, to installnumpy
all you need to do is typepip install numpy
at a terminal prompt. ↩