Reading 7: Documenting your code
The protean nature of the computer is such that it can act like a machine or like a language to be shaped and exploited.
- Alan Kay
Overview
Documenting your code means writing comments and explanations to make your code easier for other programmers to understand. Documentation doesn't affect the way your code runs, but it's still extremely important. Many beginning programmers comment sparsely or not at all, but when you write programs as part of a team (which is what most professional programmers do), good documentation is essential and can't be neglected. Remember: programs are not just written to make the computer understand what to do; they also must be able to communicate to other programmers what you intended. Those other programmers may be in charge of maintaining or extending your code, and you need to make sure that they understand the code they're working on.
There are two ways to document code. The first way is using plain old comments. The second way uses Python docstrings, which is the preferred way to add documentation to your Python functions and modules.
Topics
-
Commenting and comment style
-
Docstrings
Comments
Syntax and style
We've already seen that in Python,
comments begin with a pound sign (#
)1
and go to the end of the line.
When you write a comment,
please include at least one space after the #
character.
This is good:
and this is bad:
This last comment is perfectly legal in Python, but it's hard to read.
Note
There are a lot of "style guidelines" like this that will help you write more readable code. The definitive set of style guidelines for Python is "PEP8", which you can read here. However, PEP8 is a bit overwhelming for new programmers, so we'll try to point out the most important style guidelines as we come to them.2 If you're ever confused about how to format your code, you can consult PEP8.
You can put comments after a line of code on the same line:
This is OK as long as the line doesn't get too long. Lines shouldn't be longer than about 80 characters, because otherwise they are hard to read.3 If a comment can't easily fit on the end of a line, put it on the line before the line being commented on, not after. So this is good:4
and this is bad:
Multiline comments?
Many computer languages (like Java and C++) also have special symbols for multiline comments (comments that span multiple lines), but Python does not. The closest Python comes to multiline comments are docstrings, which we discuss in the next section, but technically docstrings are not comments.
If you do want a multiline comment, what you need to do is to make a bunch of single-line comments one after the other, like this:
Most text editors with Python support have some way of selecting multiple lines and commenting them all out.
When to use comments
Comments are mainly used inside of functions to explain what some code does if its meaning isn't obvious. Don't use a comment to restate something obvious about the code. However, if you are doing something clever or tricky, or even something you think that someone reading the code might not understand, adding a comment is a good idea.
Here's an example of bad and good comments stolen from PEP8. This is bad:
Here, you are restating something obvious about the code, so the comment is completely unnecessary.
This is good:
Not knowing what the context of the program is, I don't know what "Compensate for border" might mean, but at least it isn't something that is obvious from the code.
An example of tricky code that might deserve a comment is this:
Hopefully you won't be writing much code like this.
Note that if you never comment your code, or if you comment poorly, we will take marks off of your assignments.
Docstrings
Traditionally, one of the most common places to put a comment is right before a function, so you can tell the reader what the function does:
The only problem with this kind of comment
is that Python's help
function can't use them.
Once Python reads the comment, it discards it and it's gone.
It would be great if there were a way to write a "comment"
that would somehow be usable by Python's help
function.
And there is such a way!
It's called a docstring, which is short for "documentation string".
Docstring syntax
A docsting is just a regular Python string which happens to be the first thing inside:
- a function body
- a module
- a class (covered later in the course)
When you execute a docstring, the docstring itself doesn't do anything. However, Python "knows" about docstrings, so when it sees a string in one of those places it stores it as part of the function (or module, or class). Then it can use it later, as we'll see.
Docstrings are normally written as triple-quoted strings, because very often the information you want to write spans more than one string. Python does not require docstrings to be triple-quoted, but the PEP8 style guidelines do.
Function docstrings
Here's our previous example with a docstring instead of a comment:
Notice that even though this is a one-line docstring,
we wrote it with triple quotes.
That way, if we ever want to add more information
and it needs to span more lines we don't have to change the quotes.
We strongly recommend that you use triple-quoted strings for all docstrings.
We also recommend that you use
three double-quote characters ("""
)
not three single-quote characters ('''
),
since this is the PEP8 convention.
Let's assume we wrote the greet
function in a file called greetings.py
.
That makes it a Python module, so inside the Python interpreter,
we can import
it and run it:
as expected.
Here's the good part: we can also get help
on the function:
>>> help(greetings.greet)
Help on function greet in module greetings:
greet(name)
Print out a hearty greeting.
Note
Python actually uses what's called a "pager"
to print out the help message.
This allows you to read very long help messages a page at a time.
Hit the space bar to advance to the next page if there is one,
and type q
to dismiss the help message.
Notice that the help message for the greet
function
is exactly the docstring that you wrote.
Whatever you write in the docstring
immediately becomes available to the help
function.
Module docstrings
Let's look at the entire module (in the file greetings.py
),
with docstrings for each function:
# Module: greetings
# Filename: greetings.py
def greet(name):
"""Print out a hearty greeting."""
print('Hi there, {}!'.format(name))
def insult(name):
"""Print out a nasty insult."""
print('Get lost, {}!'.format(name))
It looks good, but the comments at the beginning of the module seem out of place. In addition, they are completely unnecessary, so we'll start by fixing that:
# This module contains functions to print out various kinds of
# greeting messages.
def greet(name):
"""Print out a hearty greeting."""
print('Hi there, {}!'.format(name))
def insult(name):
"""Print out a nasty insult."""
print('Get lost, {}!'.format(name))
That's a better comment, but the help
function can't use it.
So now we'll turn the comment into a module docstring:
"""
This module contains functions to print out various kinds of
greeting messages.
"""
def greet(name):
"""Print out a hearty greeting."""
print('Hi there, {}!'.format(name))
def insult(name):
"""Print out a nasty insult."""
print('Get lost, {}!'.format(name))
Now it's looking professional. Let's fire up the interpreter:
>>> import greetings
>>> help(greetings)
Help on module greetings:
NAME
greetings
DESCRIPTION
This module contains functions to print out various kinds of
greeting messages.
FUNCTIONS
greet(name)
Print out a hearty greeting.
insult(name)
Print out a nasty insult.
FILE
/home/mvanier/cs1/greetings.py
This gives us all the information about the module we might want (assuming we've written good docstrings).
What to put in docstrings
OK, that's the "how" of docstrings. More interesting is the "what" of docstrings i.e. "what should I write in a docstring?"
The first rule is write docstrings for every function and every module you create. Don't skip writing a docstring because you think the function is obvious; that may be true, but docstrings can also be used to generate external documentation for a module (like a web page), and in that case, you want somebody reading the documentation to know about all the functions in the module.5
Docstrings are good documentation
- for you
- for you in the future (when you've forgotten what your code does)
- for anyone else reading or using your code
The most important thing that needs to be in docstrings is information that will let people understand how to correctly use the code.
In a function docstring, you should describe:
- what the function does (not how it does it)
- what the function arguments mean (ideally one line per argument)
- what the function returns
- any other effect that the function has when run (e.g. printing to the terminal, writing to a file, etc.)
What you shouldn't put in the docstring is a description of how the function works. That should go inside the function body in comments, and may be omitted entirely if the code is simple or "obvious". I generally want to write comments in a function when the code is tricky or is doing something unintuitive. Of course, that's a judgment call.
Warning
Another thing you should never do is copy the problem description verbatim into a docstring. This is sometimes done by students who are told to write docstrings but are too lazy or rushed to write a real one. If you do this, expect to get no credit for the docstring.
If we use these guidelines ultra-strictly, our docstrings for our functions might look like this:
"""
This module contains functions to print out various kinds of
greeting messages.
"""
def greet(name):
"""Print out a hearty greeting.
Arguments:
name : the name to print with the greeting
Return value: none
Side effects: the greeting gets printed to the terminal.
"""
print('Hi there, {}!'.format(name))
def insult(name):
"""Print out a nasty insult.
Arguments:
name : the name to print with the insult
Return value: none
Side effects: the insult gets printed to the terminal.
"""
print('Get lost, {}!'.format(name))
However, not every function needs this level of detail. Very simple functions (like these) can get by with simpler docstrings, as long as they explain what the function does.
Note
In fact, for functions this simple, docstrings that are this detailed are overkill. It's possible to get obsessive about documentation; try to resist this tendency. Too much documentation is better than too little, but try to find the right level of detail.
In a module docstring, you should describe:
- the purpose of the module
- a general description of the kinds of functions in the module
What you shouldn't put in the docstring is a detailed description of any of the functions. That should go into the function docstrings, and there is no need to repeat that information.
The __main__
module
Let's try a dumb experiment: writing a function with a docstring in the Python interpreter. (It's dumb because functions written directly in the interpreter are throwaway functions because they disappear when you quit the interpreter.)
>>> def double(x):
... """This function returns twice the value of the argument `x`."""
... return 2 * x
Now let's get some help
on it:
>>> help(double)
Help on function double in module __main__:
double(x)
This function returns twice the value of the argument `x`.
What does this module __main__
thing referred to mean?
module __main__
is the name that Python gives to a module which is either
- the interactive interpreter
- the module that was directly invoked by Python (if you're running Python on a file)
All other modules are referred to by their own names.
Here, module __main__
refers to the interactive interpreter.
But if we ran Python on a file of code,
it would refer to the file's module.
Consider a file test.py
containing this code:
"""
Test module.
"""
def double(x):
"""This function returns twice the value of the argument `x`."""
return 2 * x
help(double)
When you run this from the terminal, you get this output:
$ python test.py
Help on function double in module __main__:
double(x)
This function returns twice the value of the argument `x`.
Notice that it says module __main__
, not module test
.
The __name__
special variable
Python stores the name of the currently-executing module
in a special variable called __name__
.
Let's change the file test.py
just a bit:
"""
Test module.
"""
def double(x):
"""This function returns twice the value of the argument `x`."""
return 2 * x
print('My name is: {}'.format(__name__))
Let's run it as a standalone file from the terminal:
Now let's import
it from the interactive interpreter:
We'll see a good use for the __name__
variable in a future reading.
Note
We haven't mentioned this before,
but Python can run arbitrary code when you import a module.
Most of the time, the only code in a module are function definitions
(or maybe class or variable definitions)
which don't produce any output when run.
But if there is e.g. a print
statement
in the module at the top level,
it gets run when the module is imported and you will see the result.
(But if you try to import it a second time,
Python doesn't reload it and you don't see the result.
Importing is not the same as reloading.6)
Improving our docstrings
The docstrings we've written above are pretty uninspiring.
We should take our own advice!
So let's improve the test.py
file by writing better docstrings.
"""
This module is a test of Python's docstring functionality.
"""
def double(x):
"""
This function returns twice the value of the argument `x`.
Arguments:
x : an integer
Return value: an integer
"""
return 2 * x
Here, we've split up the function docstring into multiple lines and given a general statement about what the function does, followed by a description of the argument and the return value.
We aren't going to insist on this format for every single function you write, but it's definitely appropriate for longer functions (say, more than ten lines).
-
#
is also sometimes called a "hash sign". ↩ -
The "PEP" in "PEP8" means "Python Enhancement Proposal" and is mainly used for proposing and discussing possible improvements to the language. ↩
-
PEP8 says they should be no more than 79 characters. ↩
-
When referring to a Python variable name in a comment, we surround it with backticks, so we write
`a`
and not justa
. You don't have to do this, but it helps to make it clear that this is a Python name and not a regular word. This convention comes from Markdown, which is a syntax used for writing documents that get converted into web pages. ↩ -
The only exception might be for a function which is never intended to be used outside of the module. But for this course, write docstrings for all functions. ↩
-
There is a way to reload a module, but you don't need to know about it yet, or probably ever. ↩