Reading 17: Exception handling, part 2
Nature provides exceptions to every rule.
- Margaret Fuller
Overview
In this reading we'll continue our discussion of Python's exception handling facilities, focusing on what happens when an exception is raised in one function and caught in another function. Along the way, we will introduce the runtime stack, which is an important feature of programming language implementations. Understanding the runtime stack will help you understand exception handling as well as other aspects of programming (such as recursion).
Topics
-
"Remote" exception handling
-
The runtime stack
-
Frames and tracebacks
Error scenarios
In the last reading, we saw this function:
def sum_numbers_in_file(filename):
sum_nums = 0
file = open(filename, "r")
for line in file:
line = line.strip()
sum_nums += int(line)
file.close()
return sum_nums
The purpose of sum_numbers_in_file
is to
take a filename and return a number.
The filename's file should exist and contain only numbers
(one per line).
However, errors can happen:
- the file may not exist!
- the file may contain lines with non-numbers
- the file may contain lines with multiple numbers
- the file may have blank lines
Let's explore these cases one by one.
Case 1: The file doesn't exist
What if we call the function on a file that doesn't exist?
Case 2: A line contains a non-number
What if a line in the file contains something other than a number? For instance, imagine this is one section of a file being read:
Let's run this file through the function:
The error comes when trying to execute this line:
where line
is thirty-five
.
The int function can't convert the string thirty-five
to an actual integer
(it's not that smart!).
Here, the exception class is ValueError
,
and the data associated with the exception is the string
invalid literal for int() with base 10: 'thirty-five'
(in other words, the error message).
This message gives more information about what caused the exception.
So any time you attempt to read a line and convert it to an integer
using the int
function,
it will raise a ValueError
exception
if the line can't be converted to an int.
Case 3: The file contains blank lines
For another example of this, if we read in a blank line,
it would be read in as the string "\n"
and then the strip
method will remove the newline character
to yield the empty string ""
.
If we give that to the int
function, we get this:
Again, we get a ValueError
.
In general, a ValueError
means that a function
got the correct type of input (here, a string),
but an incorrect input value
(here, a string which can't be converted to an int
).
Case 4: There are multiple numbers on a line
Assume that an input line is e.g. 1 2 3
, all on one line.
The sum_numbers_in_file
function
will eventually have to compute this:
Once again, we get the same exception, because our function
is trying to convert the entire string to an int
(after stripping off spaces from both sides of the string).
Since the string 1 2 3
doesn't represent a single number,
we get a ValueError
again.
In all of these cases,
Python couldn't convert the contents of the line
into an integer that could be added to the sum sum_nums
.
Summary
To sum up, the function sum_numbers_in_file
:
- may raise a
FileNotFoundError
if the file doesn't exist - takes in an input argument, representing a file of numbers, one per line
- outputs the sum of all the numbers in the file
- may raise a
ValueError
if a line in the file isn't formatted correctly
So we need to be able to handle two different kinds of error situations as well as the normal case!
Handling the exceptions
Previously, we've seen how to handle exceptions using try
and except
:
- We put code that can raise exceptions inside a
try
block. - We handle the exceptions inside an
except
block
We can have multiple except blocks,
one per exception class to be handled.
For instance,
we can have one for FileNotFoundError
and one for ValueError
.
Let's rewrite our code accordingly.
def sum_numbers_in_file(filename):
try:
sum_nums = 0
file = open(filename, "r")
for line in file:
line = line.strip()
sum_nums += int(line)
file.close()
return sum_nums
except FileNotFoundError:
# What should we do here?
except ValueError:
# What should we do here?
Design decisions
What code do we put in the except blocks? There are multiple ways to handle any particular error.
For a FileNotFoundError
, we could:
- interactively prompt the user for a different file name, or
- immediately return 0 since the file name is no good, or
- abort entirely (don't handle the exception) and let the program terminate
For a ValueError
, we could:
- assume the line is no good, just use 0 as the number and keep going, or
- assume the entire file is corrupt, return 0 from the function, or
- abort entirely (don't handle the exception) and let the program terminate
There are a lot of choices to make! (9 different possible combinations of exception handling strategies in one little function!) And one further question: Should this function be in charge of deciding what happens in the event of an error?
One can imagine situations in which every possible way of handling errors would be appropriate. For instance:
-
If we are reading in a small number of files, it might be OK to prompt the user for a new filename in case the filename given doesn't exist.
-
Conversely, if we are trying to read a large number of files, this would not be appropriate.
Case 1 (interactive)
In this situation, we are using the function interactively from the Python shell:
If there is no file named nums.data
,
it's reasonable to ask the user for another filename.
And then the program continues,
using mynums.data
as the filename to read from.
Case 2 (non-interactive)
In this situation, we are using the function from inside a larger Python program which reads from a large number of files:
filenames = []
for i in range(1, 1001): # 1000 files to read in
filenames.append("day{}.data".format(i))
# Now filenames is ["day1.data", "day2.data", ...]
total = 0
for name in filenames:
total += sum_numbers_in_file(name)
Here, it's better to just ignore missing files (maybe printing an error message when that happens) and continue reading.
The point
Even though you can handle exceptions inside the function where they are raised, it's often better not to. Instead, you can handle the exceptions in the function which called the function where the exceptions might be raised. Then, the calling function could decide how to handle the exceptions based on its needs. This is a very flexible approach, as we'll see.
Case 1 again
When running sum_numbers_in_file
interactively,
we would like to say:
-
If
sum_numbers_in_file
raises aFileNotFoundError
exception, tell the user the file doesn't exist, prompt the user for a different filename, and try again. -
If
sum_numbers_in_file
raises aValueError
exception, tell the user the file has invalid lines, prompt the user for a different filename, and try again.
Case 2 again
When running sum_numbers_in_file
on a large number of filenames,
we would like to say:
-
If
sum_numbers_in_file
raises aFileNotFoundError
exception, assume the file doesn't exist and keep going with the next file. -
If
sum_numbers_in_file
raises aValueError
exception, assume the file is invalid and keep going with the next file.
The runtime stack
Python (like all languages that have exception handling) allows you to catch exceptions outside of the function in which the exception was raised. But to understand how this works, first we have to understand something called the runtime stack. This will involve some details about how Python works internally. Trust us, it's necessary knowledge!
Functions calling functions
Consider these functions:
def func1(x, y):
z = func2(x * x, y * y)
return x + y + z
def func2(a, b):
return (func3(a) + func3(b))
def func3(n):
return (3 * n + 10)
Here, we can see that func1
calls func2
, which in turn calls func3
.
Notice that when func1
calls func2
:
-
func1
has to wait forfunc2
to return its return value back to it beforefunc1
can complete -
While
func2
is running, the values ofx
andy
infunc1
have to be stored somewhere in memory, to be used again oncefunc2
returns
Where are x
and y
from func1
stored while func2
is running?
Answer: the values x
and y
from func1
are stored in an internal Python data structure
called the runtime stack (or "the stack" for short).
But to talk about the runtime stack,
we first have to talk about frames.
Frames
Every time a function is called, a data structure is created which holds:
- the values of all of the arguments of the function
- the values of all of its local variables.
For func1
, this includes the function's arguments x
and y
,
as well as the local variable z
.
This data structure is called a frame.
It's kind of like a dictionary in that it
maps names (like x
, y
, z
)
to their current values in this particular call of the function.
Frames are temporary data structures. They are created when a function is called, and they go away when the function returns. They are used by Python to look up the current value of any local variable or function argument while evaluating the body of a function.
If we called func1
like this:
then the frame for this function call would contain these mappings:
What happens when one function (like func1
)
calls another function (like func2
)?
The first function hasn't returned yet, so its frame still exists.
The second function gets a frame too, as soon as it gets called.
How do we handle the two frames?
The frames from the different functions are kept on a stack of frames:
- Every time a function is called, a new frame is created and pushed onto the stack.
- Every time a function returns, its frame is popped off the stack.
This stack of frames is called the runtime stack, because it's a stack which is used while the program is running.
Frames also keep track of the exact location in the code
where a function (like func1
) called another function (like func2
),
so that when func2
returns,
Python knows where to continue in func1
.
In this case, when func2
returns,
we assign the return value of func2
to z
and continue evaluating the body of func1
.
A simple example
Recall:
When we call this function directly:
Then a frame is created which has only
This frame is pushed onto the runtime stack:
func3
computes its return value (40) and then returns to its caller.
The frame is popped off the runtime stack, because it's not needed anymore.
The stack is now empty:
A more complicated example:
Recall:
Let's do this:
This creates a frame with a
bound to 10
and b
bound to 20
.
This frame is pushed onto the stack:
Now func2
calls func3(a)
,
which is func3(10)
because a
has the value 10
in the frame for func2
.
This causes func3
to create a new frame
with n
bound to 10
.
This frame gets pushed onto the stack above func2
's frame:
Now func3(10)
evaluates to 40
.
This value is returned to func2
,
and the frame for func3
is
removed ("popped") from the stack:
Now func3(b)
is called.
b
is 20
, so func3(20)
is what's really called.
A new frame for func3
is created with n
bound to 20
,
and pushed onto the func2
stack.
Then the body of func3
is evaluated.
The stack now looks like this:
func3(20)
evaluates to 70
.
This value is returned to func2
,
and the frame for func3
is removed ("popped") from the stack.
Now that func2
has the results of the two function calls to func3
(40
and 70
, respectively), it can add them up and return the result,
which is 110
.
Then the frame for func2 is removed, and now the stack is empty:
This is how function calls work in Python (and in almost every computer language).
Note
You might wonder
where the func3
return values 40
and 70
are stored
while the computation is taking place.
Different languages handle this differently,
and we don't want to get into a long discussion
of Python internals here.
However, if you are interested in
how programming languages are implemented,
consider taking CS 131 (Interpreters) and CS 164 (Compilers)
to learn all about this and many other features
of programming language implementation,
which is a fascinating subject!
Summary
Frames store the values of function arguments and local variables. Frames are pushed onto and popped off of the runtime stack. The top of the runtime stack contains the frame for the currently executing function. When one function calls another function, which calls another function, etc., the stack grows, with one stack frame per function call. As the function calls complete their work and return, stack frames get popped off the stack. Once all the stack frames have been removed, the computation is finished.
Exceptions and the runtime stack
How does any of this relate to exception handling? Recall:
-
We have a function called
sum_numbers_in_file
. -
It takes a filename, opens a file, reads and sums up all the numbers in the file, and returns the sum.
-
It raises a
FileNotFoundError
exception if the file corresponding to the filename doesn't exist. -
It raises a
ValueError
exception if the file contains lines without numbers, or with more than one number per line. -
We don't want to handle these exceptions in the function itself, since there are too many different ways to do it.
Because of the last point,
We don't have a try
block
inside the sum_numbers_in_file
function.
Instead, we will have a try
block
inside the function that calls the sum_numbers_in_file
function.
(The try
block doesn't have to be inside a function,
but it usually is.)
Let's write two different functions that call sum_numbers_in_file
and handle the possible exceptions.
Example 1: sum_numbers_in_file_interactive
This function will try to run sum_numbers_in_file
on a single filename.
If an exception happens, it will prompt for a new filename
and will try again.
If no exception happens, it will return the sum of the numbers in the file.
This version of the function is intended for interactive use,
so we'll call it sum_numbers_in_file_interactive
:
def sum_numbers_in_file_interactive(filename):
while True:
try:
total = sum_numbers_in_file(filename)
return total
except FileNotFoundError:
print("Couldn't read file: {}".format(filename))
filename = input("Enter new filename: ")
except ValueError:
print("Invalid line in file: {}".format(filename))
filename = input("Enter new filename: ")
In this function,
sum_numbers_in_file
is called inside a try
block.
If an exception is raised
during the execution of sum_numbers_in_file
,
it will not be caught in that function.
Instead, the exception will go back down in the runtime stack
to the function that called it
(sum_numbers_in_file_interactive
),
and it will get handled
inside the except
block of this function.
The function sum_numbers_in_file_interactive
will continue to run sum_numbers_in_file
on every filename supplied by the user
until a valid filename is given
(i.e. a filename for which a sum can be computed).
Once a sum is computed, the function will return,
which breaks us out of the while True:
loop.
Unwinding the stack
If an exception can be handled
inside a try
/except
statement in the calling function, it will be,
and the runtime stack doesn't matter.
The runtime stack is important when dealing with exceptions
that are not caught inside the function in which they were raised.
These exceptions "propagate back" in the runtime stack
to the function that called the function where the exception was raised
i.e. they pop the stack and go down one stack frame
(like a special kind of return but without a return value),
looking for an except
block that can handle the exception.
If the exception is not handled in the calling function either,
it will propagate back again by popping one more stack frame,
looking for an except
block that can catch it.
It will keep doing this until either
- it finds a suitable
except
block, in which case it will execute the code in that block, or - it reaches the top level of the program (no more stack frames), in which case the program will halt and a traceback will be printed.
What is a "suitable except
block"?
It's an except block labeled with the same name
as the exception that was raised.
So if a ValueError
is raised, a suitable except
block is:
This process, by which exceptions propagate back down the stack,
looking for a suitable except block, is called unwinding the stack.
Once the exception has left a function's frame,
that frame is popped off the stack,
and that function call is over (just as if the function had returned).
Exceptions continue to unwind the stack
until they are caught (by a suitable except
block)
or until they reach the top level of the program.
If the exception goes all the way to the top level (no more stack frames to pop), then it halts the program and prints a traceback. A traceback is a record of exactly where the program was when the exception was raised, including the position in the file of every function call corresponding to each frame that got popped off the stack.
Tracebacks
Consider a Python module called dumb.py:
What will happen when we run this file? We will get this traceback:
$ python dumb.py
Traceback (most recent call last):
File "/home/user/dumb.py", line 11, in <module>
print(func1(42))
^^^^^^^^^
File "/home/user/dumb.py", line 2, in func1
return (2 * func2(n))
^^^^^^^^
File "/home/user/dumb.py", line 5, in func2
return (3 * func3(n))
^^^^^^^^
File "/home/user/dumb.py", line 8, in func3
return (1 / 0) # oops! divide by zero!
~~^~~
ZeroDivisionError: division by zero
The traceback tells us that:
print(func1(42))
, on line 11, was called first. This calledfunc1
.- On line 2,
func1
calledfunc2
. - On line 5,
func2
calledfunc3
. - On line 8,
func3
called1 / 0
, which raised aZeroDivisionError
exception.
Since this exception wasn't caught in func3
, func2
, or func1
,
or at the top level of the program,
the program halted and the traceback was printed out,
along with
- the exception class (
ZeroDivisionError
) - the error message from the
ZeroDivisionError
exception:division by zero
.
A traceback is only printed if the exception is not caught
by a suitable except
block.
Example 2: sum_numbers_in_multiple_files
We showed how to write a function
which calls sum_numbers_in_file
and asks for a new filename when the file isn't found.
If we were dealing with a large number of files,
this wouldn't be practical!
Let's write a function to deal with this case.
When an exception occurs:
- an error message will be printed,
- but execution will continue.
Here's the new function:
def sum_numbers_in_multiple_files(filenames):
total = 0
for filename in filenames:
try:
total += sum_numbers_in_file(filename)
except FileNotFoundError:
print("Couldn't read file: {}".format(filename))
except ValueError:
print("Invalid line in file: {}".format(filename))
return total
When this function executes,
exceptions in sum_numbers_in_file
due to bad or missing files
simply cause those files to be ignored.
This would be more appropriate
for a program which normally runs non-interactively.
(In a real program, though, error messages would probably
be written out to a log file.)
[OPTIONAL] Expert topic: resumable exceptions
There is one problem with the way we handled ValueError
exceptions:
Any time a ValueError
exception occurs,
the entire file it occurs in will be ignored!
This seems too drastic in case e.g. a single line was badly formatted.
What if you wanted to just ignore that line and continue?
You could always rewrite sum_numbers_in_file
to just handle ValueError
exceptions:
def sum_numbers_in_file(filename):
sum_nums = 0
file = open(filename, "r")
for line in file:
line = line.strip()
try:
sum_nums += int(line)
except ValueError:
# what to put here?
file.close()
return sum_nums
But this doesn't solve your problem! You still have to choose what to do:
- interactively prompt for a result value from the line?
- ignore the entire line?
- ignore the entire file?
Ideally, sum_numbers_in_file
shouldn't be where this decision is made!
What you would really like to do would be to write a function that does something like this:
- call
sum_numbers_in_file
- if it returns a value, use that
- if it raises a
FileNotFoundError
exception, skip that file - if it raises a
ValueError
exception, catch it, figure out what you want to do with the line, go back into thesum_numbers_in_file
function where you left off, provide a return value forint(line)
in the linesum_nums += int(line)
, and continue from there!
This is called a "resumable exception". Bad news: Python doesn't have resumable exceptions! Very few languages do!
The bottom line is this: Python's exception-handling system has some limitations which make it less powerful that you would like it to be in some cases. When you unwind the stack, there is no way to "rewind" it back to a previous state and fix whatever went wrong. We just have to live with this as long as we are using Python (or until resumable exceptions get added to Python).
Coming up
-
Object-oriented programming
- Classes
- Class "inheritance"
-
Exceptions and classes
- creating your own exceptions
- raising exceptions explicitly
- the
Exception
base class