PEP 419 – Protecting cleanup statements from interruptions
- Author:
- Paul Colomiets <paul at colomiets.name>
- Status:
- Deferred
- Type:
- Standards Track
- Created:
- 06-Apr-2012
- Python-Version:
- 3.3
Abstract
This PEP proposes a way to protect Python code from being interrupted inside a finally clause or during context manager cleanup.
PEP Deferral
Further exploration of the concepts covered in this PEP has been deferred for lack of a current champion interested in promoting the goals of the PEP and collecting and incorporating feedback, and with sufficient available time to do so effectively.
Rationale
Python has two nice ways to do cleanup. One is a finally
statement and the other is a context manager (usually called using a
with
statement). However, neither is protected from interruption
by KeyboardInterrupt
or GeneratorExit
caused by
generator.throw()
. For example:
lock.acquire()
try:
print('starting')
do_something()
finally:
print('finished')
lock.release()
If KeyboardInterrupt
occurs just after the second print()
call, the lock will not be released. Similarly, the following code
using the with
statement is affected:
from threading import Lock
class MyLock:
def __init__(self):
self._lock_impl = Lock()
def __enter__(self):
self._lock_impl.acquire()
print("LOCKED")
def __exit__(self):
print("UNLOCKING")
self._lock_impl.release()
lock = MyLock()
with lock:
do_something
If KeyboardInterrupt
occurs near any of the print()
calls, the
lock will never be released.
Coroutine Use Case
A similar case occurs with coroutines. Usually coroutine libraries
want to interrupt the coroutine with a timeout. The
generator.throw()
method works for this use case, but there is no
way of knowing if the coroutine is currently suspended from inside a
finally
clause.
An example that uses yield-based coroutines follows. The code looks similar using any of the popular coroutine libraries Monocle [1], Bluelet [2], or Twisted [3].
def run_locked():
yield connection.sendall('LOCK')
try:
yield do_something()
yield do_something_else()
finally:
yield connection.sendall('UNLOCK')
with timeout(5):
yield run_locked()
In the example above, yield something
means to pause executing the
current coroutine and to execute coroutine something
until it
finishes execution. Therefore, the coroutine library itself needs to
maintain a stack of generators. The connection.sendall()
call waits
until the socket is writable and does a similar thing to what
socket.sendall()
does.
The with
statement ensures that all code is executed within 5
seconds timeout. It does so by registering a callback in the main
loop, which calls generator.throw()
on the top-most frame in the
coroutine stack when a timeout happens.
The greenlets
extension works in a similar way, except that it
doesn’t need yield
to enter a new stack frame. Otherwise
considerations are similar.
Specification
Frame Flag ‘f_in_cleanup’
A new flag on the frame object is proposed. It is set to True
if
this frame is currently executing a finally
clause. Internally,
the flag must be implemented as a counter of nested finally statements
currently being executed.
The internal counter also needs to be incremented during execution of
the SETUP_WITH
and WITH_CLEANUP
bytecodes, and decremented
when execution for these bytecodes is finished. This allows to also
protect __enter__()
and __exit__()
methods.
Function ‘sys.setcleanuphook’
A new function for the sys
module is proposed. This function sets
a callback which is executed every time f_in_cleanup
becomes
false. Callbacks get a frame object as their sole argument, so that
they can figure out where they are called from.
The setting is thread local and must be stored in the
PyThreadState
structure.
Inspect Module Enhancements
Two new functions are proposed for the inspect
module:
isframeincleanup()
and getcleanupframe()
.
isframeincleanup()
, given a frame or generator object as its sole
argument, returns the value of the f_in_cleanup
attribute of a
frame itself or of the gi_frame
attribute of a generator.
getcleanupframe()
, given a frame object as its sole argument,
returns the innermost frame which has a true value of
f_in_cleanup
, or None
if no frames in the stack have a nonzero
value for that attribute. It starts to inspect from the specified
frame and walks to outer frames using f_back
pointers, just like
getouterframes()
does.
Example
An example implementation of a SIGINT handler that interrupts safely might look like:
import inspect, sys, functools
def sigint_handler(sig, frame):
if inspect.getcleanupframe(frame) is None:
raise KeyboardInterrupt()
sys.setcleanuphook(functools.partial(sigint_handler, 0))
A coroutine example is out of scope of this document, because its implementation depends very much on a trampoline (or main loop) used by coroutine library.
Unresolved Issues
Interruption Inside With Statement Expression
Given the statement
with open(filename):
do_something()
Python can be interrupted after open()
is called, but before the
SETUP_WITH
bytecode is executed. There are two possible
decisions:
- Protect
with
expressions. This would require another bytecode, since currently there is no way of recognizing the start of thewith
expression. - Let the user write a wrapper if he considers it important for the
use-case. A safe wrapper might look like this:
class FileWrapper(object): def __init__(self, filename, mode): self.filename = filename self.mode = mode def __enter__(self): self.file = open(self.filename, self.mode) def __exit__(self): self.file.close()
Alternatively it can be written using the
contextmanager()
decorator:@contextmanager def open_wrapper(filename, mode): file = open(filename, mode) try: yield file finally: file.close()
This code is safe, as the first part of the generator (before yield) is executed inside the
SETUP_WITH
bytecode of the caller.
Exception Propagation
Sometimes a finally
clause or an __enter__()
/__exit__()
method can raise an exception. Usually this is not a problem, since
more important exceptions like KeyboardInterrupt
or SystemExit
should be raised instead. But it may be nice to be able to keep the
original exception inside a __context__
attribute. So the cleanup
hook signature may grow an exception argument:
def sigint_handler(sig, frame)
if inspect.getcleanupframe(frame) is None:
raise KeyboardInterrupt()
sys.setcleanuphook(retry_sigint)
def retry_sigint(frame, exception=None):
if inspect.getcleanupframe(frame) is None:
raise KeyboardInterrupt() from exception
Note
There is no need to have three arguments like in the __exit__
method since there is a __traceback__
attribute in exception in
Python 3.
However, this will set the __cause__
for the exception, which is
not exactly what’s intended. So some hidden interpreter logic may be
used to put a __context__
attribute on every exception raised in a
cleanup hook.
Interruption Between Acquiring Resource and Try Block
The example from the first section is not totally safe. Let’s take a closer look:
lock.acquire()
try:
do_something()
finally:
lock.release()
The problem might occur if the code is interrupted just after
lock.acquire()
is executed but before the try
block is
entered.
There is no way the code can be fixed unmodified. The actual fix
depends very much on the use case. Usually code can be fixed using a
with
statement:
with lock:
do_something()
However, for coroutines one usually can’t use the with
statement
because you need to yield
for both the acquire and release
operations. So the code might be rewritten like this:
try:
yield lock.acquire()
do_something()
finally:
yield lock.release()
The actual locking code might need more code to support this use case, but the implementation is usually trivial, like this: check if the lock has been acquired and unlock if it is.
Handling EINTR Inside a Finally
Even if a signal handler is prepared to check the f_in_cleanup
flag, InterruptedError
might be raised in the cleanup handler,
because the respective system call returned an EINTR
error. The
primary use cases are prepared to handle this:
- Posix mutexes never return
EINTR
- Networking libraries are always prepared to handle
EINTR
- Coroutine libraries are usually interrupted with the
throw()
method, not with a signal
The platform-specific function siginterrupt()
might be used to
remove the need to handle EINTR
. However, it may have hardly
predictable consequences, for example SIGINT
a handler is never
called if the main thread is stuck inside an IO routine.
A better approach would be to have the code, which is usually used in
cleanup handlers, be prepared to handle InterruptedError
explicitly. An example of such code might be a file-based lock
implementation.
signal.pthread_sigmask
can be used to block signals inside
cleanup handlers which can be interrupted with EINTR
.
Setting Interruption Context Inside Finally Itself
Some coroutine libraries may need to set a timeout for the finally clause itself. For example:
try:
do_something()
finally:
with timeout(0.5):
try:
yield do_slow_cleanup()
finally:
yield do_fast_cleanup()
With current semantics, timeout will either protect the whole with
block or nothing at all, depending on the implementation of each
library. What the author intended is to treat do_slow_cleanup
as
ordinary code, and do_fast_cleanup
as a cleanup (a
non-interruptible one).
A similar case might occur when using greenlets or tasklets.
This case can be fixed by exposing f_in_cleanup
as a counter, and
by calling a cleanup hook on each decrement. A coroutine library may
then remember the value at timeout start, and compare it on each hook
execution.
But in practice, the example is considered to be too obscure to take into account.
Modifying KeyboardInterrupt
It should be decided if the default SIGINT
handler should be
modified to use the described mechanism. The initial proposition is
to keep old behavior, for two reasons:
- Most application do not care about cleanup on exit (either they do not have external state, or they modify it in crash-safe way).
- Cleanup may take too much time, not giving user a chance to interrupt an application.
The latter case can be fixed by allowing an unsafe break if a
SIGINT
handler is called twice, but it seems not worth the
complexity.
Alternative Python Implementations Support
We consider f_in_cleanup
an implementation detail. The actual
implementation may have some fake frame-like object passed to signal
handler, cleanup hook and returned from getcleanupframe()
. The
only requirement is that the inspect
module functions work as
expected on these objects. For this reason, we also allow to pass a
generator object to the isframeincleanup()
function, which removes
the need to use the gi_frame
attribute.
It might be necessary to specify that getcleanupframe()
must
return the same object that will be passed to cleanup hook at the next
invocation.
Alternative Names
The original proposal had a f_in_finally
frame attribute, as the
original intention was to protect finally
clauses. But as it grew
up to protecting __enter__
and __exit__
methods too, the
f_in_cleanup
name seems better. Although the __enter__
method
is not a cleanup routine, it at least relates to cleanup done by
context managers.
setcleanuphook
, isframeincleanup
and getcleanupframe
can
be unobscured to set_cleanup_hook
, is_frame_in_cleanup
and
get_cleanup_frame
, although they follow the naming convention of
their respective modules.
Alternative Proposals
Propagating ‘f_in_cleanup’ Flag Automatically
This can make getcleanupframe()
unnecessary. But for yield-based
coroutines you need to propagate it yourself. Making it writable
leads to somewhat unpredictable behavior of setcleanuphook()
.
Add Bytecodes ‘INCR_CLEANUP’, ‘DECR_CLEANUP’
These bytecodes can be used to protect the expression inside the
with
statement, as well as making counter increments more explicit
and easy to debug (visible inside a disassembly). Some middle ground
might be chosen, like END_FINALLY
and SETUP_WITH
implicitly
decrementing the counter (END_FINALLY
is present at end of every
with
suite).
However, adding new bytecodes must be considered very carefully.
Expose ‘f_in_cleanup’ as a Counter
The original intention was to expose a minimum of needed
functionality. However, as we consider the frame flag
f_in_cleanup
an implementation detail, we may expose it as a
counter.
Similarly, if we have a counter we may need to have the cleanup hook called on every counter decrement. It’s unlikely to have much performance impact as nested finally clauses are an uncommon case.
Add code object flag ‘CO_CLEANUP’
As an alternative to set the flag inside the SETUP_WITH
and
WITH_CLEANUP
bytecodes, we can introduce a flag CO_CLEANUP
.
When the interpreter starts to execute code with CO_CLEANUP
set,
it sets f_in_cleanup
for the whole function body. This flag is
set for code objects of __enter__
and __exit__
special
methods. Technically it might be set on functions called
__enter__
and __exit__
.
This seems to be less clear solution. It also covers the case where
__enter__
and __exit__
are called manually. This may be
accepted either as a feature or as an unnecessary side-effect (or,
though unlikely, as a bug).
It may also impose a problem when __enter__
or __exit__
functions are implemented in C, as there is no code object to check
for the f_in_cleanup
flag.
Have Cleanup Callback on Frame Object Itself
The frame object may be extended to have a f_cleanup_callback
member which is called when f_in_cleanup
is reset to 0. This
would help to register different callbacks to different coroutines.
Despite its apparent beauty, this solution doesn’t add anything, as the two primary use cases are:
- Setting the callback in a signal handler. The callback is inherently a single one for this case.
- Use a single callback per loop for the coroutine use case. Here, in almost all cases, there is only one loop per thread.
No Cleanup Hook
The original proposal included no cleanup hook specification, as there are a few ways to achieve the same using current tools:
- Using
sys.settrace()
and thef_trace
callback. This may impose some problem to debugging, and has a big performance impact (although interrupting doesn’t happen very often). - Sleeping a bit more and trying again. For a coroutine library this
is easy. For signals it may be achieved using
signal.alert
.
Both methods are considered too impractical and a way to catch exit
from finally
clauses is proposed.
References
[4] Original discussion https://mail.python.org/pipermail/python-ideas/2012-April/014705.html
[5] Implementation of PEP 419 https://github.com/python/cpython/issues/58935
Copyright
This document has been placed in the public domain.
Source: https://github.com/python/peps/blob/main/peps/pep-0419.rst
Last modified: 2023-09-09 17:39:29 GMT