PEP: 419 Title: Protecting cleanup statements from interruptions
Version: $Revision$ Last-Modified: $Date$ Author: Paul Colomiets
<paul@colomiets.name> Status: Deferred Type: Standards Track
Content-Type: text/x-rst Created: 06-Apr-2012 Python-Version: 3.3

Abstract

This PEP proposes a way to protect Python code from being interrupted
inside a finally clause or during context manager cleanup.

PEP Deferral

Further exploration of the concepts covered in this PEP has been
deferred for lack of a current champion interested in promoting the
goals of the PEP and collecting and incorporating feedback, and with
sufficient available time to do so effectively.

Rationale

Python has two nice ways to do cleanup. One is a finally statement and
the other is a context manager (usually called using a with statement).
However, neither is protected from interruption by KeyboardInterrupt or
GeneratorExit caused by generator.throw(). For example:

    lock.acquire()
    try:
        print('starting')
        do_something()
    finally:
        print('finished')
        lock.release()

If KeyboardInterrupt occurs just after the second print() call, the lock
will not be released. Similarly, the following code using the with
statement is affected:

    from threading import Lock

    class MyLock:

        def __init__(self):
            self._lock_impl = Lock()

        def __enter__(self):
            self._lock_impl.acquire()
            print("LOCKED")

        def __exit__(self):
            print("UNLOCKING")
            self._lock_impl.release()

    lock = MyLock()
    with lock:
        do_something

If KeyboardInterrupt occurs near any of the print() calls, the lock will
never be released.

Coroutine Use Case

A similar case occurs with coroutines. Usually coroutine libraries want
to interrupt the coroutine with a timeout. The generator.throw() method
works for this use case, but there is no way of knowing if the coroutine
is currently suspended from inside a finally clause.

An example that uses yield-based coroutines follows. The code looks
similar using any of the popular coroutine libraries Monocle[1],
Bluelet[2], or Twisted[3]. :

    def run_locked():
        yield connection.sendall('LOCK')
        try:
            yield do_something()
            yield do_something_else()
        finally:
            yield connection.sendall('UNLOCK')

    with timeout(5):
        yield run_locked()

In the example above, yield something means to pause executing the
current coroutine and to execute coroutine something until it finishes
execution. Therefore, the coroutine library itself needs to maintain a
stack of generators. The connection.sendall() call waits until the
socket is writable and does a similar thing to what socket.sendall()
does.

The with statement ensures that all code is executed within 5 seconds
timeout. It does so by registering a callback in the main loop, which
calls generator.throw() on the top-most frame in the coroutine stack
when a timeout happens.

The greenlets extension works in a similar way, except that it doesn't
need yield to enter a new stack frame. Otherwise considerations are
similar.

Specification

Frame Flag 'f_in_cleanup'

A new flag on the frame object is proposed. It is set to True if this
frame is currently executing a finally clause. Internally, the flag must
be implemented as a counter of nested finally statements currently being
executed.

The internal counter also needs to be incremented during execution of
the SETUP_WITH and WITH_CLEANUP bytecodes, and decremented when
execution for these bytecodes is finished. This allows to also protect
__enter__() and __exit__() methods.

Function 'sys.setcleanuphook'

A new function for the sys module is proposed. This function sets a
callback which is executed every time f_in_cleanup becomes false.
Callbacks get a frame object as their sole argument, so that they can
figure out where they are called from.

The setting is thread local and must be stored in the PyThreadState
structure.

Inspect Module Enhancements

Two new functions are proposed for the inspect module:
isframeincleanup() and getcleanupframe().

isframeincleanup(), given a frame or generator object as its sole
argument, returns the value of the f_in_cleanup attribute of a frame
itself or of the gi_frame attribute of a generator.

getcleanupframe(), given a frame object as its sole argument, returns
the innermost frame which has a true value of f_in_cleanup, or None if
no frames in the stack have a nonzero value for that attribute. It
starts to inspect from the specified frame and walks to outer frames
using f_back pointers, just like getouterframes() does.

Example

An example implementation of a SIGINT handler that interrupts safely
might look like:

    import inspect, sys, functools

    def sigint_handler(sig, frame):
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt()
        sys.setcleanuphook(functools.partial(sigint_handler, 0))

A coroutine example is out of scope of this document, because its
implementation depends very much on a trampoline (or main loop) used by
coroutine library.

Unresolved Issues

Interruption Inside With Statement Expression

Given the statement :

    with open(filename):
        do_something()

Python can be interrupted after open() is called, but before the
SETUP_WITH bytecode is executed. There are two possible decisions:

-   Protect with expressions. This would require another bytecode, since
    currently there is no way of recognizing the start of the with
    expression.

-   Let the user write a wrapper if he considers it important for the
    use-case. A safe wrapper might look like this:

        class FileWrapper(object):

            def __init__(self, filename, mode):
                self.filename = filename
                self.mode = mode

            def __enter__(self):
                self.file = open(self.filename, self.mode)

            def __exit__(self):
                self.file.close()

    Alternatively it can be written using the contextmanager()
    decorator:

        @contextmanager
        def open_wrapper(filename, mode):
            file = open(filename, mode)
            try:
                yield file
            finally:
                file.close()

    This code is safe, as the first part of the generator (before yield)
    is executed inside the SETUP_WITH bytecode of the caller.

Exception Propagation

Sometimes a finally clause or an __enter__()/__exit__() method can raise
an exception. Usually this is not a problem, since more important
exceptions like KeyboardInterrupt or SystemExit should be raised
instead. But it may be nice to be able to keep the original exception
inside a __context__ attribute. So the cleanup hook signature may grow
an exception argument:

    def sigint_handler(sig, frame)
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt()
        sys.setcleanuphook(retry_sigint)

    def retry_sigint(frame, exception=None):
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt() from exception

Note

There is no need to have three arguments like in the __exit__ method
since there is a __traceback__ attribute in exception in Python 3.

However, this will set the __cause__ for the exception, which is not
exactly what's intended. So some hidden interpreter logic may be used to
put a __context__ attribute on every exception raised in a cleanup hook.

Interruption Between Acquiring Resource and Try Block

The example from the first section is not totally safe. Let's take a
closer look:

    lock.acquire()
    try:
        do_something()
    finally:
        lock.release()

The problem might occur if the code is interrupted just after
lock.acquire() is executed but before the try block is entered.

There is no way the code can be fixed unmodified. The actual fix depends
very much on the use case. Usually code can be fixed using a with
statement:

    with lock:
        do_something()

However, for coroutines one usually can't use the with statement because
you need to yield for both the acquire and release operations. So the
code might be rewritten like this:

    try:
        yield lock.acquire()
        do_something()
    finally:
        yield lock.release()

The actual locking code might need more code to support this use case,
but the implementation is usually trivial, like this: check if the lock
has been acquired and unlock if it is.

Handling EINTR Inside a Finally

Even if a signal handler is prepared to check the f_in_cleanup flag,
InterruptedError might be raised in the cleanup handler, because the
respective system call returned an EINTR error. The primary use cases
are prepared to handle this:

-   Posix mutexes never return EINTR
-   Networking libraries are always prepared to handle EINTR
-   Coroutine libraries are usually interrupted with the throw() method,
    not with a signal

The platform-specific function siginterrupt() might be used to remove
the need to handle EINTR. However, it may have hardly predictable
consequences, for example SIGINT a handler is never called if the main
thread is stuck inside an IO routine.

A better approach would be to have the code, which is usually used in
cleanup handlers, be prepared to handle InterruptedError explicitly. An
example of such code might be a file-based lock implementation.

signal.pthread_sigmask can be used to block signals inside cleanup
handlers which can be interrupted with EINTR.

Setting Interruption Context Inside Finally Itself

Some coroutine libraries may need to set a timeout for the finally
clause itself. For example:

    try:
        do_something()
    finally:
        with timeout(0.5):
            try:
                yield do_slow_cleanup()
            finally:
                yield do_fast_cleanup()

With current semantics, timeout will either protect the whole with block
or nothing at all, depending on the implementation of each library. What
the author intended is to treat do_slow_cleanup as ordinary code, and
do_fast_cleanup as a cleanup (a non-interruptible one).

A similar case might occur when using greenlets or tasklets.

This case can be fixed by exposing f_in_cleanup as a counter, and by
calling a cleanup hook on each decrement. A coroutine library may then
remember the value at timeout start, and compare it on each hook
execution.

But in practice, the example is considered to be too obscure to take
into account.

Modifying KeyboardInterrupt

It should be decided if the default SIGINT handler should be modified to
use the described mechanism. The initial proposition is to keep old
behavior, for two reasons:

-   Most application do not care about cleanup on exit (either they do
    not have external state, or they modify it in crash-safe way).
-   Cleanup may take too much time, not giving user a chance to
    interrupt an application.

The latter case can be fixed by allowing an unsafe break if a SIGINT
handler is called twice, but it seems not worth the complexity.

Alternative Python Implementations Support

We consider f_in_cleanup an implementation detail. The actual
implementation may have some fake frame-like object passed to signal
handler, cleanup hook and returned from getcleanupframe(). The only
requirement is that the inspect module functions work as expected on
these objects. For this reason, we also allow to pass a generator object
to the isframeincleanup() function, which removes the need to use the
gi_frame attribute.

It might be necessary to specify that getcleanupframe() must return the
same object that will be passed to cleanup hook at the next invocation.

Alternative Names

The original proposal had a f_in_finally frame attribute, as the
original intention was to protect finally clauses. But as it grew up to
protecting __enter__ and __exit__ methods too, the f_in_cleanup name
seems better. Although the __enter__ method is not a cleanup routine, it
at least relates to cleanup done by context managers.

setcleanuphook, isframeincleanup and getcleanupframe can be unobscured
to set_cleanup_hook, is_frame_in_cleanup and get_cleanup_frame, although
they follow the naming convention of their respective modules.

Alternative Proposals

Propagating 'f_in_cleanup' Flag Automatically

This can make getcleanupframe() unnecessary. But for yield-based
coroutines you need to propagate it yourself. Making it writable leads
to somewhat unpredictable behavior of setcleanuphook().

Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP'

These bytecodes can be used to protect the expression inside the with
statement, as well as making counter increments more explicit and easy
to debug (visible inside a disassembly). Some middle ground might be
chosen, like END_FINALLY and SETUP_WITH implicitly decrementing the
counter (END_FINALLY is present at end of every with suite).

However, adding new bytecodes must be considered very carefully.

Expose 'f_in_cleanup' as a Counter

The original intention was to expose a minimum of needed functionality.
However, as we consider the frame flag f_in_cleanup an implementation
detail, we may expose it as a counter.

Similarly, if we have a counter we may need to have the cleanup hook
called on every counter decrement. It's unlikely to have much
performance impact as nested finally clauses are an uncommon case.

Add code object flag 'CO_CLEANUP'

As an alternative to set the flag inside the SETUP_WITH and WITH_CLEANUP
bytecodes, we can introduce a flag CO_CLEANUP. When the interpreter
starts to execute code with CO_CLEANUP set, it sets f_in_cleanup for the
whole function body. This flag is set for code objects of __enter__ and
__exit__ special methods. Technically it might be set on functions
called __enter__ and __exit__.

This seems to be less clear solution. It also covers the case where
__enter__ and __exit__ are called manually. This may be accepted either
as a feature or as an unnecessary side-effect (or, though unlikely, as a
bug).

It may also impose a problem when __enter__ or __exit__ functions are
implemented in C, as there is no code object to check for the
f_in_cleanup flag.

Have Cleanup Callback on Frame Object Itself

The frame object may be extended to have a f_cleanup_callback member
which is called when f_in_cleanup is reset to 0. This would help to
register different callbacks to different coroutines.

Despite its apparent beauty, this solution doesn't add anything, as the
two primary use cases are:

-   Setting the callback in a signal handler. The callback is inherently
    a single one for this case.
-   Use a single callback per loop for the coroutine use case. Here, in
    almost all cases, there is only one loop per thread.

No Cleanup Hook

The original proposal included no cleanup hook specification, as there
are a few ways to achieve the same using current tools:

-   Using sys.settrace() and the f_trace callback. This may impose some
    problem to debugging, and has a big performance impact (although
    interrupting doesn't happen very often).
-   Sleeping a bit more and trying again. For a coroutine library this
    is easy. For signals it may be achieved using signal.alert.

Both methods are considered too impractical and a way to catch exit from
finally clauses is proposed.

References

[4] Original discussion
https://mail.python.org/pipermail/python-ideas/2012-April/014705.html

[5] Implementation of PEP 419
https://github.com/python/cpython/issues/58935

Copyright

This document has been placed in the public domain.

[1] Monocle https://github.com/saucelabs/monocle

[2] Bluelet https://github.com/sampsyo/bluelet

[3] Twisted: inlineCallbacks
https://twisted.org/documents/8.1.0/api/twisted.internet.defer.html