PEP: 707 Title: A simplified signature for __exit__ and __aexit__
Author: Irit Katriel <irit@python.org> Discussions-To:
https://discuss.python.org/t/24402 Status: Rejected Type: Standards
Track Content-Type: text/x-rst Created: 18-Feb-2023 Python-Version: 3.12
Post-History: 02-Mar-2023, Resolution:
https://discuss.python.org/t/pep-707-a-simplified-signature-for-exit-and-aexit/24402/46

Rejection Notice

Per the SC:

  We discussed the PEP and have decided to reject it. Our thinking was
  the magic and risk of potential breakage didn’t warrant the benefits.
  We are totally supportive, though, of exploring a potential context
  manager v2 API or __leave__.

Abstract

This PEP proposes to make the interpreter accept context managers whose
~py3.11:object.__exit__ / ~py3.11:object.__aexit__ method takes only a
single exception instance, while continuing to also support the current
(typ, exc, tb) signature for backwards compatibility.

This proposal is part of an ongoing effort to remove the redundancy of
the 3-item exception representation from the language, a relic of
earlier Python versions which now confuses language users while adding
complexity and overhead to the interpreter.

The proposed implementation uses introspection, which is tailored to the
requirements of this use case. The solution ensures the safety of the
new feature by supporting it only in non-ambiguous cases. In particular,
any signature that could accept three arguments is assumed to expect
them.

Because reliable introspection of callables is not currently possible in
Python, the solution proposed here is limited in that only the common
types of single-arg callables will be identified as such, while some of
the more esoteric ones will continue to be called with three arguments.
This imperfect solution was chosen among several imperfect alternatives
in the spirit of practicality. It is my hope that the discussion about
this PEP will explore the other options and lead us to the best way
forward, which may well be to remain with our imperfect status quo.

Motivation

In the past, an exception was represented in many parts of Python by a
tuple of three elements: the type of the exception, its value, and its
traceback. While there were good reasons for this design at the time,
they no longer hold because the type and traceback can now be reliably
deduced from the exception instance. Over the last few years we saw
several efforts to simplify the representation of exceptions.

Since 3.10 in CPython PR #70577, the py3.11:traceback module's functions
accept either a 3-tuple as described above, or just an exception
instance as a single argument.

Internally, the interpreter no longer represents exceptions as a
triplet. This was removed for the handled exception in 3.11 and for the
raised exception in 3.12. As a consequence, several APIs that expose the
triplet can now be replaced by simpler alternatives:

                                                         Legacy API                 Alternative
  ------------------------------------------------------ -------------------------- ---------------------------
  Get handled exception (Python)                         py3.12:sys.exc_info        py3.12:sys.exception
  Get handled exception (C)                              PyErr_GetExcInfo           PyErr_GetHandledException
  Set handled exception (C)                              PyErr_SetExcInfo           PyErr_SetHandledException
  Get raised exception (C)                               PyErr_Fetch                PyErr_GetRaisedException
  Set raised exception (C)                               PyErr_Restore              PyErr_SetRaisedException
  Construct an exception instance from the 3-tuple (C)   PyErr_NormalizeException   N/A

The current proposal is a step in this process, and considers the way
forward for one more case in which the 3-tuple representation has leaked
to the language. The motivation for all this work is twofold.

Simplify the implementation of the language

The simplification gained by reducing the interpreter's internal
representation of the handled exception to a single object was
significant. Previously, the interpreter needed to push onto/pop from
the stack three items whenever it did anything with exceptions. This
increased stack depth (adding pressure on caches and registers) and
complicated some of the bytecodes. Reducing this to one item removed
about 100 lines of code from ceval.c (the interpreter's eval loop
implementation), and it was later followed by the removal of the
POP_EXCEPT_AND_RERAISE opcode which has become simple enough to be
replaced by generic stack manipulation instructions. Micro-benchmarks
showed a speedup of about 10% for catching and raising an exception, as
well as for creating generators. To summarize, removing this redundancy
in Python's internals simplified the interpreter and made it faster.

The performance of invoking __exit__/__aexit__ when leaving a context
manager can be also improved by replacing a multi-arg function call with
a single-arg one. Micro-benchmarks showed that entering and exiting a
context manager with single-arg __exit__ is about 13% faster.

Simplify the language itself

One of the reasons for the popularity of Python is its simplicity. The
py3.11:sys.exc_info triplet is cryptic for new learners, and the
redundancy in it is confusing for those who do understand it.

It will take multiple releases to get to a point where we can think of
deprecating sys.exc_info(). However, we can relatively quickly reach a
stage where new learners do not need to know about it, or about the
3-tuple representation, at least until they are maintaining legacy code.

Rationale

The only reason to object today to the removal of the last remaining
appearances of the 3-tuple from the language is the concerns about
disruption that such changes can bring. The goal of this PEP is to
propose a safe, gradual and minimally disruptive way to make this change
in the case of __exit__, and with this to initiate a discussion of our
options for evolving its method signature.

In the case of the py3.11:traceback module's API, evolving the functions
to have a hybrid signature is relatively straightforward and safe. The
functions take one positional and two optional arguments, and interpret
them according to their types. This is safe when sentinels are used for
default values. The signatures of callbacks, which are defined by the
user's program, are harder to evolve.

The safest option is to make the user explicitly indicate which
signature the callback is expecting, by marking it with an additional
attribute or giving it a different name. For example, we could make the
interpreter look for a __leave__ method on the context manager, and call
it with a single arg if it exists (otherwise, it looks for __exit__ and
continues as it does now). The introspection-based alternative proposed
here intends to make it more convenient for users to write new code,
because they can just use the single-arg version and remain unaware of
the legacy API. However, if the limitations of introspection are found
to be too severe, we should consider an explicit option. Having both
__exit__ and __leave__ around for 5-10 years with similar functionality
is not ideal, but it is an option.

Let us now examine the limitations of the current proposal. It
identifies 2-arg python functions and METH_O C functions as having a
single-arg signature, and assumes that anything else is expecting 3
args. Obviously it is possible to create false negatives for this
heuristic (single-arg callables that it will not identify). Context
managers written in this way won't work, they will continue to fail as
they do now when their __exit__ function will be called with three
arguments.

I believe that it will not be a problem in practice. First, all working
code will continue to work, so this is a limitation on new code rather
than a problem impacting existing code. Second, exotic callable types
are rarely used for __exit__ and if one is needed, it can always be
wrapped by a plain vanilla method that delegates to the callable. For
example, we can write this:

    class C:
       __enter__ = lambda self: self
       __exit__ = ExoticCallable()

as follows:

    class CM:
       __enter__ = lambda self: self
       _exit = ExoticCallable()
       __exit__ = lambda self, exc: CM._exit(exc)

While discussing the real-world impact of the problem in this PEP, it is
worth noting that most __exit__ functions don't do anything with their
arguments. Typically, a context manager is implemented to ensure that
some cleanup actions take place upon exit. It is rarely appropriate for
the __exit__ function to handle exceptions raised within the context,
and they are typically allowed to propagate out of __exit__ to the
calling function. This means that most __exit__ functions do not access
their arguments at all, and we should take this into account when trying
to assess the impact of different solutions on Python's userbase.

Specification

A context manager's __exit__/__aexit__ method can have a single-arg
signature, in which case it is invoked by the interpreter with the
argument equal to an exception instance or None:

    >>> class C:
    ...     def __enter__(self):
    ...         return self
    ...     def __exit__(self, exc):
    ...         print(f'__exit__ called with: {exc!r}')
    ...
    >>> with C():
    ...     pass
    ...
    __exit__ called with: None
    >>> with C():
    ...     1/0
    ...
    __exit__ called with: ZeroDivisionError('division by zero')
    Traceback (most recent call last):
      File "<stdin>", line 2, in <module>
    ZeroDivisionError: division by zero

If __exit__/__aexit__ has any other signature, it is invoked with the
3-tuple (typ, exc, tb) as happens now:

    >>> class C:
    ...     def __enter__(self):
    ...         return self
    ...     def __exit__(self, *exc):
    ...         print(f'__exit__ called with: {exc!r}')
    ...
    >>> with C():
    ...     pass
    ...
    __exit__ called with: (None, None, None)
    >>> with C():
    ...     1/0
    ...
    __exit__ called with: (<class 'ZeroDivisionError'>, ZeroDivisionError('division by zero'), <traceback object at 0x1039cb570>)
    Traceback (most recent call last):
      File "<stdin>", line 2, in <module>
    ZeroDivisionError: division by zero

These __exit__ methods will also be called with a 3-tuple:

    def __exit__(self, typ, *exc):
        pass

    def __exit__(self, typ, exc, tb):
        pass

A reference implementation is provided in CPython PR #101995.

When the interpreter reaches the end of the scope of a context manager,
and it is about to call the relevant __exit__ or __aexit__ function, it
instrospects this function to determine whether it is the single-arg or
the legacy 3-arg version. In the draft PR, this introspection is
performed by the is_legacy___exit__ function:

    static int is_legacy___exit__(PyObject *exit_func) {
        if (PyMethod_Check(exit_func)) {
            PyObject *func = PyMethod_GET_FUNCTION(exit_func);
            if (PyFunction_Check(func)) {
                PyCodeObject *code = (PyCodeObject*)PyFunction_GetCode(func);
                if (code->co_argcount == 2 && !(code->co_flags & CO_VARARGS)) {
                    /* Python method that expects self + one more arg */
                    return false;
                }
            }
        }
        else if (PyCFunction_Check(exit_func)) {
            if (PyCFunction_GET_FLAGS(exit_func) == METH_O) {
                /* C function declared as single-arg */
                return false;
             }
        }
        return true;
    }

It is important to note that this is not a generic introspection
function, but rather one which is specifically designed for our use
case. We know that exit_func is an attribute of the context manager
class (taken from the type of the object that provided __enter__), and
it is typically a function. Furthermore, for this to be useful we need
to identify enough single-arg forms, but not necessarily all of them.
What is critical for backwards compatibility is that we will never
misidentify a legacy exit_func as a single-arg one. So, for example,
__exit__(self, *args) and __exit__(self, exc_type, *args) both have the
legacy form, even though they could be invoked with one arg.

In summary, an exit_func will be invoke with a single arg if:

-   It is a PyMethod with argcount 2 (to count self) and no vararg, or
-   it is a PyCFunction with the METH_O flag.

Note that any performance cost of the introspection can be mitigated via
specialization <659>, so it won't be a problem if we need to make it
more sophisticated than this for some reason.

Backwards Compatibility

All context managers that previously worked will continue to work in the
same way because the interpreter will call them with three args whenever
they can accept three args. There may be context managers that
previously did not work because their exit_func expected one argument,
so the call to __exit__ would have caused a TypeError exception to be
raised, and now the call would succeed. This could theoretically change
the behaviour of existing code, but it is unlikely to be a problem in
practice.

The backwards compatibility concerns will show up in some cases when
libraries try to migrate their context managers from the multi-arg to
the single-arg signature. If __exit__ or __aexit__ is called by any code
other than the interpreter's eval loop, the introspection does not
automatically happen. For example, this will occur where a context
manager is subclassed and its __exit__ method is called directly from
the derived __exit__. Such context managers will need to migrate to the
single-arg version with their users, and may choose to offer a parallel
API rather than breaking the existing one. Alternatively, a superclass
can stay with the signature __exit__(self, *args), and support both one
and three args. Since most context managers do not use the value of the
arguments to __exit__, and simply allow the exception to propagate
onward, this is likely to be the common approach.

Security Implications

I am not aware of any.

How to Teach This

The language tutorial will present the single-arg version, and the
documentation for context managers will include a section on the legacy
signatures of __exit__ and __aexit__.

Reference Implementation

CPython PR #101995 implements the proposal of this PEP.

Rejected Ideas

Support __leave__(self, exc)

It was considered to support a method by a new name, such as __leave__,
with the new signature. This basically makes the programmer explicitly
declare which signature they are intending to use, and avoid the need
for introspection.

Different variations of this idea include different amounts of magic
that can help automate the equivalence between __leave__ and __exit__.
For example, Mark Shannon suggested that the type constructor would add
a default implementation for each of __exit__ and __leave__ whenever one
of them is defined on a class. This default implementation acts as a
trampoline that calls the user's function. This would make inheritance
work seamlessly, as well as the migration from __exit__ to __leave__ for
particular classes. The interpreter would just need to call __leave__,
and that would call __exit__ whenever necessary.

While this suggestion has several advantages over the current proposal,
it has two drawbacks. The first is that it adds a new dunder name to the
data model, and we would end up with two dunders that mean the same
thing, and only slightly differ in their signatures. The second is that
it would require the migration of every __exit__ to __leave__, while
with introspection it would not be necessary to change the many
__exit__(*arg) methods that do not access their args. While it is not as
simple as a grep for __exit__, it is possible to write an AST visitor
that detects __exit__ methods that can accept multiple arguments, and
which do access them.

Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.