PEP: 403 Title: General purpose decorator clause (aka "@in" clause)
Version: $Revision$ Last-Modified: $Date$ Author: Alyssa Coghlan
<ncoghlan@gmail.com> Status: Deferred Type: Standards Track
Content-Type: text/x-rst Created: 13-Oct-2011 Python-Version: 3.4
Post-History: 13-Oct-2011

Abstract

This PEP proposes the addition of a new @in decorator clause that makes
it possible to override the name binding step of a function or class
definition.

The new clause accepts a single simple statement that can make a forward
reference to decorated function or class definition.

This new clause is designed to be used whenever a "one-shot" function or
class is needed, and placing the function or class definition before the
statement that uses it actually makes the code harder to read. It also
avoids any name shadowing concerns by making sure the new name is
visible only to the statement in the @in clause.

This PEP is based heavily on many of the ideas in PEP 3150 (Statement
Local Namespaces) so some elements of the rationale will be familiar to
readers of that PEP. Both PEPs remain deferred for the time being,
primarily due to the lack of compelling real world use cases in either
PEP.

Basic Examples

Before diving into the long history of this problem and the detailed
rationale for this specific proposed solution, here are a few simple
examples of the kind of code it is designed to simplify.

As a trivial example, a weakref callback could be defined as follows:

    @in x = weakref.ref(target, report_destruction)
    def report_destruction(obj):
        print("{} is being destroyed".format(obj))

This contrasts with the current (conceptually) "out of order" syntax for
this operation:

    def report_destruction(obj):
        print("{} is being destroyed".format(obj))

    x = weakref.ref(target, report_destruction)

That structure is OK when you're using the callable multiple times, but
it's irritating to be forced into it for one-off operations.

If the repetition of the name seems especially annoying, then a
throwaway name like f can be used instead:

    @in x = weakref.ref(target, f)
    def f(obj):
        print("{} is being destroyed".format(obj))

Similarly, a sorted operation on a particularly poorly defined type
could now be defined as:

    @in sorted_list = sorted(original, key=f)
    def f(item):
        try:
            return item.calc_sort_order()
        except NotSortableError:
            return float('inf')

Rather than:

    def force_sort(item):
        try:
            return item.calc_sort_order()
        except NotSortableError:
            return float('inf')

    sorted_list = sorted(original, key=force_sort)

And early binding semantics in a list comprehension could be attained
via:

    @in funcs = [adder(i) for i in range(10)]
    def adder(i):
        return lambda x: x + i

Proposal

This PEP proposes the addition of a new @in clause that is a variant of
the existing class and function decorator syntax.

The new @in clause precedes the decorator lines, and allows forward
references to the trailing function or class definition.

The trailing function or class definition is always named - the name of
the trailing definition is then used to make the forward reference from
the @in clause.

The @in clause is allowed to contain any simple statement (including
those that don't make any sense in that context, such as pass - while
such code would be legal, there wouldn't be any point in writing it).
This permissive structure is easier to define and easier to explain, but
a more restrictive approach that only permits operations that "make
sense" would also be possible (see PEP 3150 for a list of possible
candidates).

The @in clause will not create a new scope - all name binding operations
aside from the trailing function or class definition will affect the
containing scope.

The name used in the trailing function or class definition is only
visible from the associated @in clause, and behaves as if it was an
ordinary variable defined in that scope. If any nested scopes are
created in either the @in clause or the trailing function or class
definition, those scopes will see the trailing function or class
definition rather than any other bindings for that name in the
containing scope.

In a very real sense, this proposal is about making it possible to
override the implicit "name = <defined function or class>" name binding
operation that is part of every function or class definition,
specifically in those cases where the local name binding isn't actually
needed.

Under this PEP, an ordinary class or function definition:

    @deco2
    @deco1
    def name():
        ...

can be explained as being roughly equivalent to:

    @in name = deco2(deco1(name))
    def name():
        ...

Syntax Change

Syntactically, only one new grammar rule is needed:

    in_stmt: '@in' simple_stmt decorated

Grammar: http://hg.python.org/cpython/file/default/Grammar/Grammar

Design Discussion

Background

The question of "multi-line lambdas" has been a vexing one for many
Python users for a very long time, and it took an exploration of Ruby's
block functionality for me to finally understand why this bugs people so
much: Python's demand that the function be named and introduced before
the operation that needs it breaks the developer's flow of thought. They
get to a point where they go "I need a one-shot operation that does
<X>", and instead of being able to just say that directly, they instead
have to back up, name a function to do <X>, then call that function from
the operation they actually wanted to do in the first place. Lambda
expressions can help sometimes, but they're no substitute for being able
to use a full suite.

Ruby's block syntax also heavily inspired the style of the solution in
this PEP, by making it clear that even when limited to one anonymous
function per statement, anonymous functions could still be incredibly
useful. Consider how many constructs Python has where one expression is
responsible for the bulk of the heavy lifting:

-   comprehensions, generator expressions, map(), filter()
-   key arguments to sorted(), min(), max()
-   partial function application
-   provision of callbacks (e.g. for weak references or asynchronous IO)
-   array broadcast operations in NumPy

However, adopting Ruby's block syntax directly won't work for Python,
since the effectiveness of Ruby's blocks relies heavily on various
conventions in the way functions are defined (specifically, using Ruby's
yield syntax to call blocks directly and the &arg mechanism to accept a
block as a function's final argument).

Since Python has relied on named functions for so long, the signatures
of APIs that accept callbacks are far more diverse, thus requiring a
solution that allows one-shot functions to be slotted in at the
appropriate location.

The approach taken in this PEP is to retain the requirement to name the
function explicitly, but allow the relative order of the definition and
the statement that references it to be changed to match the developer's
flow of thought. The rationale is essentially the same as that used when
introducing decorators, but covering a broader set of applications.

Relation to PEP 3150

PEP 3150 (Statement Local Namespaces) describes its primary motivation
as being to elevate ordinary assignment statements to be on par with
class and def statements where the name of the item to be defined is
presented to the reader in advance of the details of how the value of
that item is calculated. This PEP achieves the same goal in a different
way, by allowing the simple name binding of a standard function
definition to be replaced with something else (like assigning the result
of the function to a value).

Despite having the same author, the two PEPs are in direct competition
with each other. PEP 403 represents a minimalist approach that attempts
to achieve useful functionality with a minimum of change from the status
quo. This PEP instead aims for a more flexible standalone statement
design, which requires a larger degree of change to the language.

Note that where PEP 403 is better suited to explaining the behaviour of
generator expressions correctly, this PEP is better able to explain the
behaviour of decorator clauses in general. Both PEPs support adequate
explanations for the semantics of container comprehensions.

Keyword Choice

The proposal definitely requires some kind of prefix to avoid parsing
ambiguity and backwards compatibility problems with existing constructs.
It also needs to be clearly highlighted to readers, since it declares
that the following piece of code is going to be executed only after the
trailing function or class definition has been executed.

The in keyword was chosen as an existing keyword that can be used to
denote the concept of a forward reference.

The @ prefix was included in order to exploit the fact that Python
programmers are already used to decorator syntax as an indication of out
of order execution, where the function or class is actually defined
first and then decorators are applied in reverse order.

For functions, the construct is intended to be read as "in <this
statement that references NAME> define NAME as a function that does
<operation>".

The mapping to English prose isn't as obvious for the class definition
case, but the concept remains the same.

Better Debugging Support for Functions and Classes with Short Names

One of the objections to widespread use of lambda expressions is that
they have a negative effect on traceback intelligibility and other
aspects of introspection. Similar objections are raised regarding
constructs that promote short, cryptic function names (including this
one, which requires that the name of the trailing definition be supplied
at least twice, encouraging the use of shorthand placeholder names like
f).

However, the introduction of qualified names in PEP 3155 means that even
anonymous classes and functions will now have different representations
if they occur in different scopes. For example:

    >>> def f():
    ...     return lambda: y
    ...
    >>> f()
    <function f.<locals>.<lambda> at 0x7f6f46faeae0>

Anonymous functions (or functions that share a name) within the same
scope will still share representations (aside from the object ID), but
this is still a major improvement over the historical situation where
everything except the object ID was identical.

Possible Implementation Strategy

This proposal has at least one titanic advantage over PEP 3150:
implementation should be relatively straightforward.

The @in clause will be included in the AST for the associated function
or class definition and the statement that references it. When the @in
clause is present, it will be emitted in place of the local name binding
operation normally implied by a function or class definition.

The one potentially tricky part is changing the meaning of the
references to the statement local function or namespace while within the
scope of the in statement, but that shouldn't be too hard to address by
maintaining some additional state within the compiler (it's much easier
to handle this for a single name than it is for an unknown number of
names in a full nested suite).

Explaining Container Comprehensions and Generator Expressions

One interesting feature of the proposed construct is that it can be used
as a primitive to explain the scoping and execution order semantics of
both generator expressions and container comprehensions:

    seq2 = [x for x in y if q(x) for y in seq if p(y)]

    # would be equivalent to

    @in seq2 = f(seq):
    def f(seq)
        result = []
        for y in seq:
            if p(y):
                for x in y:
                    if q(x):
                        result.append(x)
        return result

The important point in this expansion is that it explains why
comprehensions appear to misbehave at class scope: only the outermost
iterator is evaluated at class scope, while all predicates, nested
iterators and value expressions are evaluated inside a nested scope.

An equivalent expansion is possible for generator expressions:

    gen = (x for x in y if q(x) for y in seq if p(y))

    # would be equivalent to

    @in gen = g(seq):
    def g(seq)
        for y in seq:
            if p(y):
                for x in y:
                    if q(x):
                        yield x

More Examples

Calculating attributes without polluting the local namespace (from
os.py):

    # Current Python (manual namespace cleanup)
    def _createenviron():
        ... # 27 line function

    environ = _createenviron()
    del _createenviron

    # Becomes:
    @in environ = _createenviron()
    def _createenviron():
        ... # 27 line function

Loop early binding:

    # Current Python (default argument hack)
    funcs = [(lambda x, i=i: x + i) for i in range(10)]

    # Becomes:
    @in funcs = [adder(i) for i in range(10)]
    def adder(i):
        return lambda x: x + i

    # Or even:
    @in funcs = [adder(i) for i in range(10)]
    def adder(i):
        @in return incr
        def incr(x):
            return x + i

A trailing class can be used as a statement local namespace:

    # Evaluate subexpressions only once
    @in c = math.sqrt(x.a*x.a + x.b*x.b)
    class x:
        a = calculate_a()
        b = calculate_b()

A function can be bound directly to a location which isn't a valid
identifier:

    @in dispatch[MyClass] = f
    def f():
        ...

Constructs that verge on decorator abuse can be eliminated:

    # Current Python
    @call
    def f():
        ...

    # Becomes:
    @in f()
    def f():
        ...

Reference Implementation

None as yet.

Acknowledgements

Huge thanks to Gary Bernhardt for being blunt in pointing out that I had
no idea what I was talking about in criticising Ruby's blocks, kicking
off a rather enlightening process of investigation.

Rejected Concepts

To avoid retreading previously covered ground, some rejected
alternatives are documented in this section.

Omitting the decorator prefix character

Earlier versions of this proposal omitted the @ prefix. However, without
that prefix, the bare in keyword didn't associate the clause strongly
enough with the subsequent function or class definition. Reusing the
decorator prefix and explicitly characterising the new construct as a
kind of decorator clause is intended to help users link the two concepts
and see them as two variants of the same idea.

Anonymous Forward References

A previous incarnation of this PEP (see[1]) proposed a syntax where the
new clause was introduced with : and the forward reference was written
using @. Feedback on this variant was almost universally negative, as it
was considered both ugly and excessively magical:

    :x = weakref.ref(target, @)
    def report_destruction(obj):
        print("{} is being destroyed".format(obj))

A more recent variant always used ... for forward references, along with
genuinely anonymous function and class definitions. However, this
degenerated quickly into a mass of unintelligible dots in more complex
cases:

    in funcs = [...(i) for i in range(10)]
    def ...(i):
      in return ...
      def ...(x):
          return x + i

    in c = math.sqrt(....a*....a + ....b*....b)
    class ...:
      a = calculate_a()
      b = calculate_b()

Using a nested suite

The problems with using a full nested suite are best described in PEP
3150. It's comparatively difficult to implement properly, the scoping
semantics are harder to explain and it creates quite a few situations
where there are two ways to do it without clear guidelines for choosing
between them (as almost any construct that can be expressed with
ordinary imperative code could instead be expressed using a given
statement). While the PEP does propose some new PEP 8 guidelines to help
address that last problem, the difficulties in implementation are not so
easily dealt with.

By contrast, the decorator inspired syntax in this PEP explicitly limits
the new feature to cases where it should actually improve readability,
rather than harming it. As in the case of the original introduction of
decorators, the idea of this new syntax is that if it can be used (i.e.
the local name binding of the function is completely unnecessary) then
it probably should be used.

Another possible variant of this idea is to keep the decorator based
semantics of this PEP, while adopting the prettier syntax from PEP 3150:

    x = weakref.ref(target, report_destruction) given:
        def report_destruction(obj):
            print("{} is being destroyed".format(obj))

There are a couple of problems with this approach. The main issue is
that this syntax variant uses something that looks like a suite, but
really isn't one. A secondary concern is that it's not clear how the
compiler will know which name(s) in the leading expression are forward
references (although that could potentially be addressed through a
suitable definition of the suite-that-is-not-a-suite in the language
grammar).

However, a nested suite has not yet been ruled out completely. The
latest version of PEP 3150 uses explicit forward reference and name
binding schemes that greatly simplify the semantics of the statement,
and it does offer the advantage of allowing the definition of arbitrary
subexpressions rather than being restricted to a single function or
class definition.

References

Copyright

This document has been placed in the public domain.

[1] Start of python-ideas thread:
https://mail.python.org/pipermail/python-ideas/2011-October/012276.html