PEP: 487 Title: Simpler customisation of class creation Version:
$Revision$ Last-Modified: $Date$ Author: Martin Teichmann
<lkb.teichmann@gmail.com>, Status: Final Type: Standards Track
Content-Type: text/x-rst Created: 27-Feb-2015 Python-Version: 3.6
Post-History: 27-Feb-2015, 05-Feb-2016, 24-Jun-2016, 02-Jul-2016,
13-Jul-2016 Replaces: 422 Resolution:
https://mail.python.org/pipermail/python-dev/2016-July/145629.html

Abstract

Currently, customising class creation requires the use of a custom
metaclass. This custom metaclass then persists for the entire lifecycle
of the class, creating the potential for spurious metaclass conflicts.

This PEP proposes to instead support a wide range of customisation
scenarios through a new __init_subclass__ hook in the class body, and a
hook to initialize attributes.

The new mechanism should be easier to understand and use than
implementing a custom metaclass, and thus should provide a gentler
introduction to the full power of Python's metaclass machinery.

Background

Metaclasses are a powerful tool to customize class creation. They have,
however, the problem that there is no automatic way to combine
metaclasses. If one wants to use two metaclasses for a class, a new
metaclass combining those two needs to be created, typically manually.

This need often occurs as a surprise to a user: inheriting from two base
classes coming from two different libraries suddenly raises the
necessity to manually create a combined metaclass, where typically one
is not interested in those details about the libraries at all. This
becomes even worse if one library starts to make use of a metaclass
which it has not done before. While the library itself continues to work
perfectly, suddenly every code combining those classes with classes from
another library fails.

Proposal

While there are many possible ways to use a metaclass, the vast majority
of use cases falls into just three categories: some initialization code
running after class creation, the initialization of descriptors and
keeping the order in which class attributes were defined.

The first two categories can easily be achieved by having simple hooks
into the class creation:

1.  An __init_subclass__ hook that initializes all subclasses of a given
    class.
2.  upon class creation, a __set_name__ hook is called on all the
    attribute (descriptors) defined in the class, and

The third category is the topic of another PEP, PEP 520.

As an example, the first use case looks as follows:

    >>> class QuestBase:
    ...    # this is implicitly a @classmethod (see below for motivation)
    ...    def __init_subclass__(cls, swallow, **kwargs):
    ...        cls.swallow = swallow
    ...        super().__init_subclass__(**kwargs)

    >>> class Quest(QuestBase, swallow="african"):
    ...    pass

    >>> Quest.swallow
    'african'

The base class object contains an empty __init_subclass__ method which
serves as an endpoint for cooperative multiple inheritance. Note that
this method has no keyword arguments, meaning that all methods which are
more specialized have to process all keyword arguments.

This general proposal is not a new idea (it was first suggested for
inclusion in the language definition more than 10 years ago, and a
similar mechanism has long been supported by Zope's ExtensionClass), but
the situation has changed sufficiently in recent years that the idea is
worth reconsidering for inclusion.

The second part of the proposal adds an __set_name__ initializer for
class attributes, especially if they are descriptors. Descriptors are
defined in the body of a class, but they do not know anything about that
class, they do not even know the name they are accessed with. They do
get to know their owner once __get__ is called, but still they do not
know their name. This is unfortunate, for example they cannot put their
associated value into their object's __dict__ under their name, since
they do not know that name. This problem has been solved many times, and
is one of the most important reasons to have a metaclass in a library.
While it would be easy to implement such a mechanism using the first
part of the proposal, it makes sense to have one solution for this
problem for everyone.

To give an example of its usage, imagine a descriptor representing weak
referenced values:

    import weakref

    class WeakAttribute:
        def __get__(self, instance, owner):
            return instance.__dict__[self.name]()

        def __set__(self, instance, value):
            instance.__dict__[self.name] = weakref.ref(value)

        # this is the new initializer:
        def __set_name__(self, owner, name):
            self.name = name

Such a WeakAttribute may, for example, be used in a tree structure where
one wants to avoid cyclic references via the parent:

    class TreeNode:
        parent = WeakAttribute()

        def __init__(self, parent):
            self.parent = parent

Note that the parent attribute is used like a normal attribute, yet the
tree contains no cyclic references and can thus be easily garbage
collected when out of use. The parent attribute magically becomes None
once the parent ceases existing.

While this example looks very trivial, it should be noted that until now
such an attribute cannot be defined without the use of a metaclass. And
given that such a metaclass can make life very hard, this kind of
attribute does not exist yet.

Initializing descriptors could simply be done in the __init_subclass__
hook. But this would mean that descriptors can only be used in classes
that have the proper hook, the generic version like in the example would
not work generally. One could also call __set_name__ from within the
base implementation of object.__init_subclass__. But given that it is a
common mistake to forget to call super(), it would happen too often that
suddenly descriptors are not initialized.

Key Benefits

Easier inheritance of definition time behaviour

Understanding Python's metaclasses requires a deep understanding of the
type system and the class construction process. This is legitimately
seen as challenging, due to the need to keep multiple moving parts (the
code, the metaclass hint, the actual metaclass, the class object,
instances of the class object) clearly distinct in your mind. Even when
you know the rules, it's still easy to make a mistake if you're not
being extremely careful.

Understanding the proposed implicit class initialization hook only
requires ordinary method inheritance, which isn't quite as daunting a
task. The new hook provides a more gradual path towards understanding
all of the phases involved in the class definition process.

Reduced chance of metaclass conflicts

One of the big issues that makes library authors reluctant to use
metaclasses (even when they would be appropriate) is the risk of
metaclass conflicts. These occur whenever two unrelated metaclasses are
used by the desired parents of a class definition. This risk also makes
it very difficult to add a metaclass to a class that has previously been
published without one.

By contrast, adding an __init_subclass__ method to an existing type
poses a similar level of risk to adding an __init__ method: technically,
there is a risk of breaking poorly implemented subclasses, but when that
occurs, it is recognised as a bug in the subclass rather than the
library author breaching backwards compatibility guarantees.

New Ways of Using Classes

Subclass registration

Especially when writing a plugin system, one likes to register new
subclasses of a plugin baseclass. This can be done as follows:

    class PluginBase:
        subclasses = []

        def __init_subclass__(cls, **kwargs):
            super().__init_subclass__(**kwargs)
            cls.subclasses.append(cls)

In this example, PluginBase.subclasses will contain a plain list of all
subclasses in the entire inheritance tree. One should note that this
also works nicely as a mixin class.

Trait descriptors

There are many designs of Python descriptors in the wild which, for
example, check boundaries of values. Often those "traits" need some
support of a metaclass to work. This is how this would look like with
this PEP:

    class Trait:
        def __init__(self, minimum, maximum):
            self.minimum = minimum
            self.maximum = maximum

        def __get__(self, instance, owner):
            return instance.__dict__[self.key]

        def __set__(self, instance, value):
            if self.minimum < value < self.maximum:
                instance.__dict__[self.key] = value
            else:
                raise ValueError("value not in range")

        def __set_name__(self, owner, name):
            self.key = name

Implementation Details

The hooks are called in the following order: type.__new__ calls the
__set_name__ hooks on the descriptor after the new class has been
initialized. Then it calls __init_subclass__ on the base class, on
super(), to be precise. This means that subclass initializers already
see the fully initialized descriptors. This way, __init_subclass__ users
can fix all descriptors again if this is needed.

Another option would have been to call __set_name__ in the base
implementation of object.__init_subclass__. This way it would be
possible even to prevent __set_name__ from being called. Most of the
times, however, such a prevention would be accidental, as it often
happens that a call to super() is forgotten.

As a third option, all the work could have been done in type.__init__.
Most metaclasses do their work in __new__, as this is recommended by the
documentation. Many metaclasses modify their arguments before they pass
them over to super().__new__. For compatibility with those kind of
classes, the hooks should be called from __new__.

Another small change should be done: in the current implementation of
CPython, type.__init__ explicitly forbids the use of keyword arguments,
while type.__new__ allows for its attributes to be shipped as keyword
arguments. This is weirdly incoherent, and thus it should be forbidden.
While it would be possible to retain the current behavior, it would be
better if this was fixed, as it is probably not used at all: the only
use case would be that at metaclass calls its super().__new__ with name,
bases and dict (yes, dict, not namespace or ns as mostly used with
modern metaclasses) as keyword arguments. This should not be done. This
little change simplifies the implementation of this PEP significantly,
while improving the coherence of Python overall.

As a second change, the new type.__init__ just ignores keyword
arguments. Currently, it insists that no keyword arguments are given.
This leads to a (wanted) error if one gives keyword arguments to a class
declaration if the metaclass does not process them. Metaclass authors
that do want to accept keyword arguments must filter them out by
overriding __init__.

In the new code, it is not __init__ that complains about keyword
arguments, but __init_subclass__, whose default implementation takes no
arguments. In a classical inheritance scheme using the method resolution
order, each __init_subclass__ may take out it's keyword arguments until
none are left, which is checked by the default implementation of
__init_subclass__.

For readers who prefer reading Python over English, this PEP proposes to
replace the current type and object with the following:

    class NewType(type):
        def __new__(cls, *args, **kwargs):
            if len(args) != 3:
                return super().__new__(cls, *args)
            name, bases, ns = args
            init = ns.get('__init_subclass__')
            if isinstance(init, types.FunctionType):
                ns['__init_subclass__'] = classmethod(init)
            self = super().__new__(cls, name, bases, ns)
            for k, v in self.__dict__.items():
                func = getattr(v, '__set_name__', None)
                if func is not None:
                    func(self, k)
            super(self, self).__init_subclass__(**kwargs)
            return self

        def __init__(self, name, bases, ns, **kwargs):
            super().__init__(name, bases, ns)

    class NewObject(object):
        @classmethod
        def __init_subclass__(cls):
            pass

Reference Implementation

The reference implementation for this PEP is attached to issue 27366.

Backward compatibility issues

The exact calling sequence in type.__new__ is slightly changed, raising
fears of backwards compatibility. It should be assured by tests that
common use cases behave as desired.

The following class definitions (except the one defining the metaclass)
continue to fail with a TypeError as superfluous class arguments are
passed:

    class MyMeta(type):
        pass

    class MyClass(metaclass=MyMeta, otherarg=1):
        pass

    MyMeta("MyClass", (), otherargs=1)

    import types
    types.new_class("MyClass", (), dict(metaclass=MyMeta, otherarg=1))
    types.prepare_class("MyClass", (), dict(metaclass=MyMeta, otherarg=1))

A metaclass defining only a __new__ method which is interested in
keyword arguments now does not need to define an __init__ method
anymore, as the default type.__init__ ignores keyword arguments. This is
nicely in line with the recommendation to override __new__ in
metaclasses instead of __init__. The following code does not fail
anymore:

    class MyMeta(type):
        def __new__(cls, name, bases, namespace, otherarg):
            return super().__new__(cls, name, bases, namespace)

    class MyClass(metaclass=MyMeta, otherarg=1):
        pass

Only defining an __init__ method in a metaclass continues to fail with
TypeError if keyword arguments are given:

    class MyMeta(type):
        def __init__(self, name, bases, namespace, otherarg):
            super().__init__(name, bases, namespace)

    class MyClass(metaclass=MyMeta, otherarg=1):
        pass

Defining both __init__ and __new__ continues to work fine.

About the only thing that stops working is passing the arguments of
type.__new__ as keyword arguments:

    class MyMeta(type):
        def __new__(cls, name, bases, namespace):
            return super().__new__(cls, name=name, bases=bases,
                                   dict=namespace)

    class MyClass(metaclass=MyMeta):
        pass

This will now raise TypeError, but this is weird code, and easy to fix
even if someone used this feature.

Rejected Design Options

Calling the hook on the class itself

Adding an __autodecorate__ hook that would be called on the class itself
was the proposed idea of PEP 422. Most examples work the same way or
even better if the hook is called only on strict subclasses. In general,
it is much easier to arrange to explicitly call the hook on the class in
which it is defined (to opt-in to such a behavior) than to opt-out (by
remember to check for cls is __class in the hook body), meaning that one
does not want the hook to be called on the class it is defined in.

This becomes most evident if the class in question is designed as a
mixin: it is very unlikely that the code of the mixin is to be executed
for the mixin class itself, as it is not supposed to be a complete class
on its own.

The original proposal also made major changes in the class
initialization process, rendering it impossible to back-port the
proposal to older Python versions.

When it's desired to also call the hook on the base class, two
mechanisms are available:

1.  Introduce an additional mixin class just to hold the
    __init_subclass__ implementation. The original "base" class can then
    list the new mixin as its first parent class.
2.  Implement the desired behaviour as an independent class decorator,
    and apply that decorator explicitly to the base class, and then
    implicitly to subclasses via __init_subclass__.

Calling __init_subclass__ explicitly from a class decorator will
generally be undesirable, as this will also typically call
__init_subclass__ a second time on the parent class, which is unlikely
to be desired behaviour.

Other variants of calling the hooks

Other names for the hook were presented, namely __decorate__ or
__autodecorate__. This proposal opts for __init_subclass__ as it is very
close to the __init__ method, just for the subclass, while it is not
very close to decorators, as it does not return the class.

For the __set_name__ hook other names have been proposed as well,
__set_owner__, __set_ownership__ and __init_descriptor__.

Requiring an explicit decorator on __init_subclass__

One could require the explicit use of @classmethod on the
__init_subclass__ decorator. It was made implicit since there's no
sensible interpretation for leaving it out, and that case would need to
be detected anyway in order to give a useful error message.

This decision was reinforced after noticing that the user experience of
defining __prepare__ and forgetting the @classmethod method decorator is
singularly incomprehensible (particularly since PEP 3115 documents it as
an ordinary method, and the current documentation doesn't explicitly say
anything one way or the other).

A more __new__-like hook

In PEP 422 the hook worked more like the __new__ method than the
__init__ method, meaning that it returned a class instead of modifying
one. This allows a bit more flexibility, but at the cost of much harder
implementation and undesired side effects.

Adding a class attribute with the attribute order

This got its own PEP 520.

History

This used to be a competing proposal to PEP 422 by Alyssa Coghlan and
Daniel Urban. PEP 422 intended to achieve the same goals as this PEP,
but with a different way of implementation. In the meantime, PEP 422 has
been withdrawn favouring this approach.

References

Copyright

This document has been placed in the public domain.