PEP: 575 Title: Unifying function/method classes Author: Jeroen Demeyer
<J.Demeyer@UGent.be> Status: Withdrawn Type: Standards Track
Content-Type: text/x-rst Created: 27-Mar-2018 Python-Version: 3.8
Post-History: 31-Mar-2018, 12-Apr-2018, 27-Apr-2018, 05-May-2018

Withdrawal notice

See PEP 580 for a better solution to allowing fast calling of custom
classes.

See PEP 579 for a broader discussion of some of the other issues from
this PEP.

Abstract

Reorganize the class hierarchy for functions and methods with the goal
of reducing the difference between built-in functions (implemented in C)
and Python functions. Mainly, make built-in functions behave more like
Python functions without sacrificing performance.

A new base class base_function is introduced and the various function
classes, as well as method (renamed to bound_method), inherit from it.

We also allow subclassing the Python function class.

Motivation

Currently, CPython has two different function classes: the first is
Python functions, which is what you get when defining a function with
def or lambda. The second is built-in functions such as len, isinstance
or numpy.dot. These are implemented in C.

These two classes are implemented completely independently and have
different functionality. In particular, it is currently not possible to
implement a function efficiently in C (only built-in functions can do
that) while still allowing introspection like inspect.signature or
inspect.getsourcefile (only Python functions can do that). This is a
problem for projects like Cython[1] that want to do exactly that.

In Cython, this was worked around by inventing a new function class
called cyfunction. Unfortunately, a new function class creates problems:
the inspect module does not recognize such functions as being
functions[2] and the performance is worse (CPython has specific
optimizations for calling built-in functions).

A second motivation is more generally making built-in functions and
methods behave more like Python functions and methods. For example,
Python unbound methods are just functions but unbound methods of
extension types (e.g. dict.get) are a distinct class. Bound methods of
Python classes have a __func__ attribute, bound methods of extension
types do not.

Third, this PEP allows great customization of functions. The function
class becomes subclassable and custom function subclasses are also
allowed for functions implemented in C. In the latter case, this can be
done with the same performance as true built-in functions. All functions
can access the function object (the self in __call__), paving the way
for PEP 573.

New classes

This is the new class hierarchy for functions and methods:

    object
       |
       |
    base_function
    /       |     \
    /        |      \
    /         |   defined_function
    /          |        \
    cfunction (*)    |         \
       |       function
       |
    bound_method (*)

The two classes marked with () donot* allow subclassing; the others do.

There is no difference between functions and unbound methods, while
bound methods are instances of bound_method.

base_function

The class base_function becomes a new base class for all function types.
It is based on the existing builtin_function_or_method class, but with
the following differences and new features:

1.  It acts as a descriptor implementing __get__ to turn a function into
    a method if m_self is NULL. If m_self is not NULL, then this is a
    no-op: the existing function is returned instead.
2.  A new read-only attribute __parent__, represented in the C structure
    as m_parent. If this attribute exists, it represents the defining
    object. For methods of extension types, this is the defining class
    (__class__ in plain Python) and for functions of a module, this is
    the defining module. In general, it can be any Python object. If
    __parent__ is a class, it carries special semantics: in that case,
    the function must be called with self being an instance of that
    class. Finally, __qualname__ and __reduce__ will use __parent__ as
    namespace (instead of __self__ before).
3.  A new attribute __objclass__ which equals __parent__ if __parent__
    is a class. Otherwise, accessing __objclass__ raises AttributeError.
    This is meant to be backwards compatible with method_descriptor.
4.  The field ml_doc and the attributes __doc__ and __text_signature__
    (see Argument Clinic <436>) are not supported.
5.  A new flag METH_PASS_FUNCTION for ml_flags. If this flag is set, the
    C function stored in ml_meth is called with an additional first
    argument equal to the function object.
6.  A new flag METH_BINDING for ml_flags which only applies to functions
    of a module (not methods of a class). If this flag is set, then
    m_self is set to NULL instead of the module. This allows the
    function to behave more like a Python function as it enables
    __get__.
7.  A new flag METH_CALL_UNBOUND to disable self slicing.
8.  A new flag METH_PYTHON for ml_flags. This flag indicates that this
    function should be treated as Python function. Ideally, use of this
    flag should be avoided because it goes against the duck typing
    philosophy. It is still needed in a few places though, for example
    profiling.

The goal of base_function is that it supports all different ways of
calling functions and methods in just one structure. For example, the
new flag METH_PASS_FUNCTION will be used by the implementation of
methods.

It is not possible to directly create instances of base_function (tp_new
is NULL). However, it is legal for C code to manually create instances.

These are the relevant C structures:

    PyTypeObject PyBaseFunction_Type;

    typedef struct {
        PyObject_HEAD
        PyCFunctionDef *m_ml;     /* Description of the C function to call */
        PyObject *m_self;         /* __self__: anything, can be NULL; readonly */
        PyObject *m_module;       /* __module__: anything (typically str) */
        PyObject *m_parent;       /* __parent__: anything, can be NULL; readonly */
        PyObject *m_weakreflist;  /* List of weak references */
    } PyBaseFunctionObject;

    typedef struct {
        const char *ml_name;   /* The name of the built-in function/method */
        PyCFunction ml_meth;   /* The C function that implements it */
        int ml_flags;          /* Combination of METH_xxx flags, which mostly
                                  describe the args expected by the C func */
    } PyCFunctionDef;

Subclasses may extend PyCFunctionDef with extra fields.

The Python attribute __self__ returns m_self, except if METH_STATIC is
set. In that case or if m_self is NULL, then there is no __self__
attribute at all. For that reason, we write either m_self or __self__ in
this PEP with slightly different meanings.

cfunction

This is the new version of the old builtin_function_or_method class. The
name cfunction was chosen to avoid confusion with "built-in" in the
sense of "something in the builtins module". It also fits better with
the C API which use the PyCFunction prefix.

The class cfunction is a copy of base_function, with the following
differences:

1.  m_ml points to a PyMethodDef structure, extending PyCFunctionDef
    with an additional ml_doc field to implement __doc__ and
    __text_signature__ as read-only attributes:

        typedef struct {
            const char *ml_name;
            PyCFunction ml_meth;
            int ml_flags;
            const char *ml_doc;
        } PyMethodDef;

    Note that PyMethodDef is part of the Python Stable ABI <384> and it
    is used by practically all extension modules, so we absolutely
    cannot change this structure.

2.  Argument Clinic <436> is supported.

3.  __self__ always exists. In the cases where base_function.__self__
    would raise AttributeError, instead None is returned.

The type object is PyTypeObject PyCFunction_Type and we define
PyCFunctionObject as alias of PyBaseFunctionObject (except for the type
of m_ml).

defined_function

The class defined_function is an abstract base class meant to indicate
that the function has introspection support. Instances of
defined_function are required to support all attributes that Python
functions have, namely __code__, __globals__, __doc__, __defaults__,
__kwdefaults__, __closure__ and __annotations__. There is also a
__dict__ to support attributes added by the user.

None of these is required to be meaningful. In particular, __code__ may
not be a working code object, possibly only a few fields may be filled
in. This PEP does not dictate how the various attributes are
implemented. They may be simple struct members or more complicated
descriptors. Only read-only support is required, none of the attributes
is required to be writable.

The class defined_function is mainly meant for auto-generated C code,
for example produced by Cython[3]. There is no API to create instances
of it.

The C structure is the following:

    PyTypeObject PyDefinedFunction_Type;

    typedef struct {
        PyBaseFunctionObject base;
        PyObject *func_dict;        /* __dict__: dict or NULL */
    } PyDefinedFunctionObject;

TODO: maybe find a better name for defined_function. Other proposals:
inspect_function (anything that satisfies inspect.isfunction),
builtout_function (a function that is better built out; pun on builtin),
generic_function (original proposal but conflicts with
functools.singledispatch generic functions), user_function (defined by
the user as opposed to CPython).

function

This is the class meant for functions implemented in Python. Unlike the
other function types, instances of function can be created from Python
code. This is not changed, so we do not describe the details in this
PEP.

The layout of the C structure is the following:

    PyTypeObject PyFunction_Type;

    typedef struct {
        PyBaseFunctionObject base;
        PyObject *func_dict;        /* __dict__: dict or NULL */
        PyObject *func_code;        /* __code__: code */
        PyObject *func_globals;     /* __globals__: dict; readonly */
        PyObject *func_name;        /* __name__: string */
        PyObject *func_qualname;    /* __qualname__: string */
        PyObject *func_doc;         /* __doc__: can be anything or NULL */
        PyObject *func_defaults;    /* __defaults__: tuple or NULL */
        PyObject *func_kwdefaults;  /* __kwdefaults__: dict or NULL */
        PyObject *func_closure;     /* __closure__: tuple of cell objects or NULL; readonly */
        PyObject *func_annotations; /* __annotations__: dict or NULL */
        PyCFunctionDef _ml;         /* Storage for base.m_ml */
    } PyFunctionObject;

The descriptor __name__ returns func_name. When setting __name__, also
base.m_ml->ml_name is updated with the UTF-8 encoded name.

The _ml field reserves space to be used by base.m_ml.

A base_function instance must have the flag METH_PYTHON set if and only
if it is an instance of function.

When constructing an instance of function from code and globals, an
instance is created with base.m_ml = &_ml, base.m_self = NULL.

To make subclassing easier, we also add a copy constructor: if f is an
instance of function, then types.FunctionType(f) copies f. This
conveniently allows using a custom function type as decorator:

    >>> from types import FunctionType
    >>> class CustomFunction(FunctionType):
    ...     pass
    >>> @CustomFunction
    ... def f(x):
    ...     return x
    >>> type(f)
    <class '__main__.CustomFunction'>

This also removes many use cases of functools.wraps: wrappers can be
replaced by subclasses of function.

bound_method

The class bound_method is used for all bound methods, regardless of the
class of the underlying function. It adds one new attribute on top of
base_function: __func__ points to that function.

bound_method replaces the old method class which was used only for
Python functions bound as method.

There is a complication because we want to allow constructing a method
from an arbitrary callable. This may be an already-bound method or
simply not an instance of base_function. Therefore, in practice there
are two kinds of methods:

-   For arbitrary callables, we use a single fixed PyCFunctionDef
    structure with the METH_PASS_FUNCTION flag set.
-   For methods which bind instances of base_function (more precisely,
    which have the Py_TPFLAGS_BASEFUNCTION flag set) that have self
    slicing, we instead use the PyCFunctionDef from the original
    function. This way, we don't lose any performance when calling bound
    methods. In this case, the __func__ attribute is only used to
    implement various attributes but not for calling the method.

When constructing a new method from a base_function, we check that the
self object is an instance of __objclass__ (if a class was specified as
parent) and raise a TypeError otherwise.

The C structure is:

    PyTypeObject PyMethod_Type;

    typedef struct {
        PyBaseFunctionObject base;
        PyObject *im_func;  /* __func__: function implementing the method; readonly */
    } PyMethodObject;

Calling base_function instances

We specify the implementation of __call__ for instances of
base_function.

Checking __objclass__

First of all, a type check is done if the __parent__ of the function is
a class (recall that __objclass__ then becomes an alias of __parent__):
if m_self is NULL (this is the case for unbound methods of extension
types), then the function must be called with at least one positional
argument and the first (typically called self) must be an instance of
__objclass__. If not, a TypeError is raised.

Note that bound methods have m_self != NULL, so the __objclass__ is not
checked. Instead, the __objclass__ check is done when constructing the
method.

Flags

For convenience, we define a new constant: METH_CALLFLAGS combines all
flags from PyCFunctionDef.ml_flags which specify the signature of the C
function to be called. It is equal to :

    METH_VARARGS | METH_FASTCALL | METH_NOARGS | METH_O | METH_KEYWORDS | METH_PASS_FUNCTION

Exactly one of the first four flags above must be set and only
METH_VARARGS and METH_FASTCALL may be combined with METH_KEYWORDS.
Violating these rules is undefined behaviour.

There are one new flags which affects calling functions, namely
METH_PASS_FUNCTION and METH_CALL_UNBOUND. Some flags are already
documented in[4]. We explain the others below.

Self slicing

If the function has m_self == NULL and the flag METH_CALL_UNBOUND is not
set, then the first positional argument (if any) is removed from *args
and instead passed as first argument to the C function. Effectively, the
first positional argument is treated as __self__. This is meant to
support unbound methods such that the C function does not see the
difference between bound and unbound method calls. This does not affect
keyword arguments in any way.

This process is called self slicing and a function is said to have self
slicing if m_self == NULL and METH_CALL_UNBOUND is not set.

Note that a METH_NOARGS function which has self slicing effectively has
one argument, namely self. Analogously, a METH_O function with self
slicing has two arguments.

METH_PASS_FUNCTION

If this flag is set, then the C function is called with an additional
first argument, namely the function itself (the base_function instance).
As special case, if the function is a bound_method, then the underlying
function of the method is passed (but not recursively: if a bound_method
wraps a bound_method, then __func__ is only applied once).

For example, an ordinary METH_VARARGS function has signature
(PyObject *self, PyObject *args). With
METH_VARARGS | METH_PASS_FUNCTION, this becomes
(PyObject *func, PyObject *self, PyObject *args).

METH_FASTCALL

This is an existing but undocumented flag. We suggest to officially
support and document it.

If the flag METH_FASTCALL is set without METH_KEYWORDS, then the ml_meth
field is of type PyCFunctionFast which takes the arguments
(PyObject *self, PyObject *const *args, Py_ssize_t nargs). Such a
function takes only positional arguments and they are passed as plain C
array args of length nargs.

If the flags METH_FASTCALL | METH_KEYWORDS are set, then the ml_meth
field is of type PyCFunctionFastKeywords which takes the arguments
(PyObject *self, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames).
The positional arguments are passed as C array args of length nargs. The
values of the keyword arguments follow in that array, starting at
position nargs. The keys (names) of the keyword arguments are passed as
a tuple in kwnames. As an example, assume that 3 positional and 2
keyword arguments are given. Then args is an array of length 3 + 2 = 5,
nargs equals 3 and kwnames is a 2-tuple.

Automatic creation of built-in functions

Python automatically generates instances of cfunction for extension
types (using the PyTypeObject.tp_methods field) and modules (using the
PyModuleDef.m_methods field). The arrays PyTypeObject.tp_methods and
PyModuleDef.m_methods must be arrays of PyMethodDef structures.

Unbound methods of extension types

The type of unbound methods changes from method_descriptor to cfunction.
The object which appears as unbound method is the same object which
appears in the class __dict__. Python automatically sets the __parent__
attribute to the defining class.

Built-in functions of a module

For the case of functions of a module, __parent__ will be set to the
module. Unless the flag METH_BINDING is given, also __self__ will be set
to the module (for backwards compatibility).

An important consequence is that such functions by default do not become
methods when used as attribute (base_function.__get__ only does that if
m_self was NULL). One could consider this a bug, but this was done for
backwards compatibility reasons: in an initial post on python-ideas[5]
the consensus was to keep this misfeature of built-in functions.

However, to allow this anyway for specific or newly implemented built-in
functions, the METH_BINDING flag prevents setting __self__.

Further changes

New type flag

A new PyTypeObject flag (for tp_flags) is added: Py_TPFLAGS_BASEFUNCTION
to indicate that instances of this type are functions which can be
called and bound as method like a base_function.

This is different from flags like Py_TPFLAGS_LIST_SUBCLASS because it
indicates more than just a subclass: it also indicates a default
implementation of __call__ and __get__. In particular, such subclasses
of base_function must follow the implementation from the section Calling
base_function instances.

This flag is automatically set for extension types which inherit the
tp_call and tp_descr_get implementation from base_function. Extension
types can explicitly specify it if they override __call__ or __get__ in
a compatible way. The flag Py_TPFLAGS_BASEFUNCTION must never be set for
a heap type because that would not be safe (heap types can be changed
dynamically).

C API functions

We list some relevant Python/C API macros and functions. Some of these
are existing (possibly changed) functions, some are new:

-   int PyBaseFunction_CheckFast(PyObject *op): return true if op is an
    instance of a class with the Py_TPFLAGS_BASEFUNCTION set. This is
    the function that you need to use to determine whether it is
    meaningful to access the base_function internals.
-   int PyBaseFunction_Check(PyObject *op): return true if op is an
    instance of base_function.
-   PyObject *PyBaseFunction_New(PyTypeObject *cls, PyCFunctionDef *ml, PyObject *self, PyObject *module, PyObject *parent):
    create a new instance of cls (which must be a subclass of
    base_function) from the given data.
-   int PyCFunction_Check(PyObject *op): return true if op is an
    instance of cfunction.
-   int PyCFunction_NewEx(PyMethodDef* ml, PyObject *self, PyObject* module):
    create a new instance of cfunction. As special case, if self is
    NULL, then set self = Py_None instead (for backwards compatibility).
    If self is a module, then __parent__ is set to self. Otherwise,
    __parent__ is NULL.
-   For many existing PyCFunction_... and PyMethod_ functions, we define
    a new function PyBaseFunction_... acting on base_function instances.
    The old functions are kept as aliases of the new functions.
-   int PyFunction_Check(PyObject *op): return true if op is an instance
    of base_function with the METH_PYTHON flag set (this is equivalent
    to checking whether op is an instance of function).
-   int PyFunction_CheckFast(PyObject *op): equivalent to
    PyFunction_Check(op) && PyBaseFunction_CheckFast(op).
-   int PyFunction_CheckExact(PyObject *op): return true if the type of
    op is function.
-   PyObject *PyFunction_NewPython(PyTypeObject *cls, PyObject *code, PyObject *globals, PyObject *name, PyObject *qualname):
    create a new instance of cls (which must be a subclass of function)
    from the given data.
-   PyObject *PyFunction_New(PyObject *code, PyObject *globals): create
    a new instance of function.
-   PyObject *PyFunction_NewWithQualName(PyObject *code, PyObject *globals, PyObject *qualname):
    create a new instance of function.
-   PyObject *PyFunction_Copy(PyTypeObject *cls, PyObject *func): create
    a new instance of cls (which must be a subclass of function) by
    copying a given function.

Changes to the types module

Two types are added: types.BaseFunctionType corresponding to
base_function and types.DefinedFunctionType corresponding to
defined_function.

Apart from that, no changes to the types module are made. In particular,
types.FunctionType refers to function. However, the actual types will
change: in particular, types.BuiltinFunctionType will no longer be the
same as types.BuiltinMethodType.

Changes to the inspect module

The new function inspect.isbasefunction checks for an instance of
base_function.

inspect.isfunction checks for an instance of defined_function.

inspect.isbuiltin checks for an instance of cfunction.

inspect.isroutine checks isbasefunction or ismethoddescriptor.

NOTE: bpo-33261[6] should be fixed first.

Profiling

Currently, sys.setprofile supports c_call, c_return and c_exception
events for built-in functions. These events are generated when calling
or returning from a built-in function. By contrast, the call and return
events are generated by the function itself. So nothing needs to change
for the call and return events.

Since we no longer make a difference between C functions and Python
functions, we need to prevent the c_* events for Python functions. This
is done by not generating those events if the METH_PYTHON flag in
ml_flags is set.

Non-CPython implementations

Most of this PEP is only relevant to CPython. For other implementations
of Python, the two changes that are required are the base_function base
class and the fact that function can be subclassed. The classes
cfunction and defined_function are not required.

We require base_function for consistency but we put no requirements on
it: it is acceptable if this is just a copy of object. Support for the
new __parent__ (and __objclass__) attribute is not required. If there is
no defined_function class, then types.DefinedFunctionType should be an
alias of types.FunctionType.

Rationale

Why not simply change existing classes?

One could try to solve the problem by keeping the existing classes
without introducing a new base_function class.

That might look like a simpler solution but it is not: it would require
introspection support for 3 distinct classes: function,
builtin_function_or_method and method_descriptor. For the latter two
classes, "introspection support" would mean at a minimum allowing
subclassing. But we don't want to lose performance, so we want fast
subclass checks. This would require two new flags in tp_flags. And we
want subclasses to allow __get__ for built-in functions, so we should
implement the LOAD_METHOD opcode for built-in functions too. More
generally, a lot of functionality would need to be duplicated and the
end result would be far more complex code.

It is also not clear how the introspection of built-in function
subclasses would interact with __text_signature__. Having two
independent kinds of inspect.signature support on the same class sounds
like asking for problems.

And this would not fix some of the other differences between built-in
functions and Python functions that were mentioned in the motivation.

Why __text_signature__ is not a solution

Built-in functions have an attribute __text_signature__, which gives the
signature of the function as plain text. The default values are
evaluated by ast.literal_eval. Because of this, it supports only a small
number of standard Python classes and not arbitrary Python objects.

And even if __text_signature__ would allow arbitrary signatures somehow,
that is only one piece of introspection: it does not help with
inspect.getsourcefile for example.

defined_function versus function

In many places, a decision needs to be made whether the old function
class should be replaced by defined_function or the new function class.
This is done by thinking of the most likely use case:

1.  types.FunctionType refers to function because that type might be
    used to construct instances using types.FunctionType(...).
2.  inspect.isfunction() refers to defined_function because this is the
    class where introspection is supported.
3.  The C API functions must refer to function because we do not specify
    how the various attributes of defined_function are implemented. We
    expect that this is not a problem since there is typically no reason
    for introspection to be done by C extensions.

Scope of this PEP: which classes are involved?

The main motivation of this PEP is fixing function classes, so we
certainly want to unify the existing classes builtin_function_or_method
and function.

Since built-in functions and methods have the same class, it seems
natural to include bound methods too. And since there are no "unbound
methods" for Python functions, it makes sense to get rid of unbound
methods for extension types.

For now, no changes are made to the classes staticmethod, classmethod
and classmethod_descriptor. It would certainly make sense to put these
in the base_function class hierarchy and unify classmethod and
classmethod_descriptor. However, this PEP is already big enough and this
is left as a possible future improvement.

Slot wrappers for extension types like __init__ or __eq__ are quite
different from normal methods. They are also typically not called
directly because you would normally write foo[i] instead of
foo.__getitem__(i). So these are left outside the scope of this PEP.

Python also has an instancemethod class, which seems to be a relic from
Python 2, where it was used for bound and unbound methods. It is not
clear whether there is still a use case for it. In any case, there is no
reason to deal with it in this PEP.

TODO: should instancemethod be deprecated? It doesn't seem used at all
within CPython 3.7, but maybe external packages use it?

Not treating METH_STATIC and METH_CLASS

Almost nothing in this PEP refers to the flags METH_STATIC and
METH_CLASS. These flags are checked only by the automatic creation of
built-in functions. When a staticmethod, classmethod or
classmethod_descriptor is bound (i.e. __get__ is called), a
base_function instance is created with m_self != NULL. For a
classmethod, this is obvious since m_self is the class that the method
is bound to. For a staticmethod, one can take an arbitrary Python object
for m_self. For backwards compatibility, we choose m_self = __parent__
for static methods of extension types.

------------------------------------------------------------------------

It may look strange at first sight to add the __self__ slot in
base_function as opposed to bound_method. We took this idea from the
existing builtin_function_or_method class. It allows us to have a single
general implementation of __call__ and __get__ for the various function
classes discussed in this PEP.

It also makes it easy to support existing built-in functions which set
__self__ to the module (for example, sys.exit.__self__ is sys).

Two implementations of __doc__

base_function does not support function docstrings. Instead, the classes
cfunction and function each have their own way of dealing with
docstrings (and bound_method just takes the __doc__ from the wrapped
function).

For cfunction, the docstring is stored (together with the text
signature) as C string in the read-only ml_doc field of a PyMethodDef.
For function, the docstring is stored as a writable Python object and it
does not actually need to be a string. It looks hard to unify these two
very different ways of dealing with __doc__. For backwards
compatibility, we keep the existing implementations.

For defined_function, we require __doc__ to be implemented but we do not
specify how. A subclass can implement __doc__ the same way as cfunction
or using a struct member or some other way.

Subclassing

We disallow subclassing of cfunction and bound_method to enable fast
type checks for PyCFunction_Check and PyMethod_Check.

We allow subclassing of the other classes because there is no reason to
disallow it. For Python modules, the only relevant class to subclass is
function because the others cannot be instantiated anyway.

Replacing tp_call: METH_PASS_FUNCTION and METH_CALL_UNBOUND

The new flags METH_PASS_FUNCTION and METH_CALL_UNBOUND are meant to
support cases where formerly a custom tp_call was used. It reduces the
number of special fast paths in Python/ceval.c for calling objects:
instead of treating Python functions, built-in functions and method
descriptors separately, there would only be a single check.

The signature of tp_call is essentially the signature of
PyBaseFunctionObject.m_ml.ml_meth with flags
METH_VARARGS | METH_KEYWORDS | METH_PASS_FUNCTION | METH_CALL_UNBOUND
(the only difference is an added self argument). Therefore, it should be
easy to change existing tp_call slots to use the base_function
implementation instead.

It also makes sense to use METH_PASS_FUNCTION without METH_CALL_UNBOUND
in cases where the C function simply needs access to additional metadata
from the function, such as the __parent__. This is for example needed to
support PEP 573. Converting existing methods to use METH_PASS_FUNCTION
is trivial: it only requires adding an extra argument to the C function.

Backwards compatibility

While designing this PEP, great care was taken to not break backwards
compatibility too much. Most of the potentially incompatible changes are
changes to CPython implementation details which are different anyway in
other Python interpreters. In particular, Python code which correctly
runs on PyPy will very likely continue to work with this PEP.

The standard classes and functions like staticmethod, functools.partial
or operator.methodcaller do not need to change at all.

Changes to types and inspect

The proposed changes to types and inspect are meant to minimize changes
in behaviour. However, it is unavoidable that some things change and
this can cause code which uses types or inspect to break. In the Python
standard library for example, changes are needed in the doctest module
because of this.

Also, tools which take various kinds of functions as input will need to
deal with the new function hierarchy and the possibility of custom
function classes.

Python functions

For Python functions, essentially nothing changes. The attributes that
existed before still exist and Python functions can be initialized,
called and turned into methods as before.

The name function is kept for backwards compatibility. While it might
make sense to change the name to something more specific like
python_function, that would require a lot of annoying changes in
documentation and testsuites.

Built-in functions of a module

Also for built-in functions, nothing changes. We keep the old behaviour
that such functions do not bind as methods. This is a consequence of the
fact that __self__ is set to the module.

Built-in bound and unbound methods

The types of built-in bound and unbound methods will change. However,
this does not affect calling such methods because the protocol in
base_function.__call__ (in particular the handling of __objclass__ and
self slicing) was specifically designed to be backwards compatible. All
attributes which existed before (like __objclass__ and __self__) still
exist.

New attributes

Some objects get new special double-underscore attributes. For example,
the new attribute __parent__ appears on all built-in functions and all
methods get a __func__ attribute. The fact that __self__ is now a
special read-only attribute for Python functions caused trouble in[7].
Generally, we expect that not much will break though.

method_descriptor and PyDescr_NewMethod

The class method_descriptor and the constructor PyDescr_NewMethod should
be deprecated. They are no longer used by CPython itself but are still
supported.

Two-phase Implementation

TODO: this section is optional. If this PEP is accepted, it should be
decided whether to apply this two-phase implementation or not.

As mentioned above, the changes to types and inspect can break some
existing code. In order to further minimize breakage, this PEP could be
implemented in two phases.

Phase one: keep existing classes but add base classes

Initially, implement the base_function class and use it as common base
class but otherwise keep the existing classes (but not their
implementation).

In this proposal, the class hierarchy would become:

    object
       |
       |
    base_function
    /       |     \
    /        |      \
    /         |       \
    cfunction          |     defined_function
    |     |           |         \
    |     |      bound_method    \
    |     |                       \
    |  method_descriptor       function
    |
    builtin_function_or_method

The leaf classes builtin_function_or_method, method_descriptor,
bound_method and function correspond to the existing classes (with
method renamed to bound_method).

Automatically created functions created in modules become instances of
builtin_function_or_method. Unbound methods of extension types become
instances of method_descriptor.

The class method_descriptor is a copy of cfunction except that __get__
returns a builtin_function_or_method instead of a bound_method.

The class builtin_function_or_method has the same C structure as a
bound_method, but it inherits from cfunction. The __func__ attribute is
not mandatory: it is only defined when binding a method_descriptor.

We keep the implementation of the inspect functions as they are. Because
of this and because the existing classes are kept, backwards
compatibility is ensured for code doing type checks.

Since showing an actual DeprecationWarning would affect a lot of
correctly-functioning code, any deprecations would only appear in the
documentation. Another reason is that it is hard to show warnings for
calling isinstance(x, t) (but it could be done using __instancecheck__
hacking) and impossible for type(x) is t.

Phase two

Phase two is what is actually described in the rest of this PEP. In
terms of implementation, it would be a relatively small change compared
to phase one.

Reference Implementation

Most of this PEP has been implemented for CPython at
https://github.com/jdemeyer/cpython/tree/pep575

There are four steps, corresponding to the commits on that branch. After
each step, CPython is in a mostly working state.

1.  Add the base_function class and make it a subclass for cfunction.
    This is by far the biggest step as the complete __call__ protocol is
    implemented in this step.
2.  Rename method to bound_method and make it a subclass of
    base_function. Change unbound methods of extension types to be
    instances of cfunction such that bound methods of extension types
    are also instances of bound_method.
3.  Implement defined_function and function.
4.  Changes to other parts of Python, such as the standard library and
    testsuite.

Appendix: current situation

NOTE: This section is more useful during the draft period of the PEP, so
feel free to remove this once the PEP has been accepted.

For reference, we describe in detail the relevant existing classes in
CPython 3.7.

Each of the classes involved is an "orphan" class (no non-trivial
subclasses nor superclasses).

builtin_function_or_method: built-in functions and bound methods

These are of type PyCFunction_Type with structure PyCFunctionObject:

    typedef struct {
        PyObject_HEAD
        PyMethodDef *m_ml; /* Description of the C function to call */
        PyObject    *m_self; /* Passed as 'self' arg to the C func, can be NULL */
        PyObject    *m_module; /* The __module__ attribute, can be anything */
        PyObject    *m_weakreflist; /* List of weak references */
    } PyCFunctionObject;

    struct PyMethodDef {
        const char  *ml_name;   /* The name of the built-in function/method */
        PyCFunction ml_meth;    /* The C function that implements it */
        int         ml_flags;   /* Combination of METH_xxx flags, which mostly
                                   describe the args expected by the C func */
        const char  *ml_doc;    /* The __doc__ attribute, or NULL */
    };

where PyCFunction is a C function pointer (there are various forms of
this, the most basic takes two arguments for self and *args).

This class is used both for functions and bound methods: for a method,
the m_self slot points to the object:

    >>> dict(foo=42).get
    <built-in method get of dict object at 0x...>
    >>> dict(foo=42).get.__self__
    {'foo': 42}

In some cases, a function is considered a "method" of the module
defining it:

    >>> import os
    >>> os.kill
    <built-in function kill>
    >>> os.kill.__self__
    <module 'posix' (built-in)>

method_descriptor: built-in unbound methods

These are of type PyMethodDescr_Type with structure PyMethodDescrObject:

    typedef struct {
        PyDescrObject d_common;
        PyMethodDef *d_method;
    } PyMethodDescrObject;

    typedef struct {
        PyObject_HEAD
        PyTypeObject *d_type;
        PyObject *d_name;
        PyObject *d_qualname;
    } PyDescrObject;

function: Python functions

These are of type PyFunction_Type with structure PyFunctionObject:

    typedef struct {
        PyObject_HEAD
        PyObject *func_code;        /* A code object, the __code__ attribute */
        PyObject *func_globals;     /* A dictionary (other mappings won't do) */
        PyObject *func_defaults;    /* NULL or a tuple */
        PyObject *func_kwdefaults;  /* NULL or a dict */
        PyObject *func_closure;     /* NULL or a tuple of cell objects */
        PyObject *func_doc;         /* The __doc__ attribute, can be anything */
        PyObject *func_name;        /* The __name__ attribute, a string object */
        PyObject *func_dict;        /* The __dict__ attribute, a dict or NULL */
        PyObject *func_weakreflist; /* List of weak references */
        PyObject *func_module;      /* The __module__ attribute, can be anything */
        PyObject *func_annotations; /* Annotations, a dict or NULL */
        PyObject *func_qualname;    /* The qualified name */

        /* Invariant:
         *     func_closure contains the bindings for func_code->co_freevars, so
         *     PyTuple_Size(func_closure) == PyCode_GetNumFree(func_code)
         *     (func_closure may be NULL if PyCode_GetNumFree(func_code) == 0).
         */
    } PyFunctionObject;

In Python 3, there is no "unbound method" class: an unbound method is
just a plain function.

method: Python bound methods

These are of type PyMethod_Type with structure PyMethodObject:

    typedef struct {
        PyObject_HEAD
        PyObject *im_func;   /* The callable object implementing the method */
        PyObject *im_self;   /* The instance it is bound to */
        PyObject *im_weakreflist; /* List of weak references */
    } PyMethodObject;

References

Copyright

This document has been placed in the public domain.



  Local Variables: mode: indented-text indent-tabs-mode: nil
  sentence-end-double-space: t fill-column: 70 coding: utf-8 End:

[1] Cython (http://cython.org/)

[2] Python bug 30071, Duck-typing inspect.isfunction()
(https://bugs.python.org/issue30071)

[3] Cython (http://cython.org/)

[4] PyMethodDef documentation
(https://docs.python.org/3.7/c-api/structures.html#c.PyMethodDef)

[5] PEP proposal: unifying function/method classes
(https://mail.python.org/pipermail/python-ideas/2018-March/049398.html)

[6] Python bug 33261, inspect.isgeneratorfunction fails on hand-created
methods (https://bugs.python.org/issue33261 and
https://github.com/python/cpython/pull/6448)

[7] Python bug 33265, contextlib.ExitStack abuses __self__
(https://bugs.python.org/issue33265 and
https://github.com/python/cpython/pull/6456)