PEP: 573 Title: Module State Access from C Extension Methods Version:
$Revision$ Last-Modified: $Date$ Author: Petr Viktorin
<encukou@gmail.com>, Alyssa Coghlan <ncoghlan@gmail.com>, Eric Snow
<ericsnowcurrently@gmail.com>, Marcel Plch <gmarcel.plch@gmail.com>
BDFL-Delegate: Stefan Behnel Discussions-To: import-sig@python.org
Status: Final Type: Standards Track Content-Type: text/x-rst Created:
02-Jun-2016 Python-Version: 3.9 Post-History:

Abstract

This PEP proposes to add a way for CPython extension methods to access
context, such as the state of the modules they are defined in.

This will allow extension methods to use direct pointer dereferences
rather than PyState_FindModule for looking up module state, reducing or
eliminating the performance cost of using module-scoped state over
process global state.

This fixes one of the remaining roadblocks for adoption of PEP 3121
(Extension module initialization and finalization) and PEP 489
(Multi-phase extension module initialization).

While this PEP takes an additional step towards fully solving the
problems that PEP 3121 and PEP 489 started tackling, it does not attempt
to resolve all remaining concerns. In particular, access to the module
state from slot methods (nb_add, etc) is not solved.

Terminology

Process-Global State

C-level static variables. Since this is very low-level memory storage,
it must be managed carefully.

Per-module State

State local to a module object, allocated dynamically as part of a
module object's initialization. This isolates the state from other
instances of the module (including those in other subinterpreters).

Accessed by PyModule_GetState().

Static Type

A type object defined as a C-level static variable, i.e. a compiled-in
type object.

A static type needs to be shared between module instances and has no
information of what module it belongs to. Static types do not have
__dict__ (although their instances might).

Heap Type

A type object created at run time.

Defining Class

The defining class of a method (either bound or unbound) is the class on
which the method was defined. A class that merely inherits the method
from its base is not the defining class.

For example, int is the defining class of True.to_bytes, True.__floor__
and int.__repr__.

In C, the defining class is the one defined with the corresponding
tp_methods or "tp slots"[1] entry. For methods defined in Python, the
defining class is saved in the __class__ closure cell.

C-API

The "Python/C API" as described in Python documentation. CPython
implements the C-API, but other implementations exist.

Rationale

PEP 489 introduced a new way to initialize extension modules, which
brings several advantages to extensions that implement it:

-   The extension modules behave more like their Python counterparts.
-   The extension modules can easily support loading into pre-existing
    module objects, which paves the way for extension module support for
    runpy or for systems that enable extension module reloading.
-   Loading multiple modules from the same extension is possible, which
    makes it possible to test module isolation (a key feature for proper
    sub-interpreter support) from a single interpreter.

The biggest hurdle for adoption of PEP 489 is allowing access to module
state from methods of extension types. Currently, the way to access this
state from extension methods is by looking up the module via
PyState_FindModule (in contrast to module level functions in extension
modules, which receive a module reference as an argument). However,
PyState_FindModule queries the thread-local state, making it relatively
costly compared to C level process global access and consequently
deterring module authors from using it.

Also, PyState_FindModule relies on the assumption that in each
subinterpreter, there is at most one module corresponding to a given
PyModuleDef. This assumption does not hold for modules that use PEP
489's multi-phase initialization, so PyState_FindModule is unavailable
for these modules.

A faster, safer way of accessing module-level state from extension
methods is needed.

Background

The implementation of a Python method may need access to one or more of
the following pieces of information:

-   The instance it is called on (self)
-   The underlying function
-   The defining class, i. e. the class the method was defined in
-   The corresponding module
-   The module state

In Python code, the Python-level equivalents may be retrieved as:

    import sys

    class Foo:
        def meth(self):
            instance = self
            module_globals = globals()
            module_object = sys.modules[__name__]  # (1)
            underlying_function = Foo.meth         # (1)
            defining_class = Foo                   # (1)
            defining_class = __class__             # (2)

Note

The defining class is not type(self), since type(self) might be a
subclass of Foo.

The statements marked (1) implicitly rely on name-based lookup via the
function's __globals__: either the Foo attribute to access the defining
class and Python function object, or __name__ to find the module object
in sys.modules.

In Python code, this is feasible, as __globals__ is set appropriately
when the function definition is executed, and even if the namespace has
been manipulated to return a different object, at worst an exception
will be raised.

The __class__ closure, (2), is a safer way to get the defining class,
but it still relies on __closure__ being set appropriately.

By contrast, extension methods are typically implemented as normal C
functions. This means that they only have access to their arguments and
C level thread-local and process-global states. Traditionally, many
extension modules have stored their shared state in C-level process
globals, causing problems when:

-   running multiple initialize/finalize cycles in the same process
-   reloading modules (e.g. to test conditional imports)
-   loading extension modules in subinterpreters

PEP 3121 attempted to resolve this by offering the PyState_FindModule
API, but this still has significant problems when it comes to extension
methods (rather than module level functions):

-   it is markedly slower than directly accessing C-level process-global
    state
-   there is still some inherent reliance on process global state that
    means it still doesn't reliably handle module reloading

It's also the case that when looking up a C-level struct such as module
state, supplying an unexpected object layout can crash the interpreter,
so it's significantly more important to ensure that extension methods
receive the kind of object they expect.

Proposal

Currently, a bound extension method (PyCFunction or
PyCFunctionWithKeywords) receives only self, and (if applicable) the
supplied positional and keyword arguments.

While module-level extension functions already receive access to the
defining module object via their self argument, methods of extension
types don't have that luxury: they receive the bound instance via self,
and hence have no direct access to the defining class or the module
level state.

The additional module level context described above can be made
available with two changes. Both additions are optional; extension
authors need to opt in to start using them:

-   Add a pointer to the module to heap type objects.

-   Pass the defining class to the underlying C function.

    In CPython, the defining class is readily available at the time the
    built-in method object (PyCFunctionObject) is created, so it can be
    stored in a new struct that extends PyCFunctionObject.

The module state can then be retrieved from the module object via
PyModule_GetState.

Note that this proposal implies that any type whose methods need to
access per-module state must be a heap type, rather than a static type.
This is necessary to support loading multiple module objects from a
single extension: a static type, as a C-level global, has no information
about which module object it belongs to.

Slot methods

The above changes don't cover slot methods, such as tp_iter or nb_add.

The problem with slot methods is that their C API is fixed, so we can't
simply add a new argument to pass in the defining class. Two possible
solutions have been proposed to this problem:

-   Look up the class through walking the MRO. This is potentially
    expensive, but will be usable if performance is not a problem (such
    as when raising a module-level exception).
-   Storing a pointer to the defining class of each slot in a separate
    table, __typeslots__[2]. This is technically feasible and fast, but
    quite invasive.

Modules affected by this concern also have the option of using
thread-local state or PEP 567 context variables as a caching mechanism,
or else defining their own reload-friendly lookup caching scheme.

Solving the issue generally is deferred to a future PEP.

Specification

Adding module references to heap types

A new factory method will be added to the C-API for creating modules:

    PyObject* PyType_FromModuleAndSpec(PyObject *module,
                                       PyType_Spec *spec,
                                       PyObject *bases)

This acts the same as PyType_FromSpecWithBases, and additionally
associates the provided module object with the new type. (In CPython,
this will set ht_module described below.)

Additionally, an accessor, PyObject * PyType_GetModule(PyTypeObject *)
will be provided. It will return the type's associated module if one is
set, otherwise it will set TypeError and return NULL. When given a
static type, it will always set TypeError and return NULL.

To implement this in CPython, the PyHeapTypeObject struct will get a new
member, PyObject *ht_module, that will store a pointer to the associated
module. It will be NULL by default and should not be modified after the
type object is created.

The ht_module member will not be inherited by subclasses; it needs to be
set using PyType_FromSpecWithBases for each individual type that needs
it.

Usually, creating a class with ht_module set will create a reference
cycle involving the class and the module. This is not a problem, as
tearing down modules is not a performance-sensitive operation, and
module-level functions typically also create reference cycles. The
existing "set all module globals to None" code that breaks function
cycles through f_globals will also break the new cycles through
ht_module.

Passing the defining class to extension methods

A new signature flag, METH_METHOD, will be added for use in
PyMethodDef.ml_flags. Conceptually, it adds defining_class to the
function signature. To make the initial implementation easier, the flag
can only be used as (METH_FASTCALL | METH_KEYWORDS | METH_METHOD). (It
can't be used with other flags like METH_O or bare METH_FASTCALL, though
it may be combined with METH_CLASS or METH_STATIC).

C functions for methods defined using this flag combination will be
called using a new C signature called PyCMethod:

    PyObject *PyCMethod(PyObject *self,
                        PyTypeObject *defining_class,
                        PyObject *const *args,
                        size_t nargsf,
                        PyObject *kwnames)

Additional combinations like (METH_VARARGS | METH_METHOD) may be added
in the future (or even in the initial implementation of this PEP).
However, METH_METHOD should always be an additional flag, i.e., the
defining class should only be passed in if needed.

In CPython, a new structure extending PyCFunctionObject will be added to
hold the extra information:

    typedef struct {
        PyCFunctionObject func;
        PyTypeObject *mm_class; /* Passed as 'defining_class' arg to the C func */
    } PyCMethodObject;

The PyCFunction implementation will pass mm_class into a PyCMethod C
function when it finds the METH_METHOD flag being set. A new macro
PyCFunction_GET_CLASS(cls) will be added for easier access to mm_class.

C methods may continue to use the other METH_* signatures if they do not
require access to their defining class/module. If METH_METHOD is not
set, casting to PyCMethodObject is invalid.

Argument Clinic

To support passing the defining class to methods using Argument Clinic,
a new converter called defining_class will be added to CPython's
Argument Clinic tool.

Each method may only have one argument using this converter, and it must
appear after self, or, if self is not used, as the first argument. The
argument will be of type PyTypeObject *.

When used, Argument Clinic will select
METH_FASTCALL | METH_KEYWORDS | METH_METHOD as the calling convention.
The argument will not appear in __text_signature__.

The new converter will initially not be compatible with __init__ and
__new__ methods, which cannot use the METH_METHOD convention.

Helpers

Getting to per-module state from a heap type is a very common task. To
make this easier, a helper will be added:

    void *PyType_GetModuleState(PyObject *type)

This function takes a heap type and on success, it returns pointer to
the state of the module that the heap type belongs to.

On failure, two scenarios may occur. When a non-type object, or a type
without a module is passed in, TypeError is set and NULL returned. If
the module is found, the pointer to the state, which may be NULL, is
returned without setting any exception.

Modules Converted in the Initial Implementation

To validate the approach, the _elementtree module will be modified
during the initial implementation.

Summary of API Changes and Additions

The following will be added to Python C-API:

  -   PyType_FromModuleAndSpec function
  -   PyType_GetModule function
  -   PyType_GetModuleState function
  -   METH_METHOD call flag
  -   PyCMethod function signature

The following additions will be added as CPython implementation details,
and won't be documented:

  -   PyCFunction_GET_CLASS macro
  -   PyCMethodObject struct
  -   ht_module member of _heaptypeobject
  -   defining_class converter in Argument Clinic

Backwards Compatibility

One new pointer is added to all heap types. All other changes are adding
new functions and structures, or changes to private implementation
details.

Implementation

An initial implementation is available in a Github repository[3]; a
patchset is at[4].

Possible Future Extensions

Slot methods

A way of passing defining class (or module state) to slot methods may be
added in the future.

A previous version of this PEP proposed a helper function that would
determine a defining class by searching the MRO for a class that defines
a slot to a particular function. However, this approach would fail if a
class is mutated (which is, for heap types, possible from Python code).
Solving this problem is left to future discussions.

Easy creation of types with module references

It would be possible to add a PEP 489 execution slot type to make
creating heap types significantly easier than calling
PyType_FromModuleAndSpec. This is left to a future PEP.

It may be good to add a good way to create static exception types from
the limited API. Such exception types could be shared between
subinterpreters, but instantiated without needing specific module state.
This is also left to possible future discussions.

Optimization

As proposed here, methods defined with the METH_METHOD flag only support
one specific signature.

If it turns out that other signatures are needed for performance
reasons, they may be added.

References

Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.

[1] https://docs.python.org/3/c-api/typeobj.html#tp-slots

[2] [Import-SIG] On singleton modules, heap types, and subinterpreters
(https://mail.python.org/pipermail/import-sig/2015-July/001035.html)

[3] https://github.com/Dormouse759/cpython/tree/pep-c-rebase_newer

[4] https://github.com/Dormouse759/cpython/compare/master...Dormouse759:pep-c-rebase_newer