Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 793 – PyModExport: A new entry point for C extension modules

Author:
Petr Viktorin <encukou at gmail.com>
Discussions-To:
Discourse thread
Status:
Draft
Type:
Standards Track
Created:
23-May-2025
Python-Version:
3.15
Post-History:
14-Mar-2025, 27-May-2025

Table of Contents

Abstract

In this PEP, we propose a new entry point for C extension modules, by which one can define a module using an array of PyModuleDef_Slot structures without an enclosing PyModuleDef structure. This allows extension authors to avoid using a statically allocated PyObject, lifting the most common obstacle to making one compiled library file usable with both regular and free-threaded builds of CPython.

To make this viable, we also specify new module slot types to replace PyModuleDef’s fields, and to allow adding a token similar to the Py_tp_token used for type objects.

We also add an API for defining modules from slots dynamically.

The existing API (PyInit_*) is soft-deprecated. (That is: it will continue to work without warnings, and it’ll be fully documented and supported, but we plan to not add any new features to it.)

Background & Motivation

The memory layout of Python objects differs between regular and free-threading builds. So, an ABI that supports both regular and free-threading builds cannot include the current PyObject memory layout. To stay compatible with existing ABI (and API), it cannot support statically allocated Python objects.

There is one type of object that is needed in most extension modules and is allocated statically in virtually all cases: the PyModuleDef returned from the module export hooks (that is, PyInit_* functions).

Module export hooks (PyInit_* functions) can return two kinds of objects:

  1. A fully initialized module object (for so-called single-phase initialization). This was the only option in 3.4 and below. Modules created this way have surprising (but backwards-compatible) behaviour around multiple interpreters or repeated loading. (Specifically, the contents of such a module’s __dict__ are shared across all instances of the module object.)

    The module returned is typically created using the PyModule_Create function, which requires a statically allocated (or at least long-lived) PyModuleDef struct.

    It is possible to bypass this using the lower-level PyModule_New* API. This avoids the need for PyModuleDef, but offers much less functionality.

  2. A PyModuleDef object containing a description of how to create a module object. This option, multi-phase initialization, was introduced in PEP 489; see its motivation for why it exists.

The interpreter cannot distinguish between these cases before the export hook is called.

The interpreter switch

Python 3.12 added a way for modules to mark whether they may be loaded in a subinterpreter: the Py_mod_multiple_interpreters slot. Setting it to the “not supported” value signals that an extension can only be loaded in the main interpreter.

Unfortunately, Python can only get this information by calling the module export hook. For single-phase modules, that creates the module object and runs arbitrary initialization code. For modules that set Py_mod_multiple_interpreters to “not supported”, this initialization needs to happen in the main interpreter.

To make this work, if a new module is loaded in a sub-interpreter, Python temporarily switches to the main interpreter, calls the export hook there, and then either switches back and redoes the import, or fails.

This unnecessary and fragile extra work highlights the underlying design issue: Python has no way to get information about an extension before the extension can, potentially, fully initialize itself.

Rationale

For avoiding the module export hook requiring a statically allocated PyObject*, two options come to mind:

  • Returning a dynamically allocated object, whose ownership is transferred to the interpreter. This stucture could be very similar to the existing PyModuleDef, since it needs to contain the same data. Unlike the existing PyModuleDef, this one would need to be reference-counted so that it both outlives “its” module and does not leak.
  • Adding a new export hook, which does not return a PyObject*.

    This was considered already for Python 3.5 in PEP 489, but rejected:

    Keeping only the PyInit hook name, even if it’s not entirely appropriate for exporting a definition, yielded a much simpler solution.

    Alas, after a decade of fixing the implications of this choice, the solution is no longer simple.

A new hook will also allow Python to avoid the second issue mentioned in Motivation – the interpreter switch. Effectivelly, it will add a new phase to multi-phase initialization, in which Python can check whether the module is compatible.

Using slots without a wrapper struct

The existing PyModuleDef is a struct with some fixed fields and a “slots” array. Unlike slots, the fixed fields cannot be individually deprecated and replaced. This proposal does away with fixed fields and proposes using a slots array directly, without a wrapper struct.

The PyModuleDef_Slot struct does have some downsides compared to fixed fields. We believe these are fixable, but leave that out of scope of this PEP (see “Improving slots in general” in the Possible Future Directions section).

Tokens

A static PyModuleDef has another purpose besides describing how a module should be created. As a statically allocated singleton that remains attached to the module object, it allows extension authors to check whether a given Python module is “theirs”: if a module object has a known PyModuleDef, its module state will have a known memory layout.

An analogous issue was solved for types by adding Py_tp_token. This proposal adds the same mechanism to modules.

Unlike types, the import mechanism often has a pointer that’s known to be suitable as a token value; in these cases it can provide a default token. Thus, module tokens do not need a variant of the inelegant Py_TP_USE_SPEC.

To help extensions that straddle Python versions, PyModuleDef addresses are used as default tokens, and where it’s reasonable, they are made interchangeable with tokens.

Soft-deprecating the existing export hook

The only reason for authors of existing extensions to switch to the API proposed here is that it allows a single module for both free-threaded and non-free-threaded builds. It is important that Python allows that, but for many existing modules, it is nowhere near worth losing compatibility with 3.14 and lower versions.

It is much too early to plan deprecation of the old API.

Instead, this PEP proposes to stop adding new features to the PyInit_* scheme. After all, the perfect time for extension authors to switch is when they want to modify module initialization anyway.

Specification

The export hook

When importing an extension module, Python will now first look for an export hook like this:

PyModuleDef_Slot *PyModExport_<NAME>(PyObject *spec);

where <NAME> is the name of the module. For non-ASCII names, it will instead look for PyModExportU_<NAME>, with <NAME> encoded as for existing PyInitU_* hooks (that is, punycode-encoded with hyphens replaced by underscores).

If not found, the import will continue as in previous Python versions (that is, by looking up a PyInit_* or PyInitU_* function).

If found, Python will call the hook with the appropriate importlib.machinery.ModuleSpec object as spec. To support duck-typing, extensions should not type-check this object, and if possible, implement fallbacks for any missing attributes. (The argument is mainly meant for introspection, testing, or use with specialized loaders.)

On failure, the export hook must return NULL with an exception set. This will cause the import to fail. (Python will not fall back to PyInit_* on error.)

On success, the hook must return a pointer to an array of PyModuleDef_Slot structs. Python will then create a module based on the given slots by calling functions proposed below: PyModule_FromSlotsAndSpec and PyModule_Exec. See their description for requirements on the slots array.

The returned array and all data it points to (recursively) must remain valid and constant until runtime shutdown. (We expect functions to export a static constant, or one of several constants chosen depending on, for example, Py_Version. Dynamic behaviour should generally happen in the Py_mod_create and Py_mod_exec functions.)

Dynamic creation

A new function will be added to create a module from an array of slots:

PyObject *PyModule_FromSlotsAndSpec(PyModuleDef_Slot *slots, PyObject *spec)

The slots argument must point to an array of PyModuleDef_Slot structures, terminated by a slot with slot=0 (typically written as {0} in C). There are no required slots, though slots must not be NULL. It follows that minimal input contains only the terminator slot.

The spec argument is a duck-typed ModuleSpec-like object, meaning that any attributes defined for importlib.machinery.ModuleSpec have matching semantics. The name attribute is required, but this limitation may be lifted in the future. The name will be used instead of the Py_mod_name slot (just like PyModule_FromDefAndSpec ignores PyModuleDef.m_name).

The slots arrays for both PyModule_FromSlotsAndSpec and the new export hook will only allow up to one Py_mod_exec slot. Arrays in PyModuleDef.m_slots may have more; this will not change. This limitation is easy to work around and multiple exec slots are rarely used [1].

For modules created without a PyModuleDef, the Py_mod_create function will be called with NULL for the second argument (def). (In the future, if we find a use case for passing the input slots array, a new slot with an updated signature can be added.)

Unlike the PyModExport_* hook, the slots array may be changed or destroyed after the PyModule_FromSlotsAndSpec call. (That is, Python must take a copy of all input data.) As an exception, any PyMethodDef array given by Py_mod_methods must be statically allocated (or be otherwise guaranteed to outlive the objects created from it). This limitation may be lifted in the future.

A new function, PyModule_Exec, will be added to run the exec slot(s) for a module. This acts like PyModule_ExecDef, but supports modules created using slots, and does not take an explicit def:

int PyModule_Exec(PyObject *module)

Calling this is required to fully initialize a module. PyModule_FromSlotsAndSpec will not run it (just like PyModule_FromDefAndSpec does not call PyModule_ExecDef).

For modules created from a def, calling this is equivalent to calling PyModule_ExecDef(module, PyModule_GetDef(module)).

Tokens

Module objects will optionally store a “token”: a void* pointer similar to Py_tp_token for types.

If specified, using a new Py_mod_token slot, the module token must:

  • outlive the module, so it’s not reused for something else while the module exists; and
  • “belong” to the extension module where the module lives, so it will not clash with other extension modules.

(Typically, it should point to a static constant.)

When the address of a PyModuleDef is used as a module’s token, the module should behave as if it was created from that PyModuleDef. In particular, the module state must have matching layout and semantics.

Modules created using the PyModule_FromSlotsAndSpec or the PyModExport_<NAME> export hook can use a new Py_mod_token slot to set the token.

Modules created from a PyModuleDef will have the token set to that definition. An explicit Py_mod_token slot will we rejected for these. (This allows implementations to share storage for the token and def.)

For modules created via the new export hook, the token will be set to the address of the slots array by default. (This does not apply to modules created by PyModule_FromSlotsAndSpec, as that function’s input might not outlive the module.)

The token will not be set for non-PyModuleType instances.

A PyModule_GetToken function will be added to get the token. Since the result may be NULL, it will be passed via a pointer; the function will return 0 on success and -1 on failure:

int PyModule_GetToken(PyObject *, void **token_p)

A new PyType_GetModuleByToken function will be added, with a signature like the existing PyType_GetModuleByDef but a void *token argument, and the same behaviour except matching tokens rather than only defs.

For easier backwards compatibility, the existing PyType_GetModuleByDef will be changed to work exactly like PyType_GetModuleByToken – that is, it will allow a token (cast to a PyModuleDef * pointer) as the def argument. (The PyModule_GetDef function will not get a similar change, as users may access members of its result.)

New slots

For each field of the PyModuleDef struct, except ones from PyModuleDef_HEAD_INIT, a new slot ID will be provided: Py_mod_name, Py_mod_doc, Py_mod_clear, etc. Slots related to the module state rather than the module object will use a Py_mod_state_ prefix. See New API summary for a full list.

All new slots – these and Py_tp_token discussed above – may not be repeated in the slots array, and may not be used in a PyModuleDef.m_slots array. They may not have a NULL value (instead, the slot can be omitted entirely).

Note that currently, for modules created from a spec (that is, using PyModule_FromDefAndSpec), the PyModuleDef.m_name member is ignored and the name from the spec is used instead. All API proposed in this document creates modules from a spec, and it will ignore Py_mod_name in the same way. The slot will be optional, but extension authors are strongly encouraged to include it for the benefit of future APIs, external tooling, debugging, and introspection.

Bits & Pieces

A PyMODEXPORT_FUNC macro will be added, similar to the PyMODINIT_FUNC macro but with PyModuleDef_Slot * as the return type.

A PyModule_GetStateSize function will be added to retrieve the size set by Py_mod_state_size or PyModuleDef.m_size. Since the result may be -1 (for single-phase-init modules), it will be output via a pointer; the function will return 0 on success and -1 on failure:

int PyModule_GetStateSize(PyObject *, Py_ssize_t *result);

Soft-deprecating the existing export hook

The PyInit_* export hook will be soft-deprecated.

New API summary

Python will load a new module export hook, with two variants:

PyModuleDef_Slot *PyModExport_<NAME>(PyObject *spec);
PyModuleDef_Slot *PyModExportU_<ENCODED_NAME>(PyObject *spec);

The following functions will be added:

PyObject *PyModule_FromSlotsAndSpec(PyModuleDef_Slot *, PyObject *spec)
int PyModule_Exec(PyObject *)
int PyModule_GetToken(PyObject *, void**)
PyObject *PyType_GetModuleByToken(PyTypeObject *type, void *token)
int PyModule_GetStateSize(PyObject *, Py_ssize_t *result);

A new macro will be added:

PyMODEXPORT_FUNC

And new slot types (#defined names for small integers):

  • Py_mod_name (equivalent to PyModuleDef.m_name)
  • Py_mod_doc (equivalent to PyModuleDef.m_doc)
  • Py_mod_state_size (equivalent to PyModuleDef.m_size)
  • Py_mod_methods (equivalent to PyModuleDef.m_methods)
  • Py_mod_state_traverse (equivalent to PyModuleDef.m_traverse)
  • Py_mod_state_clear (equivalent to PyModuleDef.m_clear)
  • Py_mod_state_free (equivalent to PyModuleDef.m_free)
  • Py_mod_token (see above)

All this will be added to the Limited API.

Backwards Compatibility

If an existing module is ported to use the new mechanism, then PyModule_GetDef will start returning NULL for it. (This matches PyModule_GetDef’s current documentation.) We claim that how a module was defined is an implementation detail of that module, so this should not be considered a breaking change.

Similarly, the PyType_GetModuleByDef function may stop matching modules whose definition changed. Module authors may avoid this by explicitly setting a def as the token.

PyType_GetModuleByDef will now accept a module token as the def argument. We specify a suitable restriction on using PyModuleDef addresses as tokens, and non-PyModuleDef pointers were previously invalid input, so this is not a backwards-compatibility issue.

The Py_mod_create function may now be called with NULL for the second argument. This could trip people porting from def to slots, so it needs to be mentioned in porting notes.

Forward compatibility

If a module defines the new export hook, CPython versions that implement this PEP will ignore the traditional PyInit_* hook.

Extensions that straddle Python versions are expected to define both hooks; each build of CPython will “pick” the newest one that it supports.

Porting guide

Here is a guide to convert an existing module to the new API, including some tricky edge cases. It should be moved to a HOWTO in the documentation.

This guide is meant for hand-written modules. For code generators and language wrappers, the Backwards compatibility shim below may be more useful.

  1. Scan your code for uses of PyModule_GetDef. This function will return NULL for modules that use the new mechanism. Instead:
    • For getting the contents of the module’s PyModuleDef, use the C struct directly. Alternatively, get attributes from the module using, for example, PyModule_GetNameObject, the __doc__ attribute, and PyModule_GetStateSize. (Note that Python code can mutate a module’s attributes.)
    • For testing if a module object is “yours”, use PyModule_GetToken instead. Later in this guide, you’ll set the token to be the existing PyModuleDef structure.
  2. Optionally, scan your code for uses of PyType_GetModuleByDef, and replace them with PyType_GetModuleByToken. Later in this guide, you’ll set the token to be the existing PyModuleDef structure.

    (You may skip this step if targetting Python versions that don’t expose PyType_GetModuleByToken, since PyType_GetModuleByDef is backwards-compatible.)

  3. Look at the function identified by Py_mod_create, if any. Make sure that it does not use its second argument (PyModuleDef), as it will be called with NULL. Instead of the argument, use the existing PyModuleDef struct directly.
  4. If using multiple Py_mod_exec slots, consolidate them: pick one of the functions, or write a new one, and call the others from it. Remove all but one Py_mod_exec slots.
  5. Make a copy of the existing PyModuleDef_Slot array pointed to by the m_slots member of your PyModuleDef. If you don’t have an existing slots array, create one like this:
    static PyModuleDef_Slot module_slots[] = {
        {0}
    };
    

    Give this array a unique name. Further examples will assume that you’ve named it module_slots.

  6. Add slots for all members of the existing PyModuleDef structure. See New API summary for a list of the new slots. For example, to add a name and docstring:
    static PyModuleDef_Slot module_slots[] = {
        {Py_mod_name, "mymodule"},
        {Py_mod_doc, (char*)PyDoc_STR("my docstring")},
        // ... (keep existing slots here)
        {0}
    };
    
  7. If you switched from PyModule_GetDef to PyModule_GetToken, and/or if you use PyType_GetModuleByDef or PyType_GetModuleByToken, add a Py_mod_token slot pointing to the existing PyModuleDef struct:
    static PyModuleDef_Slot module_slots[] = {
        // ... (keep existing slots here)
        {Py_mod_token, &your_module_def},
        {0}
    };
    
  8. Add a new export hook.
    PyMODEXPORT_FUNC PyModExport_examplemodule(PyObject);
    
    PyMODEXPORT_FUNC
    PyModExport_examplemodule(PyObject *spec)
    {
        return module_slots;
    }
    

The new export hook will be used on Python 3.15 and above. Once your module no longer supports lower versions:

  1. Delete the PyInit_ function.
  2. If the existing PyModuleDef struct is used only for Py_mod_token and/or PyType_GetModuleByToken, you may remove the Py_mod_token line and replace &your_module_def with module_slots everywhere else.
  3. Delete any unused data. The PyModuleDef struct and the original slots array are likely to be unused.

Backwards compatibility shim

It is possible to write a generic function that implements the “old” export hook (PyInit_) in terms of the API proposed here.

The following implementation can be copied and pasted to a project; only the names PyInit_examplemodule (twice) and PyModExport_examplemodule should need adjusting.

When added to the Example below and compiled with a non-free-threaded build of this PEP’s reference implementation, the resulting extension is compatible with non-free-threading 3.9+ builds, in addition to a free-threading build of the reference implementation. (The module must be named without a version tag, e.g. examplemodule.so, and be placed on sys.path.)

Full support for creating such modules will require backports of some new API, and support in build/install tools. This is out of scope of this PEP. (In particular, the demo “cheats” by using a subset of Limited API 3.15 that happens to work on 3.9; a proper implementation would use Limited API 3.9 with backport shims for new API like Py_mod_name.)

This implementation places a few additional requirements on the slots array:

  • Slots that correspond to PyModuleDef members must come first.
  • A Py_mod_name slot is required.
  • Any Py_mod_token must be set to &module_def_and_token, defined here.

It also passes NULL as spec to the PyModExport export hook. A proper implementation would pass None instead.

#include <string.h>     // memset

PyMODINIT_FUNC PyInit_examplemodule(void);

static PyModuleDef module_def_and_token;

PyMODINIT_FUNC
PyInit_examplemodule(void)
{
    PyModuleDef_Slot *slot = PyModExport_examplemodule(NULL);

    if (module_def_and_token.m_name) {
        // Take care to only set up the static PyModuleDef once.
        // (PyModExport might theoretically return different data each time.)
        return PyModuleDef_Init(&module_def_and_token);
    }
    int copying_slots = 1;
    for (/* slot set above */; slot->slot; slot++) {
        switch (slot->slot) {
        // Set PyModuleDef members from slots. These slots must come first.
#       define COPYSLOT_CASE(SLOT, MEMBER, TYPE)                            \
            case SLOT:                                                      \
                if (!copying_slots) {                                       \
                    PyErr_SetString(PyExc_SystemError,                      \
                                    #SLOT " must be specified earlier");    \
                    goto error;                                             \
                }                                                           \
                module_def_and_token.MEMBER = (TYPE)(slot->value);          \
                break;                                                      \
            /////////////////////////////////////////////////////////////////
        COPYSLOT_CASE(Py_mod_name, m_name, char*)
        COPYSLOT_CASE(Py_mod_doc, m_doc, char*)
        COPYSLOT_CASE(Py_mod_state_size, m_size, Py_ssize_t)
        COPYSLOT_CASE(Py_mod_methods, m_methods, PyMethodDef*)
        COPYSLOT_CASE(Py_mod_state_traverse, m_traverse, traverseproc)
        COPYSLOT_CASE(Py_mod_state_clear, m_clear, inquiry)
        COPYSLOT_CASE(Py_mod_state_free, m_free, freefunc)
        case Py_mod_token:
            // With PyInit_, the PyModuleDef is used as the token.
            if (slot->value != &module_def_and_token) {
                PyErr_SetString(PyExc_SystemError,
                                "Py_mod_token must be set to "
                                "&module_def_and_token");
                goto error;
            }
            break;
        default:
            // The remaining slots become m_slots in the def.
            // (`slot` now points to the "rest" of the original
            //  zero-terminated array.)
            if (copying_slots) {
                module_def_and_token.m_slots = slot;
            }
            copying_slots = 0;
            break;
        }
    }
    if (!module_def_and_token.m_name) {
        // This function needs m_name as the "is initialized" marker.
        PyErr_SetString(PyExc_SystemError, "Py_mod_name slot is required");
        goto error;
    }
    return PyModuleDef_Init(&module_def_and_token);

error:
    memset(&module_def_and_token, 0, sizeof(module_def_and_token));
    return NULL;
}

Security Implications

None known

How to Teach This

In addition to regular reference docs, the Porting guide should be added as a new HOWTO.

Example

/*
Example module with C-level module-global state, and a simple function to
update and query it.
Once compiled and renamed to not include a version tag (for example
examplemodule.so on Linux), this will run succesfully on both regular
and free-threaded builds.

Python usage:

import examplemodule
print(examplemodule.increment_value())  # 0
print(examplemodule.increment_value())  # 1
print(examplemodule.increment_value())  # 2
print(examplemodule.increment_value())  # 3

*/

// Avoid CPython-version-specific ABI (inline functions & macros):
#define Py_LIMITED_API 0x030f0000  // 3.15

#include <Python.h>

typedef struct {
    int value;
} examplemodule_state;

static PyObject *
increment_value(PyObject *module, PyObject *_ignored)
{
    examplemodule_state *state = PyModule_GetState(module);
    int result = ++(state->value);
    return PyLong_FromLong(result);
}

static PyMethodDef examplemodule_methods[] = {
    {"increment_value", increment_value, METH_NOARGS},
    {NULL}
};

static int
examplemodule_exec(PyObject *module) {
    examplemodule_state *state = PyModule_GetState(module);
    state->value = -1;
    return 0;
}

PyDoc_STRVAR(examplemodule_doc, "Example extension.");

static PyModuleDef_Slot examplemodule_slots[] = {
    {Py_mod_name, "examplemodule"},
    {Py_mod_doc, (char*)examplemodule_doc},
    {Py_mod_methods, examplemodule_methods},
    {Py_mod_state_size, (void*)sizeof(examplemodule_state)},
    {Py_mod_exec, (void*)examplemodule_exec},
    {0}
};

// Avoid "implicit declaration of function" warning:
PyMODEXPORT_FUNC PyModExport_examplemodule(PyObject *);

PyMODEXPORT_FUNC
PyModExport_examplemodule(PyObject *spec)
{
    return examplemodule_slots;
}

Reference Implementation

A draft implementation is available in a GitHub branch.

Open Issues

(Add yours!)

Rejected Ideas

Exporting a data pointer rather than a function

This proposes a new module export function, which is expected to return static constant data. That data could be exported directly as a data pointer.

With a function, we avoid dealing with a new kind of exported symbol.

A function also allows the extension to introspect its environment in a limited way – for example, to tailor the returned data to the current Python version.

Possible Future Directions

These ideas are out of scope for this proposal.

Improving slots in general

Slots – and specifically the existing PyModuleDef_Slot – do have a few shortcomings. The most important are:

  • Type safety: void * is used for data pointers, function pointers and small integers, requiring casting that is technically undefined behaviour in C – but works in practice on all relevant architectures. (For example: Py_tp_doc marks a string; Py_mod_gil an integer.)
  • Limited forward compatibility: if an extension provides a slot ID that’s unknown to the current interpreter, module creation will fail. This makes it cumbersome to use “optional” features – ones that should only take effect if the interpreter supports them. (The recently added slots Py_mod_gil and Py_mod_multiple_interpreters are good examples.)

    One workaround is to check Py_Version in the export function, and return a slot array suitable for the current interpreter.

Updating defaults

With a new API, we could update defaults for the Py_mod_multiple_interpreters and Py_mod_gil slots.

The inittab

We’ll need to allow PyModuleDef-less slots in the inittab – that is, add a new variant of PyImport_ExtendInittab. Should that be part of this PEP?

The inittab is used for embedding, where a common/stable ABI is not that important. So, it might be OK to leave this to a later change.

Footnotes


Source: https://github.com/python/peps/blob/main/peps/pep-0793.rst

Last modified: 2025-06-11 11:43:59 GMT