PEP 788 – Protecting the C API from Interpreter Finalization

Author:: Peter Bierma <peter at python.org>
Sponsor:: Victor Stinner <vstinner at python.org>
Discussions-To:: Discourse thread
Status:: Draft
Type:: Standards Track
Created:: 23-Apr-2025
Python-Version:: 3.15
Post-History:: 10-Mar-2025, 27-Apr-2025, 28-May-2025, 03-Oct-2025

Table of Contents

Abstract
Background
Motivation
Rationale
- Preventing Interpreter Shutdown
- Compatibility Shim for PyGILState_Ensure
Specification
Backwards Compatibility
Security Implications
How to Teach This
- Examples
Reference Implementation
Open Issues
- How Should the APIs Fail?
Rejected Ideas
Acknowledgements
Copyright

Abstract

This PEP introduces a suite of functions in the C API to safely attach to an interpreter by preventing finalization. For example:

static int
thread_function(PyInterpreterView view)
{
    // Prevent the interpreter from finalizing
    PyInterpreterGuard guard = PyInterpreterGuard_FromView(view);
    if (guard == 0) {
        return -1;
    }

    // Analogous to PyGILState_Ensure(), but this is thread-safe.
    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        return -1;
    }

    // Now we can call Python code, without worrying about the thread
    // hanging due to finalization.
    if (PyRun_SimpleString("print('My hovercraft is full of eels') < 0) {
        PyErr_Print();
    }

    // Destroy the thread state and allow the interpreter to finalize.
    PyThreadState_Release(thread_view);
    PyInterpreterGuard_Close(guard);
    return 0;
}

In addition, the APIs in the PyGILState family are deprecated by this proposal.

In the C API, threads can interact with an interpreter by holding an attached thread state for the current thread. This can get complicated when it comes to creating and attaching thread states in a safe manner, because any non-Python thread (one not created via the threading module) is considered to be “daemon”, meaning that the interpreter won’t wait on that thread before shutting down. Instead, the interpreter will hang the thread when it attempts to attach a thread state, making the thread unusable thereafter.

Attaching a thread state can happen at any point when invoking Python, such as in-between bytecode instructions (to yield the GIL to a different thread), or when a C function exits a Py_BEGIN_ALLOW_THREADS block, so simply guarding against whether the interpreter is finalizing isn’t enough to safely call Python code. (Note that hanging the thread is a relatively new behavior; in older versions, the thread would exit, but the issue is the same.)

Currently, the C API doesn’t provide any way to ensure that an interpreter is in a state that won’t cause a thread to hang when trying to attach. This can be a frustrating issue in large applications that need to execute Python code alongside other native code.

In addition, a typical pattern among users creating non-Python threads is to use PyGILState_Ensure(), which was introduced in PEP 311. This has been very unfortunate for subinterpreters, because PyGILState_Ensure() tends to create a thread state for the main interpreter rather than the current interpreter. This leads to thread-safety issues when extensions create threads that interact with the Python interpreter, because assumptions about the GIL are incorrect.

Motivation

Non-Python Threads Always Hang During Finalization

Many large libraries might need to call Python code in highly asynchronous situations where the desired interpreter could be finalizing or deleted, but want to continue running code after invoking the interpreter. This desire has been brought up by users. For example, a callback that wants to call Python code might be invoked when:

A kernel has finished running on a GPU.
A network packet was received.
A thread has quit, and a native library is executing static finalizers for thread-local storage.

Generally, this pattern would look something like this:

static void
some_callback(void *closure)
{
    /* Do some work */
    /* ... */

    PyGILState_STATE gstate = PyGILState_Ensure();
    /* Invoke the C API to do some computation */
    PyGILState_Release(gstate);

    /* ... */
}

This means that any non-Python thread may be terminated at any point, which severely limits users who want to do more than just execute Python code in their stream of calls.

`Py_IsFinalizing` Cannot Be Used Atomically

Due to the problem mentioned previously, the docs currently recommend Py_IsFinalizing() to guard against termination of the thread:

Calling this function from a thread when the runtime is finalizing will terminate the thread, even if the thread was not created by Python. You can use Py_IsFinalizing() or sys.is_finalizing() to check if the interpreter is in process of being finalized before calling this function to avoid unwanted termination.

Unfortunately, this doesn’t work reliably, because of time-of-call to time-of-use issues; the interpreter might not be finalizing during the call to Py_IsFinalizing(), but it might start finalizing immediately afterward, which would cause the attachment of a thread state to hang the thread.

Users have expressed a desire for an atomic way to call Py_IsFinalizing in the past.

Locks in Native Extensions Can Be Unusable During Finalization

When acquiring locks in a native API, it’s common to release the GIL (or critical sections on the free-threaded build) to avoid lock-ordering deadlocks. This can be problematic during finalization, because threads holding locks might be hung. For example:

A thread goes to acquire a lock, first detaching its thread state to avoid deadlocks.
The main thread begins finalization and tells all thread states to hang upon attachment.
The thread acquires the lock it was waiting on, but then hangs while attempting to reattach its thread state via Py_END_ALLOW_THREADS.
The main thread can no longer acquire the lock, because the thread holding it has hung.

This affects CPython itself, and there’s not much that can be done to fix it with the current API. For example, python/cpython#129536 remarks that the ssl module can emit a fatal error when used at finalization, because a daemon thread got hung while holding the lock for sys.stderr, and then a finalizer tried to write to it. Ideally, a thread should be able to temporarily prevent the interpreter from hanging it while it holds the lock.

Finalization Behavior for `PyGILState_Ensure` Cannot Change

There will always have to be a point in a Python program where PyGILState_Ensure() can no longer attach a thread state. If the interpreter is long dead, then Python obviously can’t give a thread a way to invoke it. PyGILState_Ensure() doesn’t have any meaningful way to return a failure, so it has no choice but to terminate the thread or emit a fatal error, as noted in python/cpython#124622:

I think a new GIL acquisition and release C API would be needed. The way the existing ones get used in existing C code is not amenible to suddenly bolting an error state onto; none of the existing C code is written that way. After the call they always just assume they have the GIL and can proceed. The API was designed as “it’ll block and only return once it has the GIL” without any other option.

As a result, CPython can’t make any real changes to how PyGILState_Ensure() works during finalization, because it would break existing code.

The Term “GIL” Is Tricky for Free-threading

A significant issue with the term “GIL” in the C API is that it is semantically misleading. This was noted in python/cpython#127989, created by the author of this PEP:

The biggest issue is that for free-threading, there is no GIL, so users erroneously call the C API inside Py_BEGIN_ALLOW_THREADS blocks or omit PyGILState_Ensure in fresh threads.

Again, PyGILState_Ensure() gets an attached thread state for the thread on both with-GIL and free-threaded builds. An attached thread state is always needed to call the C API, so PyGILState_Ensure() still needs to be called on free-threaded builds, but with a name like “ensure GIL”, it’s not immediately clear that that’s true.

`PyGILState_Ensure` Doesn’t Guess the Correct Interpreter

As noted in the documentation, the PyGILState functions aren’t officially supported in subinterpreters:

Note that the PyGILState_* functions assume there is only one global interpreter (created automatically by Py_Initialize()). Python supports the creation of additional interpreters (using Py_NewInterpreter()), but mixing multiple interpreters and the PyGILState_* API is unsupported.

This is because PyGILState_Ensure() doesn’t have any way to know which interpreter created the thread, and as such, it has to assume that it was the main interpreter. There isn’t any way to detect this at runtime, so spurious races are bound to come up in threads created by subinterpreters, because synchronization for the wrong interpreter will be used on objects shared between the threads.

For example, if the thread had access to object A, which belongs to a subinterpreter, but then called PyGILState_Ensure(), the thread would have an attached thread state pointing to the main interpreter, not the subinterpreter. This means that any GIL assumptions about the object are wrong, because there is no synchronization between the two GILs.

There’s no great way to solve this, other than introducing a new API that explicitly takes an interpreter from the caller.

Subinterpreters Can Concurrently Deallocate

The other way of creating a non-Python thread, PyThreadState_New() and PyThreadState_Swap(), is a lot better for supporting subinterpreters (because PyThreadState_New() takes an explicit interpreter, rather than assuming that the main interpreter was requested), but is still limited by the current hanging problems in the C API, and is subject to crashes when the subinterpreter finalizes before the thread has a chance to start. This is because in subinterpreters, the PyInterpreterState * structure is allocated on the heap, whereas the main interpreter is statically allocated on the Python runtime state.

Rationale

Preventing Interpreter Shutdown

This PEP takes an approach in which an interpreter includes a guarding API that prevents it from shutting down. Holding an interpreter guard ensures it is safe to call the C API without worrying about the thread being hung by finalization.

This means that interfacing with Python (for example, in a C++ library) will need a guard to the interpreter in order to safely call the object, which is more inconvenient than assuming the main interpreter is the right choice, but there’s not really another option.

This proposal also comes with “views” to an interpreter that can be used to safely poke at an interpreter that may be dead or alive. Using a view, users can create an interpreter guard at any point during its lifecycle, and it will safely fail if the interpreter can no longer support calling Python code.

Compatibility Shim for `PyGILState_Ensure`

This proposal comes with PyUnstable_InterpreterView_FromDefault() as a compatibility hack for some users of PyGILState_Ensure(). It is a thread-safe way to create a guard for the main (or “default”) interpreter.

The main drawback to porting new code to PyThreadState_Ensure() is that it isn’t a drop-in replacement for PyGILState_Ensure(), as it needs an interpreter guard argument. In some large applications, refactoring to use a PyInterpreterGuard everywhere might be tricky, so this function serves as a last resort for users who explicitly want to disallow support for subinterpreters.

Specification

Interpreter Guards

type PyInterpreterGuard

An opaque interpreter guard.

By holding an interpreter guard, the caller can ensure that the interpreter will not finalize until the guard is destroyed.

This is similar to a “readers-writers” lock; threads may hold an interpreter’s guard concurrently, and the interpreter will have to wait until all threads have destroyed their guards before it can enter finalization.

This type is guaranteed to be pointer-sized.

PyInterpreterGuard PyInterpreterGuard_FromCurrent(void)

Create a finalization guard for the current interpreter.

On success, this function guards the interpreter and returns an opaque reference to the guard; on failure, it returns 0 with an exception set.

The caller must hold an attached thread state.

PyInterpreterGuard PyInterpreterGuard_FromView(PyInterpreterView view)

Create a finalization guard for an interpreter through a view.

On success, this function returns a guard to the interpreter represented by view. The view is still valid after calling this function.

If the interpreter no longer exists or cannot safely run Python code, this function returns 0 without setting an exception.

The caller does not need to hold an attached thread state.

PyInterpreterState *PyInterpreterGuard_GetInterpreter(PyInterpreterGuard guard): Return the PyInterpreterState pointer protected by guard.
This function cannot fail, and the caller doesn’t need to hold an attached thread state.

PyInterpreterGuard PyInterpreterGuard_Copy(PyInterpreterGuard guard)

Duplicate an interpreter guard.

On success, this function returns a copy of guard; on failure, it returns 0 without an exception set.

The caller does not need to hold an attached thread state.

void PyInterpreterGuard_Close(PyInterpreterGuard guard): Destroy an interpreter guard, allowing the interpreter to enter finalization if no other guards remain. If an interpreter guard is never closed, the interpreter will infinitely wait when trying to enter finalization.
This function cannot fail, and the caller doesn’t need to hold an attached thread state.

Interpreter Views

type PyInterpreterView

An opaque view of an interpreter.

This is a thread-safe way to access an interpreter that may be finalized in another thread.

This type is guaranteed to be pointer-sized.

PyInterpreterView PyInterpreterView_FromCurrent(void)

Create a view to the current interpreter.

This function is generally meant to be used in tandem with PyInterpreterGuard_FromView().

On success, this function returns a view to the current interpreter; on failure, it returns 0 with an exception set.

The caller must hold an attached thread state.

PyInterpreterView PyInterpreterView_Copy(PyInterpreterView view)

Duplicate a view to an interpreter.

On success, this function returns a non-zero copy of view; on failure, it returns 0 without an exception set.

The caller does not need to hold an attached thread state.

void PyInterpreterView_Close(PyInterpreterView view): Delete an interpreter view. If an interpreter view is never closed, the view’s memory will never be freed.
This function cannot fail, and the caller doesn’t need to hold an attached thread state.

PyInterpreterView PyUnstable_InterpreterView_FromDefault()

Create a view for an arbitrary “main” interpreter.

This function only exists for exceptional cases where a specific interpreter can’t be saved.

On success, this function returns a view to the main interpreter; on failure, it returns 0 without an exception set.

The caller does not need to hold an attached thread state.

Ensuring and Releasing Thread States

This proposal includes two new high-level threading APIs that intend to replace PyGILState_Ensure() and PyGILState_Release().

type PyThreadView

An opaque view of a thread state.

In this PEP, a thread view provides no additional properties beyond a PyThreadState* pointer. However, APIs for PyThreadView may be added in the future.

This type is guaranteed to be pointer-sized.

PyThreadView PyThreadState_Ensure(PyInterpreterGuard guard)

Ensure that the thread has an attached thread state for the interpreter protected by guard, and thus can safely invoke that interpreter. It is OK to call this function if the thread already has an attached thread state, as long as there is a subsequent call to PyThreadState_Release() that matches this one.

Nested calls to this function will only sometimes create a new thread state. If there is no attached thread state, then this function will check for the most recent attached thread state used by this thread. If none exists or it doesn’t match guard, a new thread state is created. If it does match guard, it is reattached. If there is an attached thread state, then a similar check occurs; if the interpreter matches guard, it is attached, and otherwise a new thread state is created.

Return a non-zero thread view of the old thread state on success, and 0 on failure.

void PyThreadState_Release(PyThreadView view)

Release a PyThreadState_Ensure() call. If this function is not called, the thread state created by PyThreadState_Ensure(), if any, will leak.

The attached thread state before the corresponding PyThreadState_Ensure() call is guaranteed to be restored upon returning. The cached thread state as used (the “GIL-state”), by PyThreadState_Ensure() and PyGILState_Ensure(), will also be restored.

This function cannot fail.

Deprecation of `PyGILState` APIs

This PEP deprecates all of the existing PyGILState APIs in favor of the existing and new PyThreadState APIs. Namely:

PyGILState_Ensure(): use PyThreadState_Ensure() instead.
PyGILState_Release(): use PyThreadState_Release() instead.
PyGILState_GetThisThreadState(): use PyThreadState_Get() or PyThreadState_GetUnchecked() instead.
PyGILState_Check(): use PyThreadState_GetUnchecked() != NULL instead.

All of the PyGILState APIs are to be removed from the non-limited C API in Python 3.20. They will remain available in the stable ABI for compatibility.

Backwards Compatibility

This PEP specifies a breaking change with the removal of all the PyGILState APIs from the public headers of the non-limited C API in Python 3.20.

Security Implications

This PEP has no known security implications.

How to Teach This

As with all C API functions, all the new APIs in this PEP will be documented in the C API documentation, ideally under the “Non-Python created threads” section. The existing PyGILState documentation should be updated accordingly to point to the new APIs.

Examples

These examples are here to help understand the APIs described in this PEP. They could be reused in the documentation.

Example: A Library Interface

Imagine that you’re developing a C library for logging. You might want to provide an API that allows users to log to a Python file object.

With this PEP, you would implement it like this:

int
LogToPyFile(PyInterpreterView view,
            PyObject *file,
            PyObject *text)
{
    PyInterpreterGuard guard = PyInterpreterGuard_FromView(view);
    if (guard == 0) {
        /* Python interpreter has shut down */
        return -1;
    }

    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        fputs("Cannot call Python.\n", stderr);
        return -1;
    }

    const char *to_write = PyUnicode_AsUTF8(text);
    if (to_write == NULL) {
        // Since the exception may be destroyed upon calling PyThreadState_Release(),
        // print out the exception ourselves.
        PyErr_Print();
        PyThreadState_Release(thread_view);
        PyInterpreterGuard_Close(guard);
        return -1;
    }
    int res = PyFile_WriteString(to_write, file);
    if (res < 0) {
        PyErr_Print();
    }

    PyThreadState_Release(thread_view);
    PyInterpreterGuard_Close(guard);
    return res < 0;
}

Example: A Single-threaded Ensure

This example shows how to acquire a C lock in a Python method defined from C.

If this were called from a daemon thread, the interpreter could hang the thread while reattaching its thread state, leaving us with the lock held. Any future finalizer that attempts to acquire the lock would be deadlocked.

static PyObject *
my_critical_operation(PyObject *self, PyObject *Py_UNUSED(args))
{
    assert(PyThreadState_GetUnchecked() != NULL);
    PyInterpreterGuard guard = PyInterpreterGuard_FromCurrent();
    if (guard == 0) {
        /* Python interpreter has shut down */
        return NULL;
    }

    Py_BEGIN_ALLOW_THREADS;
    acquire_some_lock();

    /* Do something while holding the lock.
       The interpreter won't finalize during this period. */
    // ...

    release_some_lock();
    Py_END_ALLOW_THREADS;
    PyInterpreterGuard_Close(guard);
    Py_RETURN_NONE;
}

Example: Transitioning From the Legacy Functions

The following code uses the PyGILState APIs:

static int
thread_func(void *arg)
{
    PyGILState_STATE gstate = PyGILState_Ensure();
    /* It's not an issue in this example, but we just attached
       a thread state for the main interpreter. If my_method() was
       originally called in a subinterpreter, then we would be unable
       to safely interact with any objects from it. */
    if (PyRun_SimpleString("print(42)") < 0) {
        PyErr_Print();
    }
    PyGILState_Release(gstate);
    return 0;
}

static PyObject *
my_method(PyObject *self, PyObject *unused)
{
    PyThread_handle_t handle;
    PyThead_indent_t indent;

    if (PyThread_start_joinable_thread(thread_func, NULL, &ident, &handle) < 0) {
        return NULL;
    }
    Py_BEGIN_ALLOW_THREADS;
    PyThread_join_thread(handle);
    Py_END_ALLOW_THREADS;
    Py_RETURN_NONE;
}

This is the same code, rewritten to use the new functions:

static int
thread_func(void *arg)
{
    PyInterpreterGuard guard = (PyInterpreterGuard)arg;
    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        return -1;
    }
    if (PyRun_SimpleString("print(42)") < 0) {
        PyErr_Print();
    }
    PyThreadState_Release(thread_view);
    PyInterpreterGuard_Close(guard);
    return 0;
}

static PyObject *
my_method(PyObject *self, PyObject *unused)
{
    PyThread_handle_t handle;
    PyThead_indent_t indent;

    PyInterpreterGuard guard = PyInterpreterGuard_FromCurrent();
    if (guard == 0) {
        return NULL;
    }

    if (PyThread_start_joinable_thread(thread_func, (void *)guard, &ident, &handle) < 0) {
        PyInterpreterGuard_Close(guard);
        return NULL;
    }
    Py_BEGIN_ALLOW_THREADS
    PyThread_join_thread(handle);
    Py_END_ALLOW_THREADS
    Py_RETURN_NONE;
}

Example: A Daemon Thread

With this PEP, daemon threads are very similar to how non-Python threads work in the C API today. After calling PyThreadState_Ensure(), simply close the interpreter guard to allow the interpreter to shut down (and hang the current thread forever).

static int
thread_func(void *arg)
{
    PyInterpreterGuard guard = (PyInterpreterGuard)arg;
    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        return -1;
    }
    /* Close the interpreter guard, allowing it to
       finalize. This means that print(42) can hang this thread. */
    PyInterpreterGuard_Close(guard);
    if (PyRun_SimpleString("print(42)") < 0) {
        PyErr_Print();
    }
    PyThreadState_Release(thread_view);
    return 0;
}

static PyObject *
my_method(PyObject *self, PyObject *unused)
{
    PyThread_handle_t handle;
    PyThead_indent_t indent;

    PyInterpreterGuard guard = PyInterpreterGuard_FromCurrent();
    if (guard == 0) {
        return NULL;
    }

    if (PyThread_start_joinable_thread(thread_func, (void *)guard, &ident, &handle) < 0) {
        PyInterpreterGuard_Close(guard);
        return NULL;
    }
    Py_RETURN_NONE;
}

Example: An Asynchronous Callback

typedef struct {
    PyInterpreterView view;
} ThreadData;

static int
async_callback(void *arg)
{
    ThreadData *tdata = (ThreadData *)arg;
    PyInterpreterView view = tdata->view;
    PyInterpreterGuard guard = PyInterpreterGuard_FromView(view);
    if (guard == 0) {
        fputs("Python has shut down!\n", stderr);
        return -1;
    }

    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        return -1;
    }
    if (PyRun_SimpleString("print(42)") < 0) {
        PyErr_Print();
    }
    PyThreadState_Release(thread_view);
    PyInterpreterGuard_Close(guard);
    PyInterpreterView_Close(view);
    PyMem_RawFree(tdata);
    return 0;
}

static PyObject *
setup_callback(PyObject *self, PyObject *unused)
{
    // View to the interpreter. It won't wait on the callback
    // to finalize.
    ThreadData *tdata = PyMem_RawMalloc(sizeof(ThreadData));
    if (tdata == NULL) {
        PyErr_NoMemory();
        return NULL;
    }
    PyInterpreterView view = PyInterpreterView_FromCurrent();
    if (view == 0) {
        PyMem_RawFree(tdata);
        return NULL;
    }
    tdata->view = view;
    register_callback(async_callback, tdata);

    Py_RETURN_NONE;
}

Example: Calling Python Without a Callback Parameter

There are a few cases where callback functions don’t take a callback parameter (void *arg), so it’s difficult to create a guard for any specific interpreter. The solution to this problem is to create a guard for the main interpreter through PyUnstable_InterpreterView_FromDefault().

static void
call_python(void)
{
    PyInterpreterView view = PyUnstable_InterpreterView_FromDefault();
    if (guard == 0) {
        fputs("Python has shut down.", stderr);
        return;
    }

    PyInterpreterGuard guard = PyInterpreterGuard_FromView(view);
    if (guard == 0) {
        fputs("Python has shut down.", stderr);
        return;
    }

    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        PyInterpreterView_Close(view);
        return -1;
    }
    if (PyRun_SimpleString("print(42)") < 0) {
        PyErr_Print();
    }
    PyThreadState_Release(thread_view);
    PyInterpreterGuard_Close(guard);
    PyInterpreterView_Close(view);
    return 0;
}

Reference Implementation

A reference implementation of this PEP can be found at python/cpython#133110.

Open Issues

How Should the APIs Fail?

There is some disagreement over how the PyInterpreter[Guard|View] APIs should indicate a failure to the caller. There are two competing ideas:

Return -1 to indicate failure, and 0 to indicate success. On success, functions will assign to a PyInterpreter[Guard|View] pointer passed as an argument.
Directly return a PyInterpreter[Guard|View], with a value of 0 being equivalent to NULL, indicating failure.

Currently, the PEP spells the latter.

Rejected Ideas

Interpreter Reference Counting

There were two iterations of this proposal that both specified that an interpreter maintain a reference count and would wait for that count to reach zero before shutting down.

The first iteration of this idea did this by adding implicit reference counting to PyInterpreterState * pointers. A function known as PyInterpreterState_Hold would increment the reference count (making it a “strong reference”), and PyInterpreterState_Release would decrement it. An interpreter’s ID (a standalone int64_t) was used as a form of weak reference, which could be used to look up an interpreter state and atomically increment its reference count. These ideas were ultimately rejected because they seemed to make things very confusing. All existing uses of PyInterpreterState * would be borrowed, making it difficult for developers to understand which parts of their code require or use a strong reference.

In response to that pushback, this PEP specified PyInterpreterRef APIs that would also mimic reference counting, but in a more explicit manner that made it easier for developers. PyInterpreterRef was analogous to PyInterpreterGuard in this PEP. Similarly, the older revision included PyInterpreterWeakRef, which was analogous to PyInterpreterView.

Eventually, the notion of reference counting was completely abandoned from this proposal for a few reasons:

There was contention over overcomplication in the API design; the reference-counting design looked very similar to HPy’s, which had no precedent in CPython. There was fear that this proposal was being overcomplicated to look more like HPy.
Unlike traditional reference-counting APIs, acquiring a strong reference to an interpreter could fail at any time, and an interpreter would not be deallocated immediately when its reference count reached zero.
There was prior discussion about adding “true” reference counting to interpreters (which would deallocate upon reaching zero), which would have been very confusing if there was an existing API in CPython titled PyInterpreterRef that did something different.

Non-daemon Thread States

In earlier revisions of this PEP, interpreter guards were a property of a thread state rather than a property of an interpreter. This meant that PyThreadState_Ensure() kept an interpreter guard held, and it was closed upon calling PyThreadState_Release(). A thread state that had a guard to an interpreter was known as a “non-daemon thread state.” At first, this seemed like an improvement because it shifted the management of a guard’s lifetime to the thread rather than the user, which eliminated some boilerplate.

However, this ended up making the proposal significantly more complex and hurt the proposal’s goals:

Most importantly, non-daemon thread states place too much emphasis on daemon threads as the problem, which made the PEP confusing. Additionally, the phrase “non-daemon” added extra confusion, because non-daemon Python threads are explicitly joined. In contrast, a non-daemon C thread is only waited on until it destroys its guard.
In many cases, an interpreter guard should outlive a singular thread state. Stealing the interpreter guard in PyThreadState_Ensure() was particularly troublesome for these cases. If PyThreadState_Ensure() didn’t steal a guard with non-daemon thread states, it would muddy the ownership story of the interpreter guard, leading to a more confusing API.

Exposing an `Activate`/`Deactivate` API Instead of `Ensure`/`Release`

In prior discussions of this API, it was suggested to provide actual PyThreadState pointers in the API in an attempt to make the ownership and lifetime of the thread state more straightforward:

More importantly though, I think this makes it clearer who owns the thread state - a manually created one is controlled by the code that created it, and once it’s deleted it can’t be activated again.

This was ultimately rejected for two reasons:

The proposed API has closer usage to PyGILState_Ensure() & PyGILState_Release(), which helps ease the transition for old codebases.
It’s significantly easier for code-generators like Cython to use, as there isn’t any additional complexity with tracking PyThreadState pointers around.

Using `PyStatus` for the Return Value of `PyThreadState_Ensure`

In prior iterations of this API, PyThreadState_Ensure() returned a PyStatus instead of an integer to denote failures, which had the benefit of providing an error message.

This was rejected because it’s not clear that an error message would be all that useful; all the conceived use-cases for this API wouldn’t really care about a message indicating why Python can’t be invoked. As such, the API would only be needlessly more complex to use, which in turn would hurt the transition from PyGILState_Ensure().

In addition, PyStatus isn’t commonly used in the C API. A few functions related to interpreter initialization use it (simply because they can’t raise exceptions), and PyThreadState_Ensure() does not fall under that category.

Acknowledgements

This PEP is based on prior work, feedback, and discussions from many people, including Victor Stinner, Antoine Pitrou, David Woods, Sam Gross, Matt Page, Ronald Oussoren, Matt Wozniski, Eric Snow, Steve Dower, Petr Viktorin, Gregory P. Smith, and Alyssa Coghlan.

Copyright

This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.

Source: https://github.com/python/peps/blob/main/peps/pep-0788.rst

Last modified: 2025-12-02 18:33:36 GMT