PEP 788 – Protecting the C API from Interpreter Finalization
- Author:
- Peter Bierma <zintensitydev at gmail.com>
- Sponsor:
- Victor Stinner <vstinner at python.org>
- Discussions-To:
- Discourse thread
- Status:
- Draft
- Type:
- Standards Track
- Created:
- 23-Apr-2025
- Python-Version:
- 3.15
- Post-History:
- 10-Mar-2025, 27-Apr-2025, 28-May-2025, 03-Oct-2025
Table of Contents
- Abstract
- Background
- Motivation- Non-Python Threads Always Hang During Finalization
- Locks in Native Extensions Can Be Unusable During Finalization
- Finalization Behavior for PyGILState_EnsureCannot Change
- The Term “GIL” Is Tricky for Free-threading
- PyGILState_EnsureDoesn’t Guess the Correct Interpreter
- Subinterpreters Can Concurrently Deallocate
 
- Rationale
- Specification
- Backwards Compatibility
- Security Implications
- How to Teach This
- Reference Implementation
- Open Issues
- Rejected Ideas
- Acknowledgements
- Copyright
Abstract
This PEP introduces a suite of functions in the C API to safely attach to an interpreter by preventing finalization. For example:
static int
thread_function(PyInterpreterView view)
{
    // Prevent the interpreter from finalizing
    PyInterpreterGuard guard = PyInterpreterGuard_FromView(view);
    if (guard == 0) {
        return -1;
    }
    // Analogous to PyGILState_Ensure(), but this is thread-safe.
    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        return -1;
    }
    // Now we can call Python code, without worrying about the thread
    // hanging due to finalization.
    if (PyRun_SimpleString("print('My hovercraft is full of eels') < 0) {
        PyErr_Print();
    }
    // Destroy the thread state and allow the interpreter to finalize.
    PyThreadState_Release(thread_view);
    PyInterpreterGuard_Close(guard);
    return 0;
}
In addition, the APIs in the PyGILState family are deprecated by this
proposal.
Background
In the C API, threads can interact with an interpreter by holding an
attached thread state for the current thread. This can get complicated
when it comes to creating and attaching thread states
in a safe manner, because any non-Python thread (one not created via the
threading module) is considered to be “daemon”, meaning that the interpreter
won’t wait on that thread before shutting down. Instead, the interpreter will hang the
thread when it attempts to attach a thread state, making the thread unusable
thereafter.
Attaching a thread state can happen at any point when invoking Python, such
as in-between bytecode instructions (to yield the GIL to a different thread),
or when a C function exits a Py_BEGIN_ALLOW_THREADS block, so simply
guarding against whether the interpreter is finalizing isn’t enough to safely
call Python code. (Note that hanging the thread is a relatively new behavior;
in older versions, the thread would exit, but the issue is the same.)
Currently, the C API doesn’t provide any way to ensure that an interpreter is in a state that won’t cause a thread to hang when trying to attach. This can be a frustrating issue in large applications that need to execute Python code alongside other native code.
In addition, a typical pattern among users creating non-Python threads is to
use PyGILState_Ensure(), which was introduced in PEP 311. This has
been very unfortunate for subinterpreters, because PyGILState_Ensure()
tends to create a thread state for the main interpreter rather than the
current interpreter. This leads to thread-safety issues when extensions create
threads that interact with the Python interpreter, because assumptions about
the GIL are incorrect.
Motivation
Non-Python Threads Always Hang During Finalization
Many large libraries might need to call Python code in highly asynchronous situations where the desired interpreter could be finalizing or deleted, but want to continue running code after invoking the interpreter. This desire has been brought up by users. For example, a callback that wants to call Python code might be invoked when:
- A kernel has finished running on a GPU.
- A network packet was received.
- A thread has quit, and a native library is executing static finalizers for thread-local storage.
Generally, this pattern would look something like this:
static void
some_callback(void *closure)
{
    /* Do some work */
    /* ... */
    PyGILState_STATE gstate = PyGILState_Ensure();
    /* Invoke the C API to do some computation */
    PyGILState_Release(gstate);
    /* ... */
}
This means that any non-Python thread may be terminated at any point, which severely limits users who want to do more than just execute Python code in their stream of calls.
Py_IsFinalizing Is Not Atomic
Due to the problem mentioned previously, the docs
currently recommend Py_IsFinalizing() to guard against termination of
the thread:
Calling this function from a thread when the runtime is finalizing will terminate the thread, even if the thread was not created by Python. You can usePy_IsFinalizing()orsys.is_finalizing()to check if the interpreter is in process of being finalized before calling this function to avoid unwanted termination.
Unfortunately, this doesn’t work reliably, because of time-of-call to time-of-use
issues; the interpreter might not be finalizing during the call to
Py_IsFinalizing(), but it might start finalizing immediately
afterward, which would cause the attachment of a thread state to hang the
thread.
Users have expressed a desire for an
atomic way to call Py_IsFinalizing in the past.
Locks in Native Extensions Can Be Unusable During Finalization
When acquiring locks in a native API, it’s common to release the GIL (or critical sections on the free-threaded build) to avoid lock-ordering deadlocks. This can be problematic during finalization, because threads holding locks might be hung. For example:
- A thread goes to acquire a lock, first detaching its thread state to avoid deadlocks.
- The main thread begins finalization and tells all thread states to hang upon attachment.
- The thread acquires the lock it was waiting on, but then hangs while attempting
to reattach its thread state via Py_END_ALLOW_THREADS.
- The main thread can no longer acquire the lock, because the thread holding it has hung.
This affects CPython itself, and there’s not much that can be done
to fix it with the current API. For example,
python/cpython#129536
remarks that the ssl module can emit a fatal error when used at
finalization, because a daemon thread got hung while holding the lock
for sys.stderr, and then a finalizer tried to write to it.
Ideally, a thread should be able to temporarily prevent the interpreter
from hanging it while it holds the lock.
Finalization Behavior for PyGILState_Ensure Cannot Change
There will always have to be a point in a Python program where
PyGILState_Ensure() can no longer attach a thread state.
If the interpreter is long dead, then Python obviously can’t give a
thread a way to invoke it. PyGILState_Ensure() doesn’t have any
meaningful way to return a failure, so it has no choice but to terminate
the thread or emit a fatal error, as noted in
python/cpython#124622:
I think a new GIL acquisition and release C API would be needed. The way the existing ones get used in existing C code is not amenible to suddenly bolting an error state onto; none of the existing C code is written that way. After the call they always just assume they have the GIL and can proceed. The API was designed as “it’ll block and only return once it has the GIL” without any other option.
As a result, CPython can’t make any real changes to how PyGILState_Ensure()
works during finalization, because it would break existing code.
The Term “GIL” Is Tricky for Free-threading
A significant issue with the term “GIL” in the C API is that it is semantically misleading. This was noted in python/cpython#127989, created by the author of this PEP:
The biggest issue is that for free-threading, there is no GIL, so users erroneously call the C API insidePy_BEGIN_ALLOW_THREADSblocks or omitPyGILState_Ensurein fresh threads.
Again, PyGILState_Ensure() gets an attached thread state for the
thread on both with-GIL and free-threaded builds. An attached thread state is
always needed to call the C API, so PyGILState_Ensure() still needs
to be called on free-threaded builds, but with a name like “ensure GIL”, it’s
not immediately clear that that’s true.
PyGILState_Ensure Doesn’t Guess the Correct Interpreter
As noted in the documentation,
the PyGILState functions aren’t officially supported in subinterpreters:
Note that thePyGILState_*functions assume there is only one global interpreter (created automatically byPy_Initialize()). Python supports the creation of additional interpreters (usingPy_NewInterpreter()), but mixing multiple interpreters and thePyGILState_*API is unsupported.
This is because PyGILState_Ensure() doesn’t have any way
to know which interpreter created the thread, and as such, it has to assume
that it was the main interpreter. There isn’t any way to detect this at
runtime, so spurious races are bound to come up in threads created by
subinterpreters, because synchronization for the wrong interpreter will be
used on objects shared between the threads.
For example, if the thread had access to object A, which belongs to a
subinterpreter, but then called PyGILState_Ensure(), the thread would
have an attached thread state pointing to the main interpreter,
not the subinterpreter. This means that any GIL assumptions about the
object are wrong, because there is no synchronization between the two GILs.
There’s no great way to solve this, other than introducing a new API that explicitly takes an interpreter from the caller.
Subinterpreters Can Concurrently Deallocate
The other way of creating a non-Python thread, PyThreadState_New() and
PyThreadState_Swap(), is a lot better for supporting subinterpreters
(because PyThreadState_New() takes an explicit interpreter, rather than
assuming that the main interpreter was requested), but is still limited by the
current hanging problems in the C API, and is subject to crashes when the
subinterpreter finalizes before the thread has a chance to start. This is because
in subinterpreters, the PyInterpreterState * structure is allocated on the
heap, whereas the main interpreter is statically allocated on the Python runtime
state.
Rationale
Preventing Interpreter Shutdown
This PEP takes an approach in which an interpreter includes a guarding API that prevents it from shutting down. Holding an interpreter guard ensures it is safe to call the C API without worrying about the thread being hung by finalization.
This means that interfacing with Python (for example, in a C++ library) will need a guard to the interpreter in order to safely call the object, which is more inconvenient than assuming the main interpreter is the right choice, but there’s not really another option.
This proposal also comes with “views” to an interpreter that can be used to safely poke at an interpreter that may be dead or alive. Using a view, users can create an interpreter guard at any point during its lifecycle, and it will safely fail if the interpreter can no longer support calling Python code.
Compatibility Shim for PyGILState_Ensure
This proposal comes with PyUnstable_InterpreterView_FromDefault() as a
compatibility hack for some users of PyGILState_Ensure(). It is a
thread-safe way to create a guard for the main (or “default”)
interpreter.
The main drawback to porting new code to PyThreadState_Ensure() is that
it isn’t a drop-in replacement for PyGILState_Ensure(), as it needs
an interpreter guard argument. In some large applications, refactoring to
use a PyInterpreterGuard everywhere might be tricky, so this function
serves as a last resort for users who explicitly want to disallow support for
subinterpreters.
Specification
Interpreter Guards
- 
type PyInterpreterGuard
- An opaque interpreter guard.By holding an interpreter guard, the caller can ensure that the interpreter will not finalize until the guard is destroyed. This is similar to a “readers-writers” lock; threads may hold an interpreter’s guard concurrently, and the interpreter will have to wait until all threads have destroyed their guards before it can enter finalization. This type is guaranteed to be pointer-sized. 
- 
PyInterpreterGuard PyInterpreterGuard_FromCurrent(void)
- Create a finalization guard for the current interpreter.On success, this function guards the interpreter and returns an opaque reference to the guard; on failure, it returns 0with an exception set.The caller must hold an attached thread state. 
- 
PyInterpreterGuard PyInterpreterGuard_FromView(PyInterpreterView view)
- Create a finalization guard for an interpreter through a view.On success, this function returns a guard to the interpreter represented by view. The view is still valid after calling this function. If the interpreter no longer exists or cannot safely run Python code, this function returns 0without setting an exception.The caller does not need to hold an attached thread state. 
- 
PyInterpreterState *PyInterpreterGuard_GetInterpreter(PyInterpreterGuard guard)
- Return the PyInterpreterStatepointer protected by guard.This function cannot fail, and the caller doesn’t need to hold an attached thread state. 
- 
PyInterpreterGuard PyInterpreterGuard_Copy(PyInterpreterGuard guard)
- Duplicate an interpreter guard.On success, this function returns a copy of guard; on failure, it returns 0without an exception set.The caller does not need to hold an attached thread state. 
- 
void PyInterpreterGuard_Close(PyInterpreterGuard guard)
- Destroy an interpreter guard, allowing the interpreter to enter
finalization if no other guards remain.This function cannot fail, and the caller doesn’t need to hold an attached thread state. 
Interpreter Views
- 
type PyInterpreterView
- An opaque view of an interpreter.This is a thread-safe way to access an interpreter that may be finalized in another thread. This type is guaranteed to be pointer-sized. 
- 
PyInterpreterView PyInterpreterView_FromCurrent(void)
- Create a view to the current interpreter.This function is generally meant to be used in tandem with PyInterpreterGuard_FromView().On success, this function returns a view to the current interpreter; on failure, it returns 0with an exception set.The caller must hold an attached thread state. 
- 
PyInterpreterView PyInterpreterView_Copy(PyInterpreterView view)
- Duplicate a view to an interpreter.On success, this function returns a non-zero copy of view; on failure, it returns 0without an exception set.This function cannot fail, and the caller doesn’t need to hold an attached thread state. 
- 
void PyInterpreterView_Close(PyInterpreterView view)
- Delete an interpreter view.This function cannot fail, and the caller doesn’t need to hold an attached thread state. 
- 
PyInterpreterView PyUnstable_InterpreterView_FromDefault()
- Create a view for an arbitrary “main” interpreter.This function only exists for exceptional cases where a specific interpreter can’t be saved. On success, this function returns a view to the main interpreter; on failure, it returns 0without an exception set.The caller does not need to hold an attached thread state. 
Ensuring and Releasing Thread States
This proposal includes two new high-level threading APIs that intend to
replace PyGILState_Ensure() and PyGILState_Release().
- 
type PyThreadView
- An opaque view of a thread state.In this PEP, a thread view provides no additional properties beyond a PyThreadState* pointer. However, APIs for PyThreadViewmay be added in the future.This type is guaranteed to be pointer-sized. 
- 
PyThreadView PyThreadState_Ensure(PyInterpreterGuard guard)
- Ensure that the thread has an attached thread state for the
interpreter protected by guard, and thus can safely invoke that
interpreter. It is OK to call this function if the thread already has an
attached thread state, as long as there is a subsequent call to
PyThreadState_Release()that matches this one.Nested calls to this function will only sometimes create a new thread state. If there is no attached thread state, then this function will check for the most recent attached thread state used by this thread. If none exists or it doesn’t match guard, a new thread state is created. If it does match guard, it is reattached. If there is an attached thread state, then a similar check occurs; if the interpreter matches guard, it is attached, and otherwise a new thread state is created. Return a non-zero thread view of the old thread state on success, and 0on failure.
- 
void PyThreadState_Release(PyThreadView view)
- Release a PyThreadState_Ensure()call.The attached thread state before the corresponding PyThreadState_Ensure()call is guaranteed to be restored upon returning. The cached thread state as used (the “GIL-state”), byPyThreadState_Ensure()andPyGILState_Ensure(), will also be restored.This function cannot fail. 
Deprecation of PyGILState APIs
This PEP deprecates all of the existing PyGILState APIs in favor of the
existing and new PyThreadState APIs. Namely:
- PyGILState_Ensure(): use- PyThreadState_Ensure()instead.
- PyGILState_Release(): use- PyThreadState_Release()instead.
- PyGILState_GetThisThreadState(): use- PyThreadState_Get()or- PyThreadState_GetUnchecked()instead.
- PyGILState_Check(): use- PyThreadState_GetUnchecked() != NULLinstead.
All of the PyGILState APIs are to be removed from the non-limited C API in
Python 3.20. They will remain available in the stable ABI for
compatibility.
Backwards Compatibility
This PEP specifies a breaking change with the removal of all the
PyGILState APIs from the public headers of the non-limited C API in
Python 3.20.
Security Implications
This PEP has no known security implications.
How to Teach This
As with all C API functions, all the new APIs in this PEP will be documented
in the C API documentation, ideally under the “Non-Python created threads” section.
The existing PyGILState documentation should be updated accordingly to point
to the new APIs.
Examples
These examples are here to help understand the APIs described in this PEP. They could be reused in the documentation.
Example: A Library Interface
Imagine that you’re developing a C library for logging. You might want to provide an API that allows users to log to a Python file object.
With this PEP, you would implement it like this:
int
LogToPyFile(PyInterpreterView view,
            PyObject *file,
            PyObject *text)
{
    PyInterpreterGuard guard = PyInterpreterGuard_FromView(view);
    if (guard == 0) {
        /* Python interpreter has shut down */
        return -1;
    }
    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        fputs("Cannot call Python.\n", stderr);
        return -1;
    }
    const char *to_write = PyUnicode_AsUTF8(text);
    if (to_write == NULL) {
        // Since the exception may be destroyed upon calling PyThreadState_Release(),
        // print out the exception ourselves.
        PyErr_Print();
        PyThreadState_Release(thread_view);
        PyInterpreterGuard_Close(guard);
        return -1;
    }
    int res = PyFile_WriteString(to_write, file);
    free(to_write);
    if (res < 0) {
        PyErr_Print();
    }
    PyThreadState_Release(thread_view);
    PyInterpreterGuard_Close(guard);
    return res < 0;
}
Example: A Single-threaded Ensure
This example shows how to acquire a C lock in a Python method defined from C.
If this were called from a daemon thread, the interpreter could hang the thread while reattaching its thread state, leaving us with the lock held. Any future finalizer that attempts to acquire the lock would be deadlocked.
static PyObject *
my_critical_operation(PyObject *self, PyObject *Py_UNUSED(args))
{
    assert(PyThreadState_GetUnchecked() != NULL);
    PyInterpreterGuard guard = PyInterpreterGuard_FromCurrent();
    if (guard == 0) {
        /* Python interpreter has shut down */
        return NULL;
    }
    Py_BEGIN_ALLOW_THREADS;
    acquire_some_lock();
    /* Do something while holding the lock.
       The interpreter won't finalize during this period. */
    // ...
    release_some_lock();
    Py_END_ALLOW_THREADS;
    PyInterpreterGuard_Close(guard);
    Py_RETURN_NONE;
}
Example: Transitioning From the Legacy Functions
The following code uses the PyGILState APIs:
static int
thread_func(void *arg)
{
    PyGILState_STATE gstate = PyGILState_Ensure();
    /* It's not an issue in this example, but we just attached
       a thread state for the main interpreter. If my_method() was
       originally called in a subinterpreter, then we would be unable
       to safely interact with any objects from it. */
    if (PyRun_SimpleString("print(42)") < 0) {
        PyErr_Print();
    }
    PyGILState_Release(gstate);
    return 0;
}
static PyObject *
my_method(PyObject *self, PyObject *unused)
{
    PyThread_handle_t handle;
    PyThead_indent_t indent;
    if (PyThread_start_joinable_thread(thread_func, NULL, &ident, &handle) < 0) {
        return NULL;
    }
    Py_BEGIN_ALLOW_THREADS;
    PyThread_join_thread(handle);
    Py_END_ALLOW_THREADS;
    Py_RETURN_NONE;
}
This is the same code, rewritten to use the new functions:
static int
thread_func(void *arg)
{
    PyInterpreterGuard guard = (PyInterpreterGuard)arg;
    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        return -1;
    }
    if (PyRun_SimpleString("print(42)") < 0) {
        PyErr_Print();
    }
    PyThreadState_Release(thread_view);
    PyInterpreterGuard_Close(guard);
    return 0;
}
static PyObject *
my_method(PyObject *self, PyObject *unused)
{
    PyThread_handle_t handle;
    PyThead_indent_t indent;
    PyInterpreterGuard guard = PyInterpreterGuard_FromCurrent();
    if (guard == 0) {
        return NULL;
    }
    if (PyThread_start_joinable_thread(thread_func, (void *)guard, &ident, &handle) < 0) {
        PyInterpreterGuard_Close(guard);
        return NULL;
    }
    Py_BEGIN_ALLOW_THREADS
    PyThread_join_thread(handle);
    Py_END_ALLOW_THREADS
    Py_RETURN_NONE;
}
Example: A Daemon Thread
With this PEP, daemon threads are very similar to how non-Python threads work
in the C API today. After calling PyThreadState_Ensure(), simply
close the interpreter guard to allow the interpreter to shut down (and
hang the current thread forever).
static int
thread_func(void *arg)
{
    PyInterpreterGuard guard = (PyInterpreterGuard)arg;
    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        return -1;
    }
    /* Close the interpreter guard, allowing it to
       finalize. This means that print(42) can hang this thread. */
    PyInterpreterGuard_Close(guard);
    if (PyRun_SimpleString("print(42)") < 0) {
        PyErr_Print();
    }
    PyThreadState_Release(thread_view);
    return 0;
}
static PyObject *
my_method(PyObject *self, PyObject *unused)
{
    PyThread_handle_t handle;
    PyThead_indent_t indent;
    PyInterpreterGuard guard = PyInterpreterGuard_FromCurrent();
    if (guard == 0) {
        return NULL;
    }
    if (PyThread_start_joinable_thread(thread_func, (void *)guard, &ident, &handle) < 0) {
        PyInterpreterGuard_Close(guard);
        return NULL;
    }
    Py_RETURN_NONE;
}
Example: An Asynchronous Callback
typedef struct {
    PyInterpreterView view;
} ThreadData;
static int
async_callback(void *arg)
{
    ThreadData *tdata = (ThreadData *)arg;
    PyInterpreterView view = tdata->view;
    PyInterpreterGuard guard = PyInterpreterGuard_FromView(view);
    if (guard == 0) {
        fputs("Python has shut down!\n", stderr);
        return -1;
    }
    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        return -1;
    }
    if (PyRun_SimpleString("print(42)") < 0) {
        PyErr_Print();
    }
    PyThreadState_Release(thread_view);
    PyInterpreterGuard_Close(guard);
    PyInterpreterView_Close(view);
    PyMem_RawFree(tdata);
    return 0;
}
static PyObject *
setup_callback(PyObject *self, PyObject *unused)
{
    // View to the interpreter. It won't wait on the callback
    // to finalize.
    ThreadData *tdata = PyMem_RawMalloc(sizeof(ThreadData));
    if (tdata == NULL) {
        PyErr_NoMemory();
        return NULL;
    }
    PyInterpreterView view = PyInterpreterView_FromCurrent();
    if (view == 0) {
        PyMem_RawFree(tdata);
        return NULL;
    }
    tdata->view = view;
    register_callback(async_callback, tdata);
    Py_RETURN_NONE;
}
Example: Calling Python Without a Callback Parameter
There are a few cases where callback functions don’t take a callback parameter
(void *arg), so it’s difficult to create a guard for any specific
interpreter. The solution to this problem is to create a guard for the main
interpreter through PyUnstable_InterpreterView_FromDefault().
static void
call_python(void)
{
    PyInterpreterView view = PyUnstable_InterpreterView_FromDefault();
    if (guard == 0) {
        fputs("Python has shut down.", stderr);
        return;
    }
    PyInterpreterGuard guard = PyInterpreterGuard_FromView(view);
    if (guard == 0) {
        fputs("Python has shut down.", stderr);
        return;
    }
    PyThreadView thread_view = PyThreadState_Ensure(guard);
    if (thread_view == 0) {
        PyInterpreterGuard_Close(guard);
        PyInterpreterView_Close(view);
        return -1;
    }
    if (PyRun_SimpleString("print(42)") < 0) {
        PyErr_Print();
    }
    PyThreadState_Release(thread_view);
    PyInterpreterGuard_Close(guard);
    PyInterpreterView_Close(view);
    return 0;
}
Reference Implementation
A reference implementation of this PEP can be found at python/cpython#133110.
Open Issues
How Should the APIs Fail?
There is some disagreement over how the PyInterpreter[Guard|View] APIs
should indicate a failure to the caller. There are two competing ideas:
- Return -1 to indicate failure, and 0 to indicate success. On success,
functions will assign to a PyInterpreter[Guard|View]pointer passed as an argument.
- Directly return a PyInterpreter[Guard|View], with a value of 0 being equivalent toNULL, indicating failure.
Currently, the PEP spells the latter.
Rejected Ideas
Interpreter Reference Counting
There were two iterations of this proposal that both specified that an interpreter maintain a reference count and would wait for that count to reach zero before shutting down.
The first iteration of this idea did this by adding implicit reference counting
to PyInterpreterState * pointers. A function known as PyInterpreterState_Hold
would increment the reference count (making it a “strong reference”), and
PyInterpreterState_Release would decrement it. An interpreter’s ID (a
standalone int64_t) was used as a form of weak reference, which could be
used to look up an interpreter state and atomically increment its reference
count. These ideas were ultimately rejected because they seemed to make things
very confusing. All existing uses of PyInterpreterState * would be
borrowed, making it difficult for developers to understand which
parts of their code require or use a strong reference.
In response to that pushback, this PEP specified PyInterpreterRef APIs
that would also mimic reference counting, but in a more explicit manner that
made it easier for developers. PyInterpreterRef was analogous to
PyInterpreterGuard in this PEP. Similarly, the older revision included
PyInterpreterWeakRef, which was analogous to PyInterpreterView.
Eventually, the notion of reference counting was completely abandoned from this proposal for a few reasons:
- There was contention over overcomplication in the API design; the reference-counting design looked very similar to HPy’s, which had no precedent in CPython. There was fear that this proposal was being overcomplicated to look more like HPy.
- Unlike traditional reference-counting APIs, acquiring a strong reference to an interpreter could fail at any time, and an interpreter would not be deallocated immediately when its reference count reached zero.
- There was prior discussion about adding “true” reference counting to
interpreters (which would deallocate upon reaching zero), which would have
been very confusing if there was an existing API in CPython titled
PyInterpreterRefthat did something different.
Non-daemon Thread States
In earlier revisions of this PEP, interpreter guards were a property of
a thread state rather than a property of an interpreter. This meant that
PyThreadState_Ensure() kept an interpreter guard held, and
it was closed upon calling PyThreadState_Release(). A thread state
that had a guard to an interpreter was known as a “non-daemon thread
state.” At first, this seemed like an improvement because it shifted the
management of a guard’s lifetime to the thread rather than the user, which
eliminated some boilerplate.
However, this ended up making the proposal significantly more complex and hurt the proposal’s goals:
- Most importantly, non-daemon thread states place too much emphasis on daemon threads as the problem, which made the PEP confusing. Additionally, the phrase “non-daemon” added extra confusion, because non-daemon Python threads are explicitly joined. In contrast, a non-daemon C thread is only waited on until it destroys its guard.
- In many cases, an interpreter guard should outlive a singular thread
state. Stealing the interpreter guard in PyThreadState_Ensure()was particularly troublesome for these cases. IfPyThreadState_Ensure()didn’t steal a guard with non-daemon thread states, it would muddy the ownership story of the interpreter guard, leading to a more confusing API.
Exposing an Activate/Deactivate API Instead of Ensure/Clear
In prior discussions of this API, it was
suggested to provide actual
PyThreadState pointers in the API in an attempt to
make the ownership and lifetime of the thread state more straightforward:
More importantly though, I think this makes it clearer who owns the thread state - a manually created one is controlled by the code that created it, and once it’s deleted it can’t be activated again.
This was ultimately rejected for two reasons:
- The proposed API has closer usage to
PyGILState_Ensure()&PyGILState_Release(), which helps ease the transition for old codebases.
- It’s significantly easier
for code-generators like Cython to use, as there isn’t any additional
complexity with tracking PyThreadStatepointers around.
Using PyStatus for the Return Value of PyThreadState_Ensure
In prior iterations of this API, PyThreadState_Ensure() returned a
PyStatus instead of an integer to denote failures, which had the
benefit of providing an error message.
This was rejected because it’s not clear
that an error message would be all that useful; all the conceived use-cases
for this API wouldn’t really care about a message indicating why Python
can’t be invoked. As such, the API would only be needlessly more complex to
use, which in turn would hurt the transition from PyGILState_Ensure().
In addition, PyStatus isn’t commonly used in the C API. A few
functions related to interpreter initialization use it (simply because they
can’t raise exceptions), and PyThreadState_Ensure() does not fall
under that category.
Acknowledgements
This PEP is based on prior work, feedback, and discussions from many people, including Victor Stinner, Antoine Pitrou, David Woods, Sam Gross, Matt Page, Ronald Oussoren, Matt Wozniski, Eric Snow, Steve Dower, Petr Viktorin, Gregory P. Smith, and Alyssa Coghlan.
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
Source: https://github.com/python/peps/blob/main/peps/pep-0788.rst
Last modified: 2025-10-14 11:22:45 GMT