PEP 788 – Protecting the C API from Interpreter Finalization
- Author:
- Peter Bierma <zintensitydev at gmail.com>
- Sponsor:
- Victor Stinner <vstinner at python.org>
- Discussions-To:
- Discourse thread
- Status:
- Draft
- Type:
- Standards Track
- Created:
- 23-Apr-2025
- Python-Version:
- 3.15
- Post-History:
- 10-Mar-2025, 27-Apr-2025, 28-May-2025, 03-Oct-2025
Table of Contents
- Abstract
- Background
- Motivation
- Non-Python Threads Always Hang During Finalization
- Locks in Native Extensions Can Be Unusable During Finalization
- Finalization Behavior for
PyGILState_Ensure
Cannot Change - The Term “GIL” Is Tricky for Free-threading
PyGILState_Ensure
Doesn’t Guess the Correct Interpreter- Subinterpreters Can Concurrently Deallocate
- Rationale
- Specification
- Backwards Compatibility
- Security Implications
- How to Teach This
- Reference Implementation
- Open Issues
- Rejected Ideas
- Acknowledgements
- Copyright
Abstract
This PEP introduces a suite of functions in the C API to safely attach to an interpreter. For example:
static int
thread_function(PyInterpreterView view)
{
PyInterpreterLock lock = PyInterpreterLock_FromView(view);
if (lock == 0) {
return -1;
}
PyThreadView thread_view = PyThreadState_Ensure(lock);
if (thread_view == 0) {
PyInterpreterLock_Release(lock);
return -1;
}
/* Call Python code, without worrying about the thread hanging due to
finalization. */
PyThreadState_Release(thread_view);
PyInterpreterLock_Release();
return 0;
}
In addition, the APIs in the PyGILState
family are deprecated by this
proposal.
Background
In the C API, threads are able to interact with an interpreter by holding an
attached thread state for the current thread. This can get complicated
when it comes to creating and attaching thread states
in a safe manner, because any non-Python thread (one not created via the
threading
module) is considered to be “daemon”, meaning that the interpreter
won’t wait on that thread before shutting down. Instead, the interpreter will hang the
thread when it goes to attach a thread state, making the thread unusable past that
point.
Attaching a thread state can happen at any point when invoking Python, such
as in-between bytecode instructions (to yield the GIL to a different thread),
or when a C function exits a Py_BEGIN_ALLOW_THREADS
block, so simply
guarding against whether the interpreter is finalizing isn’t enough to safely
call Python code. (Note that hanging the thread is a relatively new behavior;
in older versions, the thread would exit, but the issue is the same.)
Currently, the C API doesn’t have any way to ensure that an interpreter is in a state where it won’t hang a thread when trying to attach. This can be a frustrating issue to deal with in large applications that want to execute Python code alongside some other native code.
In addition, a common pattern among users creating non-Python threads is to
use PyGILState_Ensure()
, which was introduced in PEP 311. This has
been very unfortunate for subinterpreters, because PyGILState_Ensure()
tends to choose to create a thread state for the main interpreter instead of the current interpreter. This leads
to thread-safety issues when extensions create threads that interact with the
Python interpreter, because assumptions about the GIL are incorrect.
Motivation
Non-Python Threads Always Hang During Finalization
Many large libraries might need to call Python code in highly-asynchronous situations where the desired interpreter could be finalizing or deleted, but want to continue running code after invoking the interpreter. This desire has been brought up by users. For example, a callback that wants to call Python code might be invoked when:
- A kernel has finished running on a GPU.
- A network packet was received.
- A thread has quit, and a native library is executing static finalizers of thread local storage.
Generally, this pattern would look something like this:
static void
some_callback(void *closure)
{
/* Do some work */
/* ... */
PyGILState_STATE gstate = PyGILState_Ensure();
/* Invoke the C API to do some computation */
PyGILState_Release(gstate);
/* ... */
}
This means that any non-Python thread may be terminated at any point, which is severely limiting for users who want to do more than just execute Python code in their stream of calls.
Py_IsFinalizing
Is Not Atomic
Due to the problem mentioned previously, the docs
currently recommend Py_IsFinalizing()
to guard against termination of
the thread:
Calling this function from a thread when the runtime is finalizing will terminate the thread, even if the thread was not created by Python. You can usePy_IsFinalizing()
orsys.is_finalizing()
to check if the interpreter is in process of being finalized before calling this function to avoid unwanted termination.
Unfortunately, this doesn’t work reliably, because of time-of-call to time-of-use
issues; the interpreter might not be finalizing during the call to
Py_IsFinalizing()
, but it might start finalizing immediately
afterwards, which would cause the attachment of a thread state to hang the
thread.
Users have expressed a desire for an
atomic way to call Py_IsFinalizing
in the past.
Locks in Native Extensions Can Be Unusable During Finalization
When acquiring locks in a native API, it’s common to release the GIL (or critical sections on the free-threaded build) to avoid lock-ordering deadlocks. This can be problematic during finalization, because threads holding locks might be hung. For example:
- A thread goes to acquire a lock, first detaching its thread state to avoid deadlocks.
- The main thread begins finalization and tells all thread states to hang upon attachment.
- The thread acquires the lock it was waiting on, but is then hung by attempting
to reattach its thread state via
Py_END_ALLOW_THREADS
. - The main thread can no longer acquire the lock, because the thread holding it has been hung.
This affects CPython itself, and there’s not much that can be done
to fix it with the current API. For example,
python/cpython#129536
remarks that the ssl
module can emit a fatal error when used at
finalization, because a daemon thread got hung while holding the lock
for sys.stderr
, and then a finalizer tried to write to it.
Ideally, a thread should be able to temporarily prevent the interpreter
from hanging it while it holds the lock.
Finalization Behavior for PyGILState_Ensure
Cannot Change
There will always have to be a point in a Python program where
PyGILState_Ensure()
can no longer attach a thread state.
If the interpreter is long dead, then Python obviously can’t give a
thread a way to invoke it. PyGILState_Ensure()
doesn’t have any
meaningful way to return a failure, so it has no choice but to terminate
the thread or emit a fatal error, as noted in
python/cpython#124622:
I think a new GIL acquisition and release C API would be needed. The way the existing ones get used in existing C code is not amenible to suddenly bolting an error state onto; none of the existing C code is written that way. After the call they always just assume they have the GIL and can proceed. The API was designed as “it’ll block and only return once it has the GIL” without any other option.
As a result, CPython can’t make any real changes to how PyGILState_Ensure()
works during finalization, because it would break existing code.
The Term “GIL” Is Tricky for Free-threading
A large issue with the term “GIL” in the C API is that it is semantically misleading. This was noted in python/cpython#127989, created by the authors of this PEP:
The biggest issue is that for free-threading, there is no GIL, so users erroneously call the C API insidePy_BEGIN_ALLOW_THREADS
blocks or omitPyGILState_Ensure
in fresh threads.
Again, PyGILState_Ensure()
gets an attached thread state
for the thread on both with-GIL and free-threaded builds.
An attached thread state is always needed to call the C API, so
PyGILState_Ensure()
still needs to be called on free-threaded builds,
but with a name like “ensure GIL”, it’s not immediately clear that that’s true.
PyGILState_Ensure
Doesn’t Guess the Correct Interpreter
As noted in the documentation,
the PyGILState
functions aren’t officially supported in subinterpreters:
Note that thePyGILState_*
functions assume there is only one global interpreter (created automatically byPy_Initialize()
). Python supports the creation of additional interpreters (usingPy_NewInterpreter()
), but mixing multiple interpreters and thePyGILState_*
API is unsupported.
This is because PyGILState_Ensure()
doesn’t have any way
to know which interpreter created the thread, and as such, it has to assume
that it was the main interpreter. There isn’t any way to detect this at
runtime, so spurious races are bound to come up in threads created by
subinterpreters, because synchronization for the wrong interpreter will be
used on objects shared between the threads.
For example, if the thread had access to object A, which belongs to a
subinterpreter, but then called PyGILState_Ensure()
, the thread would
have an attached thread state pointing to the main interpreter,
not the subinterpreter. This means that any GIL assumptions about the
object are wrong, because there isn’t any synchronization between the two GILs.
There’s not any great way to solve this, other than introducing a new API that explicitly takes an interpreter from the caller.
Subinterpreters Can Concurrently Deallocate
The other way of creating a non-Python thread, PyThreadState_New()
and
PyThreadState_Swap()
, is a lot better for supporting subinterpreters
(because PyThreadState_New()
takes an explicit interpreter, rather than
assuming that the main interpreter was requested), but is still limited by the
current hanging problems in the C API, and is subject to crashes when the
subinterpreter finalizes before the thread has a chance to start. This is because
in subinterpreters, the PyInterpreterState *
structure is allocated on the
heap, whereas the main interpreter is statically allocated on the Python runtime
state.
Rationale
Preventing Interpreter Shutdown
This PEP takes an approach where an interpreter comes with a locking API that prevents it from shutting down. Holding an interpreter lock will make it safe to call the C API without worrying about the thread being hung.
This means that interfacing Python (for example, in a C++ library) will need a lock to the interpreter in order to safely call the object, which is more inconvenient than assuming the main interpreter is the right choice, but there’s not really another option.
This proposal also comes with “views” to an interpreter that can be used to safely poke at an interpreter that may be dead or alive. Using a view, users can acquire an interpreter lock at any point during its lifecycle, and will safely fail if the interpreter can no longer support calling Python code.
Compatibility Shim for PyGILState_Ensure
This proposal comes with PyUnstable_InterpreterView_FromDefault()
as a
compatibility hack for some users of PyGILState_Ensure()
. It is a
thread-safe way to acquire a lock to the main (or “default”)
interpreter.
The main drawback to porting new code to PyThreadState_Ensure()
is that
it isn’t a drop-in replacement for PyGILState_Ensure()
, as it needs
an interpreter lock argument. In some large applications, refactoring to
use a PyInterpreterLock
everywhere might be tricky; so, this function
acts as a last resort for users who explicitly want to disallow support for
subinterpreters.
Specification
Interpreter Locks
-
type PyInterpreterLock
- An opaque interpreter lock.
By holding an interpreter lock, the caller can know that the interpreter will be in a state where it can safely execute Python code.
This is a special type of “readers-writers” lock; threads may hold an interpreter’s lock concurrently, and the interpreter will have to wait until all threads have released the lock until it can enter finalization.
This type is guaranteed to be pointer-sized.
-
PyInterpreterLock PyInterpreterLock_FromCurrent(void)
- Acquire a lock for the current interpreter.
On success, this function locks the interpreter and returns an opaque reference to the lock, or returns
0
with an exception set on failure.The caller must hold an attached thread state.
-
PyInterpreterLock PyInterpreterLock_FromView(PyInterpreterView view)
- Acquire a lock to an interpreter through a view.
On success, this function returns a lock to the interpreter denoted by view. The view is still valid after calling this function.
If the interpreter no longer exists or can no longer support calling Python code safely, then this function returns
0
without an exception set.The caller does not need to hold an attached thread state.
-
PyInterpreterState *PyInterpreterLock_GetInterpreter(PyInterpreterLock lock)
- Return the
PyInterpreterState
pointer denoted by lock.This function cannot fail, and the caller doesn’t need to hold an attached thread state.
-
PyInterpreterLock PyInterpreterLock_Copy(PyInterpreterLock lock)
- Duplicate a lock to an interpreter.
On success, this function returns a lock to the interpreter denoted by lock, and returns
0
without an exception set on failure.The caller does not need to hold an attached thread state.
-
void PyInterpreterLock_Release(PyInterpreterLock lock)
- Release an interpreter’s lock, possibly allowing it to shut down.
This function cannot fail, and the caller doesn’t need to hold an attached thread state.
Interpreter Views
-
type PyInterpreterView
- An opaque view of an interpreter.
This is a thread-safe way to access an interpreter that may be finalized in another thread.
This type is guaranteed to be pointer-sized.
-
PyInterpreterView PyInterpreterView_FromCurrent(void)
- Create a view to the current interpreter.
This function is generally meant to be used in tandem with
PyInterpreterLock_FromView()
.On success, this function returns a view to the current interpreter, and returns
0
with an exception set on failure.The caller must hold an attached thread state.
-
PyInterpreterView PyInterpreterView_Copy(PyInterpreterView view)
- Duplicate a view to an interpreter.
On success, this function returns a non-zero view to the interpreter denoted by view, and returns
0
without an exception set on failure.This function cannot fail, and the caller doesn’t need to hold an attached thread state.
-
void PyInterpreterView_Close(PyInterpreterView view)
- Delete an interpreter view.
This function cannot fail, and the caller doesn’t need to hold an attached thread state.
-
PyInterpreterView PyUnstable_InterpreterView_FromDefault()
- Create a view for an arbitrary “main” interpreter.
This function only exists for special cases where a specific interpreter can’t be saved.
On success, this function returns a view to the main interpreter, and returns
0
without an exception set on failure.The caller does not need to hold an attached thread state.
Ensuring and Releasing Thread States
This proposal includes two new high-level threading APIs that intend to
replace PyGILState_Ensure()
and PyGILState_Release()
.
-
type PyThreadView
- An opaque view of a thread state.
In this PEP, a thread view comes with no additional properties over a PyThreadState* pointer. APIs for
PyThreadView
may be added in the future.This type is guaranteed to be pointer-sized.
-
PyThreadView PyThreadState_Ensure(PyInterpreterLock lock)
- Ensure that the thread has an attached thread state for the
interpreter denoted by lock, and thus can safely invoke that
interpreter. It is OK to call this function if the thread already has an
attached thread state, as long as there is a subsequent call to
PyThreadState_Release()
that matches this one.Nested calls to this function will only sometimes create a new thread state. If there is no attached thread state, then this function will check for the most recent attached thread state used by this thread. If none exists or it doesn’t match lock, a new thread state is created. If it does match lock, it is reattached. If there is an attached thread state, then a similar check occurs; if the interpreter matches lock, it is attached, and otherwise a new thread state is created.
Return a non-zero thread view of the old thread state on success, and
0
on failure.
-
void PyThreadState_Release(PyThreadView lock)
- Release a
PyThreadState_Ensure()
call.The attached thread state prior to the corresponding
PyThreadState_Ensure()
call is guaranteed to be restored upon returning. The cached thread state as used byPyThreadState_Ensure()
andPyGILState_Ensure()
will also be restored.This function cannot fail.
Deprecation of PyGILState
APIs
This PEP deprecates all of the existing PyGILState
APIs in favor of the
existing and new PyThreadState
APIs. Namely:
PyGILState_Ensure()
: usePyThreadState_Ensure()
instead.PyGILState_Release()
: usePyThreadState_Release()
instead.PyGILState_GetThisThreadState()
: usePyThreadState_Get()
orPyThreadState_GetUnchecked()
instead.PyGILState_Check()
: usePyThreadState_GetUnchecked() != NULL
instead.
All of the PyGILState
APIs are to be removed from the non-limited C API in
Python 3.20. They will remain available in the stable ABI for
compatibility.
Backwards Compatibility
This PEP specifies a breaking change with the removal of all the
PyGILState
APIs from the public headers of the non-limited C API in
Python 3.20.
Security Implications
This PEP has no known security implications.
How to Teach This
As with all C API functions, all the new APIs in this PEP will be documented
in the C API documentation, ideally under the Non-Python created threads section.
The existing PyGILState
documentation should be updated accordingly to point
to the new APIs.
Examples
These examples are here to help understand the APIs described in this PEP. Ideally, they could be reused in the documentation.
Example: A Library Interface
Imagine that you’re developing a C library for logging. You might want to provide an API that allows users to log to a Python file object.
With this PEP, you would implement it like this:
int
LogToPyFile(PyInterpreterView view,
PyObject *file,
PyObject *text)
{
PyInterpreterLock lock = PyInterpreterLock_FromView(view);
if (lock == 0) {
/* Python interpreter has shut down */
return -1;
}
PyThreadView thread_view = PyThreadState_Ensure(lock);
if (thread_view == 0) {
PyInterpreterLock_Release(lock);
fputs("Cannot call Python.\n", stderr);
return -1;
}
const char *to_write = PyUnicode_AsUTF8(text);
if (to_write == NULL) {
// Since the exception may be destroyed upon calling PyThreadState_Release(),
// print out the exception ourself.
PyErr_Print();
PyThreadState_Release(thread_view);
PyInterpreterLock_Release(lock);
return -1;
}
int res = PyFile_WriteString(to_write, file);
free(to_write);
if (res < 0) {
PyErr_Print();
}
PyThreadState_Release(thread_view);
PyInterpreterLock_Release(lock);
return res < 0;
}
Example: A Single-threaded Ensure
This example shows acquiring a C lock in a Python method.
If this were to be called from a daemon thread, then the interpreter could hang the thread while reattaching the thread state, leaving us with the lock held. Any future finalizer that attempted to acquire the lock would be deadlocked.
static PyObject *
my_critical_operation(PyObject *self, PyObject *Py_UNUSED(args))
{
assert(PyThreadState_GetUnchecked() != NULL);
PyInterpreterLock lock = PyInterpreterLock_FromCurrent();
if (lock == 0) {
/* Python interpreter has shut down */
return NULL;
}
Py_BEGIN_ALLOW_THREADS;
acquire_some_lock();
/* Do something while holding the lock.
The interpreter won't finalize during this period. */
// ...
release_some_lock();
Py_END_ALLOW_THREADS;
PyInterpreterLock_Release(lock);
Py_RETURN_NONE;
}
Example: Transitioning From the Legacy Functions
The following code uses the PyGILState
APIs:
static int
thread_func(void *arg)
{
PyGILState_STATE gstate = PyGILState_Ensure();
/* It's not an issue in this example, but we just attached
a thread state for the main interpreter. If my_method() was
originally called in a subinterpreter, then we would be unable
to safely interact with any objects from it. */
if (PyRun_SimpleString("print(42)") < 0) {
PyErr_Print();
}
PyGILState_Release(gstate);
return 0;
}
static PyObject *
my_method(PyObject *self, PyObject *unused)
{
PyThread_handle_t handle;
PyThead_indent_t indent;
if (PyThread_start_joinable_thread(thread_func, NULL, &ident, &handle) < 0) {
return NULL;
}
Py_BEGIN_ALLOW_THREADS;
PyThread_join_thread(handle);
Py_END_ALLOW_THREADS;
Py_RETURN_NONE;
}
This is the same code, rewritten to use the new functions:
static int
thread_func(void *arg)
{
PyInterpreterLock interp = (PyInterpreterLock)arg;
PyThreadView thread_view = PyThreadState_Ensure(interp);
if (thread_view == 0) {
PyInterpreterLock_Release(interp);
return -1;
}
if (PyRun_SimpleString("print(42)") < 0) {
PyErr_Print();
}
PyThreadState_Release(thread_view);
PyInterpreterLock_Release(interp);
return 0;
}
static PyObject *
my_method(PyObject *self, PyObject *unused)
{
PyThread_handle_t handle;
PyThead_indent_t indent;
PyInterpreterLock lock = PyInterpreterLock_FromCurrent();
if (lock == 0) {
return NULL;
}
if (PyThread_start_joinable_thread(thread_func, (void *)lock, &ident, &handle) < 0) {
PyInterpreterLock_Release(lock);
return NULL;
}
Py_BEGIN_ALLOW_THREADS
PyThread_join_thread(handle);
Py_END_ALLOW_THREADS
Py_RETURN_NONE;
}
Example: A Daemon Thread
With this PEP, daemon threads are very similar to how non-Python threads work
in the C API today. After calling PyThreadState_Ensure()
, simply
release the interpreter lock to allow the interpreter to shut down (and
hang the current thread forever).
static int
thread_func(void *arg)
{
PyInterpreterLock lock = (PyInterpreterLock)arg;
PyThreadView thread_view = PyThreadState_Ensure(lock);
if (thread_view == 0) {
PyInterpreterLock_Release(lock);
return -1;
}
/* Release the interpreter lock, allowing it to
finalize. This means that print(42) can hang this thread. */
PyInterpreterLock_Release(lock);
if (PyRun_SimpleString("print(42)") < 0) {
PyErr_Print();
}
PyThreadState_Release(thread_view);
return 0;
}
static PyObject *
my_method(PyObject *self, PyObject *unused)
{
PyThread_handle_t handle;
PyThead_indent_t indent;
PyInterpreterLock lock = PyInterpreterLock_FromCurrent();
if (lock == 0) {
return NULL;
}
if (PyThread_start_joinable_thread(thread_func, (void *)lock, &ident, &handle) < 0) {
PyInterpreterLock_Release(lock);
return NULL;
}
Py_RETURN_NONE;
}
Example: An Asynchronous Callback
typedef struct {
PyInterpreterView view;
} ThreadData;
static int
async_callback(void *arg)
{
ThreadData *tdata = (ThreadData *)arg;
PyInterpreterView view = tdata->view;
PyInterpreterLock lock = PyInterpreterLock_FromView(view);
if (lock == 0) {
fputs("Python has shut down!\n", stderr);
return -1;
}
PyThreadView thread_view = PyThreadState_Ensure(lock);
if (thread_view == 0) {
PyInterpreterLock_Release(lock);
return -1;
}
if (PyRun_SimpleString("print(42)") < 0) {
PyErr_Print();
}
PyThreadState_Release(thread_view);
PyInterpreterLock_Release(lock);
PyInterpreterView_Close(view);
PyMem_RawFree(tdata);
return 0;
}
static PyObject *
setup_callback(PyObject *self, PyObject *unused)
{
// View to the interpreter. It won't wait on the callback
// to finalize.
ThreadData *tdata = PyMem_RawMalloc(sizeof(ThreadData));
if (tdata == NULL) {
PyErr_NoMemory();
return NULL;
}
PyInterpreterView view = PyInterpreterView_FromCurrent();
if (view == 0) {
PyMem_RawFree(tdata);
return NULL;
}
tdata->view = view;
register_callback(async_callback, tdata);
Py_RETURN_NONE;
}
Example: Calling Python Without a Callback Parameter
There are a few cases where callback functions don’t take a callback parameter
(void *arg
), so it’s difficult to acquire a lock to any specific
interpreter. The solution to this problem is to acquire a lock to the main
interpreter through PyUnstable_InterpreterView_FromDefault()
.
static void
call_python(void)
{
PyInterpreterView view = PyUnstable_InterpreterView_FromDefault();
if (lock == 0) {
fputs("Python has shut down.", stderr);
return;
}
PyInterpreterLock lock = PyInterpreterLock_FromView(view);
if (lock == 0) {
fputs("Python has shut down.", stderr);
return;
}
PyThreadView thread_view = PyThreadState_Ensure(lock);
if (thread_view == 0) {
PyInterpreterLock_Release(lock);
PyInterpreterView_Close(view);
return -1;
}
if (PyRun_SimpleString("print(42)") < 0) {
PyErr_Print();
}
PyThreadState_Release(thread_view);
PyInterpreterLock_Release(lock);
PyInterpreterView_Close(view);
return 0;
}
Reference Implementation
A reference implementation of this PEP can be found at python/cpython#133110.
Open Issues
How Should the APIs Fail?
There is a bit of disagreement on how the PyInterpreter[Lock|View]
APIs
should indicate a failure to the caller. There are two competing ideas:
- Return -1 to indicate failure, and 0 to indicate success. On success,
functions will assign to a
PyInterpreter[Lock|View]
pointer passed as an argument. - Directly return a
PyInterpreter[Lock|View]
, which a value of 0 being equivalent toNULL
, indicating failure.
Currently, the PEP spells the latter.
Rejected Ideas
Interpreter Reference Counting
There were two iterations of this proposal that both specified an interpreter to have a reference count, and the interpreter would wait for that reference count to hit zero before shutting down.
The first iteration of this idea did this by adding implicit reference counting
to PyInterpreterState *
pointers. A function known as PyInterpreterState_Hold
would increment the reference count (making it a “strong reference”), and
PyInterpreterState_Release
would decrement it. An interpreter’s ID (a
standalone int64_t
) was used as a form of weak reference, which could be
used to look up an interpreter state and atomically increment its reference
count. These ideas were ultimately rejected because they seemed to make things
very confusing – all existing uses of PyInterpreterState *
would be
borrowed, which would make it difficult for developers to understand which
areas of their code required/used a strong reference.
In response to that pushback, this PEP specified PyInterpreterRef
APIs
that would also mimic reference counting, but in a more explicit manner that
made it easier upon developers. PyInterpreterRef
was analogous to
PyInterpreterLock
in this PEP. Similarly, the older revision included
PyInterpreterWeakRef
, which was analogous to PyInterpreterView
.
Eventually, the notion of reference counting was completely abandonded from this proposal for a few reasons:
- There was contention about overcomplication in the API design; the reference counting design looked very similar to that of HPy, which had no precedent in CPython. There was fear that this proposal was being overcomplicated to look more like HPy.
- Unlike traditional reference counting APIs, acquiring a strong reference to an interpreter could arbitrarily fail, and an interpreter would not immediately deallocate when its reference count reached zero.
- There was prior discussion about adding “true” reference counting to
interpreters (which would deallocate upon reaching zero), which would have
been very confusing if there was an existing API in CPython titled
PyInterpreterRef
that did something different.
Non-daemon Thread States
In earlier revisions of this PEP, interpreter locks were a property of
a thread state rather than a property of an interpreter. This meant that
PyThreadState_Ensure()
kept an interpreter lock held, and
it was released upon calling PyThreadState_Release()
. A thread state
that held a lock to an interpreter was known as a “non-daemon thread
state.” At first, this seemed like an improvement, because it shifted management
of a lock’s lifetime to the thread instead of the user, which eliminated
some boilerplate.
However, this ended up making the proposal significantly more complex and hurt the proposal’s goals:
- Most importantly, non-daemon thread states put too much emphasis on daemon threads as the problem, which hurt the clarity of the PEP. Additionally, the phrase “non-daemon” added extra confusion, because non-daemon Python threads are explicitly joined, whereas a non-daemon C thread is only waited on until it releases its lock.
- In many cases, an interpreter lock should outlive a singular thread
state. Stealing the interpreter lock in
PyThreadState_Ensure()
was particularly troublesome for these cases. IfPyThreadState_Ensure()
didn’t steal a lock with non-daemon thread states, it would muddy the ownership story of the interpreter lock, leading to a more confusing API.
Exposing an Activate
/Deactivate
API Instead of Ensure
/Clear
In prior discussions of this API, it was
suggested to provide actual
PyThreadState
pointers in the API in an attempt to
make the ownership and lifetime of the thread state clearer:
More importantly though, I think this makes it clearer who owns the thread state - a manually created one is controlled by the code that created it, and once it’s deleted it can’t be activated again.
This was ultimately rejected for two reasons:
- The proposed API has closer usage to
PyGILState_Ensure()
&PyGILState_Release()
, which helps ease the transition for old codebases. - It’s significantly easier
for code-generators like Cython to use, as there isn’t any additional
complexity with tracking
PyThreadState
pointers around.
Using PyStatus
for the Return Value of PyThreadState_Ensure
In prior iterations of this API, PyThreadState_Ensure()
returned a
PyStatus
instead of an integer to denote failures, which had the
benefit of providing an error message.
This was rejected because it’s not clear
that an error message would be all that useful; all the conceived use-cases
for this API wouldn’t really care about a message indicating why Python
can’t be invoked. As such, the API would only be needlessly harder to use,
which in turn would hurt the transition from PyGILState_Ensure()
.
In addition, PyStatus
isn’t commonly used in the C API. A few
functions related to interpreter initialization use it (simply because they
can’t raise exceptions), and PyThreadState_Ensure()
does not fall
under that category.
Acknowledgements
This PEP is based on prior work, feedback, and discussions from many people, including Victor Stinner, Antoine Pitrou, David Woods, Sam Gross, Matt Page, Ronald Oussoren, Matt Wozniski, Eric Snow, Steve Dower, Petr Viktorin, Gregory P. Smith, and Alyssa Coghlan.
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
Source: https://github.com/python/peps/blob/main/peps/pep-0788.rst
Last modified: 2025-10-04 14:30:43 GMT