Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 734 – Multiple Interpreters in the Stdlib

Author:
Eric Snow <ericsnowcurrently at gmail.com>
Discussions-To:
Discourse thread
Status:
Deferred
Type:
Standards Track
Created:
06-Nov-2023
Python-Version:
3.13
Post-History:
14-Dec-2023
Replaces:
554
Resolution:
Discourse message

Table of Contents

Note

This PEP is essentially a continuation of PEP 554. That document had grown a lot of ancillary information across 7 years of discussion. This PEP is a reduction back to the essential information. Much of that extra information is still valid and useful, just not in the immediate context of the specific proposal here.

Abstract

This PEP proposes to add a new module, interpreters, to support inspecting, creating, and running code in multiple interpreters in the current process. This includes Interpreter objects that represent the underlying interpreters. The module will also provide a basic Queue class for communication between interpreters. Finally, we will add a new concurrent.futures.InterpreterPoolExecutor based on the interpreters module.

Introduction

Fundamentally, an “interpreter” is the collection of (essentially) all runtime state which Python threads must share. So, let’s first look at threads. Then we’ll circle back to interpreters.

Threads and Thread States

A Python process will have one or more OS threads running Python code (or otherwise interacting with the C API). Each of these threads interacts with the CPython runtime using its own thread state (PyThreadState), which holds all the runtime state unique to that thread. There is also some runtime state that is shared between multiple OS threads.

Any OS thread may switch which thread state it is currently using, as long as it isn’t one that another OS thread is already using (or has been using). This “current” thread state is stored by the runtime in a thread-local variable, and may be looked up explicitly with PyThreadState_Get(). It gets set automatically for the initial (“main”) OS thread and for threading.Thread objects. From the C API it is set (and cleared) by PyThreadState_Swap() and may be set by PyGILState_Ensure(). Most of the C API requires that there be a current thread state, either looked up implicitly or passed in as an argument.

The relationship between OS threads and thread states is one-to-many. Each thread state is associated with at most a single OS thread and records its thread ID. A thread state is never used for more than one OS thread. In the other direction, however, an OS thread may have more than one thread state associated with it, though, again, only one may be current.

When there’s more than one thread state for an OS thread, PyThreadState_Swap() is used in that OS thread to switch between them, with the requested thread state becoming the current one. Whatever was running in the thread using the old thread state is effectively paused until that thread state is swapped back in.

Interpreter States

As noted earlier, there is some runtime state that multiple OS threads share. Some of it is exposed by the sys module, though much is used internally and not exposed explicitly or only through the C API.

This shared state is called the interpreter state (PyInterpreterState). We’ll sometimes refer to it here as just “interpreter”, though that is also sometimes used to refer to the python executable, to the Python implementation, and to the bytecode interpreter (i.e. exec()/eval()).

CPython has supported multiple interpreters in the same process (AKA “subinterpreters”) since version 1.5 (1997). The feature has been available via the C API.

Interpreters and Threads

Thread states are related to interpreter states in much the same way that OS threads and processes are related (at a high level). To begin with, the relationship is one-to-many. A thread state belongs to a single interpreter (and stores a pointer to it). That thread state is never used for a different interpreter. In the other direction, however, an interpreter may have zero or more thread states associated with it. The interpreter is only considered active in OS threads where one of its thread states is current.

Interpreters are created via the C API using Py_NewInterpreterFromConfig() (or Py_NewInterpreter(), which is a light wrapper around Py_NewInterpreterFromConfig()). That function does the following:

  1. create a new interpreter state
  2. create a new thread state
  3. set the thread state as current (a current tstate is needed for interpreter init)
  4. initialize the interpreter state using that thread state
  5. return the thread state (still current)

Note that the returned thread state may be immediately discarded. There is no requirement that an interpreter have any thread states, except as soon as the interpreter is meant to actually be used. At that point it must be made active in the current OS thread.

To make an existing interpreter active in the current OS thread, the C API user first makes sure that interpreter has a corresponding thread state. Then PyThreadState_Swap() is called like normal using that thread state. If the thread state for another interpreter was already current then it gets swapped out like normal and execution of that interpreter in the OS thread is thus effectively paused until it is swapped back in.

Once an interpreter is active in the current OS thread like that, the thread can call any of the C API, such as PyEval_EvalCode() (i.e. exec()). This works by using the current thread state as the runtime context.

The “Main” Interpreter

When a Python process starts, it creates a single interpreter state (the “main” interpreter) with a single thread state for the current OS thread. The Python runtime is then initialized using them.

After initialization, the script or module or REPL is executed using them. That execution happens in the interpreter’s __main__ module.

When the process finishes running the requested Python code or REPL, in the main OS thread, the Python runtime is finalized in that thread using the main interpreter.

Runtime finalization has only a slight, indirect effect on still-running Python threads, whether in the main interpreter or in subinterpreters. That’s because right away it waits indefinitely for all non-daemon Python threads to finish.

While the C API may be queried, there is no mechanism by which any Python thread is directly alerted that finalization has begun, other than perhaps with “atexit” functions that may be been registered using threading._register_atexit().

Any remaining subinterpreters are themselves finalized later, but at that point they aren’t current in any OS threads.

Interpreter Isolation

CPython’s interpreters are intended to be strictly isolated from each other. That means interpreters never share objects (except in very specific cases with immortal, immutable builtin objects). Each interpreter has its own modules (sys.modules), classes, functions, and variables. Even where two interpreters define the same class, each will have its own copy. The same applies to state in C, including in extension modules. The CPython C API docs explain more.

Notably, there is some process-global state that interpreters will always share, some mutable and some immutable. Sharing immutable state presents few problems, while providing some benefits (mainly performance). However, all shared mutable state requires special management, particularly for thread-safety, some of which the OS takes care of for us.

Mutable:

  • file descriptors
  • low-level env vars
  • process memory (though allocators are isolated)
  • the list of interpreters

Immutable:

  • builtin types (e.g. dict, bytes)
  • singletons (e.g. None)
  • underlying static module data (e.g. functions) for builtin/extension/frozen modules

Existing Execution Components

There are a number of existing parts of Python that may help with understanding how code may be run in a subinterpreter.

In CPython, each component is built around one of the following C API functions (or variants):

  • PyEval_EvalCode(): run the bytecode interpreter with the given code object
  • PyRun_String(): compile + PyEval_EvalCode()
  • PyRun_File(): read + compile + PyEval_EvalCode()
  • PyRun_InteractiveOneObject(): compile + PyEval_EvalCode()
  • PyObject_Call(): calls PyEval_EvalCode()

builtins.exec()

The builtin exec() may be used to execute Python code. It is essentially a wrapper around the C API functions PyRun_String() and PyEval_EvalCode().

Here are some relevant characteristics of the builtin exec():

  • It runs in the current OS thread and pauses whatever was running there, which resumes when exec() finishes. No other OS threads are affected. (To avoid pausing the current Python thread, run exec() in a threading.Thread.)
  • It may start additional threads, which don’t interrupt it.
  • It executes against a “globals” namespace (and a “locals” namespace). At module-level, exec() defaults to using __dict__ of the current module (i.e. globals()). exec() uses that namespace as-is and does not clear it before or after.
  • It propagates any uncaught exception from the code it ran. The exception is raised from the exec() call in the Python thread that originally called exec().

Command-line

The python CLI provides several ways to run Python code. In each case it maps to a corresponding C API call:

  • <no args>, -i - run the REPL (PyRun_InteractiveOneObject())
  • <filename> - run a script (PyRun_File())
  • -c <code> - run the given Python code (PyRun_String())
  • -m module - run the module as a script (PyEval_EvalCode() via runpy._run_module_as_main())

In each case it is essentially a variant of running exec() at the top-level of the __main__ module of the main interpreter.

threading.Thread

When a Python thread is started, it runs the “target” function with PyObject_Call() using a new thread state. The globals namespace come from func.__globals__ and any uncaught exception is discarded.

Motivation

The interpreters module will provide a high-level interface to the multiple interpreter functionality. The goal is to make the existing multiple-interpreters feature of CPython more easily accessible to Python code. This is particularly relevant now that CPython has a per-interpreter GIL (PEP 684) and people are more interested in using multiple interpreters.

Without a stdlib module, users are limited to the C API, which restricts how much they can try out and take advantage of multiple interpreters.

The module will include a basic mechanism for communicating between interpreters. Without one, multiple interpreters are a much less useful feature.

Specification

The module will:

  • expose the existing multiple interpreter support
  • introduce a basic mechanism for communicating between interpreters

The module will wrap a new low-level _interpreters module (in the same way as the threading module). However, that low-level API is not intended for public use and thus not part of this proposal.

Using Interpreters

The module defines the following functions:

  • get_current() -> Interpreter
    Returns the Interpreter object for the currently executing interpreter.
  • list_all() -> list[Interpreter]
    Returns the Interpreter object for each existing interpreter, whether it is currently running in any OS threads or not.
  • create() -> Interpreter
    Create a new interpreter and return the Interpreter object for it. The interpreter doesn’t do anything on its own and is not inherently tied to any OS thread. That only happens when something is actually run in the interpreter (e.g. Interpreter.exec()), and only while running. The interpreter may or may not have thread states ready to use, but that is strictly an internal implementation detail.

Interpreter Objects

An interpreters.Interpreter object that represents the interpreter (PyInterpreterState) with the corresponding unique ID. There will only be one object for any given interpreter.

If the interpreter was created with interpreters.create() then it will be destroyed as soon as all Interpreter objects with its ID (across all interpreters) have been deleted.

Interpreter objects may represent other interpreters than those created by interpreters.create(). Examples include the main interpreter (created by Python’s runtime initialization) and those created via the C-API, using Py_NewInterpreter(). Such Interpreter objects will not be able to interact with their corresponding interpreters, e.g. via Interpreter.exec() (though we may relax this in the future).

Attributes and methods:

  • id
    (read-only) A non-negative int that identifies the interpreter that this Interpreter instance represents. Conceptually, this is similar to a process ID.
  • __hash__()
    Returns the hash of the interpreter’s id. This is the same as the hash of the ID’s integer value.
  • is_running() -> bool
    Returns True if the interpreter is currently executing code in its __main__ module. This excludes sub-threads.

    It refers only to if there is an OS thread running a script (code) in the interpreter’s __main__ module. That basically means whether or not Interpreter.exec() is running in some OS thread. Code running in sub-threads is ignored.

  • prepare_main(**kwargs)
    Bind one or more objects in the interpreter’s __main__ module.

    The keyword argument names will be used as the attribute names. The values will be bound as new objects, though exactly equivalent to the original. Only objects specifically supported for passing between interpreters are allowed. See Shareable Objects.

    prepare_main() is helpful for initializing the globals for an interpreter before running code in it.

  • exec(code, /)
    Execute the given source code in the interpreter (in the current OS thread), using its __main__ module. It doesn’t return anything.

    This is essentially equivalent to switching to this interpreter in the current OS thread and then calling the builtin exec() using this interpreter’s __main__ module’s __dict__ as the globals and locals.

    The code running in the current OS thread (a different interpreter) is effectively paused until Interpreter.exec() finishes. To avoid pausing it, create a new threading.Thread and call Interpreter.exec() in it (like Interpreter.call_in_thread() does).

    Interpreter.exec() does not reset the interpreter’s state nor the __main__ module, neither before nor after, so each successive call picks up where the last one left off. This can be useful for running some code to initialize an interpreter (e.g. with imports) before later performing some repeated task.

    If there is an uncaught exception, it will be propagated into the calling interpreter as an ExecutionFailed. The full error display of the original exception, generated relative to the called interpreter, is preserved on the propagated ExecutionFailed. That includes the full traceback, with all the extra info like syntax error details and chained exceptions. If the ExecutionFailed is not caught then that full error display will be shown, much like it would be if the propagated exception had been raised in the main interpreter and uncaught. Having the full traceback is particularly useful when debugging.

    If exception propagation is not desired then an explicit try-except should be used around the code passed to Interpreter.exec(). Likewise any error handling that depends on specific information from the exception must use an explicit try-except around the given code, since ExecutionFailed will not preserve that information.

  • call(callable, /)
    Call the callable object in the interpreter. The return value is discarded. If the callable raises an exception then it gets propagated as an ExecutionFailed exception, in the same way as Interpreter.exec().

    For now only plain functions are supported and only ones that take no arguments and have no cell vars. Free globals are resolved against the target interpreter’s __main__ module.

    In the future, we can add support for arguments, closures, and a broader variety of callables, at least partly via pickle. We can also consider not discarding the return value. The initial restrictions are in place to allow us to get the basic functionality of the module out to users sooner.

  • call_in_thread(callable, /) -> threading.Thread
    Essentially, apply Interpreter.call() in a new thread. Return values are discarded and exceptions are not propagated.

    call_in_thread() is roughly equivalent to:

    def task():
        interp.call(func)
    t = threading.Thread(target=task)
    t.start()
    
  • close()
    Destroy the underlying interpreter.

Communicating Between Interpreters

The module introduces a basic communication mechanism through special queues.

There are interpreters.Queue objects, but they only proxy the actual data structure: an unbounded FIFO queue that exists outside any one interpreter. These queues have special accommodations for safely passing object data between interpreters, without violating interpreter isolation. This includes thread-safety.

As with other queues in Python, for each “put” the object is added to the back and each “get” pops the next one off the front. Every added object will be popped off in the order it was pushed on.

Only objects that are specifically supported for passing between interpreters may be sent through an interpreters.Queue. Note that the actual objects aren’t sent, but rather their underlying data. However, the popped object will still be strictly equivalent to the original. See Shareable Objects.

The module defines the following functions:

  • create_queue(maxsize=0, *, syncobj=False) -> Queue
    Create a new queue. If the maxsize is zero or negative then the queue is unbounded.

    “syncobj” is used as the default for put() and put_nowait().

Queue Objects

interpreters.Queue objects act as proxies for the underlying cross-interpreter-safe queues exposed by the interpreters module. Each Queue object represents the queue with the corresponding unique ID. There will only be one object for any given queue.

Queue implements all the methods of queue.Queue except for task_done() and join(), hence it is similar to asyncio.Queue and multiprocessing.Queue.

Attributes and methods:

  • id
    (read-only) A non-negative int that identifies the corresponding cross-interpreter queue. Conceptually, this is similar to the file descriptor used for a pipe.
  • maxsize
    (read-only) Number of items allowed in the queue. Zero means “unbounded”.
  • __hash__()
    Return the hash of the queue’s id. This is the same as the hash of the ID’s integer value.
  • empty()
    Return True if the queue is empty, False otherwise.

    This is only a snapshot of the state at the time of the call. Other threads or interpreters may cause this to change.

  • full()
    Return True if there are maxsize items in the queue.

    If the queue was initialized with maxsize=0 (the default), then full() never returns True.

    This is only a snapshot of the state at the time of the call. Other threads or interpreters may cause this to change.

  • qsize()
    Return the number of items in the queue.

    This is only a snapshot of the state at the time of the call. Other threads or interpreters may cause this to change.

  • put(obj, timeout=None, *, syncobj=None)
    Add the object to the queue.

    If maxsize > 0 and the queue is full then this blocks until a free slot is available. If timeout is a positive number then it only blocks at least that many seconds and then raises interpreters.QueueFull. Otherwise is blocks forever.

    If “syncobj” is true then the object must be shareable, which means the object’s data is passed through rather than the object itself. If “syncobj” is false then all objects are supported. However, there are some performance penalties and all objects are copies (e.g. via pickle). Thus mutable objects will never be automatically synchronized between interpreters. If “syncobj” is None (the default) then the queue’s default value is used.

    If an object is still in the queue, and the interpreter which put it in the queue (i.e. to which it belongs) is destroyed, then the object is immediately removed from the queue. (We may later add an option to replace the removed object in the queue with a sentinel or to raise an exception for the corresponding get() call.)

  • put_nowait(obj, *, syncobj=None)
    Like put() but effectively with an immediate timeout. Thus if the queue is full, it immediately raises interpreters.QueueFull.
  • get(timeout=None) -> object
    Pop the next object from the queue and return it. Block while the queue is empty. If a positive timeout is provided and an object hasn’t been added to the queue in that many seconds then raise interpreters.QueueEmpty.
  • get_nowait() -> object
    Like get(), but do not block. If the queue is not empty then return the next item. Otherwise, raise interpreters.QueueEmpty.

Shareable Objects

Interpreter.prepare_main() only works with “shareable” objects. The same goes for interpreters.Queue (optionally).

A “shareable” object is one which may be passed from one interpreter to another. The object is not necessarily actually directly shared by the interpreters. However, even if it isn’t, the shared object should be treated as though it were shared directly. That’s a strong equivalence guarantee for all shareable objects. (See below.)

For some types (builtin singletons), the actual object is shared. For some, the object’s underlying data is actually shared but each interpreter has a distinct object wrapping that data. For all other shareable types, a strict copy or proxy is made such that the corresponding objects continue to match exactly. In cases where the underlying data is complex but must be copied (e.g. tuple), the data is serialized as efficiently as possible.

Shareable objects must be specifically supported internally by the Python runtime. However, there is no restriction against adding support for more types later.

Here’s the initial list of supported objects:

  • str
  • bytes
  • int
  • float
  • bool (True/False)
  • None
  • tuple (only with shareable items)
  • interpreters.Queue
  • memoryview (underlying buffer actually shared)

Note that the last two on the list, queues and memoryview, are technically mutable data types, whereas the rest are not. When any interpreters share mutable data there is always a risk of data races. Cross-interpreter safety, including thread-safety, is a fundamental feature of queues.

However, memoryview does not have any native accommodations. The user is responsible for managing thread-safety, whether passing a token back and forth through a queue to indicate safety (see Synchronization), or by assigning sub-range exclusivity to individual interpreters.

Most objects will be shared through queues (interpreters.Queue), as interpreters communicate information between each other. Less frequently, objects will be shared through prepare_main() to set up an interpreter prior to running code in it. However, prepare_main() is the primary way that queues are shared, to provide another interpreter with a means of further communication.

Finally, a reminder: for a few types the actual object is shared, whereas for the rest only the underlying data is shared, whether as a copy or through a proxy. Regardless, it always preserves the strong equivalence guarantee of “shareable” objects.

The guarantee is that a shared object in one interpreter is strictly equivalent to the corresponding object in the other interpreter. In other words, the two objects will be indistinguishable from each other. The shared object should be treated as though the original had been shared directly, whether or not it actually was. That’s a slightly different and stronger promise than just equality.

The guarantee is especially important for mutable objects, like Interpreters.Queue and memoryview. Mutating the object in one interpreter will always be reflected immediately in every other interpreter sharing the object.

Synchronization

There are situations where two interpreters should be synchronized. That may involve sharing a resource, worker management, or preserving sequential consistency.

In threaded programming the typical synchronization primitives are types like mutexes. The threading module exposes several. However, interpreters cannot share objects which means they cannot share threading.Lock objects.

The interpreters module does not provide any such dedicated synchronization primitives. Instead, interpreters.Queue objects provide everything one might need.

For example, if there’s a shared resource that needs managed access then a queue may be used to manage it, where the interpreters pass an object around to indicate who can use the resource:

import interpreters
from mymodule import load_big_data, check_data

numworkers = 10
control = interpreters.create_queue()
data = memoryview(load_big_data())

def worker():
    interp = interpreters.create()
    interp.prepare_main(control=control, data=data)
    interp.exec("""if True:
        from mymodule import edit_data
        while True:
            token = control.get()
            edit_data(data)
            control.put(token)
        """)
threads = [threading.Thread(target=worker) for _ in range(numworkers)]
for t in threads:
    t.start()

token = 'football'
control.put(token)
while True:
    control.get()
    if not check_data(data):
        break
    control.put(token)

Exceptions

  • InterpreterError
    Indicates that some interpreter-related failure occurred.

    This exception is a subclass of Exception.

  • InterpreterNotFoundError
    Raised from Interpreter methods after the underlying interpreter has been destroyed, e.g. via the C-API.

    This exception is a subclass of InterpreterError.

  • ExecutionFailed
    Raised from Interpreter.exec() and Interpreter.call() when there’s an uncaught exception. The error display for this exception includes the traceback of the uncaught exception, which gets shown after the normal error display, much like happens for ExceptionGroup.

    Attributes:

    • type - a representation of the original exception’s class, with __name__, __module__, and __qualname__ attrs.
    • msg - str(exc) of the original exception
    • snapshot - a traceback.TracebackException object for the original exception

    This exception is a subclass of InterpreterError.

  • QueueError
    Indicates that some queue-related failure occurred.

    This exception is a subclass of Exception.

  • QueueNotFoundError
    Raised from interpreters.Queue methods after the underlying queue has been destroyed.

    This exception is a subclass of QueueError.

  • QueueEmpty
    Raised from Queue.get() (or get_nowait() with no default) when the queue is empty.

    This exception is a subclass of both QueueError and the stdlib queue.Empty.

  • QueueFull
    Raised from Queue.put() (with a timeout) or put_nowait() when the queue is already at its max size.

    This exception is a subclass of both QueueError and the stdlib queue.Empty.

InterpreterPoolExecutor

Along with the new interpreters module, there will be a new concurrent.futures.InterpreterPoolExecutor. It will be a derivative of ThreadPoolExecutor, where each worker executes in its own thread, but each with its own subinterpreter.

Like the other executors, InterpreterPoolExecutor will support callables for tasks, and for the initializer. Also like the other executors, the arguments in both cases will be mostly unrestricted. The callables and arguments will typically be serialized when sent to a worker’s interpreter, e.g. with pickle, like how the ProcessPoolExecutor works. This contrasts with Interpreter.call(), which will (at least initially) be much more restricted.

Communication between workers, or between the executor (or generally its interpreter) and the workers, may still be done through interpreters.Queue objects, set with the initializer.

sys.implementation.supports_isolated_interpreters

Python implementations are not required to support subinterpreters, though most major ones do. If an implementation does support them then sys.implementation.supports_isolated_interpreters will be set to True. Otherwise it will be False. If the feature is not supported then importing the interpreters module will raise an ImportError.

Examples

The following examples demonstrate practical cases where multiple interpreters may be useful.

Example 1:

There’s a stream of requests coming in that will be handled via workers in sub-threads.

  • each worker thread has its own interpreter
  • there’s one queue to send tasks to workers and another queue to return results
  • the results are handled in a dedicated thread
  • each worker keeps going until it receives a “stop” sentinel (None)
  • the results handler keeps going until all workers have stopped
import interpreters
from mymodule import iter_requests, handle_result

tasks = interpreters.create_queue()
results = interpreters.create_queue()

numworkers = 20
threads = []

def results_handler():
    running = numworkers
    while running:
        try:
            res = results.get(timeout=0.1)
        except interpreters.QueueEmpty:
            # No workers have finished a request since last time.
            pass
        else:
            if res is None:
                # A worker has stopped.
                running -= 1
            else:
                handle_result(res)
    empty = object()
    assert results.get_nowait(empty) is empty
threads.append(threading.Thread(target=results_handler))

def worker():
    interp = interpreters.create()
    interp.prepare_main(tasks=tasks, results=results)
    interp.exec("""if True:
        from mymodule import handle_request, capture_exception

        while True:
            req = tasks.get()
            if req is None:
                # Stop!
                break
            try:
                res = handle_request(req)
            except Exception as exc:
                res = capture_exception(exc)
            results.put(res)
        # Notify the results handler.
        results.put(None)
        """)
threads.extend(threading.Thread(target=worker) for _ in range(numworkers))

for t in threads:
    t.start()

for req in iter_requests():
    tasks.put(req)
# Send the "stop" signal.
for _ in range(numworkers):
    tasks.put(None)

for t in threads:
    t.join()

Example 2:

This case is similar to the last as there are a bunch of workers in sub-threads. However, this time the code is chunking up a big array of data, where each worker processes one chunk at a time. Copying that data to each interpreter would be exceptionally inefficient, so the code takes advantage of directly sharing memoryview buffers.

  • all the interpreters share the buffer of the source array
  • each one writes its results to a second shared buffer
  • there’s use a queue to send tasks to workers
  • only one worker will ever read any given index in the source array
  • only one worker will ever write to any given index in the results (this is how it ensures thread-safety)
import interpreters
import queue
from mymodule import read_large_data_set, use_results

numworkers = 3
data, chunksize = read_large_data_set()
buf = memoryview(data)
numchunks = (len(buf) + 1) / chunksize
results = memoryview(b'\0' * numchunks)

tasks = interpreters.create_queue()

def worker(id):
    interp = interpreters.create()
    interp.prepare_main(data=buf, results=results, tasks=tasks)
    interp.exec("""if True:
        from mymodule import reduce_chunk

        while True:
            req = tasks.get()
            if res is None:
                # Stop!
                break
            resindex, start, end = req
            chunk = data[start: end]
            res = reduce_chunk(chunk)
            results[resindex] = res
        """)
threads = [threading.Thread(target=worker) for _ in range(numworkers)]
for t in threads:
    t.start()

for i in range(numchunks):
    # Assume there's at least one worker running still.
    start = i * chunksize
    end = start + chunksize
    if end > len(buf):
        end = len(buf)
    tasks.put((start, end, i))
# Send the "stop" signal.
for _ in range(numworkers):
    tasks.put(None)

for t in threads:
    t.join()

use_results(results)

Rationale

A Minimal API

Since the core dev team has no real experience with how users will make use of multiple interpreters in Python code, this proposal purposefully keeps the initial API as lean and minimal as possible. The objective is to provide a well-considered foundation on which further (more advanced) functionality may be added later, as appropriate.

That said, the proposed design incorporates lessons learned from existing use of subinterpreters by the community, from existing stdlib modules, and from other programming languages. It also factors in experience from using subinterpreters in the CPython test suite and using them in concurrency benchmarks.

create(), create_queue()

Typically, users call a type to create instances of the type, at which point the object’s resources get provisioned. The interpreters module takes a different approach, where users must call create() to get a new interpreter or create_queue() for a new queue. Calling interpreters.Interpreter() directly only returns a wrapper around an existing interpreters (likewise for interpreters.Queue()).

This is because interpreters (and queues) are special resources. They exist globally in the process and are not managed/owned by the current interpreter. Thus the interpreters module makes creating an interpreter (or queue) a visibly distinct operation from creating an instance of interpreters.Interpreter (or interpreters.Queue).

Interpreter.prepare_main() Sets Multiple Variables

prepare_main() may be seen as a setter function of sorts. It supports setting multiple names at once, e.g. interp.prepare_main(spam=1, eggs=2), whereas most setters set one item at a time. The main reason is for efficiency.

To set a value in the interpreter’s __main__.__dict__, the implementation must first switch the OS thread to the identified interpreter, which involves some non-negligible overhead. After setting the value it must switch back. Furthermore, there is some additional overhead to the mechanism by which it passes objects between interpreters, which can be reduced in aggregate if multiple values are set at once.

Therefore, prepare_main() supports setting multiple values at once.

Propagating Exceptions

An uncaught exception from a subinterpreter, via Interpreter.exec(), could either be (effectively) ignored, like threading.Thread() does, or propagated, like the builtin exec() does. Since Interpreter.exec() is a synchronous operation, like the builtin exec(), uncaught exceptions are propagated.

However, such exceptions are not raised directly. That’s because interpreters are isolated from each other and must not share objects, including exceptions. That could be addressed by raising a surrogate of the exception, whether a summary, a copy, or a proxy that wraps it. Any of those could preserve the traceback, which is useful for debugging. The ExecutionFailed that gets raised is such a surrogate.

There’s another concern to consider. If a propagated exception isn’t immediately caught, it will bubble up through the call stack until caught (or not). In the case that code somewhere else may catch it, it is helpful to identify that the exception came from a subinterpreter (i.e. a “remote” source), rather than from the current interpreter. That’s why Interpreter.exec() raises ExecutionFailed and why it is a plain Exception, rather than a copy or proxy with a class that matches the original exception. For example, an uncaught ValueError from a subinterpreter would never get caught in a later try: ... except ValueError: .... Instead, ExecutionFailed must be handled directly.

In contrast, exceptions propagated from Interpreter.call() do not involve ExecutionFailed but are raised directly, as though originating in the calling interpreter. This is because Interpreter.call() is a higher level method that uses pickle to support objects that can’t normally be passed between interpreters.

Limited Object Sharing

As noted in Interpreter Isolation, only a small number of builtin objects may be truly shared between interpreters. In all other cases objects can only be shared indirectly, through copies or proxies.

The set of objects that are shareable as copies through queues (and Interpreter.prepare_main()) is limited for the sake of efficiency.

Supporting sharing of all objects is possible (via pickle) but not part of this proposal. For one thing, it’s helpful to know in those cases that only an efficient implementation is being used. Furthermore, in those cases supporting mutable objects via pickling would violate the guarantee that “shared” objects be equivalent (and stay that way).

Objects vs. ID Proxies

For both interpreters and queues, the low-level module makes use of proxy objects that expose the underlying state by their corresponding process-global IDs. In both cases the state is likewise process-global and will be used by multiple interpreters. Thus they aren’t suitable to be implemented as PyObject, which is only really an option for interpreter-specific data. That’s why the interpreters module instead provides objects that are weakly associated through the ID.

Rejected Ideas

See PEP 554.


Source: https://github.com/python/peps/blob/main/peps/pep-0734.rst

Last modified: 2024-04-10 21:49:06 GMT