PEP: 554 Title: Multiple Interpreters in the Stdlib Author: Eric Snow
<ericsnowcurrently@gmail.com> Discussions-To:
https://discuss.python.org/t/pep-554-multiple-interpreters-in-the-stdlib/24855
Status: Superseded Type: Standards Track Content-Type: text/x-rst
Created: 05-Sep-2017 Python-Version: 3.13 Post-History: 07-Sep-2017,
08-Sep-2017, 13-Sep-2017, 05-Dec-2017, 04-May-2020, 14-Mar-2023,
01-Nov-2023, Superseded-By: 734

Note

This PEP effectively continues in a cleaner form in PEP 734. This PEP is
kept as-is for the sake of the various sections of background
information and deferred/rejected ideas that have been stripped from PEP
734.

Abstract

CPython has supported multiple interpreters in the same process (AKA
"subinterpreters") since version 1.5 (1997). The feature has been
available via the C-API. [c-api] Multiple interpreters operate in
relative isolation from one another, which facilitates novel alternative
approaches to concurrency.

This proposal introduces the stdlib interpreters module. It exposes the
basic functionality of multiple interpreters already provided by the
C-API, along with basic support for communicating between interpreters.
This module is especially relevant since PEP 684 introduced a
per-interpreter GIL in Python 3.12.

Proposal

Summary:

-   add a new stdlib module: "interpreters"
-   add concurrent.futures.InterpreterPoolExecutor
-   help for extension module maintainers

The "interpreters" Module

The interpreters module will provide a high-level interface to the
multiple interpreter functionality, and wrap a new low-level
_interpreters (in the same way as the threading module). See the
Examples section for concrete usage and use cases.

Along with exposing the existing (in CPython) multiple interpreter
support, the module will also support a basic mechanism for passing data
between interpreters. That involves setting "shareable" objects in the
__main__ module of a target subinterpreter. Some such objects, like
os.pipe(), may be used to communicate further. The module will also
provide a minimal implementation of "channels" as a demonstration of
cross-interpreter communication.

Note that objects are not shared between interpreters since they are
tied to the interpreter in which they were created. Instead, the
objects' data is passed between interpreters. See the Shared Data and
API For Communication sections for more details about
sharing/communicating between interpreters.

API summary for interpreters module

Here is a summary of the API for the interpreters module. For a more
in-depth explanation of the proposed classes and functions, see the
"interpreters" Module API section below.

For creating and using interpreters:

  -----------------------------------------------------------------------
  signature                      description
  ------------------------------ ----------------------------------------
  list_all() -> [Interpreter]    Get all existing interpreters.

  get_current() -> Interpreter   Get the currently running interpreter.

  get_main() -> Interpreter      Get the main interpreter.

  create() -> Interpreter        Initialize a new (idle) Python
                                 interpreter.
  -----------------------------------------------------------------------

+---------------------------+------------------------------------------+
| signature                 | description                              |
+===========================+==========================================+
| class Interpreter         | A single interpreter.                    |
+---------------------------+------------------------------------------+
| .id                       | The interpreter's ID (read-only).        |
+---------------------------+------------------------------------------+
| .is_running() -> bool     | Is the interpreter currently executing   |
|                           | code?                                    |
+---------------------------+------------------------------------------+
| .close()                  | Finalize and destroy the interpreter.    |
+---------------------------+------------------------------------------+
| .set_main_attrs(**kwargs) | Bind "shareable" objects in __main__.    |
+---------------------------+------------------------------------------+
| .get_main_attr(name)      | Get a "shareable" object from __main__.  |
+---------------------------+------------------------------------------+
| .exec(src_str, /)         | Run the given source code in the         |
|                           | interpreter                              |
|                           | (in the current thread).                 |
+---------------------------+------------------------------------------+

For communicating between interpreters:

+---------------------------------------+------------------------------+
| signature                             | description                  |
+=======================================+==============================+
| is_shareable(obj) -> Bool             | Can the object's data be     |
|                                       | passed                       |
|                                       | between interpreters?        |
+---------------------------------------+------------------------------+
| create_ch                             | Create a new channel for     |
| annel() -> (RecvChannel, SendChannel) | passing                      |
|                                       | data between interpreters.   |
+---------------------------------------+------------------------------+

concurrent.futures.InterpreterPoolExecutor

An executor will be added that extends ThreadPoolExecutor to run
per-thread tasks in subinterpreters. Initially, the only supported tasks
will be whatever Interpreter.exec() takes (e.g. a str script). However,
we may also support some functions, as well as eventually a separate
method for pickling the task and arguments, to reduce friction (at the
expense of performance for short-running tasks).

Help for Extension Module Maintainers

In practice, an extension that implements multi-phase init (PEP 489) is
considered isolated and thus compatible with multiple interpreters.
Otherwise it is "incompatible".

Many extension modules are still incompatible. The maintainers and users
of such extension modules will both benefit when they are updated to
support multiple interpreters. In the meantime, users may become
confused by failures when using multiple interpreters, which could
negatively impact extension maintainers. See Concerns below.

To mitigate that impact and accelerate compatibility, we will do the
following:

-   be clear that extension modules are not required to support use in
    multiple interpreters
-   raise ImportError when an incompatible module is imported in a
    subinterpreter
-   provide resources (e.g. docs) to help maintainers reach
    compatibility
-   reach out to the maintainers of Cython and of the most used
    extension modules (on PyPI) to get feedback and possibly provide
    assistance

Examples

Run isolated code in current OS thread

    interp = interpreters.create()
    print('before')
    interp.exec('print("during")')
    print('after')

Run in a different thread

    interp = interpreters.create()
    def run():
        interp.exec('print("during")')
    t = threading.Thread(target=run)
    print('before')
    t.start()
    t.join()
    print('after')

Pre-populate an interpreter

    interp = interpreters.create()
    interp.exec(tw.dedent("""
        import some_lib
        import an_expensive_module
        some_lib.set_up()
        """))
    wait_for_request()
    interp.exec(tw.dedent("""
        some_lib.handle_request()
        """))

Handling an exception

    interp = interpreters.create()
    try:
        interp.exec(tw.dedent("""
            raise KeyError
            """))
    except interpreters.RunFailedError as exc:
        print(f"got the error from the subinterpreter: {exc}")

Re-raising an exception

    interp = interpreters.create()
    try:
        try:
            interp.exec(tw.dedent("""
                raise KeyError
                """))
        except interpreters.RunFailedError as exc:
            raise exc.__cause__
    except KeyError:
        print("got a KeyError from the subinterpreter")

Note that this pattern is a candidate for later improvement.

Interact with the __main__ namespace

    interp = interpreters.create()
    interp.set_main_attrs(a=1, b=2)
    interp.exec(tw.dedent("""
        res = do_something(a, b)
        """))
    res = interp.get_main_attr('res')

Synchronize using an OS pipe

    interp = interpreters.create()
    r1, s1 = os.pipe()
    r2, s2 = os.pipe()

    def task():
        interp.exec(tw.dedent(f"""
            import os
            os.read({r1}, 1)
            print('during B')
            os.write({s2}, '')
            """))

    t = threading.thread(target=task)
    t.start()
    print('before')
    os.write(s1, '')
    print('during A')
    os.read(r2, 1)
    print('after')
    t.join()

Sharing a file descriptor

    interp = interpreters.create()
    with open('spamspamspam') as infile:
        interp.set_main_attrs(fd=infile.fileno())
        interp.exec(tw.dedent(f"""
            import os
            for line in os.fdopen(fd):
                print(line)
            """))

Passing objects via pickle

    interp = interpreters.create()
    r, s = os.pipe()
    interp.exec(tw.dedent(f"""
        import os
        import pickle
        reader = {r}
        """))
    interp.exec(tw.dedent("""
            data = b''
            c = os.read(reader, 1)
            while c != b'\x00':
                while c != b'\x00':
                    data += c
                    c = os.read(reader, 1)
                obj = pickle.loads(data)
                do_something(obj)
                c = os.read(reader, 1)
            """))
    for obj in input:
        data = pickle.dumps(obj)
        os.write(s, data)
        os.write(s, b'\x00')
    os.write(s, b'\x00')

Capturing an interpreter's stdout

    interp = interpreters.create()
    stdout = io.StringIO()
    with contextlib.redirect_stdout(stdout):
        interp.exec(tw.dedent("""
            print('spam!')
            """))
    assert(stdout.getvalue() == 'spam!')

    # alternately:
    interp.exec(tw.dedent("""
        import contextlib, io
        stdout = io.StringIO()
        with contextlib.redirect_stdout(stdout):
            print('spam!')
        captured = stdout.getvalue()
        """))
    captured = interp.get_main_attr('captured')
    assert(captured == 'spam!')

A pipe (os.pipe()) could be used similarly.

Running a module

    interp = interpreters.create()
    main_module = mod_name
    interp.exec(f'import runpy; runpy.run_module({main_module!r})')

Running as script (including zip archives & directories)

    interp = interpreters.create()
    main_script = path_name
    interp.exec(f"import runpy; runpy.run_path({main_script!r})")

Using a channel to communicate

    tasks_recv, tasks = interpreters.create_channel()
    results, results_send = interpreters.create_channel()

    def worker():
        interp = interpreters.create()
        interp.set_main_attrs(tasks=tasks_recv, results=results_send)
        interp.exec(tw.dedent("""
            def handle_request(req):
                ...

            def capture_exception(exc):
                ...

            while True:
                try:
                    req = tasks.recv()
                except Exception:
                    # channel closed
                    break
                try:
                    res = handle_request(req)
                except Exception as exc:
                    res = capture_exception(exc)
                results.send_nowait(res)
            """))
    threads = [threading.Thread(target=worker) for _ in range(20)]
    for t in threads:
        t.start()

    requests = ...
    for req in requests:
        tasks.send(req)
    tasks.close()

    for t in threads:
        t.join()

Sharing a memoryview (imagine map-reduce)

    data, chunksize = read_large_data_set()
    buf = memoryview(data)
    numchunks = (len(buf) + 1) / chunksize
    results = memoryview(b'\0' * numchunks)

    tasks_recv, tasks = interpreters.create_channel()

    def worker():
        interp = interpreters.create()
        interp.set_main_attrs(data=buf, results=results, tasks=tasks_recv)
        interp.exec(tw.dedent("""
            while True:
                try:
                    req = tasks.recv()
                except Exception:
                    # channel closed
                    break
                resindex, start, end = req
                chunk = data[start: end]
                res = reduce_chunk(chunk)
                results[resindex] = res
            """))
    t = threading.Thread(target=worker)
    t.start()

    for i in range(numchunks):
        if not workers_running():
            raise ...
        start = i * chunksize
        end = start + chunksize
        if end > len(buf):
            end = len(buf)
        tasks.send((start, end, i))
    tasks.close()
    t.join()

    use_results(results)

Rationale

Running code in multiple interpreters provides a useful level of
isolation within the same process. This can be leveraged in a number of
ways. Furthermore, subinterpreters provide a well-defined framework in
which such isolation may extended. (See PEP 684.)

Alyssa (Nick) Coghlan explained some of the benefits through a
comparison with multi-processing [benefits]:

    [I] expect that communicating between subinterpreters is going
    to end up looking an awful lot like communicating between
    subprocesses via shared memory.

    The trade-off between the two models will then be that one still
    just looks like a single process from the point of view of the
    outside world, and hence doesn't place any extra demands on the
    underlying OS beyond those required to run CPython with a single
    interpreter, while the other gives much stricter isolation
    (including isolating C globals in extension modules), but also
    demands much more from the OS when it comes to its IPC
    capabilities.

    The security risk profiles of the two approaches will also be quite
    different, since using subinterpreters won't require deliberately
    poking holes in the process isolation that operating systems give
    you by default.

CPython has supported multiple interpreters, with increasing levels of
support, since version 1.5. While the feature has the potential to be a
powerful tool, it has suffered from neglect because the multiple
interpreter capabilities are not readily available directly from Python.
Exposing the existing functionality in the stdlib will help reverse the
situation.

This proposal is focused on enabling the fundamental capability of
multiple interpreters, isolated from each other, in the same Python
process. This is a new area for Python so there is relative uncertainly
about the best tools to provide as companions to interpreters. Thus we
minimize the functionality we add in the proposal as much as possible.

Concerns

-   "subinterpreters are not worth the trouble"

Some have argued that subinterpreters do not add sufficient benefit to
justify making them an official part of Python. Adding features to the
language (or stdlib) has a cost in increasing the size of the language.
So an addition must pay for itself.

In this case, multiple interpreter support provide a novel concurrency
model focused on isolated threads of execution. Furthermore, they
provide an opportunity for changes in CPython that will allow
simultaneous use of multiple CPU cores (currently prevented by the
GIL--see PEP 684).

Alternatives to subinterpreters include threading, async, and
multiprocessing. Threading is limited by the GIL and async isn't the
right solution for every problem (nor for every person). Multiprocessing
is likewise valuable in some but not all situations. Direct IPC (rather
than via the multiprocessing module) provides similar benefits but with
the same caveat.

Notably, subinterpreters are not intended as a replacement for any of
the above. Certainly they overlap in some areas, but the benefits of
subinterpreters include isolation and (potentially) performance. In
particular, subinterpreters provide a direct route to an alternate
concurrency model (e.g. CSP) which has found success elsewhere and will
appeal to some Python users. That is the core value that the
interpreters module will provide.

-   "stdlib support for multiple interpreters adds extra burden on C
    extension authors"

In the Interpreter Isolation section below we identify ways in which
isolation in CPython's subinterpreters is incomplete. Most notable is
extension modules that use C globals to store internal state. (PEP 3121
and PEP 489 provide a solution to that problem, followed by some extra
APIs that improve efficiency, e.g. PEP 573).

Consequently, projects that publish extension modules may face an
increased maintenance burden as their users start using subinterpreters,
where their modules may break. This situation is limited to modules that
use C globals (or use libraries that use C globals) to store internal
state. For numpy, the reported-bug rate is one every 6 months.
[bug-rate]

Ultimately this comes down to a question of how often it will be a
problem in practice: how many projects would be affected, how often
their users will be affected, what the additional maintenance burden
will be for projects, and what the overall benefit of subinterpreters is
to offset those costs. The position of this PEP is that the actual extra
maintenance burden will be small and well below the threshold at which
subinterpreters are worth it.

-   "creating a new concurrency API deserves much more thought and
    experimentation, so the new module shouldn't go into the stdlib
    right away, if ever"

Introducing an API for a new concurrency model, like happened with
asyncio, is an extremely large project that requires a lot of careful
consideration. It is not something that can be done as simply as this
PEP proposes and likely deserves significant time on PyPI to mature.
(See Nathaniel's post on python-dev.)

However, this PEP does not propose any new concurrency API. At most it
exposes minimal tools (e.g. subinterpreters, channels) which may be used
to write code that follows patterns associated with (relatively)
new-to-Python concurrency models. Those tools could also be used as the
basis for APIs for such concurrency models. Again, this PEP does not
propose any such API.

-   "there is no point to exposing subinterpreters if they still share
    the GIL"
-   "the effort to make the GIL per-interpreter is disruptive and risky"

A common misconception is that this PEP also includes a promise that
interpreters will no longer share the GIL. When that is clarified, the
next question is "what is the point?". This is already answered at
length in this PEP. Just to be clear, the value lies in:

    * increase exposure of the existing feature, which helps improve
      the code health of the entire CPython runtime
    * expose the (mostly) isolated execution of interpreters
    * preparation for per-interpreter GIL
    * encourage experimentation

-   "data sharing can have a negative impact on cache performance in
    multi-core scenarios"

(See [cache-line-ping-pong].)

This shouldn't be a problem for now as we have no immediate plans to
actually share data between interpreters, instead focusing on copying.

About Subinterpreters

Concurrency

Concurrency is a challenging area of software development. Decades of
research and practice have led to a wide variety of concurrency models,
each with different goals. Most center on correctness and usability.

One class of concurrency models focuses on isolated threads of execution
that interoperate through some message passing scheme. A notable example
is Communicating Sequential Processes [CSP] (upon which Go's concurrency
is roughly based). The intended isolation inherent to CPython's
interpreters makes them well-suited to this approach.

Shared Data

CPython's interpreters are inherently isolated (with caveats explained
below), in contrast to threads. So the same
communicate-via-shared-memory approach doesn't work. Without an
alternative, effective use of concurrency via multiple interpreters is
significantly limited.

The key challenge here is that sharing objects between interpreters
faces complexity due to various constraints on object ownership,
visibility, and mutability. At a conceptual level it's easier to reason
about concurrency when objects only exist in one interpreter at a time.
At a technical level, CPython's current memory model limits how Python
objects may be shared safely between interpreters; effectively, objects
are bound to the interpreter in which they were created. Furthermore,
the complexity of object sharing increases as interpreters become more
isolated, e.g. after GIL removal (though this is mitigated somewhat for
some "immortal" objects (see PEP 683).

Consequently, the mechanism for sharing needs to be carefully
considered. There are a number of valid solutions, several of which may
be appropriate to support in Python's stdlib and C-API. Any such
solution is likely to share many characteristics with the others.

In the meantime, we propose here a minimal solution
(Interpreter.set_main_attrs()), which sets some precedent for how
objects are shared. More importantly, it facilitates the introduction of
more advanced approaches later and allows them to coexist and cooperate.
In part to demonstrate that, we will provide a basic implementation of
"channels", as a somewhat more advanced sharing solution.

Separate proposals may cover:

-   the addition of a public C-API based on the implementation
    Interpreter.set_main_attrs()
-   the addition of other sharing approaches to the "interpreters"
    module

The fundamental enabling feature for communication is that most objects
can be converted to some encoding of underlying raw data, which is safe
to be passed between interpreters. For example, an int object can be
turned into a C long value, sent to another interpreter, and turned back
into an int object there. As another example, None may be passed as-is.

Regardless, the effort to determine the best way forward here is mostly
outside the scope of this PEP. In the meantime, this proposal describes
a basic interim solution using pipes (os.pipe()), as well as providing a
dedicated capability ("channels"). See API For Communication below.

Interpreter Isolation

CPython's interpreters are intended to be strictly isolated from each
other. Each interpreter has its own copy of all modules, classes,
functions, and variables. The same applies to state in C, including in
extension modules. The CPython C-API docs explain more. [caveats]

However, there are ways in which interpreters do share some state. First
of all, some process-global state remains shared:

-   file descriptors
-   low-level env vars
-   process memory (though allocators are isolated)
-   builtin types (e.g. dict, bytes)
-   singletons (e.g. None)
-   underlying static module data (e.g. functions) for
    builtin/extension/frozen modules

There are no plans to change this.

Second, some isolation is faulty due to bugs or implementations that did
not take subinterpreters into account. This includes things like
extension modules that rely on C globals. [cryptography] In these cases
bugs should be opened (some are already):

-   readline module hook functions (http://bugs.python.org/issue4202)
-   memory leaks on re-init (http://bugs.python.org/issue21387)

Finally, some potential isolation is missing due to the current design
of CPython. Improvements are currently going on to address gaps in this
area:

-   extensions using the PyGILState_* API are somewhat incompatible
    [gilstate]

Existing Usage

Multiple interpreter support has not been a widely used feature. In
fact, there have been only a handful of documented cases of widespread
usage, including mod_wsgi, OpenStack Ceph, and JEP. On the one hand,
these cases provide confidence that existing multiple interpreter
support is relatively stable. On the other hand, there isn't much of a
sample size from which to judge the utility of the feature.

Alternate Python Implementations

I've solicited feedback from various Python implementors about support
for subinterpreters. Each has indicated that they would be able to
support multiple interpreters in the same process (if they choose to)
without a lot of trouble. Here are the projects I contacted:

-   jython ([jython])
-   ironpython (personal correspondence)
-   pypy (personal correspondence)
-   micropython (personal correspondence)

"interpreters" Module API

The module provides the following functions:

    list_all() -> [Interpreter]

       Return a list of all existing interpreters.

    get_current() => Interpreter

       Return the currently running interpreter.

    get_main() => Interpreter

       Return the main interpreter.  If the Python implementation
       has no concept of a main interpreter then return None.

    create() -> Interpreter

       Initialize a new Python interpreter and return it.
       It will remain idle until something is run in it and always
       run in its own thread.

    is_shareable(obj) -> bool:

       Return True if the object may be "shared" between interpreters.
       This does not necessarily mean that the actual objects will be
       shared.  Instead, it means that the objects' underlying data will
       be shared in a cross-interpreter way, whether via a proxy, a
       copy, or some other means.

The module also provides the following class:

    class Interpreter(id):

       id -> int:

          The interpreter's ID. (read-only)

       is_running() -> bool:

          Return whether or not the interpreter's "exec()" is currently
          executing code.  Code running in subthreads is ignored.
          Calling this on the current interpreter will always return True.

       close():

          Finalize and destroy the interpreter.

          This may not be called on an already running interpreter.
          Doing so results in a RuntimeError.

       set_main_attrs(iterable_or_mapping, /):
       set_main_attrs(**kwargs):

          Set attributes in the interpreter's __main__ module
          corresponding to the given name-value pairs.  Each value
          must be a "shareable" object and will be converted to a new
          object (e.g. copy, proxy) in whatever way that object's type
          defines.  If an attribute with the same name is already set,
          it will be overwritten.

          This method is helpful for setting up an interpreter before
          calling exec().

       get_main_attr(name, default=None, /):

          Return the value of the corresponding attribute of the
          interpreter's __main__ module.  If the attribute isn't set
          then the default is returned.  If it is set, but the value
          isn't "shareable" then a ValueError is raised.

          This may be used to introspect the __main__ module, as well
          as a very basic mechanism for "returning" one or more results
          from Interpreter.exec().

       exec(source_str, /):

          Run the provided Python source code in the interpreter,
          in its __main__ module.

          This may not be called on an already running interpreter.
          Doing so results in a RuntimeError.

          An "interp.exec()" call is similar to a builtin exec() call
          (or to calling a function that returns None).  Once
          "interp.exec()" completes, the code that called "exec()"
          continues executing (in the original interpreter).  Likewise,
          if there is any uncaught exception then it effectively
          (see below) propagates into the code where ``interp.exec()``
          was called.  Like exec() (and threads), but unlike function
          calls, there is no return value.  If any "return" value from
          the code is needed, send the data out via a pipe (os.pipe())
          or channel or other cross-interpreter communication mechanism.

          The big difference from exec() or functions is that
          "interp.exec()" executes the code in an entirely different
          interpreter, with entirely separate state.  The interpreters
          are completely isolated from each other, so the state of the
          original interpreter (including the code it was executing in
          the current OS thread) does not affect the state of the target
          interpreter (the one that will execute the code).  Likewise,
          the target does not affect the original, nor any of its other
          threads.

          Instead, the state of the original interpreter (for this thread)
          is frozen, and the code it's executing code completely blocks.
          At that point, the target interpreter is given control of the
          OS thread.  Then, when it finishes executing, the original
          interpreter gets control back and continues executing.

          So calling "interp.exec()" will effectively cause the current
          Python thread to completely pause.  Sometimes you won't want
          that pause, in which case you should make the "exec()" call in
          another thread.  To do so, add a function that calls
          "interp.exec()" and then run that function in a normal
          "threading.Thread".

          Note that the interpreter's state is never reset, neither
          before "interp.exec()" executes the code nor after.  Thus the
          interpreter state is preserved between calls to
          "interp.exec()".  This includes "sys.modules", the "builtins"
          module, and the internal state of C extension modules.

          Also note that "interp.exec()" executes in the namespace of the
          "__main__" module, just like scripts, the REPL, "-m", and
          "-c".  Just as the interpreter's state is not ever reset, the
          "__main__" module is never reset.  You can imagine
          concatenating the code from each "interp.exec()" call into one
          long script.  This is the same as how the REPL operates.

          Supported code: source text.

In addition to the functionality of Interpreter.set_main_attrs(), the
module provides a related way to pass data between interpreters:
channels. See Channels below.

Uncaught Exceptions

Regarding uncaught exceptions in Interpreter.exec(), we noted that they
are "effectively" propagated into the code where interp.exec() was
called. To prevent leaking exceptions (and tracebacks) between
interpreters, we create a surrogate of the exception and its traceback
(see traceback.TracebackException), set it to __cause__ on a new
interpreters.RunFailedError, and raise that.

Directly raising (a proxy of) the exception is problematic since it's
harder to distinguish between an error in the interp.exec() call and an
uncaught exception from the subinterpreter.

Interpreter Restrictions

Every new interpreter created by interpreters.create() now has specific
restrictions on any code it runs. This includes the following:

-   importing an extension module fails if it does not implement
    multi-phase init
-   daemon threads may not be created
-   os.fork() is not allowed (so no multiprocessing)
-   os.exec*() is not allowed (but "fork+exec", a la subprocess is okay)

Note that interpreters created with the existing C-API do not have these
restrictions. The same is true for the "main" interpreter, so existing
use of Python will not change.

We may choose to later loosen some of the above restrictions or provide
a way to enable/disable granular restrictions individually. Regardless,
requiring multi-phase init from extension modules will always be a
default restriction.

API For Communication

As discussed in Shared Data above, multiple interpreter support is less
useful without a mechanism for sharing data (communicating) between
them. Sharing actual Python objects between interpreters, however, has
enough potential problems that we are avoiding support for that in this
proposal. Nor, as mentioned earlier, are we adding anything more than a
basic mechanism for communication.

That mechanism is the Interpreter.set_main_attrs() method. It may be
used to set up global variables before Interpreter.exec() is called. The
name-value pairs passed to set_main_attrs() are bound as attributes of
the interpreter's __main__ module. The values must be "shareable". See
Shareable Types below.

Additional approaches to communicating and sharing objects are enabled
through Interpreter.set_main_attrs(). A shareable object could be
implemented which works like a queue, but with cross-interpreter safety.
In fact, this PEP does include an example of such an approach: channels.

Shareable Types

An object is "shareable" if its type supports shareable instances. The
type must implement a new internal protocol, which is used to convert an
object to interpreter-independent data and then converted back to an
object on the other side. Also see is_shareable() above.

A minimal set of simple, immutable builtin types will be supported
initially, including:

-   None
-   bool
-   bytes
-   str
-   int
-   float

We will also support a small number of complex types initially:

-   memoryview, to allow sharing PEP 3118 buffers
-   channels

Further builtin types may be supported later, complex or not. Limiting
the initial shareable types is a practical matter, reducing the
potential complexity of the initial implementation. There are a number
of strategies we may pursue in the future to expand supported objects,
once we have more experience with interpreter isolation.

In the meantime, a separate proposal will discuss making the internal
protocol (and C-API) used by Interpreter.set_main_attrs() public. With
that protocol, support for other types could be added by extension
modules.

Communicating Through OS Pipes

Even without a dedicated object for communication, users may already use
existing tools. For example, one basic approach for sending data between
interpreters is to use a pipe (see os.pipe()):

1.  interpreter A calls os.pipe() to get a read/write pair of file
    descriptors (both int objects)
2.  interpreter A calls interp.set_main_attrs(), binding the read FD (or
    embeds it using string formatting)
3.  interpreter A calls interp.exec() on interpreter B
4.  interpreter A writes some bytes to the write FD
5.  interpreter B reads those bytes

Several of the earlier examples demonstrate this, such as Synchronize
using an OS pipe.

Channels

The interpreters module will include a dedicated solution for passing
object data between interpreters: channels. They are included in the
module in part to provide an easier mechanism than using os.pipe() and
in part to demonstrate how libraries may take advantage of
Interpreter.set_main_attrs() and the protocol it uses.

A channel is a simplex FIFO. It is a basic, opt-in data sharing
mechanism that draws inspiration from pipes, queues, and CSP's channels.
[fifo] The main difference from pipes is that channels can be associated
with zero or more interpreters on either end. Like queues, which are
also many-to-many, channels are buffered (though they also offer methods
with unbuffered semantics).

Channels have two operations: send and receive. A key characteristic of
those operations is that channels transmit data derived from Python
objects rather than the objects themselves. When objects are sent, their
data is extracted. When the "object" is received in the other
interpreter, the data is converted back into an object owned by that
interpreter.

To make this work, the mutable shared state will be managed by the
Python runtime, not by any of the interpreters. Initially we will
support only one type of objects for shared state: the channels provided
by interpreters.create_channel(). Channels, in turn, will carefully
manage passing objects between interpreters.

This approach, including keeping the API minimal, helps us avoid further
exposing any underlying complexity to Python users.

The interpreters module provides the following function related to
channels:

    create_channel() -> (RecvChannel, SendChannel):

       Create a new channel and return (recv, send), the RecvChannel
       and SendChannel corresponding to the ends of the channel.

       Both ends of the channel are supported "shared" objects (i.e.
       may be safely shared by different interpreters.  Thus they
       may be set using "Interpreter.set_main_attrs()".

The module also provides the following channel-related classes:

    class RecvChannel(id):

       The receiving end of a channel.  An interpreter may use this to
       receive objects from another interpreter.  Any type supported by
       Interpreter.set_main_attrs() will be supported here, though at
       first only a few of the simple, immutable builtin types
       will be supported.

       id -> int:

          The channel's unique ID.  The "send" end has the same one.

       recv(*, timeout=None):

          Return the next object from the channel.  If none have been
          sent then wait until the next send (or until the timeout is hit).

          At the least, the object will be equivalent to the sent object.
          That will almost always mean the same type with the same data,
          though it could also be a compatible proxy.  Regardless, it may
          use a copy of that data or actually share the data.  That's up
          to the object's type.

       recv_nowait(default=None):

          Return the next object from the channel.  If none have been
          sent then return the default.  Otherwise, this is the same
          as the "recv()" method.


    class SendChannel(id):

       The sending end of a channel.  An interpreter may use this to
       send objects to another interpreter.  Any type supported by
       Interpreter.set_main_attrs() will be supported here, though
       at first only a few of the simple, immutable builtin types
       will be supported.

       id -> int:

          The channel's unique ID.  The "recv" end has the same one.

       send(obj, *, timeout=None):

          Send the object (i.e. its data) to the "recv" end of the
          channel.  Wait until the object is received.  If the object
          is not shareable then ValueError is raised.

          The builtin memoryview is supported, so sending a buffer
          across involves first wrapping the object in a memoryview
          and then sending that.

       send_nowait(obj):

          Send the object to the "recv" end of the channel.  This
          behaves the same as "send()", except for the waiting part.
          If no interpreter is currently receiving (waiting on the
          other end) then queue the object and return False.  Otherwise
          return True.

Caveats For Shared Objects

Again, Python objects are not shared between interpreters. However, in
some cases data those objects wrap is actually shared and not just
copied. One example might be PEP 3118 buffers.

In those cases the object in the original interpreter is kept alive
until the shared data in the other interpreter is no longer used. Then
object destruction can happen like normal in the original interpreter,
along with the previously shared data.

Documentation

The new stdlib docs page for the interpreters module will include the
following:

-   (at the top) a clear note that support for multiple interpreters is
    not required from extension modules
-   some explanation about what subinterpreters are
-   brief examples of how to use multiple interpreters (and
    communicating between them)
-   a summary of the limitations of using multiple interpreters
-   (for extension maintainers) a link to the resources for ensuring
    multiple interpreters compatibility
-   much of the API information in this PEP

Docs about resources for extension maintainers already exist on the
Isolating Extension Modules howto page. Any extra help will be added
there. For example, it may prove helpful to discuss strategies for
dealing with linked libraries that keep their own
subinterpreter-incompatible global state.

Note that the documentation will play a large part in mitigating any
negative impact that the new interpreters module might have on extension
module maintainers.

Also, the ImportError for incompatible extension modules will be updated
to clearly say it is due to missing multiple interpreters compatibility
and that extensions are not required to provide it. This will help set
user expectations properly.

Alternative Solutions

One possible alternative to a new module is to add support for
interpreters to concurrent.futures. There are several reasons why that
wouldn't work:

-   the obvious place to look for multiple interpreters support is an
    "interpreters" module, much as with "threading", etc.
-   concurrent.futures is all about executing functions but currently we
    don't have a good way to run a function from one interpreter in
    another

Similar reasoning applies for support in the multiprocessing module.

Open Questions

-   will is be too confusing that interp.exec() runs in the current
    thread?
-   should we add pickling fallbacks right now for interp.exec(), and/or
    Interpreter.set_main_attrs() and Interpreter.get_main_attr()?
-   should we support (limited) functions in interp.exec() right now?
-   rename Interpreter.close() to Interpreter.destroy()?
-   drop Interpreter.get_main_attr(), since we have channels?
-   should channels be its own PEP?

Deferred Functionality

In the interest of keeping this proposal minimal, the following
functionality has been left out for future consideration. Note that this
is not a judgement against any of said capability, but rather a
deferment. That said, each is arguably valid.

Add convenience API

There are a number of things I can imagine would smooth out hypothetical
rough edges with the new module:

-   add something like Interpreter.run() or Interpreter.call() that
    calls interp.exec() and falls back to pickle
-   fall back to pickle in Interpreter.set_main_attrs() and
    Interpreter.get_main_attr()

These would be easy to do if this proves to be a pain point.

Avoid possible confusion about interpreters running in the current thread

One regular point of confusion has been that Interpreter.exec() executes
in the current OS thread, temporarily blocking the current Python
thread. It may be worth doing something to avoid that confusion.

Some possible solutions for this hypothetical problem:

-   by default, run in a new thread?
-   add Interpreter.exec_in_thread()?
-   add Interpreter.exec_in_current_thread()?

In earlier versions of this PEP the method was interp.run(). The simple
change to interp.exec() alone will probably reduce confusion
sufficiently, when coupled with educating users via the docs. It it
turns out to be a real problem, we can pursue one of the alternatives at
that point.

Clarify "running" vs. "has threads"

Interpreter.is_running() refers specifically to whether or not
Interpreter.exec() (or similar) is running somewhere. It does not say
anything about if the interpreter has any subthreads running. That
information might be helpful.

Some things we could do:

-   rename Interpreter.is_running() to Interpreter.is_running_main()
-   add Interpreter.has_threads(), to complement
    Interpreter.is_running()
-   expand to Interpreter.is_running(main=True, threads=False)

None of these are urgent and any could be done later, if desired.

A Dunder Method For Sharing

We could add a special method, like __xid__ to correspond to tp_xid. At
the very least, it would allow Python types to convert their instances
to some other type that implements tp_xid.

The problem is that exposing this capability to Python code presents a
degree of complixity that hasn't been explored yet, nor is there a
compelling case to investigate that complexity.

Interpreter.call()

It would be convenient to run existing functions in subinterpreters
directly. Interpreter.exec() could be adjusted to support this or a
call() method could be added:

    Interpreter.call(f, *args, **kwargs)

This suffers from the same problem as sharing objects between
interpreters via queues. The minimal solution (running a source string)
is sufficient for us to get the feature out where it can be explored.

Interpreter.run_in_thread()

This method would make a interp.exec() call for you in a thread. Doing
this using only threading.Thread and interp.exec() is relatively trivial
so we've left it out.

Synchronization Primitives

The threading module provides a number of synchronization primitives for
coordinating concurrent operations. This is especially necessary due to
the shared-state nature of threading. In contrast, interpreters do not
share state. Data sharing is restricted to the runtime's shareable
objects capability, which does away with the need for explicit
synchronization. If any sort of opt-in shared state support is added to
CPython's interpreters in the future, that same effort can introduce
synchronization primitives to meet that need.

CSP Library

A csp module would not be a large step away from the functionality
provided by this PEP. However, adding such a module is outside the
minimalist goals of this proposal.

Syntactic Support

The Go language provides a concurrency model based on CSP, so it's
similar to the concurrency model that multiple interpreters support.
However, Go also provides syntactic support, as well as several builtin
concurrency primitives, to make concurrency a first-class feature.
Conceivably, similar syntactic (and builtin) support could be added to
Python using interpreters. However, that is way outside the scope of
this PEP!

Multiprocessing

The multiprocessing module could support interpreters in the same way it
supports threads and processes. In fact, the module's maintainer, Davin
Potts, has indicated this is a reasonable feature request. However, it
is outside the narrow scope of this PEP.

C-extension opt-in/opt-out

By using the PyModuleDef_Slot introduced by PEP 489, we could easily add
a mechanism by which C-extension modules could opt out of multiple
interpreter support. Then the import machinery, when operating in a
subinterpreter, would need to check the module for support. It would
raise an ImportError if unsupported.

Alternately we could support opting in to multiple interpreters support.
However, that would probably exclude many more modules (unnecessarily)
than the opt-out approach. Also, note that PEP 489 defined that an
extension's use of the PEP's machinery implies multiple interpreters
support.

The scope of adding the ModuleDef slot and fixing up the import
machinery is non-trivial, but could be worth it. It all depends on how
many extension modules break under subinterpreters. Given that there are
relatively few cases we know of through mod_wsgi, we can leave this for
later.

Poisoning channels

CSP has the concept of poisoning a channel. Once a channel has been
poisoned, any send() or recv() call on it would raise a special
exception, effectively ending execution in the interpreter that tried to
use the poisoned channel.

This could be accomplished by adding a poison() method to both ends of
the channel. The close() method can be used in this way (mostly), but
these semantics are relatively specialized and can wait.

Resetting __main__

As proposed, every call to Interpreter.exec() will execute in the
namespace of the interpreter's existing __main__ module. This means that
data persists there between interp.exec() calls. Sometimes this isn't
desirable and you want to execute in a fresh __main__. Also, you don't
necessarily want to leak objects there that you aren't using any more.

Note that the following won't work right because it will clear too much
(e.g. __name__ and the other "__dunder__" attributes:

    interp.exec('globals().clear()')

Possible solutions include:

-   a create() arg to indicate resetting __main__ after each
    interp.exec() call
-   an Interpreter.reset_main flag to support opting in or out after the
    fact
-   an Interpreter.reset_main() method to opt in when desired
-   importlib.util.reset_globals() [reset_globals]

Also note that resetting __main__ does nothing about state stored in
other modules. So any solution would have to be clear about the scope of
what is being reset. Conceivably we could invent a mechanism by which
any (or every) module could be reset, unlike reload() which does not
clear the module before loading into it.

Regardless, since __main__ is the execution namespace of the
interpreter, resetting it has a much more direct correlation to
interpreters and their dynamic state than does resetting other modules.
So a more generic module reset mechanism may prove unnecessary.

This isn't a critical feature initially. It can wait until later if
desirable.

Resetting an interpreter's state

It may be nice to re-use an existing subinterpreter instead of spinning
up a new one. Since an interpreter has substantially more state than
just the __main__ module, it isn't so easy to put an interpreter back
into a pristine/fresh state. In fact, there may be parts of the state
that cannot be reset from Python code.

A possible solution is to add an Interpreter.reset() method. This would
put the interpreter back into the state it was in when newly created. If
called on a running interpreter it would fail (hence the main
interpreter could never be reset). This would likely be more efficient
than creating a new interpreter, though that depends on what
optimizations will be made later to interpreter creation.

While this would potentially provide functionality that is not otherwise
available from Python code, it isn't a fundamental functionality. So in
the spirit of minimalism here, this can wait. Regardless, I doubt it
would be controversial to add it post-PEP.

Copy an existing interpreter's state

Relatedly, it may be useful to support creating a new interpreter based
on an existing one, e.g. Interpreter.copy(). This ties into the idea
that a snapshot could be made of an interpreter's memory, which would
make starting up CPython, or creating new interpreters, faster in
general. The same mechanism could be used for a hypothetical
Interpreter.reset(), as described previously.

Shareable file descriptors and sockets

Given that file descriptors and sockets are process-global resources,
making them shareable is a reasonable idea. They would be a good
candidate for the first effort at expanding the supported shareable
types. They aren't strictly necessary for the initial API.

Integration with async

Per Antoine Pitrou [async]:

    Has any thought been given to how FIFOs could integrate with async
    code driven by an event loop (e.g. asyncio)?  I think the model of
    executing several asyncio (or Tornado) applications each in their
    own subinterpreter may prove quite interesting to reconcile multi-
    core concurrency with ease of programming.  That would require the
    FIFOs to be able to synchronize on something an event loop can wait
    on (probably a file descriptor?).

The basic functionality of multiple interpreters support does not depend
on async and can be added later.

A possible solution is to provide async implementations of the blocking
channel methods (recv(), and send()).

Alternately, "readiness callbacks" could be used to simplify use in
async scenarios. This would mean adding an optional callback (kw-only)
parameter to the recv_nowait() and send_nowait() channel methods. The
callback would be called once the object was sent or received
(respectively).

(Note that making channels buffered makes readiness callbacks less
important.)

Support for iteration

Supporting iteration on RecvChannel (via __iter__() or _next__()) may be
useful. A trivial implementation would use the recv() method, similar to
how files do iteration. Since this isn't a fundamental capability and
has a simple analog, adding iteration support can wait until later.

Channel context managers

Context manager support on RecvChannel and SendChannel may be helpful.
The implementation would be simple, wrapping a call to close() (or maybe
release()) like files do. As with iteration, this can wait.

Pipes and Queues

With the proposed object passing mechanism of "os.pipe()", other similar
basic types aren't strictly required to achieve the minimal useful
functionality of multiple interpreters. Such types include pipes (like
unbuffered channels, but one-to-one) and queues (like channels, but more
generic). See below in Rejected Ideas for more information.

Even though these types aren't part of this proposal, they may still be
useful in the context of concurrency. Adding them later is entirely
reasonable. The could be trivially implemented as wrappers around
channels. Alternatively they could be implemented for efficiency at the
same low level as channels.

Return a lock from send()

When sending an object through a channel, you don't have a way of
knowing when the object gets received on the other end. One way to work
around this is to return a locked threading.Lock from SendChannel.send()
that unlocks once the object is received.

Alternately, the proposed SendChannel.send() (blocking) and
SendChannel.send_nowait() provide an explicit distinction that is less
likely to confuse users.

Note that returning a lock would matter for buffered channels (i.e.
queues). For unbuffered channels it is a non-issue.

Support prioritization in channels

A simple example is queue.PriorityQueue in the stdlib.

Support inheriting settings (and more?)

Folks might find it useful, when creating a new interpreter, to be able
to indicate that they would like some things "inherited" by the new
interpreter. The mechanism could be a strict copy or it could be
copy-on-write. The motivating example is with the warnings module (e.g.
copy the filters).

The feature isn't critical, nor would it be widely useful, so it can
wait until there's interest. Notably, both suggested solutions will
require significant work, especially when it comes to complex objects
and most especially for mutable containers of mutable complex objects.

Make exceptions shareable

Exceptions are propagated out of run() calls, so it isn't a big leap to
make them shareable. However, as noted elsewhere, it isn't essential or
(particularly common) so we can wait on doing that.

Make everything shareable through serialization

We could use pickle (or marshal) to serialize everything and thus make
them shareable. Doing this is potentially inefficient, but it may be a
matter of convenience in the end. We can add it later, but trying to
remove it later would be significantly more painful.

Make RunFailedError.__cause__ lazy

An uncaught exception in a subinterpreter (from interp.exec()) is copied
to the calling interpreter and set as __cause__ on a RunFailedError
which is then raised. That copying part involves some sort of
deserialization in the calling interpreter, which can be expensive (e.g.
due to imports) yet is not always necessary.

So it may be useful to use an ExceptionProxy type to wrap the serialized
exception and only deserialize it when needed. That could be via
ExceptionProxy__getattribute__() or perhaps through
RunFailedError.resolve() (which would raise the deserialized exception
and set RunFailedError.__cause__ to the exception.

It may also make sense to have RunFailedError.__cause__ be a descriptor
that does the lazy deserialization (and set __cause__) on the
RunFailedError instance.

Return a value from interp.exec()

Currently interp.exec() always returns None. One idea is to return the
return value from whatever the subinterpreter ran. However, for now it
doesn't make sense. The only thing folks can run is a string of code
(i.e. a script). This is equivalent to PyRun_StringFlags(), exec(), or a
module body. None of those "return" anything. We can revisit this once
interp.exec() supports functions, etc.

Add a shareable synchronization primitive

This would be _threading.Lock (or something like it) where interpreters
would actually share the underlying mutex. The main concern is that
locks and isolated interpreters may not mix well (as learned in Go).

We can add this later if it proves desirable without much trouble.

Propagate SystemExit and KeyboardInterrupt Differently

The exception types that inherit from BaseException (aside from
Exception) are usually treated specially. These types are:
KeyboardInterrupt, SystemExit, and GeneratorExit. It may make sense to
treat them specially when it comes to propagation from interp.exec().
Here are some options:

    * propagate like normal via RunFailedError
    * do not propagate (handle them somehow in the subinterpreter)
    * propagate them directly (avoid RunFailedError)
    * propagate them directly (set RunFailedError as __cause__)

We aren't going to worry about handling them differently. Threads
already ignore SystemExit, so for now we will follow that pattern.

Add an explicit release() and close() to channel end classes

It can be convenient to have an explicit way to close a channel against
further global use. Likewise it could be useful to have an explicit way
to release one of the channel ends relative to the current interpreter.
Among other reasons, such a mechanism is useful for communicating
overall state between interpreters without the extra boilerplate that
passing objects through a channel directly would require.

The challenge is getting automatic release/close right without making it
hard to understand. This is especially true when dealing with a
non-empty channel. We should be able to get by without release/close for
now.

Add SendChannel.send_buffer()

This method would allow no-copy sending of an object through a channel
if it supports the PEP 3118 buffer protocol (e.g. memoryview).

Support for this is not fundamental to channels and can be added on
later without much disruption.

Auto-run in a thread

The PEP proposes a hard separation between subinterpreters and threads:
if you want to run in a thread you must create the thread yourself and
call interp.exec() in it. However, it might be convenient if
interp.exec() could do that for you, meaning there would be less
boilerplate.

Furthermore, we anticipate that users will want to run in a thread much
more often than not. So it would make sense to make this the default
behavior. We would add a kw-only param "threaded" (default True) to
interp.exec() to allow the run-in-the-current-thread operation.

Rejected Ideas

Explicit channel association

Interpreters are implicitly associated with channels upon recv() and
send() calls. They are de-associated with release() calls. The
alternative would be explicit methods. It would be either add_channel()
and remove_channel() methods on Interpreter objects or something similar
on channel objects.

In practice, this level of management shouldn't be necessary for users.
So adding more explicit support would only add clutter to the API.

Add an API based on pipes

A pipe would be a simplex FIFO between exactly two interpreters. For
most use cases this would be sufficient. It could potentially simplify
the implementation as well. However, it isn't a big step to supporting a
many-to-many simplex FIFO via channels. Also, with pipes the API ends up
being slightly more complicated, requiring naming the pipes.

Add an API based on queues

Queues and buffered channels are almost the same thing. The main
difference is that channels have a stronger relationship with context
(i.e. the associated interpreter).

The name "Channel" was used instead of "Queue" to avoid confusion with
the stdlib queue.Queue.

"enumerate"

The list_all() function provides the list of all interpreters. In the
threading module, which partly inspired the proposed API, the function
is called enumerate(). The name is different here to avoid confusing
Python users that are not already familiar with the threading API. For
them "enumerate" is rather unclear, whereas "list_all" is clear.

Alternate solutions to prevent leaking exceptions across interpreters

In function calls, uncaught exceptions propagate to the calling frame.
The same approach could be taken with interp.exec(). However, this would
mean that exception objects would leak across the inter-interpreter
boundary. Likewise, the frames in the traceback would potentially leak.

While that might not be a problem currently, it would be a problem once
interpreters get better isolation relative to memory management (which
is necessary to stop sharing the GIL between interpreters). We've
resolved the semantics of how the exceptions propagate by raising a
RunFailedError instead, for which __cause__ wraps a safe proxy for the
original exception and traceback.

Rejected possible solutions:

-   reproduce the exception and traceback in the original interpreter
    and raise that.
-   raise a subclass of RunFailedError that proxies the original
    exception and traceback.
-   raise RuntimeError instead of RunFailedError
-   convert at the boundary (a la subprocess.CalledProcessError)
    (requires a cross-interpreter representation)
-   support customization via Interpreter.excepthook (requires a
    cross-interpreter representation)
-   wrap in a proxy at the boundary (including with support for
    something like err.raise() to propagate the traceback).
-   return the exception (or its proxy) from interp.exec() instead of
    raising it
-   return a result object (like subprocess does) [result-object]
    (unnecessary complexity?)
-   throw the exception away and expect users to deal with unhandled
    exceptions explicitly in the script they pass to interp.exec() (they
    can pass error info out via channels); with threads you have to do
    something similar

Always associate each new interpreter with its own thread

As implemented in the C-API, an interpreter is not inherently tied to
any thread. Furthermore, it will run in any existing thread, whether
created by Python or not. You only have to activate one of its thread
states (PyThreadState) in the thread first. This means that the same
thread may run more than one interpreter (though obviously not at the
same time).

The proposed module maintains this behavior. Interpreters are not tied
to threads. Only calls to Interpreter.exec() are. However, one of the
key objectives of this PEP is to provide a more human-centric
concurrency model. With that in mind, from a conceptual standpoint the
module might be easier to understand if each interpreter were associated
with its own thread.

That would mean interpreters.create() would create a new thread and
Interpreter.exec() would only execute in that thread (and nothing else
would). The benefit is that users would not have to wrap
Interpreter.exec() calls in a new threading.Thread. Nor would they be in
a position to accidentally pause the current interpreter (in the current
thread) while their interpreter executes.

The idea is rejected because the benefit is small and the cost is high.
The difference from the capability in the C-API would be potentially
confusing. The implicit creation of threads is magical. The early
creation of threads is potentially wasteful. The inability to run
arbitrary interpreters in an existing thread would prevent some valid
use cases, frustrating users. Tying interpreters to threads would
require extra runtime modifications. It would also make the module's
implementation overly complicated. Finally, it might not even make the
module easier to understand.

Only associate interpreters upon use

Associate interpreters with channel ends only once recv(), send(), etc.
are called.

Doing this is potentially confusing and also can lead to unexpected
races where a channel is auto-closed before it can be used in the
original (creating) interpreter.

Allow multiple simultaneous calls to Interpreter.exec()

This would make sense especially if Interpreter.exec() were to manage
new threads for you (which we've rejected). Essentially, each call would
run independently, which would be mostly fine from a narrow technical
standpoint, since each interpreter can have multiple threads.

The problem is that the interpreter has only one __main__ module and
simultaneous Interpreter.exec() calls would have to sort out sharing
__main__ or we'd have to invent a new mechanism. Neither would be simple
enough to be worth doing.

Add a "reraise" method to RunFailedError

While having __cause__ set on RunFailedError helps produce a more useful
traceback, it's less helpful when handling the original error. To help
facilitate this, we could add RunFailedError.reraise(). This method
would enable the following pattern:

    try:
        try:
            interp.exec(script)
        except RunFailedError as exc:
            exc.reraise()
    except MyException:
        ...

This would be made even simpler if there existed a __reraise__ protocol.

All that said, this is completely unnecessary. Using __cause__ is good
enough:

    try:
        try:
            interp.exec(script)
        except RunFailedError as exc:
            raise exc.__cause__
    except MyException:
        ...

Note that in extreme cases it may require a little extra boilerplate:

    try:
        try:
            interp.exec(script)
        except RunFailedError as exc:
            if exc.__cause__ is not None:
                raise exc.__cause__
            raise  # re-raise
    except MyException:
        ...

Implementation

The implementation of the PEP has 4 parts:

-   the high-level module described in this PEP (mostly a light wrapper
    around a low-level C extension
-   the low-level C extension module
-   additions to the internal C-API needed by the low-level module
-   secondary fixes/changes in the CPython runtime that facilitate the
    low-level module (among other benefits)

These are at various levels of completion, with more done the lower you
go:

-   the high-level module has been, at best, roughly implemented.
    However, fully implementing it will be almost trivial.
-   the low-level module is mostly complete. The bulk of the
    implementation was merged into master in December 2018 as the
    "_xxsubinterpreters" module (for the sake of testing multiple
    interpreters functionality). Only the exception propagation
    implementation remains to be finished, which will not require
    extensive work.
-   all necessary C-API work has been finished
-   all anticipated work in the runtime has been finished

The implementation effort for PEP 554 is being tracked as part of a
larger project aimed at improving multi-core support in CPython.
[multi-core-project]

References

-   

    mp-conn

        https://docs.python.org/3/library/multiprocessing.html#connection-objects

-   

    main-thread

        https://mail.python.org/pipermail/python-ideas/2017-September/047144.html
        https://mail.python.org/pipermail/python-dev/2017-September/149566.html

-   

    petr-c-ext

        https://mail.python.org/pipermail/import-sig/2016-June/001062.html
        https://mail.python.org/pipermail/python-ideas/2016-April/039748.html

Copyright

This document has been placed in the public domain.

CSP

    https://en.wikipedia.org/wiki/Communicating_sequential_processes
    https://github.com/futurecore/python-csp

async

    https://mail.python.org/pipermail/python-dev/2017-September/149420.html
    https://mail.python.org/pipermail/python-dev/2017-September/149585.html

benefits

    https://mail.python.org/pipermail/python-ideas/2017-September/047122.html

bug-rate

    https://mail.python.org/pipermail/python-ideas/2017-September/047094.html

c-api

    https://docs.python.org/3/c-api/init.html#sub-interpreter-support

cache-line-ping-pong

    https://mail.python.org/archives/list/python-dev@python.org/message/3HVRFWHDMWPNR367GXBILZ4JJAUQ2STZ/

caveats

    https://docs.python.org/3/c-api/init.html#bugs-and-caveats

cryptography

    https://github.com/pyca/cryptography/issues/2299

fifo

    https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Pipe
    https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Queue
    https://docs.python.org/3/library/queue.html#module-queue
    http://stackless.readthedocs.io/en/2.7-slp/library/stackless/channels.html
    https://golang.org/doc/effective_go.html#sharing
    http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/

gilstate

    https://bugs.python.org/issue10915 http://bugs.python.org/issue15751

jython

    https://mail.python.org/pipermail/python-ideas/2017-May/045771.html

multi-core-project

    https://github.com/ericsnowcurrently/multi-core-python

reset_globals

    https://mail.python.org/pipermail/python-dev/2017-September/149545.html

result-object

    https://mail.python.org/pipermail/python-dev/2017-September/149562.html