PEP: 555 Title: Context-local variables (contextvars) Version:
$Revision$ Last-Modified: $Date$ Author: Koos Zevenhoven Status:
Withdrawn Type: Standards Track Content-Type: text/x-rst Created:
06-Sep-2017 Python-Version: 3.7 Post-History: 06-Sep-2017

Abstract

Sometimes, in special cases, it is desired that code can pass
information down the function call chain to the callees without having
to explicitly pass the information as arguments to each function in the
call chain. This proposal describes a construct which allows code to
explicitly switch in and out of a context where a certain context
variable has a given value assigned to it. This is a modern alternative
to some uses of things like global variables in traditional
single-threaded (or thread-unsafe) code and of thread-local storage in
traditional concurrency-unsafe code (single- or multi-threaded). In
particular, the proposed mechanism can also be used with more modern
concurrent execution mechanisms such as asynchronously executed
coroutines, without the concurrently executed call chains interfering
with each other's contexts.

The "call chain" can consist of normal functions, awaited coroutines, or
generators. The semantics of context variable scope are equivalent in
all cases, allowing code to be refactored freely into subroutines (which
here refers to functions, sub-generators or sub-coroutines) without
affecting the semantics of context variables. Regarding implementation,
this proposal aims at simplicity and minimum changes to the CPython
interpreter and to other Python interpreters.

Rationale

Consider a modern Python call chain (or call tree), which in this
proposal refers to any chained (nested) execution of subroutines, using
any possible combinations of normal function calls, or expressions using
await or yield from. In some cases, passing necessary information down
the call chain as arguments can substantially complicate the required
function signatures, or it can even be impossible to achieve in
practice. In these cases, one may search for another place to store this
information. Let us look at some historical examples.

The most naive option is to assign the value to a global variable or
similar, where the code down the call chain can access it. However, this
immediately makes the code thread-unsafe, because with multiple threads,
all threads assign to the same global variable, and another thread can
interfere at any point in the call chain. Sooner or later, someone will
probably find a reason to run the same code in parallel threads.

A somewhat less naive option is to store the information as per-thread
information in thread-local storage, where each thread has its own
"copy" of the variable which other threads cannot interfere with.
Although non-ideal, this has been the best solution in many cases.
However, thanks to generators and coroutines, the execution of the call
chain can be suspended and resumed, allowing code in other contexts to
run concurrently. Therefore, using thread-local storage is
concurrency-unsafe, because other call chains in other contexts may
interfere with the thread-local variable.

Note that in the above two historical approaches, the stored information
has the widest available scope without causing problems. For a third
solution along the same path, one would first define an equivalent of a
"thread" for asynchronous execution and concurrency. This could be seen
as the largest amount of code and nested calls that is guaranteed to be
executed sequentially without ambiguity in execution order. This might
be referred to as concurrency-local or task-local storage. In this
meaning of "task", there is no ambiguity in the order of execution of
the code within one task. (This concept of a task is close to equivalent
to a Task in asyncio, but not exactly.) In such concurrency-locals, it
is possible to pass information down the call chain to callees without
another code path interfering with the value in the background.

Common to the above approaches is that they indeed use variables with a
wide but just-narrow-enough scope. Thread-locals could also be called
thread-wide globals---in single-threaded code, they are indeed truly
global. And task-locals could be called task-wide globals, because tasks
can be very big.

The issue here is that neither global variables, thread-locals nor
task-locals are really meant to be used for this purpose of passing
information of the execution context down the call chain. Instead of the
widest possible variable scope, the scope of the variables should be
controlled by the programmer, typically of a library, to have the
desired scope---not wider. In other words, task-local variables (and
globals and thread-locals) have nothing to do with the kind of
context-bound information passing that this proposal intends to enable,
even if task-locals can be used to emulate the desired semantics.
Therefore, in the following, this proposal describes the semantics and
the outlines of an implementation for context-local variables (or
context variables, contextvars). In fact, as a side effect of this PEP,
an async framework can use the proposed feature to implement task-local
variables.

Proposal

Because the proposed semantics are not a direct extension to anything
already available in Python, this proposal is first described in terms
of semantics and API at a fairly high level. In particular, Python with
statements are heavily used in the description, as they are a good match
with the proposed semantics. However, the underlying __enter__ and
__exit__ methods correspond to functions in the lower-level
speed-optimized (C) API. For clarity of this document, the lower-level
functions are not explicitly named in the definition of the semantics.
After describing the semantics and high-level API, the implementation is
described, going to a lower level.

Semantics and higher-level API

Core concept

A context-local variable is represented by a single instance of
contextvars.Var, say cvar. Any code that has access to the cvar object
can ask for its value with respect to the current context. In the
high-level API, this value is given by the cvar.value property:

    cvar = contextvars.Var(default="the default value",
                           description="example context variable")

    assert cvar.value == "the default value"  # default still applies

    # In code examples, all ``assert`` statements should
    # succeed according to the proposed semantics.

No assignments to cvar have been applied for this context, so cvar.value
gives the default value. Assigning new values to contextvars is done in
a highly scope-aware manner:

    with cvar.assign(new_value):
        assert cvar.value is new_value
        # Any code here, or down the call chain from here, sees:
        #     cvar.value is new_value
        # unless another value has been assigned in a
        # nested context
        assert cvar.value is new_value
    # the assignment of ``cvar`` to ``new_value`` is no longer visible
    assert cvar.value == "the default value"

Here, cvar.assign(value) returns another object, namely
contextvars.Assignment(cvar, new_value). The essential part here is that
applying a context variable assignment (Assignment.__enter__) is paired
with a de-assignment (Assignment.__exit__). These operations set the
bounds for the scope of the assigned value.

Assignments to the same context variable can be nested to override the
outer assignment in a narrower context:

    assert cvar.value == "the default value"
    with cvar.assign("outer"):
        assert cvar.value == "outer"
        with cvar.assign("inner"):
            assert cvar.value == "inner"
        assert cvar.value == "outer"
    assert cvar.value == "the default value"

Also multiple variables can be assigned to in a nested manner without
affecting each other:

    cvar1 = contextvars.Var()
    cvar2 = contextvars.Var()

    assert cvar1.value is None # default is None by default
    assert cvar2.value is None

    with cvar1.assign(value1):
        assert cvar1.value is value1
        assert cvar2.value is None
        with cvar2.assign(value2):
            assert cvar1.value is value1
            assert cvar2.value is value2
        assert cvar1.value is value1
        assert cvar2.value is None
    assert cvar1.value is None
    assert cvar2.value is None

Or with more convenient Python syntax:

    with cvar1.assign(value1), cvar2.assign(value2):
        assert cvar1.value is value1
        assert cvar2.value is value2

In another context, in another thread or otherwise concurrently executed
task or code path, the context variables can have a completely different
state. The programmer thus only needs to worry about the context at
hand.

Refactoring into subroutines

Code using contextvars can be refactored into subroutines without
affecting the semantics. For instance:

    assi = cvar.assign(new_value)
    def apply():
        assi.__enter__()
    assert cvar.value == "the default value"
    apply()
    assert cvar.value is new_value
    assi.__exit__()
    assert cvar.value == "the default value"

Or similarly in an asynchronous context where await expressions are
used. The subroutine can now be a coroutine:

    assi = cvar.assign(new_value)
    async def apply():
        assi.__enter__()
    assert cvar.value == "the default value"
    await apply()
    assert cvar.value is new_value
    assi.__exit__()
    assert cvar.value == "the default value"

Or when the subroutine is a generator:

    def apply():
        yield
        assi.__enter__()

which is called using yield from apply() or with calls to next or .send.
This is discussed further in later sections.

Semantics for generators and generator-based coroutines

Generators, coroutines and async generators act as subroutines in much
the same way that normal functions do. However, they have the additional
possibility of being suspended by yield expressions. Assignment contexts
entered inside a generator are normally preserved across yields:

    def genfunc():
        with cvar.assign(new_value):
            assert cvar.value is new_value
            yield
            assert cvar.value is new_value
    g = genfunc()
    next(g)
    assert cvar.value == "the default value"
    with cvar.assign(another_value):
        next(g)

However, the outer context visible to the generator may change state
across yields:

    def genfunc():
        assert cvar.value is value2
        yield
        assert cvar.value is value1
        yield
        with cvar.assign(value3):
            assert cvar.value is value3

    with cvar.assign(value1):
        g = genfunc()
        with cvar.assign(value2):
            next(g)
        next(g)
        next(g)
        assert cvar.value is value1

Similar semantics apply to async generators defined by
async def ... yield ... ).

By default, values assigned inside a generator do not leak through
yields to the code that drives the generator. However, the assignment
contexts entered and left open inside the generator do become visible
outside the generator after the generator has finished with a
StopIteration or another exception:

    assi = cvar.assign(new_value)
    def genfunc():
        yield
        assi.__enter__():
        yield

    g = genfunc()
    assert cvar.value == "the default value"
    next(g)
    assert cvar.value == "the default value"
    next(g)  # assi.__enter__() is called here
    assert cvar.value == "the default value"
    next(g)
    assert cvar.value is new_value
    assi.__exit__()

Special functionality for framework authors

Frameworks, such as asyncio or third-party libraries, can use additional
functionality in contextvars to achieve the desired semantics in cases
which are not determined by the Python interpreter. Some of the
semantics described in this section are also afterwards used to describe
the internal implementation.

Leaking yields

Using the contextvars.leaking_yields decorator, one can choose to leak
the context through yield expressions into the outer context that drives
the generator:

    @contextvars.leaking_yields
    def genfunc():
        assert cvar.value == "outer"
        with cvar.assign("inner"):
            yield
            assert cvar.value == "inner"
        assert cvar.value == "outer"

    g = genfunc():
    with cvar.assign("outer"):
        assert cvar.value == "outer"
        next(g)
        assert cvar.value == "inner"
        next(g)
        assert cvar.value == "outer"

Capturing contextvar assignments

Using contextvars.capture(), one can capture the assignment contexts
that are entered by a block of code. The changes applied by the block of
code can then be reverted and subsequently reapplied, even in another
context:

    assert cvar1.value is None # default
    assert cvar2.value is None # default
    assi1 = cvar1.assign(value1)
    assi2 = cvar1.assign(value2)
    with contextvars.capture() as delta:
        assi1.__enter__()
        with cvar2.assign("not captured"):
            assert cvar2.value is "not captured"
        assi2.__enter__()
    assert cvar1.value is value2
    delta.revert()
    assert cvar1.value is None
    assert cvar2.value is None
    ...
    with cvar1.assign(1), cvar2.assign(2):
        delta.reapply()
        assert cvar1.value is value2
        assert cvar2.value == 2

However, reapplying the "delta" if its net contents include
deassignments may not be possible (see also Implementation and Open
Issues).

Getting a snapshot of context state

The function contextvars.get_local_state() returns an object
representing the applied assignments to all context-local variables in
the context where the function is called. This can be seen as equivalent
to using contextvars.capture() to capture all context changes from the
beginning of execution. The returned object supports methods .revert()
and reapply() as above.

Running code in a clean state

Although it is possible to revert all applied context changes using the
above primitives, a more convenient way to run a block of code in a
clean context is provided:

    with context_vars.clean_context():
        # here, all context vars start off with their default values
    # here, the state is back to what it was before the with block.

Implementation

This section describes to a variable level of detail how the described
semantics can be implemented. At present, an implementation aimed at
simplicity but sufficient features is described. More details will be
added later.

Alternatively, a somewhat more complicated implementation offers minor
additional features while adding some performance overhead and requiring
more code in the implementation.

Data structures and implementation of the core concept

Each thread of the Python interpreter keeps its own stack of
contextvars.Assignment objects, each having a pointer to the previous
(outer) assignment like in a linked list. The local state (also returned
by contextvars.get_local_state()) then consists of a reference to the
top of the stack and a pointer/weak reference to the bottom of the
stack. This allows efficient stack manipulations. An object produced by
contextvars.capture() is similar, but refers to only a part of the stack
with the bottom reference pointing to the top of the stack as it was in
the beginning of the capture block.

Now, the stack evolves according to the assignment __enter__ and
__exit__ methods. For example:

    cvar1 = contextvars.Var()
    cvar2 = contextvars.Var()
    # stack: []
    assert cvar1.value is None
    assert cvar2.value is None

    with cvar1.assign("outer"):
        # stack: [Assignment(cvar1, "outer")]
        assert cvar1.value == "outer"

        with cvar1.assign("inner"):
            # stack: [Assignment(cvar1, "outer"),
            #         Assignment(cvar1, "inner")]
            assert cvar1.value == "inner"

            with cvar2.assign("hello"):
                # stack: [Assignment(cvar1, "outer"),
                #         Assignment(cvar1, "inner"),
                #         Assignment(cvar2, "hello")]
                assert cvar2.value == "hello"

            # stack: [Assignment(cvar1, "outer"),
            #         Assignment(cvar1, "inner")]
            assert cvar1.value == "inner"
            assert cvar2.value is None

        # stack: [Assignment(cvar1, "outer")]
        assert cvar1.value == "outer"

    # stack: []
    assert cvar1.value is None
    assert cvar2.value is None

Getting a value from the context using cvar1.value can be implemented as
finding the topmost occurrence of a cvar1 assignment on the stack and
returning the value there, or the default value if no assignment is
found on the stack. However, this can be optimized to instead be an O(1)
operation in most cases. Still, even searching through the stack may be
reasonably fast since these stacks are not intended to grow very large.

The above description is already sufficient for implementing the core
concept. Suspendable frames require some additional attention, as
explained in the following.

Implementation of generator and coroutine semantics

Within generators, coroutines and async generators, assignments and
deassignments are handled in exactly the same way as anywhere else.
However, some changes are needed in the builtin generator methods send,
__next__, throw and close. Here is the Python equivalent of the changes
needed in send for a generator (here _old_send refers to the behavior in
Python 3.6):

    def send(self, value):
        if self.gi_contextvars is LEAK:
            # If decorated with contextvars.leaking_yields.
            # Nothing needs to be done to leak context through yields :)
            return self._old_send(value)
        try:
            with contextvars.capture() as delta:
                if self.gi_contextvars:
                    # non-zero captured content from previous iteration
                    self.gi_contextvars.reapply()
                ret = self._old_send(value)
        except Exception:
            raise  # back to the calling frame (e.g. StopIteration)
        else:
            # suspending, revert context changes but save them for later
            delta.revert()
            self.gi_contextvars = delta
        return ret

The corresponding modifications to the other methods is essentially
identical. The same applies to coroutines and async generators.

For code that does not use contextvars, the additions are O(1) and
essentially reduce to a couple of pointer comparisons. For code that
does use contextvars, the additions are still O(1) in most cases.

More on implementation

The rest of the functionality, including contextvars.leaking_yields,
contextvars.capture(), contextvars.get_local_state() and
contextvars.clean_context() are in fact quite straightforward to
implement, but their implementation will be discussed further in later
versions of this proposal. Caching of assigned values is somewhat more
complicated, and will be discussed later, but it seems that most cases
should achieve O(1) complexity.

Backwards compatibility

There are no direct backwards-compatibility concerns, since a completely
new feature is proposed.

However, various traditional uses of thread-local storage may need a
smooth transition to contextvars so they can be concurrency-safe. There
are several approaches to this, including emulating task-local storage
with a little bit of help from async frameworks. A fully general
implementation cannot be provided, because the desired semantics may
depend on the design of the framework.

Another way to deal with the transition is for code to first look for a
context created using contextvars. If that fails because a new-style
context has not been set or because the code runs on an older Python
version, a fallback to thread-local storage is used.

Open Issues

Out-of-order de-assignments

In this proposal, all variable deassignments are made in the opposite
order compared to the preceding assignments. This has two useful
properties: it encourages using with statements to define assignment
scope and has a tendency to catch errors early (forgetting a .__exit__()
call often results in a meaningful error. To have this as a requirement
is beneficial also in terms of implementation simplicity and
performance. Nevertheless, allowing out-of-order context exits is not
completely out of the question, and reasonable implementation strategies
for that do exist.

Rejected Ideas

Dynamic scoping linked to subroutine scopes

The scope of value visibility should not be determined by the way the
code is refactored into subroutines. It is necessary to have
per-variable control of the assignment scope.

Acknowledgements

To be added.

References

To be added.

Copyright

This document has been placed in the public domain.