PEP: 567 Title: Context Variables Version: $Revision$ Last-Modified:
$Date$ Author: Yury Selivanov <yury@edgedb.com> Status: Final Type:
Standards Track Content-Type: text/x-rst Created: 12-Dec-2017
Python-Version: 3.7 Post-History: 12-Dec-2017, 28-Dec-2017, 16-Jan-2018

Abstract

This PEP proposes a new contextvars module and a set of new CPython C
APIs to support context variables. This concept is similar to
thread-local storage (TLS), but, unlike TLS, it also allows correctly
keeping track of values per asynchronous task, e.g. asyncio.Task.

This proposal is a simplified version of PEP 550. The key difference is
that this PEP is concerned only with solving the case for asynchronous
tasks, not for generators. There are no proposed modifications to any
built-in types or to the interpreter.

This proposal is not strictly related to Python Context Managers.
Although it does provide a mechanism that can be used by Context
Managers to store their state.

API Design and Implementation Revisions

In Python 3.7.1 the signatures of all context variables C APIs were
changed to use PyObject * pointers instead of PyContext *,
PyContextVar *, and PyContextToken *, e.g.:

    // in 3.7.0:
    PyContext *PyContext_New(void);

    // in 3.7.1+:
    PyObject *PyContext_New(void);

See[1] for more details. The C API section of this PEP was updated to
reflect the change.

Rationale

Thread-local variables are insufficient for asynchronous tasks that
execute concurrently in the same OS thread. Any context manager that
saves and restores a context value using threading.local() will have its
context values bleed to other code unexpectedly when used in async/await
code.

A few examples where having a working context local storage for
asynchronous code is desirable:

-   Context managers like decimal contexts and numpy.errstate.
-   Request-related data, such as security tokens and request data in
    web applications, language context for gettext, etc.
-   Profiling, tracing, and logging in large code bases.

Introduction

The PEP proposes a new mechanism for managing context variables. The key
classes involved in this mechanism are contextvars.Context and
contextvars.ContextVar. The PEP also proposes some policies for using
the mechanism around asynchronous tasks.

The proposed mechanism for accessing context variables uses the
ContextVar class. A module (such as decimal) that wishes to use the new
mechanism should:

-   declare a module-global variable holding a ContextVar to serve as a
    key;
-   access the current value via the get() method on the key variable;
-   modify the current value via the set() method on the key variable.

The notion of "current value" deserves special consideration: different
asynchronous tasks that exist and execute concurrently may have
different values for the same key. This idea is well known from
thread-local storage but in this case the locality of the value is not
necessarily bound to a thread. Instead, there is the notion of the
"current Context" which is stored in thread-local storage. Manipulation
of the current context is the responsibility of the task framework, e.g.
asyncio.

A Context is a mapping of ContextVar objects to their values. The
Context itself exposes the abc.Mapping interface (not
abc.MutableMapping!), so it cannot be modified directly. To set a new
value for a context variable in a Context object, the user needs to:

-   make the Context object "current" using the Context.run() method;
-   use ContextVar.set() to set a new value for the context variable.

The ContextVar.get() method looks for the variable in the current
Context object using self as a key.

It is not possible to get a direct reference to the current Context
object, but it is possible to obtain a shallow copy of it using the
contextvars.copy_context() function. This ensures that the caller of
Context.run() is the sole owner of its Context object.

Specification

A new standard library module contextvars is added with the following
APIs:

1.  The copy_context() -> Context function is used to get a copy of the
    current Context object for the current OS thread.
2.  The ContextVar class to declare and access context variables.
3.  The Context class encapsulates context state. Every OS thread stores
    a reference to its current Context instance. It is not possible to
    control that reference directly. Instead, the
    Context.run(callable, *args, **kwargs) method is used to run Python
    code in another context.

contextvars.ContextVar

The ContextVar class has the following constructor signature:
ContextVar(name, *, default=_NO_DEFAULT). The name parameter is used for
introspection and debug purposes, and is exposed as a read-only
ContextVar.name attribute. The default parameter is optional. Example:

    # Declare a context variable 'var' with the default value 42.
    var = ContextVar('var', default=42)

(The _NO_DEFAULT is an internal sentinel object used to detect if the
default value was provided.)

ContextVar.get(default=_NO_DEFAULT) returns a value for the context
variable for the current Context:

    # Get the value of `var`.
    var.get()

If there is no value for the variable in the current context,
ContextVar.get() will:

-   return the value of the default argument of the get() method, if
    provided; or
-   return the default value for the context variable, if provided; or
-   raise a LookupError.

ContextVar.set(value) -> Token is used to set a new value for the
context variable in the current Context:

    # Set the variable 'var' to 1 in the current context.
    var.set(1)

ContextVar.reset(token) is used to reset the variable in the current
context to the value it had before the set() operation that created the
token (or to remove the variable if it was not set):

    # Assume: var.get(None) is None

    # Set 'var' to 1:
    token = var.set(1)
    try:
        # var.get() == 1
    finally:
        var.reset(token)

    # After reset: var.get(None) is None,
    # i.e. 'var' was removed from the current context.

The ContextVar.reset() method raises:

-   a ValueError if it is called with a token object created by another
    variable;
-   a ValueError if the current Context object does not match the one
    where the token object was created;
-   a RuntimeError if the token object has already been used once to
    reset the variable.

contextvars.Token

contextvars.Token is an opaque object that should be used to restore the
ContextVar to its previous value, or to remove it from the context if
the variable was not set before. It can be created only by calling
ContextVar.set().

For debug and introspection purposes it has:

-   a read-only attribute Token.var pointing to the variable that
    created the token;
-   a read-only attribute Token.old_value set to the value the variable
    had before the set() call, or to Token.MISSING if the variable
    wasn't set before.

contextvars.Context

Context object is a mapping of context variables to values.

Context() creates an empty context. To get a copy of the current Context
for the current OS thread, use the contextvars.copy_context() method:

    ctx = contextvars.copy_context()

To run Python code in some Context, use Context.run() method:

    ctx.run(function)

Any changes to any context variables that function causes will be
contained in the ctx context:

    var = ContextVar('var')
    var.set('spam')

    def main():
        # 'var' was set to 'spam' before
        # calling 'copy_context()' and 'ctx.run(main)', so:
        # var.get() == ctx[var] == 'spam'

        var.set('ham')

        # Now, after setting 'var' to 'ham':
        # var.get() == ctx[var] == 'ham'

    ctx = copy_context()

    # Any changes that the 'main' function makes to 'var'
    # will be contained in 'ctx'.
    ctx.run(main)

    # The 'main()' function was run in the 'ctx' context,
    # so changes to 'var' are contained in it:
    # ctx[var] == 'ham'

    # However, outside of 'ctx', 'var' is still set to 'spam':
    # var.get() == 'spam'

Context.run() raises a RuntimeError when called on the same context
object from more than one OS thread, or when called recursively.

Context.copy() returns a shallow copy of the context object.

Context objects implement the collections.abc.Mapping ABC. This can be
used to introspect contexts:

    ctx = contextvars.copy_context()

    # Print all context variables and their values in 'ctx':
    print(ctx.items())

    # Print the value of 'some_variable' in context 'ctx':
    print(ctx[some_variable])

Note that all Mapping methods, including Context.__getitem__ and
Context.get, ignore default values for context variables (i.e.
ContextVar.default). This means that for a variable var that was created
with a default value and was not set in the context:

-   context[var] raises a KeyError,
-   var in context returns False,
-   the variable isn't included in context.items(), etc.

asyncio

asyncio uses Loop.call_soon(), Loop.call_later(), and Loop.call_at() to
schedule the asynchronous execution of a function. asyncio.Task uses
call_soon() to run the wrapped coroutine.

We modify Loop.call_{at,later,soon} and Future.add_done_callback() to
accept the new optional context keyword-only argument, which defaults to
the current context:

    def call_soon(self, callback, *args, context=None):
        if context is None:
            context = contextvars.copy_context()

        # ... some time later
        context.run(callback, *args)

Tasks in asyncio need to maintain their own context that they inherit
from the point they were created at. asyncio.Task is modified as
follows:

    class Task:
        def __init__(self, coro):
            ...
            # Get the current context snapshot.
            self._context = contextvars.copy_context()
            self._loop.call_soon(self._step, context=self._context)

        def _step(self, exc=None):
            ...
            # Every advance of the wrapped coroutine is done in
            # the task's context.
            self._loop.call_soon(self._step, context=self._context)
            ...

Implementation

This section explains high-level implementation details in pseudo-code.
Some optimizations are omitted to keep this section short and clear.

The Context mapping is implemented using an immutable dictionary. This
allows for a O(1) implementation of the copy_context() function. The
reference implementation implements the immutable dictionary using Hash
Array Mapped Tries (HAMT); see PEP 550 for analysis of HAMT
performance[2].

For the purposes of this section, we implement an immutable dictionary
using a copy-on-write approach and the built-in dict type:

    class _ContextData:

        def __init__(self):
            self._mapping = dict()

        def __getitem__(self, key):
            return self._mapping[key]

        def __contains__(self, key):
            return key in self._mapping

        def __len__(self):
            return len(self._mapping)

        def __iter__(self):
            return iter(self._mapping)

        def set(self, key, value):
            copy = _ContextData()
            copy._mapping = self._mapping.copy()
            copy._mapping[key] = value
            return copy

        def delete(self, key):
            copy = _ContextData()
            copy._mapping = self._mapping.copy()
            del copy._mapping[key]
            return copy

Every OS thread has a reference to the current Context object:

    class PyThreadState:
        context: Context

contextvars.Context is a wrapper around _ContextData:

    class Context(collections.abc.Mapping):

        _data: _ContextData
        _prev_context: Optional[Context]

        def __init__(self):
            self._data = _ContextData()
            self._prev_context = None

        def run(self, callable, *args, **kwargs):
            if self._prev_context is not None:
                raise RuntimeError(
                    f'cannot enter context: {self} is already entered')

            ts: PyThreadState = PyThreadState_Get()
            self._prev_context = ts.context
            try:
                ts.context = self
                return callable(*args, **kwargs)
            finally:
                ts.context = self._prev_context
                self._prev_context = None

        def copy(self):
            new = Context()
            new._data = self._data
            return new

        # Implement abstract Mapping.__getitem__
        def __getitem__(self, var):
            return self._data[var]

        # Implement abstract Mapping.__contains__
        def __contains__(self, var):
            return var in self._data

        # Implement abstract Mapping.__len__
        def __len__(self):
            return len(self._data)

        # Implement abstract Mapping.__iter__
        def __iter__(self):
            return iter(self._data)

        # The rest of the Mapping methods are implemented
        # by collections.abc.Mapping.

contextvars.copy_context() is implemented as follows:

    def copy_context():
        ts: PyThreadState = PyThreadState_Get()
        return ts.context.copy()

contextvars.ContextVar interacts with PyThreadState.context directly:

    class ContextVar:

        def __init__(self, name, *, default=_NO_DEFAULT):
            self._name = name
            self._default = default

        @property
        def name(self):
            return self._name

        def get(self, default=_NO_DEFAULT):
            ts: PyThreadState = PyThreadState_Get()
            try:
                return ts.context[self]
            except KeyError:
                pass

            if default is not _NO_DEFAULT:
                return default

            if self._default is not _NO_DEFAULT:
                return self._default

            raise LookupError

        def set(self, value):
            ts: PyThreadState = PyThreadState_Get()

            data: _ContextData = ts.context._data
            try:
                old_value = data[self]
            except KeyError:
                old_value = Token.MISSING

            updated_data = data.set(self, value)
            ts.context._data = updated_data
            return Token(ts.context, self, old_value)

        def reset(self, token):
            if token._used:
                raise RuntimeError("Token has already been used once")

            if token._var is not self:
                raise ValueError(
                    "Token was created by a different ContextVar")

            ts: PyThreadState = PyThreadState_Get()
            if token._context is not ts.context:
                raise ValueError(
                    "Token was created in a different Context")

            if token._old_value is Token.MISSING:
                ts.context._data = ts.context._data.delete(token._var)
            else:
                ts.context._data = ts.context._data.set(token._var,
                                                        token._old_value)

            token._used = True

Note that the in the reference implementation, ContextVar.get() has an
internal cache for the most recent value, which allows to bypass a hash
lookup. This is similar to the optimization the decimal module
implements to retrieve its context from PyThreadState_GetDict(). See PEP
550 which explains the implementation of the cache in great detail.

The Token class is implemented as follows:

    class Token:

        MISSING = object()

        def __init__(self, context, var, old_value):
            self._context = context
            self._var = var
            self._old_value = old_value
            self._used = False

        @property
        def var(self):
            return self._var

        @property
        def old_value(self):
            return self._old_value

Summary of the New APIs

Python API

1.  A new contextvars module with ContextVar, Context, and Token
    classes, and a copy_context() function.
2.  asyncio.Loop.call_at(), asyncio.Loop.call_later(),
    asyncio.Loop.call_soon(), and asyncio.Future.add_done_callback() run
    callback functions in the context they were called in. A new context
    keyword-only parameter can be used to specify a custom context.
3.  asyncio.Task is modified internally to maintain its own context.

C API

1.  PyObject * PyContextVar_New(char *name, PyObject *default): create a
    ContextVar object. The default argument can be NULL, which means
    that the variable has no default value.

2.  int PyContextVar_Get(PyObject *, PyObject *default_value, PyObject **value):
    return -1 if an error occurs during the lookup, 0 otherwise. If a
    value for the context variable is found, it will be set to the value
    pointer. Otherwise, value will be set to default_value when it is
    not NULL. If default_value is NULL, value will be set to the default
    value of the variable, which can be NULL too. value is always a new
    reference.

3.  PyObject * PyContextVar_Set(PyObject *, PyObject *): set the value
    of the variable in the current context.

4.  PyContextVar_Reset(PyObject *, PyObject *): reset the value of the
    context variable.

5.  PyObject * PyContext_New(): create a new empty context.

6.  PyObject * PyContext_Copy(PyObject *): return a shallow copy of the
    passed context object.

7.  PyObject * PyContext_CopyCurrent(): get a copy of the current
    context.

8.  int PyContext_Enter(PyObject *) and int PyContext_Exit(PyObject *)
    allow to set and restore the context for the current OS thread. It
    is required to always restore the previous context:

        PyObject *old_ctx = PyContext_Copy();
        if (old_ctx == NULL) goto error;

        if (PyContext_Enter(new_ctx)) goto error;

        // run some code

        if (PyContext_Exit(old_ctx)) goto error;

Rejected Ideas

Replicating threading.local() interface

Please refer to PEP 550 where this topic is covered in detail:[3].

Replacing Token with ContextVar.unset()

The Token API allows to get around having a ContextVar.unset() method,
which is incompatible with chained contexts design of PEP 550. Future
compatibility with PEP 550 is desired in case there is demand to support
context variables in generators and asynchronous generators.

The Token API also offers better usability: the user does not have to
special-case absence of a value. Compare:

    token = cv.set(new_value)
    try:
        # cv.get() is new_value
    finally:
        cv.reset(token)

with:

    _deleted = object()
    old = cv.get(default=_deleted)
    try:
        cv.set(blah)
        # code
    finally:
        if old is _deleted:
            cv.unset()
        else:
            cv.set(old)

Having Token.reset() instead of ContextVar.reset()

Nathaniel Smith suggested to implement the ContextVar.reset() method
directly on the Token class, so instead of:

    token = var.set(value)
    # ...
    var.reset(token)

we would write:

    token = var.set(value)
    # ...
    token.reset()

Having Token.reset() would make it impossible for a user to attempt to
reset a variable with a token object created by another variable.

This proposal was rejected for the reason of ContextVar.reset() being
clearer to the human reader of the code which variable is being reset.

Making Context objects picklable

Proposed by Antoine Pitrou, this could enable transparent cross-process
use of Context objects, so the Offloading execution to other threads
example would work with a ProcessPoolExecutor too.

Enabling this is problematic because of the following reasons:

1.  ContextVar objects do not have __module__ and __qualname__
    attributes, making straightforward pickling of Context objects
    impossible. This is solvable by modifying the API to either auto
    detect the module where a context variable is defined, or by adding
    a new keyword-only "module" parameter to ContextVar constructor.
2.  Not all context variables refer to picklable objects. Making a
    ContextVar picklable must be an opt-in.

Given the time frame of the Python 3.7 release schedule it was decided
to defer this proposal to Python 3.8.

Making Context a MutableMapping

Making the Context class implement the abc.MutableMapping interface
would mean that it is possible to set and unset variables using
Context[var] = value and del Context[var] operations.

This proposal was deferred to Python 3.8+ because of the following:

1.  If in Python 3.8 it is decided that generators should support
    context variables (see PEP 550 and PEP 568), then Context would be
    transformed into a chain-map of context variables mappings (as every
    generator would have its own mapping). That would make mutation
    operations like Context.__delitem__ confusing, as they would operate
    only on the topmost mapping of the chain.

2.  Having a single way of mutating the context (ContextVar.set() and
    ContextVar.reset() methods) makes the API more straightforward.

    For example, it would be non-obvious why the below code fragment
    does not work as expected:

        var = ContextVar('var')

        ctx = copy_context()
        ctx[var] = 'value'
        print(ctx[var])  # Prints 'value'

        print(var.get())  # Raises a LookupError

    While the following code would work:

        ctx = copy_context()

        def func():
            ctx[var] = 'value'

            # Contrary to the previous example, this would work
            # because 'func()' is running within 'ctx'.
            print(ctx[var])
            print(var.get())

        ctx.run(func)

3.  If Context was mutable it would mean that context variables could be
    mutated separately (or concurrently) from the code that runs within
    the context. That would be similar to obtaining a reference to a
    running Python frame object and modifying its f_locals from another
    OS thread. Having one single way to assign values to context
    variables makes contexts conceptually simpler and more predictable,
    while keeping the door open for future performance optimizations.

Having initial values for ContextVars

Nathaniel Smith proposed to have a required initial_value keyword-only
argument for the ContextVar constructor.

The main argument against this proposal is that for some types there is
simply no sensible "initial value" except None. E.g. consider a web
framework that stores the current HTTP request object in a context
variable. With the current semantics it is possible to create a context
variable without a default value:

    # Framework:
    current_request: ContextVar[Request] = \
        ContextVar('current_request')


    # Later, while handling an HTTP request:
    request: Request = current_request.get()

    # Work with the 'request' object:
    return request.method

Note that in the above example there is no need to check if request is
None. It is simply expected that the framework always sets the
current_request variable, or it is a bug (in which case
current_request.get() would raise a LookupError).

If, however, we had a required initial value, we would have to guard
against None values explicitly:

    # Framework:
    current_request: ContextVar[Optional[Request]] = \
        ContextVar('current_request', initial_value=None)


    # Later, while handling an HTTP request:
    request: Optional[Request] = current_request.get()

    # Check if the current request object was set:
    if request is None:
        raise RuntimeError

    # Work with the 'request' object:
    return request.method

Moreover, we can loosely compare context variables to regular Python
variables and to threading.local() objects. Both of them raise errors on
failed lookups (NameError and AttributeError respectively).

Backwards Compatibility

This proposal preserves 100% backwards compatibility.

Libraries that use threading.local() to store context-related values,
currently work correctly only for synchronous code. Switching them to
use the proposed API will keep their behavior for synchronous code
unmodified, but will automatically enable support for asynchronous code.

Examples

Converting code that uses threading.local()

A typical code fragment that uses threading.local() usually looks like
the following:

    class PrecisionStorage(threading.local):
        # Subclass threading.local to specify a default value.
        value = 0.0

    precision = PrecisionStorage()

    # To set a new precision:
    precision.value = 0.5

    # To read the current precision:
    print(precision.value)

Such code can be converted to use the contextvars module:

    precision = contextvars.ContextVar('precision', default=0.0)

    # To set a new precision:
    precision.set(0.5)

    # To read the current precision:
    print(precision.get())

Offloading execution to other threads

It is possible to run code in a separate OS thread using a copy of the
current thread context:

    executor = ThreadPoolExecutor()
    current_context = contextvars.copy_context()

    executor.submit(current_context.run, some_function)

Reference Implementation

The reference implementation can be found here:[4]. See also issue
32436[5].

Acceptance

PEP 567 was accepted by Guido on Monday, January 22, 2018[6]. The
reference implementation was merged on the same day.

References

Acknowledgments

I thank Guido van Rossum, Nathaniel Smith, Victor Stinner, Elvis
Pranskevichus, Alyssa Coghlan, Antoine Pitrou, INADA Naoki, Paul Moore,
Eric Snow, Greg Ewing, and many others for their feedback, ideas, edits,
criticism, code reviews, and discussions around this PEP.

Copyright

This document has been placed in the public domain.

[1] https://bugs.python.org/issue34762

[2] 550#appendix-hamt-performance-analysis

[3] 550#replication-of-threading-local-interface

[4] https://github.com/python/cpython/pull/5027

[5] https://bugs.python.org/issue32436

[6] https://mail.python.org/pipermail/python-dev/2018-January/151878.html