PEP: 661 Title: Sentinel Values Author: Tal Einat <tal@python.org>
Discussions-To:
https://discuss.python.org/t/pep-661-sentinel-values/9126 Status: Draft
Type: Standards Track Content-Type: text/x-rst Created: 06-Jun-2021
Post-History: 20-May-2021, 06-Jun-2021

TL;DR: See the Specification and Reference Implementation.

Abstract

Unique placeholder values, commonly known as "sentinel values", are
common in programming. They have many uses, such as for:

-   Default values for function arguments, for when a value was not
    given:

        def foo(value=None):
            ...

-   Return values from functions when something is not found or
    unavailable:

        >>> "abc".find("d")
        -1

-   Missing data, such as NULL in relational databases or "N/A" ("not
    available") in spreadsheets

Python has the special value None, which is intended to be used as such
a sentinel value in most cases. However, sometimes an alternative
sentinel value is needed, usually when it needs to be distinct from None
since None is a valid value in that context. Such cases are common
enough that several idioms for implementing such sentinels have arisen
over the years, but uncommon enough that there hasn't been a clear need
for standardization. However, the common implementations, including some
in the stdlib, suffer from several significant drawbacks.

This PEP proposes adding a utility for defining sentinel values, to be
used in the stdlib and made publicly available as part of the stdlib.

Note: Changing all existing sentinels in the stdlib to be implemented
this way is not deemed necessary, and whether to do so is left to the
discretion of the maintainers.

Motivation

In May 2021, a question was brought up on the python-dev mailing list
[1] about how to better implement a sentinel value for
traceback.print_exception. The existing implementation used the
following common idiom:

    _sentinel = object()

However, this object has an uninformative and overly verbose repr,
causing the function's signature to be overly long and hard to read:

    >>> help(traceback.print_exception)
    Help on function print_exception in module traceback:

    print_exception(exc, /, value=<object object at
    0x000002825DF09650>, tb=<object object at 0x000002825DF09650>,
    limit=None, file=None, chain=True)

Additionally, two other drawbacks of many existing sentinels were
brought up in the discussion:

1.  Some do not have a distinct type, hence it is impossible to define
    clear type signatures for functions with such sentinels as default
    values.
2.  They behave unexpectedly after being copied or unpickled, due to a
    separate instance being created and thus comparisons using is
    failing.

In the ensuing discussion, Victor Stinner supplied a list of currently
used sentinel values in the Python standard library[2]. This showed that
the need for sentinels is fairly common, that there are various
implementation methods used even within the stdlib, and that many of
these suffer from at least one of the three above drawbacks.

The discussion did not lead to any clear consensus on whether a standard
implementation method is needed or desirable, whether the drawbacks
mentioned are significant, nor which kind of implementation would be
good. The author of this PEP created an issue on bugs.python.org (now a
GitHub issue[3]) suggesting options for improvement, but that focused on
only a single problematic aspect of a few cases, and failed to gather
any support.

A poll[4] was created on discuss.python.org to get a clearer sense of
the community's opinions. After nearly two weeks, significant further,
discussion, and 39 votes, the poll's results were not conclusive. 40%
had voted for "The status-quo is fine / there’s no need for consistency
in this", but most voters had voted for one or more standardized
solutions. Specifically, 37% of the voters chose "Consistent use of a
new, dedicated sentinel factory / class / meta-class, also made publicly
available in the stdlib".

With such mixed opinions, this PEP was created to facilitate making a
decision on the subject.

While working on this PEP, iterating on various options and
implementations and continuing discussions, the author has come to the
opinion that a simple, good implementation available in the standard
library would be worth having, both for use in the standard library
itself and elsewhere.

Rationale

The criteria guiding the chosen implementation were:

1.  The sentinel objects should behave as expected by a sentinel object:
    When compared using the is operator, it should always be considered
    identical to itself but never to any other object.
2.  Creating a sentinel object should be a simple, straightforward
    one-liner.
3.  It should be simple to define as many distinct sentinel values as
    needed.
4.  The sentinel objects should have a clear and short repr.
5.  It should be possible to use clear type signatures for sentinels.
6.  The sentinel objects should behave correctly after copying and/or
    unpickling.
7.  Such sentinels should work when using CPython 3.x and PyPy3, and
    ideally also with other implementations of Python.
8.  As simple and straightforward as possible, in implementation and
    especially in use. Avoid this becoming one more special thing to
    learn when learning Python. It should be easy to find and use when
    needed, and obvious enough when reading code that one would normally
    not feel a need to look up its documentation.

With so many uses in the Python standard library[5], it would be useful
to have an implementation in the standard library, since the stdlib
cannot use implementations of sentinel objects available elsewhere (such
as the sentinels[6] or sentinel[7] PyPI packages).

After researching existing idioms and implementations, and going through
many different possible implementations, an implementation was written
which meets all of these criteria (see Reference Implementation).

Specification

A new Sentinel class will be added to a new sentinels module. Its
initializer will accept a single required argument, the name of the
sentinel object, and three optional arguments: the repr of the object,
its boolean value, and the name of its module:

    >>> from sentinels import Sentinel
    >>> NotGiven = Sentinel('NotGiven')
    >>> NotGiven
    <NotGiven>
    >>> MISSING = Sentinel('MISSING', repr='mymodule.MISSING')
    >>> MISSING
    mymodule.MISSING
    >>> MEGA = Sentinel('MEGA',
                        repr='<MEGA>',
                        bool_value=False,
                        module_name='mymodule')
    <MEGA>

Checking if a value is such a sentinel should be done using the is
operator, as is recommended for None. Equality checks using == will also
work as expected, returning True only when the object is compared with
itself. Identity checks such as if value is MISSING: should usually be
used rather than boolean checks such as if value: or if not value:.

Sentinel instances are truthy by default, unlike None. This parallels
the default for arbitrary classes, as well as the boolean value of
Ellipsis.

The names of sentinels are unique within each module. When calling
Sentinel() in a module where a sentinel with that name was already
defined, the existing sentinel with that name will be returned.
Sentinels with the same name in different modules will be distinct from
each other.

Creating a copy of a sentinel object, such as by using copy.copy() or by
pickling and unpickling, will return the same object.

Type annotations for sentinel values should use
Literal[<sentinel_object>]. For example:

    def foo(value: int | Literal[MISSING] = MISSING) -> int:
        ...

The module_name optional argument should normally not need to be
supplied, as Sentinel() will usually be able to recognize the module in
which it was called. module_name should be supplied only in unusual
cases when this automatic recognition does not work as intended, such as
perhaps when using Jython or IronPython. This parallels the designs of
Enum and namedtuple. For more details, see PEP 435.

The Sentinel class may not be sub-classed, to avoid overly-clever uses
based on it, such as attempts to use it as a base for implementing
singletons. It is considered important that the addition of Sentinel to
the stdlib should add minimal complexity.

Ordering comparisons are undefined for sentinel objects.

Backwards Compatibility

While not breaking existing code, adding a new "sentinels" stdlib module
could cause some confusion with regard to existing modules named
"sentinels", and specifically with the "sentinels" package on PyPI.

The existing "sentinels" package on PyPI[8] appears to be abandoned,
with the latest release being made on Aug. 2016. Therefore, using this
name for a new stdlib module seems reasonable.

If and when this PEP is accepted, it may be worth verifying if this has
indeed been abandoned, and if so asking to transfer ownership to the
CPython maintainers to reduce the potential for confusion with the new
stdlib module.

How to Teach This

The normal types of documentation of new stdlib modules and features,
namely doc-strings, module docs and a section in "What's New", should
suffice.

Security Implications

This proposal should have no security implications.

Reference Implementation

The reference implementation is found in a dedicated GitHub repo[9]. A
simplified version follows:

    _registry = {}

    class Sentinel:
        """Unique sentinel values."""

        def __new__(cls, name, repr=None, bool_value=True, module_name=None):
            name = str(name)
            repr = str(repr) if repr else f'<{name.split(".")[-1]}>'
            bool_value = bool(bool_value)
            if module_name is None:
                try:
                    module_name = \
                        sys._getframe(1).f_globals.get('__name__', '__main__')
                except (AttributeError, ValueError):
                    module_name = __name__

            registry_key = f'{module_name}-{name}'

            sentinel = _registry.get(registry_key, None)
            if sentinel is not None:
                return sentinel

            sentinel = super().__new__(cls)
            sentinel._name = name
            sentinel._repr = repr
            sentinel._bool_value = bool_value
            sentinel._module_name = module_name

            return _registry.setdefault(registry_key, sentinel)

        def __repr__(self):
            return self._repr

        def __bool__(self):
            return self._bool_value

        def __reduce__(self):
            return (
                self.__class__,
                (
                    self._name,
                    self._repr,
                    self._module_name,
                ),
            )

Rejected Ideas

Use NotGiven = object()

This suffers from all of the drawbacks mentioned in the Rationale
section.

Add a single new sentinel value, such as MISSING or Sentinel

Since such a value could be used for various things in various places,
one could not always be confident that it would never be a valid value
in some use cases. On the other hand, a dedicated and distinct sentinel
value can be used with confidence without needing to consider potential
edge-cases.

Additionally, it is useful to be able to provide a meaningful name and
repr for a sentinel value, specific to the context where it is used.

Finally, this was a very unpopular option in the poll[10], with only 12%
of the votes voting for it.

Use the existing Ellipsis sentinel value

This is not the original intended use of Ellipsis, though it has become
increasingly common to use it to define empty class or function blocks
instead of using pass.

Also, similar to a potential new single sentinel value, Ellipsis can't
be as confidently used in all cases, unlike a dedicated, distinct value.

Use a single-valued enum

The suggested idiom is:

    class NotGivenType(Enum):
        NotGiven = 'NotGiven'
    NotGiven = NotGivenType.NotGiven

Besides the excessive repetition, the repr is overly long:
<NotGivenType.NotGiven: 'NotGiven'>. A shorter repr can be defined, at
the expense of a bit more code and yet more repetition.

Finally, this option was the least popular among the nine options in the
poll[11], being the only option to receive no votes.

A sentinel class decorator

The suggested idiom is:

    @sentinel(repr='<NotGiven>')
    class NotGivenType: pass
    NotGiven = NotGivenType()

While this allows for a very simple and clear implementation of the
decorator, the idiom is too verbose, repetitive, and difficult to
remember.

Using class objects

Since classes are inherently singletons, using a class as a sentinel
value makes sense and allows for a simple implementation.

The simplest version of this is:

    class NotGiven: pass

To have a clear repr, one would need to use a meta-class:

    class NotGiven(metaclass=SentinelMeta): pass

... or a class decorator:

    @Sentinel
    class NotGiven: pass

Using classes this way is unusual and could be confusing. The intention
of code would be hard to understand without comments. It would also
cause such sentinels to have some unexpected and undesirable behavior,
such as being callable.

Define a recommended "standard" idiom, without supplying an implementation

Most common existing idioms have significant drawbacks. So far, no idiom
has been found that is clear and concise while avoiding these drawbacks.

Also, in the poll[12] on this subject, the options for recommending an
idiom were unpopular, with the highest-voted option being voted for by
only 25% of the voters.

Additional Notes

-   This PEP and the initial implementation are drafted in a dedicated
    GitHub repo[13].

-   For sentinels defined in a class scope, to avoid potential name
    clashes, one should use the fully-qualified name of the variable in
    the module. Only the part of the name after the last period will be
    used for the default repr. For example:

        >>> class MyClass:
        ...    NotGiven = sentinel('MyClass.NotGiven')
        >>> MyClass.NotGiven
        <NotGiven>

-   One should be careful when creating sentinels in a function or
    method, since sentinels with the same name created by code in the
    same module will be identical. If distinct sentinel objects are
    needed, make sure to use distinct names.

-   There is no single desirable value for the "truthiness" of
    sentinels, i.e. their boolean value. It is sometimes useful for the
    boolean value to be True, and sometimes False. Of the built-in
    sentinels in Python, None evaluates to False, while Ellipsis (a.k.a.
    ...) evaluates to True. The desire for this to be set as needed came
    up in discussions as well.

-   The boolean value of NotImplemented is True, but using this is
    deprecated since Python 3.9 (doing so generates a deprecation
    warning.) This deprecation is due to issues specific to
    NotImplemented, as described in bpo-35712[14].

-   To define multiple, related sentinel values, possibly with a defined
    ordering among them, one should instead use Enum or something
    similar.

-   There was a discussion on the typing-sig mailing list[15] about the
    typing for these sentinels, where different options were discussed.

Open Issues

-   Is adding a new stdlib module the right way to go? I could not find
    any existing module which seems like a logical place for this.
    However, adding new stdlib modules should be done judiciously, so
    perhaps choosing an existing module would be preferable even if it
    is not a perfect fit?

Footnotes

Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.

[1] Python-Dev mailing list: The repr of a sentinel

[2] Python-Dev mailing list: "The stdlib contains tons of sentinels"

[3] bpo-44123: Make function parameter sentinel values true singletons

[4] discuss.python.org Poll: Sentinel Values in the Stdlib

[5] Python-Dev mailing list: "The stdlib contains tons of sentinels"

[6] The "sentinels" package on PyPI

[7] The "sentinel" package on PyPI

[8] sentinels package on PyPI

[9] Reference implementation at the taleinat/python-stdlib-sentinels
GitHub repo

[10] discuss.python.org Poll: Sentinel Values in the Stdlib

[11] discuss.python.org Poll: Sentinel Values in the Stdlib

[12] discuss.python.org Poll: Sentinel Values in the Stdlib

[13] Reference implementation at the taleinat/python-stdlib-sentinels
GitHub repo

[14] bpo-35712: Make NotImplemented unusable in boolean context

[15] Discussion thread about type signatures for these sentinels on the
typing-sig mailing list