PEP 661 – Sentinel Values
- Author:
- Tal Einat <tal at python.org>
- Discussions-To:
- Discourse thread
- Status:
- Draft
- Type:
- Standards Track
- Created:
- 06-Jun-2021
- Post-History:
- 20-May-2021, 06-Jun-2021
Table of Contents
- Abstract
- Motivation
- Rationale
- Specification
- Backwards Compatibility
- How to Teach This
- Security Implications
- Reference Implementation
- Rejected Ideas
- Use
NotGiven = object()
- Add a single new sentinel value, such as
MISSING
orSentinel
- Use the existing
Ellipsis
sentinel value - Use a single-valued enum
- A sentinel class decorator
- Using class objects
- Define a recommended “standard” idiom, without supplying an implementation
- Allowing customization of repr
- Using
typing.Literal
in type annotations
- Use
- Additional Notes
- Open Issues
- Footnotes
- Copyright
TL;DR: See the Specification and Reference Implementation.
Abstract
Unique placeholder values, commonly known as “sentinel values”, are common in programming. They have many uses, such as for:
- Default values for function arguments, for when a value was not given:
def foo(value=None): ...
- Return values from functions when something is not found or unavailable:
>>> "abc".find("d") -1
- Missing data, such as NULL in relational databases or “N/A” (“not available”) in spreadsheets
Python has the special value None
, which is intended to be used as such
a sentinel value in most cases. However, sometimes an alternative sentinel
value is needed, usually when it needs to be distinct from None
since
None
is a valid value in that context. Such cases are common enough that
several idioms for implementing such sentinels have arisen over the years, but
uncommon enough that there hasn’t been a clear need for standardization.
However, the common implementations, including some in the stdlib, suffer from
several significant drawbacks.
This PEP proposes adding a utility for defining sentinel values, to be used in the stdlib and made publicly available as part of the stdlib.
Note: Changing all existing sentinels in the stdlib to be implemented this way is not deemed necessary, and whether to do so is left to the discretion of the maintainers.
Motivation
In May 2021, a question was brought up on the python-dev mailing list
[1] about how to better implement a sentinel value for
traceback.print_exception
. The existing implementation used the
following common idiom:
_sentinel = object()
However, this object has an uninformative and overly verbose repr, causing the function’s signature to be overly long and hard to read:
>>> help(traceback.print_exception)
Help on function print_exception in module traceback:
print_exception(exc, /, value=<object object at
0x000002825DF09650>, tb=<object object at 0x000002825DF09650>,
limit=None, file=None, chain=True)
Additionally, two other drawbacks of many existing sentinels were brought up in the discussion:
- Some do not have a distinct type, hence it is impossible to define clear type signatures for functions with such sentinels as default values.
- They behave unexpectedly after being copied or unpickled, due to a separate
instance being created and thus comparisons using
is
failing.
In the ensuing discussion, Victor Stinner supplied a list of currently used sentinel values in the Python standard library [2]. This showed that the need for sentinels is fairly common, that there are various implementation methods used even within the stdlib, and that many of these suffer from at least one of the three above drawbacks.
The discussion did not lead to any clear consensus on whether a standard implementation method is needed or desirable, whether the drawbacks mentioned are significant, nor which kind of implementation would be good. The author of this PEP created an issue on bugs.python.org (now a GitHub issue [3]) suggesting options for improvement, but that focused on only a single problematic aspect of a few cases, and failed to gather any support.
A poll [4] was created on discuss.python.org to get a clearer sense of the community’s opinions. After nearly two weeks, significant further, discussion, and 39 votes, the poll’s results were not conclusive. 40% had voted for “The status-quo is fine / there’s no need for consistency in this”, but most voters had voted for one or more standardized solutions. Specifically, 37% of the voters chose “Consistent use of a new, dedicated sentinel factory / class / meta-class, also made publicly available in the stdlib”.
With such mixed opinions, this PEP was created to facilitate making a decision on the subject.
While working on this PEP, iterating on various options and implementations and continuing discussions, the author has come to the opinion that a simple, good implementation available in the standard library would be worth having, both for use in the standard library itself and elsewhere.
Rationale
The criteria guiding the chosen implementation were:
- The sentinel objects should behave as expected by a sentinel object: When
compared using the
is
operator, it should always be considered identical to itself but never to any other object. - Creating a sentinel object should be a simple, straightforward one-liner.
- It should be simple to define as many distinct sentinel values as needed.
- The sentinel objects should have a clear and short repr.
- It should be possible to use clear type signatures for sentinels.
- The sentinel objects should behave correctly after copying and/or unpickling.
- Such sentinels should work when using CPython 3.x and PyPy3, and ideally also with other implementations of Python.
- As simple and straightforward as possible, in implementation and especially in use. Avoid this becoming one more special thing to learn when learning Python. It should be easy to find and use when needed, and obvious enough when reading code that one would normally not feel a need to look up its documentation.
With so many uses in the Python standard library [2], it would be useful to
have an implementation in the standard library, since the stdlib cannot use
implementations of sentinel objects available elsewhere (such as the
sentinels
[5] or sentinel
[6] PyPI packages).
After researching existing idioms and implementations, and going through many different possible implementations, an implementation was written which meets all of these criteria (see Reference Implementation).
Specification
A new Sentinel
class will be added to a new sentinellib
module.
>>> from sentinellib import Sentinel
>>> MISSING = Sentinel('MISSING')
>>> MISSING
MISSING
Checking if a value is such a sentinel should be done using the is
operator, as is recommended for None
. Equality checks using ==
will
also work as expected, returning True
only when the object is compared
with itself. Identity checks such as if value is MISSING:
should usually
be used rather than boolean checks such as if value:
or if not value:
.
Sentinel instances are “truthy”, i.e. boolean evaluation will result in
True
. This parallels the default for arbitrary classes, as well as the
boolean value of Ellipsis
. This is unlike None
, which is “falsy”.
The names of sentinels are unique within each module. When calling
Sentinel()
in a module where a sentinel with that name was already
defined, the existing sentinel with that name will be returned. Sentinels
with the same name defined in different modules will be distinct from each
other.
Creating a copy of a sentinel object, such as by using copy.copy()
or by
pickling and unpickling, will return the same object.
Sentinel()
will also accept a single optional argument, module_name
.
This should normally not need to be supplied, as Sentinel()
will usually
be able to recognize the module in which it was called. module_name
should be supplied only in unusual cases when this automatic recognition does
not work as intended, such as perhaps when using Jython or IronPython. This
parallels the designs of Enum
and namedtuple
. For more details, see
PEP 435.
The Sentinel
class may not be sub-classed, to avoid the greater complexity
of supporting subclassing.
Ordering comparisons are undefined for sentinel objects.
Typing
To make usage of sentinels clear and simple in typed Python code, we propose to amend the type system with a special case for sentinel objects.
Sentinel objects may be used in
type expressions, representing themselves.
This is similar to how None
is handled in the existing type system. For
example:
from sentinels import Sentinel
MISSING = Sentinel('MISSING')
def foo(value: int | MISSING = MISSING) -> int:
...
More formally, type checkers should recognize sentinel creations of the form
NAME = Sentinel('NAME')
as creating a new sentinel object. If the name
passed to the Sentinel
constructor does not match the name the object is
assigned to, type checkers should emit an error.
Sentinels defined using this syntax may be used in type expressions. They represent a fully static type that has a single member, the sentinel object itself.
Type checkers should support narrowing union types involving sentinels
using the is
and is not
operators:
from sentinels import Sentinel
from typing import assert_type
MISSING = Sentinel('MISSING')
def foo(value: int | MISSING) -> None:
if value is MISSING:
assert_type(value, MISSING)
else:
assert_type(value, int)
To support usage in type expressions, the runtime implementation
of the Sentinel
class should have the __or__
and __ror__
methods, returning typing.Union
objects.
Backwards Compatibility
This proposal should have no backwards compatibility implications.
How to Teach This
The normal types of documentation of new stdlib modules and features, namely doc-strings, module docs and a section in “What’s New”, should suffice.
Security Implications
This proposal should have no security implications.
Reference Implementation
The reference implementation is found in a dedicated GitHub repo [7]. A simplified version follows:
_registry = {}
class Sentinel:
"""Unique sentinel values."""
def __new__(cls, name, module_name=None):
name = str(name)
if module_name is None:
module_name = sys._getframemodulename(1)
if module_name is None:
module_name = __name__
registry_key = f'{module_name}-{name}'
sentinel = _registry.get(registry_key, None)
if sentinel is not None:
return sentinel
sentinel = super().__new__(cls)
sentinel._name = name
sentinel._module_name = module_name
return _registry.setdefault(registry_key, sentinel)
def __repr__(self):
return self._name
def __reduce__(self):
return (
self.__class__,
(
self._name,
self._module_name,
),
)
Rejected Ideas
Use NotGiven = object()
This suffers from all of the drawbacks mentioned in the Rationale section.
Add a single new sentinel value, such as MISSING
or Sentinel
Since such a value could be used for various things in various places, one could not always be confident that it would never be a valid value in some use cases. On the other hand, a dedicated and distinct sentinel value can be used with confidence without needing to consider potential edge-cases.
Additionally, it is useful to be able to provide a meaningful name and repr for a sentinel value, specific to the context where it is used.
Finally, this was a very unpopular option in the poll [4], with only 12% of the votes voting for it.
Use the existing Ellipsis
sentinel value
This is not the original intended use of Ellipsis, though it has become
increasingly common to use it to define empty class or function blocks instead
of using pass
.
Also, similar to a potential new single sentinel value, Ellipsis
can’t be
as confidently used in all cases, unlike a dedicated, distinct value.
Use a single-valued enum
The suggested idiom is:
class NotGivenType(Enum):
NotGiven = 'NotGiven'
NotGiven = NotGivenType.NotGiven
Besides the excessive repetition, the repr is overly long:
<NotGivenType.NotGiven: 'NotGiven'>
. A shorter repr can be defined, at
the expense of a bit more code and yet more repetition.
Finally, this option was the least popular among the nine options in the poll [4], being the only option to receive no votes.
A sentinel class decorator
The suggested idiom is:
@sentinel
class NotGivenType: pass
NotGiven = NotGivenType()
While this allows for a very simple and clear implementation of the decorator, the idiom is too verbose, repetitive, and difficult to remember.
Using class objects
Since classes are inherently singletons, using a class as a sentinel value makes sense and allows for a simple implementation.
The simplest version of this is:
class NotGiven: pass
To have a clear repr, one would need to use a meta-class:
class NotGiven(metaclass=SentinelMeta): pass
… or a class decorator:
@Sentinel
class NotGiven: pass
Using classes this way is unusual and could be confusing. The intention of code would be hard to understand without comments. It would also cause such sentinels to have some unexpected and undesirable behavior, such as being callable.
Define a recommended “standard” idiom, without supplying an implementation
Most common existing idioms have significant drawbacks. So far, no idiom has been found that is clear and concise while avoiding these drawbacks.
Also, in the poll [4] on this subject, the options for recommending an idiom were unpopular, with the highest-voted option being voted for by only 25% of the voters.
Allowing customization of repr
This was desirable to allow using this for existing sentinel values without changing their repr. However, this was eventually dropped as it wasn’t considered worth the added complexity.
Using typing.Literal
in type annotations
This was suggested by several people in discussions and is what this PEP
first went with. However, it was pointed out that this would cause potential
confusion, due to e.g. Literal["MISSING"]
referring to the string value
"MISSING"
rather than being a forward-reference to a sentinel value
MISSING
. Using the bare name was also suggested often in discussions.
This follows the precedent and well-known pattern set by None
, and has the
advantages of not requiring an import and being much shorter.
Additional Notes
- This PEP and the initial implementation are drafted in a dedicated GitHub repo [7].
- For sentinels defined in a class scope, to avoid potential name clashes,
one should use the fully-qualified name of the variable in the module. The
full name will be used as the repr. For example:
>>> class MyClass: ... NotGiven = sentinel('MyClass.NotGiven') >>> MyClass.NotGiven MyClass.NotGiven
- One should be careful when creating sentinels in a function or method, since sentinels with the same name created by code in the same module will be identical. If distinct sentinel objects are needed, make sure to use distinct names.
- There is no single desirable value for the “truthiness” of sentinels, i.e.
their boolean value. It is sometimes useful for the boolean value to be
True
, and sometimesFalse
. Of the built-in sentinels in Python,None
evaluates toFalse
, whileEllipsis
(a.k.a....
) evaluates toTrue
. The desire for this to be set as needed came up in discussions as well. - The boolean value of
NotImplemented
isTrue
, but using this is deprecated since Python 3.9 (doing so generates a deprecation warning.) This deprecation is due to issues specific toNotImplemented
, as described in bpo-35712 [8]. - To define multiple, related sentinel values, possibly with a defined
ordering among them, one should instead use
Enum
or something similar. - There was a discussion on the typing-sig mailing list [9] about the typing for these sentinels, where different options were discussed.
Open Issues
- Is adding a new stdlib module the right way to go? I could not find any existing module which seems like a logical place for this. However, adding new stdlib modules should be done judiciously, so perhaps choosing an existing module would be preferable even if it is not a perfect fit?
Footnotes
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
Source: https://github.com/python/peps/blob/main/peps/pep-0661.rst
Last modified: 2025-02-01 08:55:40 GMT