PEP: 505 Title: None-aware operators Author: Mark E. Haase
<mehaase@gmail.com>, Steve Dower <steve.dower@python.org> Status:
Deferred Type: Standards Track Content-Type: text/x-rst Created:
18-Sep-2015 Python-Version: 3.8

Abstract

Several modern programming languages have so-called "null-coalescing" or
"null- aware" operators, including C#[1], Dart[2], Perl, Swift, and PHP
(starting in version 7). There are also stage 3 draft proposals for
their addition to ECMAScript (a.k.a. JavaScript)[3][4]. These operators
provide syntactic sugar for common patterns involving null references.

-   The "null-coalescing" operator is a binary operator that returns its
    left operand if it is not null. Otherwise it returns its right
    operand.
-   The "null-aware member access" operator accesses an instance member
    only if that instance is non-null. Otherwise it returns null. (This
    is also called a "safe navigation" operator.)
-   The "null-aware index access" operator accesses an element of a
    collection only if that collection is non-null. Otherwise it returns
    null. (This is another type of "safe navigation" operator.)

This PEP proposes three None-aware operators for Python, based on the
definitions and other language's implementations of those above.
Specifically:

-   The "None coalescing" binary operator ?? returns the left hand side
    if it evaluates to a value that is not None, or else it evaluates
    and returns the right hand side. A coalescing ??= augmented
    assignment operator is included.
-   The "None-aware attribute access" operator ?. ("maybe dot")
    evaluates the complete expression if the left hand side evaluates to
    a value that is not None
-   The "None-aware indexing" operator ?[] ("maybe subscript") evaluates
    the complete expression if the left hand site evaluates to a value
    that is not None

See the Grammar changes section for specifics and examples of the
required grammar changes.

See the Examples section for more realistic examples of code that could
be updated to use the new operators.

Syntax and Semantics

Specialness of None

The None object denotes the lack of a value. For the purposes of these
operators, the lack of a value indicates that the remainder of the
expression also lacks a value and should not be evaluated.

A rejected proposal was to treat any value that evaluates as "false" in
a Boolean context as not having a value. However, the purpose of these
operators is to propagate the "lack of value" state, rather than the
"false" state.

Some argue that this makes None special. We contend that None is already
special, and that using it as both the test and the result of these
operators does not change the existing semantics in any way.

See the Rejected Ideas section for discussions on alternate approaches.

Grammar changes

The following rules of the Python grammar are updated to read:

    augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' |
                '<<=' | '>>=' | '**=' | '//=' | '??=')

    power: coalesce ['**' factor]
    coalesce: atom_expr ['??' factor]
    atom_expr: ['await'] atom trailer*
    trailer: ('(' [arglist] ')' |
              '[' subscriptlist ']' |
              '?[' subscriptlist ']' |
              '.' NAME |
              '?.' NAME)

The coalesce rule

The coalesce rule provides the ?? binary operator. Unlike most binary
operators, the right-hand side is not evaluated until the left-hand side
is determined to be None.

The ?? operator binds more tightly than other binary operators as most
existing implementations of these do not propagate None values (they
will typically raise TypeError). Expressions that are known to
potentially result in None can be substituted for a default value
without needing additional parentheses.

Some examples of how implicit parentheses are placed when evaluating
operator precedence in the presence of the ?? operator:

    a, b = None, None
    def c(): return None
    def ex(): raise Exception()

    (a ?? 2 ** b ?? 3) == a ?? (2 ** (b ?? 3))
    (a * b ?? c // d) == a * (b ?? c) // d
    (a ?? True and b ?? False) == (a ?? True) and (b ?? False)
    (c() ?? c() ?? True) == True
    (True ?? ex()) == True
    (c ?? ex)() == c()

Particularly for cases such as a ?? 2 ** b ?? 3, parenthesizing the
sub-expressions any other way would result in TypeError, as int.__pow__
cannot be called with None (and the fact that the ?? operator is used at
all implies that a or b may be None). However, as usual, while
parentheses are not required they should be added if it helps improve
readability.

An augmented assignment for the ?? operator is also added. Augmented
coalescing assignment only rebinds the name if its current value is
None. If the target name already has a value, the right-hand side is not
evaluated. For example:

    a = None
    b = ''
    c = 0

    a ??= 'value'
    b ??= undefined_name
    c ??= shutil.rmtree('/')    # don't try this at home, kids

    assert a == 'value'
    assert b == ''
    assert c == 0 and any(os.scandir('/'))

The maybe-dot and maybe-subscript operators

The maybe-dot and maybe-subscript operators are added as trailers for
atoms, so that they may be used in all the same locations as the regular
operators, including as part of an assignment target (more details
below). As the existing evaluation rules are not directly embedded in
the grammar, we specify the required changes below.

Assume that the atom is always successfully evaluated. Each trailer is
then evaluated from left to right, applying its own parameter (either
its arguments, subscripts or attribute name) to produce the value for
the next trailer. Finally, if present, await is applied.

For example, await a.b(c).d[e] is currently parsed as
['await', 'a', '.b', '(c)', '.d', '[e]'] and evaluated:

    _v = a
    _v = _v.b
    _v = _v(c)
    _v = _v.d
    _v = _v[e]
    await _v

When a None-aware operator is present, the left-to-right evaluation may
be short-circuited. For example, await a?.b(c).d?[e] is evaluated:

    _v = a
    if _v is not None:
        _v = _v.b
        _v = _v(c)
        _v = _v.d
        if _v is not None:
            _v = _v[e]
    await _v

Note

await will almost certainly fail in this context, as it would in the
case where code attempts await None. We are not proposing to add a
None-aware await keyword here, and merely include it in this example for
completeness of the specification, since the atom_expr grammar rule
includes the keyword. If it were in its own rule, we would have never
mentioned it.

Parenthesised expressions are handled by the atom rule (not shown
above), which will implicitly terminate the short-circuiting behaviour
of the above transformation. For example, (a?.b ?? c).d?.e is evaluated
as:

    # a?.b
    _v = a
    if _v is not None:
        _v = _v.b

    # ... ?? c
    if _v is None:
        _v = c

    # (...).d?.e
    _v = _v.d
    if _v is not None:
        _v = _v.e

When used as an assignment target, the None-aware operations may only be
used in a "load" context. That is, a?.b = 1 and a?[b] = 1 will raise
SyntaxError. Use earlier in the expression (a?.b.c = 1) is permitted,
though unlikely to be useful unless combined with a coalescing
operation:

    (a?.b ?? d).c = 1

Reading expressions

For the maybe-dot and maybe-subscript operators, the intention is that
expressions including these operators should be read and interpreted as
for the regular versions of these operators. In "normal" cases, the end
results are going to be identical between an expression such as a?.b?[c]
and a.b[c], and just as we do not currently read "a.b" as "read
attribute b from a if it has an attribute a or else it raises
AttributeError", there is no need to read "a?.b" as "read attribute b
from a if a is not None" (unless in a context where the listener needs
to be aware of the specific behaviour).

For coalescing expressions using the ?? operator, expressions should
either be read as "or ... if None" or "coalesced with". For example, the
expression a.get_value() ?? 100 would be read "call a dot get_value or
100 if None", or "call a dot get_value coalesced with 100".

Note

Reading code in spoken text is always lossy, and so we make no attempt
to define an unambiguous way of speaking these operators. These
suggestions are intended to add context to the implications of adding
the new syntax.

Examples

This section presents some examples of common None patterns and shows
what conversion to use None-aware operators may look like.

Standard Library

Using the find-pep505.py script[5] an analysis of the Python 3.7
standard library discovered up to 678 code snippets that could be
replaced with use of one of the None-aware operators:

    $ find /usr/lib/python3.7 -name '*.py' | xargs python3.7 find-pep505.py
    <snip>
    Total None-coalescing `if` blocks: 449
    Total [possible] None-coalescing `or`: 120
    Total None-coalescing ternaries: 27
    Total Safe navigation `and`: 13
    Total Safe navigation `if` blocks: 61
    Total Safe navigation ternaries: 8

Some of these are shown below as examples before and after converting to
use the new operators.

From bisect.py:

    def insort_right(a, x, lo=0, hi=None):
        # ...
        if hi is None:
            hi = len(a)
        # ...

After updating to use the ??= augmented assignment statement:

    def insort_right(a, x, lo=0, hi=None):
        # ...
        hi ??= len(a)
        # ...

From calendar.py:

    encoding = options.encoding
    if encoding is None:
        encoding = sys.getdefaultencoding()
    optdict = dict(encoding=encoding, css=options.css)

After updating to use the ?? operator:

    optdict = dict(encoding=options.encoding ?? sys.getdefaultencoding(),
                   css=options.css)

From email/generator.py (and importantly note that there is no way to
substitute or for ?? in this situation):

    mangle_from_ = True if policy is None else policy.mangle_from_

After updating:

    mangle_from_ = policy?.mangle_from_ ?? True

From asyncio/subprocess.py:

    def pipe_data_received(self, fd, data):
        if fd == 1:
            reader = self.stdout
        elif fd == 2:
            reader = self.stderr
        else:
            reader = None
        if reader is not None:
            reader.feed_data(data)

After updating to use the ?. operator:

    def pipe_data_received(self, fd, data):
        if fd == 1:
            reader = self.stdout
        elif fd == 2:
            reader = self.stderr
        else:
            reader = None
        reader?.feed_data(data)

From asyncio/tasks.py:

    try:
        await waiter
    finally:
        if timeout_handle is not None:
            timeout_handle.cancel()

After updating to use the ?. operator:

    try:
        await waiter
    finally:
        timeout_handle?.cancel()

From ctypes/_aix.py:

    if libpaths is None:
        libpaths = []
    else:
        libpaths = libpaths.split(":")

After updating:

    libpaths = libpaths?.split(":") ?? []

From os.py:

    if entry.is_dir():
        dirs.append(name)
        if entries is not None:
            entries.append(entry)
    else:
        nondirs.append(name)

After updating to use the ?. operator:

    if entry.is_dir():
        dirs.append(name)
        entries?.append(entry)
    else:
        nondirs.append(name)

From importlib/abc.py:

    def find_module(self, fullname, path):
        if not hasattr(self, 'find_spec'):
            return None
        found = self.find_spec(fullname, path)
        return found.loader if found is not None else None

After partially updating:

    def find_module(self, fullname, path):
        if not hasattr(self, 'find_spec'):
            return None
        return self.find_spec(fullname, path)?.loader

After extensive updating (arguably excessive, though that's for the
style guides to determine):

    def find_module(self, fullname, path):
        return getattr(self, 'find_spec', None)?.__call__(fullname, path)?.loader

From dis.py:

    def _get_const_info(const_index, const_list):
        argval = const_index
        if const_list is not None:
            argval = const_list[const_index]
        return argval, repr(argval)

After updating to use the ?[] and ?? operators:

    def _get_const_info(const_index, const_list):
        argval = const_list?[const_index] ?? const_index
        return argval, repr(argval)

jsonify

This example is from a Python web crawler that uses the Flask framework
as its front-end. This function retrieves information about a web site
from a SQL database and formats it as JSON to send to an HTTP client:

    class SiteView(FlaskView):
        @route('/site/<id_>', methods=['GET'])
        def get_site(self, id_):
            site = db.query('site_table').find(id_)

            return jsonify(
                first_seen=site.first_seen.isoformat() if site.first_seen is not None else None,
                id=site.id,
                is_active=site.is_active,
                last_seen=site.last_seen.isoformat() if site.last_seen is not None else None,
                url=site.url.rstrip('/')
            )

Both first_seen and last_seen are allowed to be null in the database,
and they are also allowed to be null in the JSON response. JSON does not
have a native way to represent a datetime, so the server's contract
states that any non-null date is represented as an ISO-8601 string.

Without knowing the exact semantics of the first_seen and last_seen
attributes, it is impossible to know whether the attribute can be safely
or performantly accessed multiple times.

One way to fix this code is to replace each conditional expression with
an explicit value assignment and a full if/else block:

    class SiteView(FlaskView):
        @route('/site/<id_>', methods=['GET'])
        def get_site(self, id_):
            site = db.query('site_table').find(id_)

            first_seen_dt = site.first_seen
            if first_seen_dt is None:
                first_seen = None
            else:
                first_seen = first_seen_dt.isoformat()

            last_seen_dt = site.last_seen
            if last_seen_dt is None:
                last_seen = None
            else:
                last_seen = last_seen_dt.isoformat()

            return jsonify(
                first_seen=first_seen,
                id=site.id,
                is_active=site.is_active,
                last_seen=last_seen,
                url=site.url.rstrip('/')
            )

This adds ten lines of code and four new code paths to the function,
dramatically increasing the apparent complexity. Rewriting using the
None-aware attribute operator results in shorter code with more clear
intent:

    class SiteView(FlaskView):
        @route('/site/<id_>', methods=['GET'])
        def get_site(self, id_):
            site = db.query('site_table').find(id_)

            return jsonify(
                first_seen=site.first_seen?.isoformat(),
                id=site.id,
                is_active=site.is_active,
                last_seen=site.last_seen?.isoformat(),
                url=site.url.rstrip('/')
            )

Grab

The next example is from a Python scraping library called Grab
<https://github.com/lorien/grab/blob/4c95b18dcb0fa88eeca81f5643c0ebfb114bf728/gr
ab/upload.py>:

    class BaseUploadObject(object):
        def find_content_type(self, filename):
            ctype, encoding = mimetypes.guess_type(filename)
            if ctype is None:
                return 'application/octet-stream'
            else:
                return ctype

    class UploadContent(BaseUploadObject):
        def __init__(self, content, filename=None, content_type=None):
            self.content = content
            if filename is None:
                self.filename = self.get_random_filename()
            else:
                self.filename = filename
            if content_type is None:
                self.content_type = self.find_content_type(self.filename)
            else:
                self.content_type = content_type

    class UploadFile(BaseUploadObject):
        def __init__(self, path, filename=None, content_type=None):
            self.path = path
            if filename is None:
                self.filename = os.path.split(path)[1]
            else:
                self.filename = filename
            if content_type is None:
                self.content_type = self.find_content_type(self.filename)
            else:
                self.content_type = content_type

This example contains several good examples of needing to provide
default values. Rewriting to use conditional expressions reduces the
overall lines of code, but does not necessarily improve readability:

    class BaseUploadObject(object):
        def find_content_type(self, filename):
            ctype, encoding = mimetypes.guess_type(filename)
            return 'application/octet-stream' if ctype is None else ctype

    class UploadContent(BaseUploadObject):
        def __init__(self, content, filename=None, content_type=None):
            self.content = content
            self.filename = (self.get_random_filename() if filename
                is None else filename)
            self.content_type = (self.find_content_type(self.filename)
                if content_type is None else content_type)

    class UploadFile(BaseUploadObject):
        def __init__(self, path, filename=None, content_type=None):
            self.path = path
            self.filename = (os.path.split(path)[1] if filename is
                None else filename)
            self.content_type = (self.find_content_type(self.filename)
                if content_type is None else content_type)

The first ternary expression is tidy, but it reverses the intuitive
order of the operands: it should return ctype if it has a value and use
the string literal as fallback. The other ternary expressions are
unintuitive and so long that they must be wrapped. The overall
readability is worsened, not improved.

Rewriting using the None coalescing operator:

    class BaseUploadObject(object):
        def find_content_type(self, filename):
            ctype, encoding = mimetypes.guess_type(filename)
            return ctype ?? 'application/octet-stream'

    class UploadContent(BaseUploadObject):
        def __init__(self, content, filename=None, content_type=None):
            self.content = content
            self.filename = filename ?? self.get_random_filename()
            self.content_type = content_type ?? self.find_content_type(self.filename)

    class UploadFile(BaseUploadObject):
        def __init__(self, path, filename=None, content_type=None):
            self.path = path
            self.filename = filename ?? os.path.split(path)[1]
            self.content_type = content_type ?? self.find_content_type(self.filename)

This syntax has an intuitive ordering of the operands. In
find_content_type, for example, the preferred value ctype appears before
the fallback value. The terseness of the syntax also makes for fewer
lines of code and less code to visually parse, and reading from
left-to-right and top-to-bottom more accurately follows the execution
flow.

Rejected Ideas

The first three ideas in this section are oft-proposed alternatives to
treating None as special. For further background on why these are
rejected, see their treatment in PEP 531 and PEP 532 and the associated
discussions.

No-Value Protocol

The operators could be generalised to user-defined types by defining a
protocol to indicate when a value represents "no value". Such a protocol
may be a dunder method __has_value__(self) that returns True if the
value should be treated as having a value, and False if the value should
be treated as no value.

With this generalization, object would implement a dunder method
equivalent to this:

    def __has_value__(self):
        return True

NoneType would implement a dunder method equivalent to this:

    def __has_value__(self):
        return False

In the specification section, all uses of x is None would be replaced
with not x.__has_value__().

This generalization would allow for domain-specific "no-value" objects
to be coalesced just like None. For example, the pyasn1 package has a
type called Null that represents an ASN.1 null:

    >>> from pyasn1.type import univ
    >>> univ.Null() ?? univ.Integer(123)
    Integer(123)

Similarly, values such as math.nan and NotImplemented could be treated
as representing no value.

However, the "no-value" nature of these values is domain-specific, which
means they should be treated as a value by the language. For example,
math.nan.imag is well defined (it's 0.0), and so short-circuiting
math.nan?.imag to return math.nan would be incorrect.

As None is already defined by the language as being the value that
represents "no value", and the current specification would not preclude
switching to a protocol in the future (though changes to built-in
objects would not be compatible), this idea is rejected for now.

Boolean-aware operators

This suggestion is fundamentally the same as adding a no-value protocol,
and so the discussion above also applies.

Similar behavior to the ?? operator can be achieved with an or
expression, however or checks whether its left operand is false-y and
not specifically None. This approach is attractive, as it requires fewer
changes to the language, but ultimately does not solve the underlying
problem correctly.

Assuming the check is for truthiness rather than None, there is no
longer a need for the ?? operator. However, applying this check to the
?. and ?[] operators prevents perfectly valid operations applying

Consider the following example, where get_log_list() may return either a
list containing current log messages (potentially empty), or None if
logging is not enabled:

    lst = get_log_list()
    lst?.append('A log message')

If ?. is checking for true values rather than specifically None and the
log has not been initialized with any items, no item will ever be
appended. This violates the obvious intent of the code, which is to
append an item. The append method is available on an empty list, as are
all other list methods, and there is no reason to assume that these
members should not be used because the list is presently empty.

Further, there is no sensible result to use in place of the expression.
A normal lst.append returns None, but under this idea lst?.append may
result in either [] or None, depending on the value of lst. As with the
examples in the previous section, this makes no sense.

As checking for truthiness rather than None results in apparently valid
expressions no longer executing as intended, this idea is rejected.

Exception-aware operators

Arguably, the reason to short-circuit an expression when None is
encountered is to avoid the AttributeError or TypeError that would be
raised under normal circumstances. As an alternative to testing for
None, the ?. and ?[] operators could instead handle AttributeError and
TypeError raised by the operation and skip the remainder of the
expression.

This produces a transformation for a?.b.c?.d.e similar to this:

    _v = a
    try:
        _v = _v.b
    except AttributeError:
        pass
    else:
        _v = _v.c
        try:
            _v = _v.d
        except AttributeError:
            pass
        else:
            _v = _v.e

One open question is which value should be returned as the expression
when an exception is handled. The above example simply leaves the
partial result, but this is not helpful for replacing with a default
value. An alternative would be to force the result to None, which then
raises the question as to why None is special enough to be the result
but not special enough to be the test.

Secondly, this approach masks errors within code executed implicitly as
part of the expression. For ?., any AttributeError within a property or
__getattr__ implementation would be hidden, and similarly for ?[] and
__getitem__ implementations.

Similarly, simple typing errors such as {}?.ietms() could go unnoticed.

Existing conventions for handling these kinds of errors in the form of
the getattr builtin and the .get(key, default) method pattern
established by dict show that it is already possible to explicitly use
this behaviour.

As this approach would hide errors in code, it is rejected.

None-aware Function Call

The None-aware syntax applies to attribute and index access, so it seems
natural to ask if it should also apply to function invocation syntax. It
might be written as foo?(), where foo is only called if it is not None.

This has been deferred on the basis of the proposed operators being
intended to aid traversal of partially populated hierarchical data
structures, not for traversal of arbitrary class hierarchies. This is
reflected in the fact that none of the other mainstream languages that
already offer this syntax have found it worthwhile to support a similar
syntax for optional function invocations.

A workaround similar to that used by C# would be to write
maybe_none?.__call__(arguments). If the callable is None, the expression
will not be evaluated. (The C# equivalent uses ?.Invoke() on its
callable type.)

? Unary Postfix Operator

To generalize the None-aware behavior and limit the number of new
operators introduced, a unary, postfix operator spelled ? was suggested.
The idea is that ? might return a special object that could would
override dunder methods that return self. For example, foo? would
evaluate to foo if it is not None, otherwise it would evaluate to an
instance of NoneQuestion:

    class NoneQuestion():
        def __call__(self, *args, **kwargs):
            return self

        def __getattr__(self, name):
            return self

        def __getitem__(self, key):
            return self

With this new operator and new type, an expression like foo?.bar[baz]
evaluates to NoneQuestion if foo is None. This is a nifty
generalization, but it's difficult to use in practice since most
existing code won't know what NoneQuestion is.

Going back to one of the motivating examples above, consider the
following:

    >>> import json
    >>> created = None
    >>> json.dumps({'created': created?.isoformat()})

The JSON serializer does not know how to serialize NoneQuestion, nor
will any other API. This proposal actually requires lots of specialized
logic throughout the standard library and any third party library.

At the same time, the ? operator may also be too general, in the sense
that it can be combined with any other operator. What should the
following expressions mean?:

    >>> x? + 1
    >>> x? -= 1
    >>> x? == 1
    >>> ~x?

This degree of generalization is not useful. The operators actually
proposed herein are intentionally limited to a few operators that are
expected to make it easier to write common code patterns.

Built-in maybe

Haskell has a concept called Maybe that encapsulates the idea of an
optional value without relying on any special keyword (e.g. null) or any
special instance (e.g. None). In Haskell, the purpose of Maybe is to
avoid separate handling of "something" and nothing".

A Python package called pymaybe provides a rough approximation. The
documentation shows the following example:

    >>> maybe('VALUE').lower()
    'value'

    >>> maybe(None).invalid().method().or_else('unknown')
    'unknown'

The function maybe() returns either a Something instance or a Nothing
instance. Similar to the unary postfix operator described in the
previous section, Nothing overrides dunder methods in order to allow
chaining on a missing value.

Note that or_else() is eventually required to retrieve the underlying
value from pymaybe's wrappers. Furthermore, pymaybe does not short
circuit any evaluation. Although pymaybe has some strengths and may be
useful in its own right, it also demonstrates why a pure Python
implementation of coalescing is not nearly as powerful as support built
into the language.

The idea of adding a builtin maybe type to enable this scenario is
rejected.

Just use a conditional expression

Another common way to initialize default values is to use the ternary
operator. Here is an excerpt from the popular Requests package
<https://github.com/kennethreitz/requests/blob/14a555ac716866678bf17e43e23230d81
a8149f5/requests/models.py#L212>:

    data = [] if data is None else data
    files = [] if files is None else files
    headers = {} if headers is None else headers
    params = {} if params is None else params
    hooks = {} if hooks is None else hooks

This particular formulation has the undesirable effect of putting the
operands in an unintuitive order: the brain thinks, "use data if
possible and use [] as a fallback," but the code puts the fallback
before the preferred value.

The author of this package could have written it like this instead:

    data = data if data is not None else []
    files = files if files is not None else []
    headers = headers if headers is not None else {}
    params = params if params is not None else {}
    hooks = hooks if hooks is not None else {}

This ordering of the operands is more intuitive, but it requires 4 extra
characters (for "not "). It also highlights the repetition of
identifiers: data if data, files if files, etc.

When written using the None coalescing operator, the sample reads:

    data = data ?? []
    files = files ?? []
    headers = headers ?? {}
    params = params ?? {}
    hooks = hooks ?? {}

References

Copyright

This document has been placed in the public domain.

[1] C# Reference: Operators
(https://learn.microsoft.com/en/dotnet/csharp/language-reference/operators/)

[2] A Tour of the Dart Language: Operators
(https://www.dartlang.org/docs/dart-up-and-running/ch02.html#operators)

[3] Proposal: Nullish Coalescing for JavaScript
(https://github.com/tc39/proposal-nullish-coalescing)

[4] Proposal: Optional Chaining for JavaScript
(https://github.com/tc39/proposal-optional-chaining)

[5] Associated scripts
(https://github.com/python/peps/tree/master/pep-0505/)