Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 769 – Add a ‘default’ keyword argument to ‘attrgetter’, ‘itemgetter’ and ‘getitem’

Author:
Facundo Batista <facundo at taniquetil.com.ar>
Discussions-To:
Discourse thread
Status:
Draft
Type:
Standards Track
Created:
22-Dec-2024
Python-Version:
3.14
Post-History:
07-Jan-2025

Table of Contents

Abstract

This proposal aims to enhance the operator module by adding a default keyword argument to the attrgetter, itemgetter and getitem functions. This addition would allow these functions to return a specified default value when the targeted attribute or item is missing, thereby preventing exceptions and simplifying code that handles optional attributes or items.

Motivation

Currently, attrgetter and itemgetter raise exceptions if the specified attribute or item is absent. This limitation requires developers to implement additional error handling, leading to more complex and less readable code.

Introducing a default parameter would streamline operations involving optional attributes or items, reducing boilerplate code and enhancing code clarity.

A similar situation occurs with getitem, with the added nuance that allowing the specification of a default value in this case would resolve a longstanding asymmetry with the getattr() built-in function.

Rationale

The primary design decision is to introduce a single default parameter applicable to all specified attributes or items.

This approach maintains simplicity and avoids the complexity of assigning individual default values to multiple attributes or items. While some discussions considered allowing multiple defaults, the increased complexity and potential for confusion led to favoring a single default value for all cases (more about this below in Rejected Ideas).

Specification

Proposed behaviors:

  • attrgetter: f = attrgetter("name", default=XYZ) followed by f(obj) would return obj.name if the attribute exists, else XYZ.
  • itemgetter: f = itemgetter(2, default=XYZ) followed by f(obj) would return obj[2] if that is valid, else XYZ.
  • getitem: getitem(obj, k, XYZ) or getitem(obj, k, default=XYZ) would return obj[k] if that is valid, else XYZ.

In the first two cases, the enhancement applies to single and multiple attribute/item retrievals, with the default value returned for any missing attribute or item.

No functionality change is incorporated in any case if the extra default (keyword) argument is not used.

Examples for attrgetter

The current behavior is unchanged:

>>> class C:
...   class D:
...     class X:
...       pass
...   class E:
...     pass
...
>>> attrgetter("D")(C)
<class '__main__.C.D'>
>>> attrgetter("badname")(C)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'C' has no attribute 'badname'
>>> attrgetter("D", "E")(C)
(<class '__main__.C.D'>, <class '__main__.C.E'>)
>>> attrgetter("D", "badname")(C)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'C' has no attribute 'badname'
>>> attrgetter("D.X")(C)
<class '__main__.C.D.X'>
>>> attrgetter("D.badname")(C)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'D' has no attribute 'badname'

With this PEP, using the proposed default keyword:

>>> attrgetter("D", default="noclass")(C)
<class '__main__.C.D'>
>>> attrgetter("badname", default="noclass")(C)
'noclass'
>>> attrgetter("D", "E", default="noclass")(C)
(<class '__main__.C.D'>, <class '__main__.C.E'>)
>>> attrgetter("D", "badname", default="noclass")(C)
(<class '__main__.C.D'>, 'noclass')
>>> attrgetter("D.X", default="noclass")(C)
<class '__main__.C.D.X'>
>>> attrgetter("D.badname", default="noclass")(C)
'noclass'

Examples for itemgetter

The current behavior is unchanged:

>>> obj = ["foo", "bar", "baz"]
>>> itemgetter(1)(obj)
'bar'
>>> itemgetter(5)(obj)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> itemgetter(1, 0)(obj)
('bar', 'foo')
>>> itemgetter(1, 5)(obj)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

With this PEP, using the proposed default keyword:

>>> itemgetter(1, default="XYZ")(obj)
'bar'
>>> itemgetter(5, default="XYZ")(obj)
'XYZ'
>>> itemgetter(1, 0, default="XYZ")(obj)
('bar', 'foo')
>>> itemgetter(1, 5, default="XYZ")(obj)
('bar', 'XYZ')

Examples for getitem

The current behavior is unchanged:

>>> obj = ["foo", "bar", "baz"]
>>> getitem(obj, 1)
'bar'
>>> getitem(obj, 5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

With this PEP, using the proposed extra default, positionally or with a keyword:

>>> getitem(obj, 1, "XYZ")
'bar'
>>> getitem(obj, 5, "XYZ")
'XYZ'
>>> getitem(obj, 1, default="XYZ")
'bar'
>>> getitem(obj, 5, default="XYZ")
'XYZ'

About Possible Implementations

The implementation of attrgetter is quite direct: it implies using getattr and catching a possible AttributeError. So attrgetter("name", default=XYZ)(obj) would be like:

try:
    value = getattr(obj, "name")
except (TypeError, IndexError, KeyError):
    value = XYZ

Note we cannot rely on using getattr with a default value, as it would be impossible to distinguish what it returned on each step when an attribute chain is specified (e.g. attrgetter("foo.bar.baz", default=XYZ)).

The implementation for itemgetter and getitem is not that easy. The more straightforward way is also simple to define and understand: attempting __getitem__ and catching a possible exception (any of the three indicated in __getitem__ reference). This way, itemgetter(123, default=XYZ)(obj) or getitem(obj, 123, default=XYZ) would be equivalent to:

try:
    value = obj[123]
except (TypeError, IndexError, KeyError):
    value = XYZ

However, for performance reasons the implementation may look more like the following, which has the same exact behaviour:

if type(obj) == dict:
    value = obj.get(123, XYZ)
else:
    try:
        value = obj[123]
    except (TypeError, IndexError, KeyError):
        value = XYZ

Note how the verification is about the exact type and not using isinstance; this is to ensure the exact behaviour, which would be impossible if the object is a user defined one that inherits dict but overwrites get (similar reason to not check if the object has a get method).

This way, performance is better but it’s just an implementation detail, so we can keep the original explanation on how it behaves.

Corner Cases

Providing a default option would only work if accessing the item/attribute would fail in the normal case. In other words, the object accessed should not handle defaults itself.

For example, the following would be redundant/confusing because defaultdict will never error out when accessing the item:

>>> from collections import defaultdict
>>> from operator import itemgetter
>>> dd = defaultdict(int)
>>> itemgetter("foo", default=-1)(dd)
0

The same applies to any user defined object that overloads __getitem__ or __getattr__ implementing its own fallbacks.

Rejected Ideas

Multiple Default Values

The idea of allowing multiple default values for multiple attributes or items was considered.

Two alternatives were discussed, using an iterable that must have the same quantity of items as parameters given to attrgetter/itemgetter, or using a dictionary with keys matching those names passed to attrgetter/itemgetter.

The really complex thing to solve here (making the feature hard to explain and with confusing corner cases), is what would happen if an iterable or dictionary is the actual default desired for all items. For example:

>>> itemgetter("a", default=(1, 2))({})
(1, 2)
>>> itemgetter("a", "b", default=(1, 2))({})
((1, 2), (1, 2))

If we allow “multiple default values” using default, the first case in the example above would raise an exception because there are more items than names in the default, and the second case would return (1, 2)). This is why we considered the possibility of using a different name for multiple defaults (e.g. defaults, which is expressive but maybe error prone because it is too similar to default).

Another proposal that would enable multiple defaults, is allowing combinations of attrgetter and itemgetter, e.g.:

>>> ig_a = itemgetter("a", default=1)
>>> ig_b = itemgetter("b", default=2)
>>> ig_combined = itemgetter(ig_a, ig_b)
>>> ig_combined({"a": 999})
(999, 2)
>>> ig_combined({})
(1, 2)

However, combining itemgetter or attrgetter is totally new behavior and very complex to define. While not impossible, it is beyond the scope of this PEP.

In the end, having multiple default values was deemed overly complex and potentially confusing, and a single default parameter was favored for simplicity and predictability.

Tuple Return Consistency

Another rejected proposal was adding a flag to always return a tuple regardless of how many keys/names/indices were given. E.g.:

>>> letters = ["a", "b", "c"]
>>> itemgetter(1, return_tuple=True)(letters)
('b',)
>>> itemgetter(1, 2, return_tuple=True)(letters)
('b', 'c')

This would be of little help for multiple default values consistency, requiring further discussion, and is out of the scope of this PEP.

Open Issues

There are no open issues at this time.

How to Teach This

As the basic behavior is not modified, this new default can be avoided when teaching attrgetter and itemgetter for the first time. It can be introduced only when the functionality is needed.

Backwards Compatibility

The proposed changes are backward-compatible. The default parameter is optional; existing code without this parameter will function as before. Only code that explicitly uses the new default parameter will exhibit the new behavior, ensuring no disruption to current implementations.

Security Implications

Introducing a default parameter does not inherently introduce security vulnerabilities.


Source: https://github.com/python/peps/blob/main/peps/pep-0769.rst

Last modified: 2025-01-20 23:32:45 GMT