Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 769 – Add a ‘default’ keyword argument to ‘attrgetter’ and ‘itemgetter’

Author:
Facundo Batista <facundo at taniquetil.com.ar>
Status:
Draft
Type:
Standards Track
Created:
22-Dec-2024
Python-Version:
3.14

Table of Contents

Abstract

This proposal aims to enhance the operator module by adding a default keyword argument to the attrgetter and itemgetter functions. This addition would allow these functions to return a specified default value when the targeted attribute or item is missing, thereby preventing exceptions and simplifying code that handles optional attributes or items.

Motivation

Currently, attrgetter and itemgetter raise exceptions if the specified attribute or item is absent. This limitation requires developers to implement additional error handling, leading to more complex and less readable code.

Introducing a default parameter would streamline operations involving optional attributes or items, reducing boilerplate code and enhancing code clarity.

Rationale

The primary design decision is to introduce a single default parameter applicable to all specified attributes or items.

This approach maintains simplicity and avoids the complexity of assigning individual default values to multiple attributes or items. While some discussions considered allowing multiple defaults, the increased complexity and potential for confusion led to favoring a single default value for all cases (more about this below in Rejected Ideas).

Specification

Proposed behaviors:

  • attrgetter: f = attrgetter("name", default=XYZ) followed by f(obj) would return obj.name if the attribute exists, else XYZ.
  • itemgetter: f = itemgetter(2, default=XYZ) followed by f(obj) would return obj[2] if that is valid, else XYZ.

This enhancement applies to single and multiple attribute/item retrievals, with the default value returned for any missing attribute or item.

No functionality change is incorporated if default is not used.

Examples for attrgetter

The current behavior is unchanged:

>>> class C:
...   class D:
...     class X:
...       pass
...   class E:
...     pass
...
>>> attrgetter("D")(C)
<class '__main__.C.D'>
>>> attrgetter("badname")(C)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'C' has no attribute 'badname'
>>> attrgetter("D", "E")(C)
(<class '__main__.C.D'>, <class '__main__.C.E'>)
>>> attrgetter("D", "badname")(C)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'C' has no attribute 'badname'
>>> attrgetter("D.X")(C)
<class '__main__.C.D.X'>
>>> attrgetter("D.badname")(C)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'D' has no attribute 'badname'

With this PEP, using the proposed default keyword:

>>> attrgetter("D", default="noclass")(C)
<class '__main__.C.D'>
>>> attrgetter("badname", default="noclass")(C)
'noclass'
>>> attrgetter("D", "E", default="noclass")(C)
(<class '__main__.C.D'>, <class '__main__.C.E'>)
>>> attrgetter("D", "badname", default="noclass")(C)
(<class '__main__.C.D'>, 'noclass')
>>> attrgetter("D.X", default="noclass")(C)
<class '__main__.C.D.X'>
>>> attrgetter("D.badname", default="noclass")(C)
'noclass'

Examples for itemgetter

The current behavior is unchanged:

>>> obj = ["foo", "bar", "baz"]
>>> itemgetter(1)(obj)
'bar'
>>> itemgetter(5)(obj)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> itemgetter(1, 0)(obj)
('bar', 'foo')
>>> itemgetter(1, 5)(obj)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

With this PEP, using the proposed default keyword:

>>> itemgetter(1, default="XYZ")(obj)
'bar'
>>> itemgetter(5, default="XYZ")(obj)
'XYZ'
>>> itemgetter(1, 0, default="XYZ")(obj)
('bar', 'foo')
>>> itemgetter(1, 5, default="XYZ")(obj)
('bar', 'XYZ')

About Possible Implementations

The implementation of attrgetter is quite direct: it implies using getattr and catching a possible AttributeError. So attrgetter("name", default=XYZ)(obj) would be like:

try:
    value = getattr(obj, "name")
except (TypeError, IndexError, KeyError):
    value = XYZ

Note we cannot rely on using getattr with a default value, as it would be impossible to distinguish what it returned on each step when an attribute chain is specified (e.g. attrgetter("foo.bar.baz", default=XYZ)).

The implementation for itemgetter is not that easy. The more straightforward way is also simple to define and understand: attempting __getitem__ and catching a possible exception (any of the three indicated in __getitem__ reference). This way, itemgetter(123, default=XYZ)(obj) would be equivalent to:

try:
    value = obj[123]
except (TypeError, IndexError, KeyError):
    value = XYZ

However, this would be not as efficient as we’d want for certain cases, e.g. using dictionaries where better performance is possible. A more complex alternative would be:

if isinstance(obj, dict):
    value = obj.get(123, XYZ)
else:
    try:
        value = obj[123]
    except (TypeError, IndexError, KeyError):
        value = XYZ

While this provides better performance, it is more complicated to implement and explain. This is the first case in the Open Issues section later.

Corner Cases

Providing a default option would only work when accessing the item/attribute would fail in the normal case. In other words, the object accessed should not handle defaults itself.

For example, the following would be redundant/confusing because defaultdict will never error out when accessing the item:

>>> from collections import defaultdict
>>> from operator import itemgetter
>>> dd = defaultdict(int)
>>> itemgetter("foo", default=-1)(dd)
0

The same applies to any user defined object that overloads __getitem__ or __getattr__ implementing its own fallbacks.

Rejected Ideas

Multiple Default Values

The idea of allowing multiple default values for multiple attributes or items was considered.

Two alternatives were discussed, using an iterable that must have the same quantity of items as parameters given to attrgetter/itemgetter, or using a dictionary with keys matching those names passed to attrgetter/itemgetter.

The really complex thing to solve here (making the feature hard to explain and with confusing corner cases), is what would happen if an iterable or dictionary is the actual default desired for all items. For example:

>>> itemgetter("a", default=(1, 2))({})
(1, 2)
>>> itemgetter("a", "b", default=(1, 2))({})
((1, 2), (1, 2))

If we allow “multiple default values” using default, the first case in the example above would raise an exception because there are more items than names in the default, and the second case would return (1, 2)). This is why we considered the possibility of using a different name for multiple defaults (e.g. defaults, which is expressive but maybe error prone because it is too similar to default).

Another proposal that would enable multiple defaults, is allowing combinations of attrgetter and itemgetter, e.g.:

>>> ig_a = itemgetter("a", default=1)
>>> ig_b = itemgetter("b", default=2)
>>> ig_combined = itemgetter(ig_a, ig_b)
>>> ig_combined({"a": 999})
(999, 2)
>>> ig_combined({})
(1, 2)

However, combining itemgetter or attrgetter is totally new behavior and very complex to define. While not impossible, it is beyond the scope of this PEP.

In the end, having multiple default values was deemed overly complex and potentially confusing, and a single default parameter was favored for simplicity and predictability.

Tuple Return Consistency

Another rejected proposal was adding a flag to always return a tuple regardless of how many keys/names/indices were given. E.g.:

>>> letters = ["a", "b", "c"]
>>> itemgetter(1, return_tuple=True)(letters)
('b',)
>>> itemgetter(1, 2, return_tuple=True)(letters)
('b', 'c')

This would be of little help for multiple default values consistency, requiring further discussion, and is out of the scope of this PEP.

Open Issues

Behavior Equivalence for itemgetter

For itemgetter, should it just attempt to access the item and capture exceptions regardless of the object’s API, or should it first validate that the object provides a get method, and if so use it to retrieve the item with a default? See examples in the About Possible Implementations subsection above.

This would help performance for the case of dictionaries, but would make the default feature somewhat more difficult to explain, and a little confusing if some object that is not a dictionary but still provides a get method. Alternatively, we could call .get only if the object is an instance of dict.

In any case, it is desirable that we do not affect performance at all if the default is not triggered. Checking for .get would be faster for dicts, but implies doing a verification in all cases. Using the try/except model would make it less efficient as possible in the case of dictionaries, but only if the default is not triggered.

Add a Default to getitem

It was proposed that we could also enhance getitem, as part of this PEP, adding the default keyword to that function as well.

This will not only improve getitem itself, but we would also gain internal consistency in the operator module and in comparison with the getattr builtin function, which also has a default.

The definition could be as simple as the try/except proposed above, so doing getitem(obj, name, default) would be equivalent to:

try:
    result = obj[name]
except (TypeError, IndexError, KeyError):
    result = default

(However see previous open issue about special case for dictionaries.)

How to Teach This

As the basic behavior is not modified, this new default can be avoided when teaching attrgetter and itemgetter for the first time. It can be introduced only when the functionality is needed.

Backwards Compatibility

The proposed changes are backward-compatible. The default parameter is optional; existing code without this parameter will function as before. Only code that explicitly uses the new default parameter will exhibit the new behavior, ensuring no disruption to current implementations.

Security Implications

Introducing a default parameter does not inherently introduce security vulnerabilities.


Source: https://github.com/python/peps/blob/main/peps/pep-0769.rst

Last modified: 2025-01-09 16:24:54 GMT