PEP 769 – Add a ‘default’ keyword argument to ‘attrgetter’, ‘itemgetter’ and ‘getitem’
- Author:
- Facundo Batista <facundo at taniquetil.com.ar>
- Discussions-To:
- Discourse thread
- Status:
- Draft
- Type:
- Standards Track
- Created:
- 22-Dec-2024
- Python-Version:
- 3.14
- Post-History:
- 07-Jan-2025
Abstract
This proposal aims to enhance the operator
module by adding a
default
keyword argument to the attrgetter
, itemgetter
and
getitem
functions. This addition would allow these functions to return a
specified default value when the targeted attribute or item is missing,
thereby preventing exceptions and simplifying code that handles optional
attributes or items.
Motivation
Currently, attrgetter
and itemgetter
raise exceptions if the
specified attribute or item is absent. This limitation requires
developers to implement additional error handling, leading to more
complex and less readable code.
Introducing a default
parameter would streamline operations involving
optional attributes or items, reducing boilerplate code and enhancing
code clarity.
A similar situation occurs with getitem
, with the added nuance that
allowing the specification of a default value in this case would resolve
a longstanding asymmetry with the getattr()
built-in function.
Rationale
The primary design decision is to introduce a single default
parameter
applicable to all specified attributes or items.
This approach maintains simplicity and avoids the complexity of assigning individual default values to multiple attributes or items. While some discussions considered allowing multiple defaults, the increased complexity and potential for confusion led to favoring a single default value for all cases (more about this below in Rejected Ideas).
Specification
Proposed behaviors:
- attrgetter:
f = attrgetter("name", default=XYZ)
followed byf(obj)
would returnobj.name
if the attribute exists, elseXYZ
. - itemgetter:
f = itemgetter(2, default=XYZ)
followed byf(obj)
would returnobj[2]
if that is valid, elseXYZ
. - getitem:
getitem(obj, k, XYZ)
orgetitem(obj, k, default=XYZ)
would returnobj[k]
if that is valid, elseXYZ
.
In the first two cases, the enhancement applies to single and multiple attribute/item retrievals, with the default value returned for any missing attribute or item.
No functionality change is incorporated in any case if the extra default (keyword) argument is not used.
Examples for attrgetter
The current behavior is unchanged:
>>> class C:
... class D:
... class X:
... pass
... class E:
... pass
...
>>> attrgetter("D")(C)
<class '__main__.C.D'>
>>> attrgetter("badname")(C)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'C' has no attribute 'badname'
>>> attrgetter("D", "E")(C)
(<class '__main__.C.D'>, <class '__main__.C.E'>)
>>> attrgetter("D", "badname")(C)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'C' has no attribute 'badname'
>>> attrgetter("D.X")(C)
<class '__main__.C.D.X'>
>>> attrgetter("D.badname")(C)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'D' has no attribute 'badname'
With this PEP, using the proposed default
keyword:
>>> attrgetter("D", default="noclass")(C)
<class '__main__.C.D'>
>>> attrgetter("badname", default="noclass")(C)
'noclass'
>>> attrgetter("D", "E", default="noclass")(C)
(<class '__main__.C.D'>, <class '__main__.C.E'>)
>>> attrgetter("D", "badname", default="noclass")(C)
(<class '__main__.C.D'>, 'noclass')
>>> attrgetter("D.X", default="noclass")(C)
<class '__main__.C.D.X'>
>>> attrgetter("D.badname", default="noclass")(C)
'noclass'
Examples for itemgetter
The current behavior is unchanged:
>>> obj = ["foo", "bar", "baz"]
>>> itemgetter(1)(obj)
'bar'
>>> itemgetter(5)(obj)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> itemgetter(1, 0)(obj)
('bar', 'foo')
>>> itemgetter(1, 5)(obj)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
With this PEP, using the proposed default
keyword:
>>> itemgetter(1, default="XYZ")(obj)
'bar'
>>> itemgetter(5, default="XYZ")(obj)
'XYZ'
>>> itemgetter(1, 0, default="XYZ")(obj)
('bar', 'foo')
>>> itemgetter(1, 5, default="XYZ")(obj)
('bar', 'XYZ')
Examples for getitem
The current behavior is unchanged:
>>> obj = ["foo", "bar", "baz"]
>>> getitem(obj, 1)
'bar'
>>> getitem(obj, 5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
With this PEP, using the proposed extra default, positionally or with a keyword:
>>> getitem(obj, 1, "XYZ")
'bar'
>>> getitem(obj, 5, "XYZ")
'XYZ'
>>> getitem(obj, 1, default="XYZ")
'bar'
>>> getitem(obj, 5, default="XYZ")
'XYZ'
About Possible Implementations
The implementation of attrgetter
is quite direct: it implies using
getattr
and catching a possible AttributeError
. So
attrgetter("name", default=XYZ)(obj)
would be like:
try:
value = getattr(obj, "name")
except (TypeError, IndexError, KeyError):
value = XYZ
Note we cannot rely on using getattr
with a default value, as it would
be impossible to distinguish what it returned on each step when an
attribute chain is specified (e.g.
attrgetter("foo.bar.baz", default=XYZ)
).
The implementation for itemgetter
and getitem
is not that
easy. The more straightforward way is also simple to define and
understand: attempting __getitem__
and catching a possible
exception (any of the three indicated in __getitem__
reference).
This way, itemgetter(123, default=XYZ)(obj)
or
getitem(obj, 123, default=XYZ)
would be equivalent to:
try:
value = obj[123]
except (TypeError, IndexError, KeyError):
value = XYZ
However, for performance reasons the implementation may look more like the following, which has the same exact behaviour:
if type(obj) == dict:
value = obj.get(123, XYZ)
else:
try:
value = obj[123]
except (TypeError, IndexError, KeyError):
value = XYZ
Note how the verification is about the exact type and not using
isinstance
; this is to ensure the exact behaviour, which would be
impossible if the object is a user defined one that inherits dict
but overwrites get
(similar reason to not check if the object has
a get
method).
This way, performance is better but it’s just an implementation detail, so we can keep the original explanation on how it behaves.
Corner Cases
Providing a default
option would only work if accessing the
item/attribute would fail in the normal case. In other words, the
object accessed should not handle defaults itself.
For example, the following would be redundant/confusing because
defaultdict
will never error out when accessing the item:
>>> from collections import defaultdict
>>> from operator import itemgetter
>>> dd = defaultdict(int)
>>> itemgetter("foo", default=-1)(dd)
0
The same applies to any user defined object that overloads __getitem__
or __getattr__
implementing its own fallbacks.
Rejected Ideas
Multiple Default Values
The idea of allowing multiple default values for multiple attributes or items was considered.
Two alternatives were discussed, using an iterable that must have the
same quantity of items as parameters given to
attrgetter
/itemgetter
, or using a dictionary with keys matching
those names passed to attrgetter
/itemgetter
.
The really complex thing to solve here (making the feature hard to explain and with confusing corner cases), is what would happen if an iterable or dictionary is the actual default desired for all items. For example:
>>> itemgetter("a", default=(1, 2))({})
(1, 2)
>>> itemgetter("a", "b", default=(1, 2))({})
((1, 2), (1, 2))
If we allow “multiple default values” using default
, the first case
in the example above would raise an exception because there are more items
than names in the default, and the second case would return (1, 2))
. This is
why we considered the possibility of using a different name for multiple
defaults (e.g. defaults
, which is expressive but maybe error prone because
it is too similar to default
).
Another proposal that would enable multiple defaults, is allowing
combinations of attrgetter
and itemgetter
, e.g.:
>>> ig_a = itemgetter("a", default=1)
>>> ig_b = itemgetter("b", default=2)
>>> ig_combined = itemgetter(ig_a, ig_b)
>>> ig_combined({"a": 999})
(999, 2)
>>> ig_combined({})
(1, 2)
However, combining itemgetter
or attrgetter
is totally new
behavior and very complex to define. While not impossible, it is beyond
the scope of this PEP.
In the end, having multiple default values was deemed overly complex and
potentially confusing, and a single default
parameter was favored for
simplicity and predictability.
Tuple Return Consistency
Another rejected proposal was adding a flag to always return a tuple regardless of how many keys/names/indices were given. E.g.:
>>> letters = ["a", "b", "c"]
>>> itemgetter(1, return_tuple=True)(letters)
('b',)
>>> itemgetter(1, 2, return_tuple=True)(letters)
('b', 'c')
This would be of little help for multiple default values consistency, requiring further discussion, and is out of the scope of this PEP.
Open Issues
There are no open issues at this time.
How to Teach This
As the basic behavior is not modified, this new default
can be
avoided when teaching attrgetter
and itemgetter
for the first
time. It can be introduced only when the functionality is needed.
Backwards Compatibility
The proposed changes are backward-compatible. The default
parameter
is optional; existing code without this parameter will function as
before. Only code that explicitly uses the new default
parameter will
exhibit the new behavior, ensuring no disruption to current
implementations.
Security Implications
Introducing a default
parameter does not inherently introduce
security vulnerabilities.
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
Source: https://github.com/python/peps/blob/main/peps/pep-0769.rst
Last modified: 2025-01-20 23:32:45 GMT