PEP 213 – Attribute Access Handlers
- Author:
- Paul Prescod <paul at prescod.net>
- Status:
- Deferred
- Type:
- Standards Track
- Created:
- 21-Jul-2000
- Python-Version:
- 2.1
- Post-History:
Introduction
It is possible (and even relatively common) in Python code and in extension modules to “trap” when an instance’s client code attempts to set an attribute and execute code instead. In other words, it is possible to allow users to use attribute assignment/ retrieval/deletion syntax even though the underlying implementation is doing some computation rather than directly modifying a binding.
This PEP describes a feature that makes it easier, more efficient and safer to implement these handlers for Python instances.
Justification
Scenario 1
You have a deployed class that works on an attribute named “stdout”. After a while, you think it would be better to check that stdout is really an object with a “write” method at the moment of assignment. Rather than change to a setstdout method (which would be incompatible with deployed code) you would rather trap the assignment and check the object’s type.
Scenario 2
You want to be as compatible as possible with an object model that has a concept of attribute assignment. It could be the W3C Document Object Model or a particular COM interface (e.g. the PowerPoint interface). In that case you may well want attributes in the model to show up as attributes in the Python interface, even though the underlying implementation may not use attributes at all.
Scenario 3
A user wants to make an attribute read-only.
In short, this feature allows programmers to separate the interface of their module from the underlying implementation for whatever purpose. Again, this is not a new feature but merely a new syntax for an existing convention.
Current Solution
To make some attributes read-only:
class foo:
def __setattr__( self, name, val ):
if name=="readonlyattr":
raise TypeError
elif name=="readonlyattr2":
raise TypeError
...
else:
self.__dict__["name"]=val
This has the following problems:
- The creator of the method must be intimately aware of whether
somewhere else in the class hierarchy
__setattr__
has also been trapped for any particular purpose. If so, she must specifically call that method rather than assigning to the dictionary. There are many different reasons to overload__setattr__
so there is a decent potential for clashes. For instance object database implementations often overload setattr for an entirely unrelated purpose. - The string-based switch statement forces all attribute handlers to be specified in one place in the code. They may then dispatch to task-specific methods (for modularity) but this could cause performance problems.
- Logic for the setting, getting and deleting must live in
__getattr__
,__setattr__
and__delattr__
. Once again, this can be mitigated through an extra level of method call but this is inefficient.
Proposed Syntax
Special methods should declare themselves with declarations of the following form:
class x:
def __attr_XXX__(self, op, val ):
if op=="get":
return someComputedValue(self.internal)
elif op=="set":
self.internal=someComputedValue(val)
elif op=="del":
del self.internal
Client code looks like this:
fooval=x.foo
x.foo=fooval+5
del x.foo
Semantics
Attribute references of all three kinds should call the method. The op parameter can be “get”/”set”/”del”. Of course this string will be interned so the actual checks for the string will be very fast.
It is disallowed to actually have an attribute named XXX in the same instance as a method named __attr_XXX__.
An implementation of __attr_XXX__ takes precedence over an
implementation of __getattr__
based on the principle that
__getattr__
is supposed to be invoked only after finding an
appropriate attribute has failed.
An implementation of __attr_XXX__ takes precedence over an
implementation of __setattr__
in order to be consistent. The
opposite choice seems fairly feasible also, however. The same
goes for __del_y__.
Proposed Implementation
There is a new object type called an attribute access handler. Objects of this type have the following attributes:
name (e.g. XXX, not __attr__XXX__)
method (pointer to a method object)
In PyClass_New, methods of the appropriate form will be detected and
converted into objects (just like unbound method objects). These are
stored in the class __dict__
under the name XXX. The original method
is stored as an unbound method under its original name.
If there are any attribute access handlers in an instance at all, a flag is set. Let’s call it “I_have_computed_attributes” for now. Derived classes inherit the flag from base classes. Instances inherit the flag from classes.
A get proceeds as usual until just before the object is returned. In addition to the current check whether the returned object is a method it would also check whether a returned object is an access handler. If so, it would invoke the getter method and return the value. To remove an attribute access handler you could directly fiddle with the dictionary.
A set proceeds by checking the “I_have_computed_attributes” flag. If
it is not set, everything proceeds as it does today. If it is set
then we must do a dictionary get on the requested object name. If it
returns an attribute access handler then we call the setter function
with the value. If it returns any other object then we discard the
result and continue as we do today. Note that having an attribute
access handler will mildly affect attribute “setting” performance for
all sets on a particular instance, but no more so than today, using
__setattr__
. Gets are more efficient than they are today with
__getattr__
.
The I_have_computed_attributes flag is intended to eliminate the performance degradation of an extra “get” per “set” for objects not using this feature. Checking this flag should have minuscule performance implications for all objects.
The implementation of delete is analogous to the implementation of set.
Caveats
- You might note that I have not proposed any logic to keep
the I_have_computed_attributes flag up to date as attributes
are added and removed from the instance’s dictionary. This is
consistent with current Python. If you add a
__setattr__
method to an object after it is in use, that method will not behave as it would if it were available at “compile” time. The dynamism is arguably not worth the extra implementation effort. This snippet demonstrates the current behavior:>>> def prn(*args):print args >>> class a: ... __setattr__=prn >>> a().foo=5 (<__main__.a instance at 882890>, 'foo', 5) >>> class b: pass >>> bi=b() >>> bi.__setattr__=prn >>> b.foo=5
- Assignment to __dict__[“XXX”] can overwrite the attribute access handler for __attr_XXX__. Typically the access handlers will store information away in private __XXX variables
- An attribute access handler that attempts to call setattr or getattr
on the object itself can cause an infinite loop (as with
__getattr__
) Once again, the solution is to use a special (typically private) variable such as __XXX.
Note
The descriptor mechanism described in PEP 252 is powerful enough to support this more directly. A ‘getset’ constructor may be added to the language making this possible:
class C:
def get_x(self):
return self.__x
def set_x(self, v):
self.__x = v
x = getset(get_x, set_x)
Additional syntactic sugar might be added, or a naming convention could be recognized.
Source: https://github.com/python/peps/blob/main/peps/pep-0213.rst
Last modified: 2023-09-09 17:39:29 GMT