PEP 357 – Allowing Any Object to be Used for Slicing
- Travis Oliphant <oliphant at ee.byu.edu>
- Standards Track
Table of Contents
- Implementation Plan
- Discussion Questions
- Reference Implementation
This PEP proposes adding an
nb_index slot in
PyNumberMethods and an
__index__ special method so that arbitrary objects can be used
whenever integers are explicitly needed in Python, such as in slice
syntax (from which the slot gets its name).
Currently integers and long integers play a special role in
slicing in that they are the only objects allowed in slice
syntax. In other words, if X is an object implementing the
sequence protocol, then
X[obj1:obj2] is only valid if
obj2 are both integers or long integers. There is no way for
obj2 to tell Python that they could be reasonably used as
indexes into a sequence. This is an unnecessary limitation.
In NumPy, for example, there are 8 different integer scalars corresponding to unsigned and signed integers of 8, 16, 32, and 64 bits. These type-objects could reasonably be used as integers in many places where Python expects true integers but cannot inherit from the Python integer type because of incompatible memory layouts. There should be some way to be able to tell Python that an object can behave like an integer.
It is not possible to use the
__int__ special method)
for this purpose because that method is used to coerce objects
to integers. It would be inappropriate to allow every object that
can be coerced to an integer to be used as an integer everywhere
Python expects a true integer. For example, if
__int__ were used
to convert an object to an integer in slicing, then float objects
would be allowed in slicing and
x[3.2:5.8] would not raise an error
as it should.
nb_index slot to
PyNumberMethods, and a corresponding
__index__ special method. Objects could define a function to
place in the
nb_index slot that returns a Python integer
(either an int or a long). This integer can
then be appropriately converted to a
Py_ssize_t value whenever
Python needs one such as in
nb_indexslot will have the following signature:
PyObject *index_func (PyObject *self)
The returned object must be a Python
LongType. NULL should be returned on error with an appropriate error set.
__index__special method will have the signature:
def __index__(self): return obj
where obj must be either an int or a long.
- 3 new abstract C-API functions will be added
- The first checks to see if the object supports the index
slot and if it is filled in.
This will return true if the object defines the
- The second is a simple wrapper around the
nb_indexcall that raises
PyExc_TypeErrorif the call is not available or if it doesn’t return an int or long. Because the
PyIndex_Checkis performed inside the
PyNumber_Indexcall you can call it directly and manage any error rather than check for compatibility first.
PyObject *PyNumber_Index (PyObject *obj)
- The third call helps deal with the common situation of
actually needing a
Py_ssize_tvalue from the object to use for indexing or other needs.
Py_ssize_t PyNumber_AsSsize_t(PyObject *obj, PyObject *exc)
The function calls the
nb_indexslot of obj if it is available and then converts the returned Python integer into a
Py_ssize_tvalue. If this goes well, then the value is returned. The second argument allows control over what happens if the integer returned from
nb_indexcannot fit into a
If exc is NULL, then the returned value will be clipped to
PY_SSIZE_T_MINdepending on whether the
nb_indexslot of obj returned a positive or negative integer. If exc is non-NULL, then it is the error object that will be set to replace the
PyExc_OverflowErrorthat was raised when the Python integer or long was converted to
- The first checks to see if the object supports the index slot and if it is filled in.
- A new
operator.index(obj)function will be added that calls equivalent of
obj.__index__()and raises an error if obj does not implement the special method.
- Add the
typeobject.cto create the
- Change the
ISINDEXand alter it to accommodate objects with the index slot defined.
- Change the
_PyEval_SliceIndexfunction to accommodate objects with the index slot defined.
- Change all builtin objects (e.g. lists) that use the
as_mappingslots for subscript access and use a special-check for integers to check for the slot as well.
- Add the
nb_indexslot to integers and long_integers (which just return themselves)
PyNumber_IndexC-API to return an integer from any Python Object that has the
- Add the
mmapmodule.cto use the new C-API for their sub-scripting and other needs.
- Add unit-tests
Implementation should not slow down Python because integers and long integers used as indexes will complete in the same number of instructions. The only change will be that what used to generate an error will now be acceptable.
Why not use
nb_int which is already there?
nb_int method is used for coercion and so means something
fundamentally different than what is requested here. This PEP
proposes a method for something that can already be thought of as
an integer communicate that information to Python when it needs an
integer. The biggest example of why using
nb_int would be a bad
thing is that float objects already define the
nb_int method, but
float objects should not be used as indexes in a sequence.
Why the name
Some questions were raised regarding the name
__index__ when other
interpretations of the slot are possible. For example, the slot
can be used any time Python requires an integer internally (such
"mystring" * 3). The name was suggested by Guido because
slicing syntax is the biggest reason for having such a slot and
in the end no better name emerged. See the discussion thread 
for examples of names that were suggested such as “
PyObject * from
Py_ssize_t was selected as the return type for the
nb_index slot. However, this led to an inability to track and
distinguish overflow and underflow errors without ugly and brittle
hacks. As the
nb_index slot is used in at least 3 different ways
in the Python core (to get an integer, to get a slice end-point,
and to get a sequence index), there is quite a bit of flexibility
needed to handle all these cases. The importance of having the
necessary flexibility to handle all the use cases is critical.
For example, the initial implementation that returned
nb_index led to the discovery that on a 32-bit machine with >=2GB of RAM
s = 'x' * (2**100) works but
len(s) was clipped at 2147483647.
Several fixes were suggested but eventually it was decided that
nb_index needed to return a Python Object similar to the
nb_long slots in order to handle overflow correctly.
__index__ return any object with the
This would allow infinite recursion in many different ways that are not
easy to check for. This restriction is similar to the requirement that
__nonzero__ return an int or a bool.
Submitted as patch 1436368 to SourceForge.
This document is placed in the public domain.
Last modified: 2017-11-11 19:28:55 GMT