PEP: 276 Title: Simple Iterator for ints Author: Jim Althoff
<james_althoff@i2.com> Status: Rejected Type: Standards Track
Content-Type: text/x-rst Created: 12-Nov-2001 Python-Version: 2.3
Post-History:

Abstract

Python 2.1 added new functionality to support iterators (PEP 234).
Iterators have proven to be useful and convenient in many coding
situations. It is noted that the implementation of Python's for-loop
control structure uses the iterator protocol as of release 2.1. It is
also noted that Python provides iterators for the following builtin
types: lists, tuples, dictionaries, strings, and files. This PEP
proposes the addition of an iterator for the builtin type int
(types.IntType). Such an iterator would simplify the coding of certain
for-loops in Python.

BDFL Pronouncement

This PEP was rejected on 17 June 2005 with a note to python-dev.

Much of the original need was met by the enumerate() function which was
accepted for Python 2.3.

Also, the proposal both allowed and encouraged misuses such as:

    >>> for i in 3: print i
    0
    1
    2

Likewise, it was not helpful that the proposal would disable the syntax
error in statements like:

    x, = 1

Specification

Define an iterator for types.intType (i.e., the builtin type "int") that
is returned from the builtin function "iter" when called with an
instance of types.intType as the argument.

The returned iterator has the following behavior:

-   Assume that object i is an instance of types.intType (the builtin
    type int) and that i > 0

-   iter(i) returns an iterator object

-   said iterator object iterates through the sequence of ints
    0,1,2,...,i-1

    Example:

      iter(5) returns an iterator object that iterates through the
      sequence of ints 0,1,2,3,4

-   if i <= 0, iter(i) returns an "empty" iterator, i.e., one that
    throws StopIteration upon the first call of its "next" method

In other words, the conditions and semantics of said iterator is
consistent with the conditions and semantics of the range() and xrange()
functions.

Note that the sequence 0,1,2,...,i-1 associated with the int i is
considered "natural" in the context of Python programming because it is
consistent with the builtin indexing protocol of sequences in Python.
Python lists and tuples, for example, are indexed starting at 0 and
ending at len(object)-1 (when using positive indices). In other words,
such objects are indexed with the sequence 0,1,2,...,len(object)-1

Rationale

A common programming idiom is to take a collection of objects and apply
some operation to each item in the collection in some established
sequential order. Python provides the "for in" looping control structure
for handling this common idiom. Cases arise, however, where it is
necessary (or more convenient) to access each item in an "indexed"
collection by iterating through each index and accessing each item in
the collection using the corresponding index.

For example, one might have a two-dimensional "table" object where one
requires the application of some operation to the first column of each
row in the table. Depending on the implementation of the table it might
not be possible to access first each row and then each column as
individual objects. It might, rather, be possible to access a cell in
the table using a row index and a column index. In such a case it is
necessary to use an idiom where one iterates through a sequence of
indices (indexes) in order to access the desired items in the table.
(Note that the commonly used DefaultTableModel class in
Java-Swing-Jython has this very protocol).

Another common example is where one needs to process two or more
collections in parallel. Another example is where one needs to access,
say, every second item in a collection.

There are many other examples where access to items in a collection is
facilitated by a computation on an index thus necessitating access to
the indices rather than direct access to the items themselves.

Let's call this idiom the "indexed for-loop" idiom. Some programming
languages provide builtin syntax for handling this idiom. In Python the
common convention for implementing the indexed for-loop idiom is to use
the builtin range() or xrange() function to generate a sequence of
indices as in, for example:

    for rowcount in range(table.getRowCount()):
        print table.getValueAt(rowcount, 0)

or

    for rowcount in xrange(table.getRowCount()):
        print table.getValueAt(rowcount, 0)

From time to time there are discussions in the Python community about
the indexed for-loop idiom. It is sometimes argued that the need for
using the range() or xrange() function for this design idiom is:

-   Not obvious (to new-to-Python programmers),
-   Error prone (easy to forget, even for experienced Python
    programmers)
-   Confusing and distracting for those who feel compelled to understand
    the differences and recommended usage of xrange() vis-a-vis range()
-   Unwieldy, especially when combined with the len() function, i.e.,
    xrange(len(sequence))
-   Not as convenient as equivalent mechanisms in other languages,
-   Annoying, a "wart", etc.

And from time to time proposals are put forth for ways in which Python
could provide a better mechanism for this idiom. Recent examples include
PEP 204, "Range Literals", and PEP 212, "Loop Counter Iteration".

Most often, such proposal include changes to Python's syntax and other
"heavyweight" changes.

Part of the difficulty here is that advocating new syntax implies a
comprehensive solution for "general indexing" that has to include
aspects like:

-   starting index value
-   ending index value
-   step value
-   open intervals versus closed intervals versus half opened intervals

Finding a new syntax that is comprehensive, simple, general, Pythonic,
appealing to many, easy to implement, not in conflict with existing
structures, not excessively overloading of existing structures, etc. has
proven to be more difficult than one might anticipate.

The proposal outlined in this PEP tries to address the problem by
suggesting a simple "lightweight" solution that helps the most common
case by using a proven mechanism that is already available (as of Python
2.1): namely, iterators.

Because for-loops already use "iterator" protocol as of Python 2.1,
adding an iterator for types.IntType as proposed in this PEP would
enable by default the following shortcut for the indexed for-loop idiom:

    for rowcount in table.getRowCount():
        print table.getValueAt(rowcount, 0)

The following benefits for this approach vis-a-vis the current mechanism
of using the range() or xrange() functions are claimed to be:

-   Simpler,
-   Less cluttered,
-   Focuses on the problem at hand without the need to resort to
    secondary implementation-oriented functions (range() and xrange())

And compared to other proposals for change:

-   Requires no new syntax
-   Requires no new keywords
-   Takes advantage of the new and well-established iterator mechanism

And generally:

-   Is consistent with iterator-based "convenience" changes already
    included (as of Python 2.1) for other builtin types such as: lists,
    tuples, dictionaries, strings, and files.

Backwards Compatibility

The proposed mechanism is generally backwards compatible as it calls for
neither new syntax nor new keywords. All existing, valid Python programs
should continue to work unmodified.

However, this proposal is not perfectly backwards compatible in the
sense that certain statements that are currently invalid would, under
the current proposal, become valid.

Tim Peters has pointed out two such examples:

1)  The common case where one forgets to include range() or xrange(),
    for example:

        for rowcount in table.getRowCount():
            print table.getValueAt(rowcount, 0)

    in Python 2.2 raises a TypeError exception.

    Under the current proposal, the above statement would be valid and
    would work as (presumably) intended. Presumably, this is a good
    thing.

    As noted by Tim, this is the common case of the "forgotten range"
    mistake (which one currently corrects by adding a call to range() or
    xrange()).

2)  The (hopefully) very uncommon case where one makes a typing mistake
    when using tuple unpacking. For example:

        x, = 1

    in Python 2.2 raises a TypeError exception.

    Under the current proposal, the above statement would be valid and
    would set x to 0. The PEP author has no data as to how common this
    typing error is nor how difficult it would be to catch such an error
    under the current proposal. He imagines that it does not occur
    frequently and that it would be relatively easy to correct should it
    happen.

Issues

Extensive discussions concerning PEP 276 on the Python interest mailing
list suggests a range of opinions: some in favor, some neutral, some
against. Those in favor tend to agree with the claims above of the
usefulness, convenience, ease of learning, and simplicity of a simple
iterator for integers.

Issues with PEP 276 include:

-   Using range/xrange is fine as is.

    Response: Some posters feel this way. Other disagree.

-   Some feel that iterating over the sequence "0, 1, 2, ..., n-1" for
    an integer n is not intuitive. "for i in 5:" is considered (by some)
    to be "non-obvious", for example. Some dislike this usage because it
    doesn't have "the right feel". Some dislike it because they believe
    that this type of usage forces one to view integers as a sequences
    and this seems wrong to them. Some dislike it because they prefer to
    view for-loops as dealing with explicit sequences rather than with
    arbitrary iterators.

    Response: Some like the proposed idiom and see it as simple,
    elegant, easy to learn, and easy to use. Some are neutral on this
    issue. Others, as noted, dislike it.

-   Is it obvious that iter(5) maps to the sequence 0,1,2,3,4?

    Response: Given, as noted above, that Python has a strong convention
    for indexing sequences starting at 0 and stopping at (inclusively)
    the index whose value is one less than the length of the sequence,
    it is argued that the proposed sequence is reasonably intuitive to
    the Python programmer while being useful and practical. More
    importantly, it is argued that once learned this convention is very
    easy to remember. Note that the doc string for the range function
    makes a reference to the natural and useful association between
    range(n) and the indices for a list whose length is n.

-   Possible ambiguity

        for i in 10: print i

    might be mistaken for

        for i in (10,): print i

    Response: This is exactly the same situation with strings in current
    Python (replace 10 with 'spam' in the above, for example).

-   Too general: in the newest releases of Python there are contexts --
    as with for-loops -- where iterators are called implicitly. Some
    fear that having an iterator invoked for an integer in one of the
    context (excluding for-loops) might lead to unexpected behavior and
    bugs. The "x, = 1" example noted above is an a case in point.

    Response: From the author's perspective the examples of the above
    that were identified in the PEP 276 discussions did not appear to be
    ones that would be accidentally misused in ways that would lead to
    subtle and hard-to-detect errors.

    In addition, it seems that there is a way to deal with this issue by
    using a variation of what is outlined in the specification section
    of this proposal. Instead of adding an __iter__ method to class int,
    change the for-loop handling code to convert (in essence) from

        for i in n:  # when isinstance(n,int) is 1

    to

        for i in xrange(n):

    This approach gives the same results in a for-loop as an __iter__
    method would but would prevent iteration on integer values in any
    other context. Lists and tuples, for example, don't have __iter__
    and are handled with special code. Integer values would be one more
    special case.

-   "i in n" seems very unnatural.

    Response: Some feel that "i in len(mylist)" would be easily
    understandable and useful. Some don't like it, particularly when a
    literal is used as in "i in 5". If the variant mentioned in the
    response to the previous issue is implemented, this issue is moot.
    If not, then one could also address this issue by defining a
    __contains__ method in class int that would always raise a
    TypeError. This would then make the behavior of "i in n" identical
    to that of current Python.

-   Might dissuade newbies from using the indexed for-loop idiom when
    the standard "for item in collection:" idiom is clearly better.

    Response: The standard idiom is so nice when it fits that it needs
    neither extra "carrot" nor "stick". On the other hand, one does
    notice cases of overuse/misuse of the standard idiom (due, most
    likely, to the awkwardness of the indexed for-loop idiom), as in:

        for item in sequence:
            print sequence.index(item)

-   Why not propose even bigger changes?

The majority of disagreement with PEP 276 came from those who favor much
larger changes to Python to address the more general problem of
specifying a sequence of integers where such a specification is general
enough to handle the starting value, ending value, and stepping value of
the sequence and also addresses variations of open, closed, and
half-open (half-closed) integer intervals. Many suggestions of such were
discussed.

These include:

-   adding Haskell-like notation for specifying a sequence of integers
    in a literal list,
-   various uses of slicing notation to specify sequences,
-   changes to the syntax of for-in loops to allow the use of relational
    operators in the loop header,
-   creation of an integer-interval class along with methods that
    overload relational operators or division operators to provide
    "slicing" on integer-interval objects,
-   and more.

It should be noted that there was much debate but not an overwhelming
consensus for any of these larger-scale suggestions.

Clearly, PEP 276 does not propose such a large-scale change and instead
focuses on a specific problem area. Towards the end of the discussion
period, several posters expressed favor for the narrow focus and
simplicity of PEP 276 vis-a-vis the more ambitious suggestions that were
advanced. There did appear to be consensus for the need for a PEP for
any such larger-scale, alternative suggestion. In light of this
recognition, details of the various alternative suggestions are not
discussed here further.

Implementation

An implementation is not available at this time but is expected to be
straightforward. The author has implemented a subclass of int with an
__iter__ method (written in Python) as a means to test out the ideas in
this proposal, however.

Copyright

This document has been placed in the public domain.