PEP: 238 Title: Changing the Division Operator Author: Moshe Zadka
<moshez@zadka.site.co.il>, Guido van Rossum <guido@python.org> Status:
Final Type: Standards Track Content-Type: text/x-rst Created:
11-Mar-2001 Python-Version: 2.2 Post-History: 16-Mar-2001, 26-Jul-2001,
27-Jul-2001

Abstract

The current division (/) operator has an ambiguous meaning for numerical
arguments: it returns the floor of the mathematical result of division
if the arguments are ints or longs, but it returns a reasonable
approximation of the division result if the arguments are floats or
complex. This makes expressions expecting float or complex results
error-prone when integers are not expected but possible as inputs.

We propose to fix this by introducing different operators for different
operations: x/y to return a reasonable approximation of the mathematical
result of the division ("true division"), x//y to return the floor
("floor division"). We call the current, mixed meaning of x/y "classic
division".

Because of severe backwards compatibility issues, not to mention a major
flamewar on c.l.py, we propose the following transitional measures
(starting with Python 2.2):

-   Classic division will remain the default in the Python 2.x series;
    true division will be standard in Python 3.0.
-   The // operator will be available to request floor division
    unambiguously.
-   The future division statement, spelled
    from __future__ import division, will change the / operator to mean
    true division throughout the module.
-   A command line option will enable run-time warnings for classic
    division applied to int or long arguments; another command line
    option will make true division the default.
-   The standard library will use the future division statement and the
    // operator when appropriate, so as to completely avoid classic
    division.

Motivation

The classic division operator makes it hard to write numerical
expressions that are supposed to give correct results from arbitrary
numerical inputs. For all other operators, one can write down a formula
such as x*y**2 + z, and the calculated result will be close to the
mathematical result (within the limits of numerical accuracy, of course)
for any numerical input type (int, long, float, or complex). But
division poses a problem: if the expressions for both arguments happen
to have an integral type, it implements floor division rather than true
division.

The problem is unique to dynamically typed languages: in a statically
typed language like C, the inputs, typically function arguments, would
be declared as double or float, and when a call passes an integer
argument, it is converted to double or float at the time of the call.
Python doesn't have argument type declarations, so integer arguments can
easily find their way into an expression.

The problem is particularly pernicious since ints are perfect
substitutes for floats in all other circumstances: math.sqrt(2) returns
the same value as math.sqrt(2.0), 3.14*100 and 3.14*100.0 return the
same value, and so on. Thus, the author of a numerical routine may only
use floating point numbers to test his code, and believe that it works
correctly, and a user may accidentally pass in an integer input value
and get incorrect results.

Another way to look at this is that classic division makes it difficult
to write polymorphic functions that work well with either float or int
arguments; all other operators already do the right thing. No algorithm
that works for both ints and floats has a need for truncating division
in one case and true division in the other.

The correct work-around is subtle: casting an argument to float() is
wrong if it could be a complex number; adding 0.0 to an argument doesn't
preserve the sign of the argument if it was minus zero. The only
solution without either downside is multiplying an argument (typically
the first) by 1.0. This leaves the value and sign unchanged for float
and complex, and turns int and long into a float with the corresponding
value.

It is the opinion of the authors that this is a real design bug in
Python, and that it should be fixed sooner rather than later. Assuming
Python usage will continue to grow, the cost of leaving this bug in the
language will eventually outweigh the cost of fixing old code -- there
is an upper bound to the amount of code to be fixed, but the amount of
code that might be affected by the bug in the future is unbounded.

Another reason for this change is the desire to ultimately unify
Python's numeric model. This is the subject of PEP 228 (which is
currently incomplete). A unified numeric model removes most of the
user's need to be aware of different numerical types. This is good for
beginners, but also takes away concerns about different numeric behavior
for advanced programmers. (Of course, it won't remove concerns about
numerical stability and accuracy.)

In a unified numeric model, the different types (int, long, float,
complex, and possibly others, such as a new rational type) serve mostly
as storage optimizations, and to some extent to indicate orthogonal
properties such as inexactness or complexity. In a unified model, the
integer 1 should be indistinguishable from the floating point number 1.0
(except for its inexactness), and both should behave the same in all
numeric contexts. Clearly, in a unified numeric model, if a==b and c==d,
a/c should equal b/d (taking some liberties due to rounding for inexact
numbers), and since everybody agrees that 1.0/2.0 equals 0.5, 1/2 should
also equal 0.5. Likewise, since 1//2 equals zero, 1.0//2.0 should also
equal zero.

Variations

Aesthetically, x//y doesn't please everyone, and hence several
variations have been proposed. They are addressed here:

-   x div y. This would introduce a new keyword. Since div is a popular
    identifier, this would break a fair amount of existing code, unless
    the new keyword was only recognized under a future division
    statement. Since it is expected that the majority of code that needs
    to be converted is dividing integers, this would greatly increase
    the need for the future division statement. Even with a future
    statement, the general sentiment against adding new keywords unless
    absolutely necessary argues against this.
-   div(x, y). This makes the conversion of old code much harder.
    Replacing x/y with x//y or x div y can be done with a simple query
    replace; in most cases the programmer can easily verify that a
    particular module only works with integers so all occurrences of x/y
    can be replaced. (The query replace is still needed to weed out
    slashes occurring in comments or string literals.) Replacing x/y
    with div(x, y) would require a much more intelligent tool, since the
    extent of the expressions to the left and right of the / must be
    analyzed before the placement of the div( and ) part can be decided.
-   x \ y. The backslash is already a token, meaning line continuation,
    and in general it suggests an escape to Unix eyes. In addition (this
    due to Terry Reedy) this would make things like eval("x\y") harder
    to get right.

Alternatives

In order to reduce the amount of old code that needs to be converted,
several alternative proposals have been put forth. Here is a brief
discussion of each proposal (or category of proposals). If you know of
an alternative that was discussed on c.l.py that isn't mentioned here,
please mail the second author.

-   Let / keep its classic semantics; introduce // for true division.
    This still leaves a broken operator in the language, and invites to
    use the broken behavior. It also shuts off the road to a unified
    numeric model a la PEP 228.
-   Let int division return a special "portmanteau" type that behaves as
    an integer in integer context, but like a float in a float context.
    The problem with this is that after a few operations, the int and
    the float value could be miles apart, it's unclear which value
    should be used in comparisons, and of course many contexts (like
    conversion to string) don't have a clear integer or float
    preference.
-   Use a directive to use specific division semantics in a module,
    rather than a future statement. This retains classic division as a
    permanent wart in the language, requiring future generations of
    Python programmers to be aware of the problem and the remedies.
-   Use from __past__ import division to use classic division semantics
    in a module. This also retains the classic division as a permanent
    wart, or at least for a long time (eventually the past division
    statement could raise an ImportError).
-   Use a directive (or some other way) to specify the Python version
    for which a specific piece of code was developed. This requires
    future Python interpreters to be able to emulate exactly several
    previous versions of Python, and moreover to do so for multiple
    versions within the same interpreter. This is way too much work. A
    much simpler solution is to keep multiple interpreters installed.
    Another argument against this is that the version directive is
    almost always overspecified: most code written for Python X.Y, works
    for Python X.(Y-1) and X.(Y+1) as well, so specifying X.Y as a
    version is more constraining than it needs to be. At the same time,
    there's no way to know at which future or past version the code will
    break.

API Changes

During the transitional phase, we have to support three division
operators within the same program: classic division (for / in modules
without a future division statement), true division (for / in modules
with a future division statement), and floor division (for //). Each
operator comes in two flavors: regular, and as an augmented assignment
operator (/= or //=).

The names associated with these variations are:

-   Overloaded operator methods:

        __div__(), __floordiv__(), __truediv__();
        __idiv__(), __ifloordiv__(), __itruediv__().

-   Abstract API C functions:

        PyNumber_Divide(), PyNumber_FloorDivide(),
        PyNumber_TrueDivide();

        PyNumber_InPlaceDivide(), PyNumber_InPlaceFloorDivide(),
        PyNumber_InPlaceTrueDivide().

-   Byte code opcodes:

        BINARY_DIVIDE, BINARY_FLOOR_DIVIDE, BINARY_TRUE_DIVIDE;
        INPLACE_DIVIDE, INPLACE_FLOOR_DIVIDE, INPLACE_TRUE_DIVIDE.

-   PyNumberMethod slots:

        nb_divide, nb_floor_divide, nb_true_divide,
        nb_inplace_divide, nb_inplace_floor_divide,
        nb_inplace_true_divide.

The added PyNumberMethod slots require an additional flag in tp_flags;
this flag will be named Py_TPFLAGS_HAVE_NEWDIVIDE and will be included
in Py_TPFLAGS_DEFAULT.

The true and floor division APIs will look for the corresponding slots
and call that; when that slot is NULL, they will raise an exception.
There is no fallback to the classic divide slot.

In Python 3.0, the classic division semantics will be removed; the
classic division APIs will become synonymous with true division.

Command Line Option

The -Q command line option takes a string argument that can take four
values: old, warn, warnall, or new. The default is old in Python 2.2 but
will change to warn in later 2.x versions. The old value means the
classic division operator acts as described. The warn value means the
classic division operator issues a warning (a DeprecationWarning using
the standard warning framework) when applied to ints or longs. The
warnall value also issues warnings for classic division when applied to
floats or complex; this is for use by the fixdiv.py conversion script
mentioned below. The new value changes the default globally so that the
/ operator is always interpreted as true division. The new option is
only intended for use in certain educational environments, where true
division is required, but asking the students to include the future
division statement in all their code would be a problem.

This option will not be supported in Python 3.0; Python 3.0 will always
interpret / as true division.

(This option was originally proposed as -D, but that turned out to be an
existing option for Jython, hence the Q -- mnemonic for Quotient. Other
names have been proposed, like -Qclassic, -Qclassic-warn, -Qtrue, or
-Qold_division etc.; these seem more verbose to me without much
advantage. After all the term classic division is not used in the
language at all (only in the PEP), and the term true division is rarely
used in the language -- only in __truediv__.)

Semantics of Floor Division

Floor division will be implemented in all the Python numeric types, and
will have the semantics of:

    a // b == floor(a/b)

except that the result type will be the common type into which a and b
are coerced before the operation.

Specifically, if a and b are of the same type, a//b will be of that type
too. If the inputs are of different types, they are first coerced to a
common type using the same rules used for all other arithmetic
operators.

In particular, if a and b are both ints or longs, the result has the
same type and value as for classic division on these types (including
the case of mixed input types; int//long and long//int will both return
a long).

For floating point inputs, the result is a float. For example:

    3.5//2.0 == 1.0

For complex numbers, // raises an exception, since floor() of a complex
number is not allowed.

For user-defined classes and extension types, all semantics are up to
the implementation of the class or type.

Semantics of True Division

True division for ints and longs will convert the arguments to float and
then apply a float division. That is, even 2/1 will return a
float (2.0), not an int. For floats and complex, it will be the same as
classic division.

The 2.2 implementation of true division acts as if the float type had
unbounded range, so that overflow doesn't occur unless the magnitude of
the mathematical result is too large to represent as a float. For
example, after x = 1L << 40000, float(x) raises OverflowError (note that
this is also new in 2.2: previously the outcome was platform-dependent,
most commonly a float infinity). But x/x returns 1.0 without exception,
while x/1 raises OverflowError.

Note that for int and long arguments, true division may lose
information; this is in the nature of true division (as long as
rationals are not in the language). Algorithms that consciously use
longs should consider using //, as true division of longs retains no
more than 53 bits of precision (on most platforms).

If and when a rational type is added to Python (see PEP 239), true
division for ints and longs should probably return a rational. This
avoids the problem with true division of ints and longs losing
information. But until then, for consistency, float is the only choice
for true division.

The Future Division Statement

If from __future__ import division is present in a module, or if -Qnew
is used, the / and /= operators are translated to true division opcodes;
otherwise they are translated to classic division (until Python 3.0
comes along, where they are always translated to true division).

The future division statement has no effect on the recognition or
translation of // and //=.

See PEP 236 for the general rules for future statements.

(It has been proposed to use a longer phrase, like true_division or
modern_division. These don't seem to add much information.)

Open Issues

We expect that these issues will be resolved over time, as more feedback
is received or we gather more experience with the initial
implementation.

-   It has been proposed to call // the quotient operator, and the /
    operator the ratio operator. I'm not sure about this -- for some
    people quotient is just a synonym for division, and ratio suggests
    rational numbers, which is wrong. I prefer the terminology to be
    slightly awkward if that avoids unambiguity. Also, for some folks
    quotient suggests truncation towards zero, not towards infinity as
    floor division says explicitly.
-   It has been argued that a command line option to change the default
    is evil. It can certainly be dangerous in the wrong hands: for
    example, it would be impossible to combine a 3rd party library
    package that requires -Qnew with another one that requires -Qold.
    But I believe that the VPython folks need a way to enable true
    division by default, and other educators might need the same. These
    usually have enough control over the library packages available in
    their environment.
-   For classes to have to support all three of __div__(),
    __floordiv__() and __truediv__() seems painful; and what to do in
    3.0? Maybe we only need __div__() and __floordiv__(), or maybe at
    least true division should try __truediv__() first and __div__()
    second.

Resolved Issues

-   Issue: For very large long integers, the definition of true division
    as returning a float causes problems, since the range of Python
    longs is much larger than that of Python floats. This problem will
    disappear if and when rational numbers are supported.

    Resolution: For long true division, Python uses an internal float
    type with native double precision but unbounded range, so that
    OverflowError doesn't occur unless the quotient is too large to
    represent as a native double.

-   Issue: In the interim, maybe the long-to-float conversion could be
    made to raise OverflowError if the long is out of range.

    Resolution: This has been implemented, but, as above, the magnitude
    of the inputs to long true division doesn't matter; only the
    magnitude of the quotient matters.

-   Issue: Tim Peters will make sure that whenever an in-range float is
    returned, decent precision is guaranteed.

    Resolution: Provided the quotient of long true division is
    representable as a float, it suffers no more than 3 rounding errors:
    one each for converting the inputs to an internal float type with
    native double precision but unbounded range, and one more for the
    division. However, note that if the magnitude of the quotient is too
    small to represent as a native double, 0.0 is returned without
    exception ("silent underflow").

FAQ

When will Python 3.0 be released?

  We don't plan that long ahead, so we can't say for sure. We want to
  allow at least two years for the transition. If Python 3.0 comes out
  sooner, we'll keep the 2.x line alive for backwards compatibility
  until at least two years from the release of Python 2.2. In practice,
  you will be able to continue to use the Python 2.x line for several
  years after Python 3.0 is released, so you can take your time with the
  transition. Sites are expected to have both Python 2.x and Python 3.x
  installed simultaneously.

Why isn't true division called float division?

  Because I want to keep the door open to possibly introducing rationals
  and making 1/2 return a rational rather than a float. See PEP 239.

Why is there a need for __truediv__ and __itruediv__?

  We don't want to make user-defined classes second-class citizens.
  Certainly not with the type/class unification going on.

How do I write code that works under the classic rules as well as under the new rules without using // or a future division statement?

  Use x*1.0/y for true division, divmod(x, y) (PEP 228) for int
  division. Especially the latter is best hidden inside a function. You
  may also write float(x)/y for true division if you are sure that you
  don't expect complex numbers. If you know your integers are never
  negative, you can use int(x/y) -- while the documentation of int()
  says that int() can round or truncate depending on the C
  implementation, we know of no C implementation that doesn't truncate,
  and we're going to change the spec for int() to promise truncation.
  Note that classic division (and floor division) round towards negative
  infinity, while int() rounds towards zero, giving different answers
  for negative numbers.

How do I specify the division semantics for input(), compile(), execfile(), eval() and exec?

  They inherit the choice from the invoking module. PEP 236 now lists
  this as a resolved problem, referring to PEP 264.

What about code compiled by the codeop module?

  This is dealt with properly; see PEP 264.

Will there be conversion tools or aids?

  Certainly. While these are outside the scope of the PEP, I should
  point out two simple tools that will be released with Python 2.2a3:
  Tools/scripts/finddiv.py finds division operators (slightly smarter
  than grep /) and Tools/scripts/fixdiv.py can produce patches based on
  run-time analysis.

Why is my question not answered here?

  Because we weren't aware of it. If it's been discussed on c.l.py and
  you believe the answer is of general interest, please notify the
  second author. (We don't have the time or inclination to answer every
  question sent in private email, hence the requirement that it be
  discussed on c.l.py first.)

Implementation

Essentially everything mentioned here is implemented in CVS and will be
released with Python 2.2a3; most of it was already released with Python
2.2a2.

Copyright

This document has been placed in the public domain.