PEP: 485 Title: A Function for testing approximate equality Version:
$Revision$ Last-Modified: $Date$ Author: Christopher Barker
<PythonCHB@gmail.com> Status: Final Type: Standards Track Content-Type:
text/x-rst Created: 20-Jan-2015 Python-Version: 3.5 Post-History:
Resolution:
https://mail.python.org/pipermail/python-dev/2015-February/138598.html

Abstract

This PEP proposes the addition of an isclose() function to the standard
library math module that determines whether one value is approximately
equal or "close" to another value.

Rationale

Floating point values contain limited precision, which results in their
being unable to exactly represent some values, and for errors to
accumulate with repeated computation. As a result, it is common advice
to only use an equality comparison in very specific situations. Often an
inequality comparison fits the bill, but there are times (often in
testing) where the programmer wants to determine whether a computed
value is "close" to an expected value, without requiring them to be
exactly equal. This is common enough, particularly in testing, and not
always obvious how to do it, that it would be useful addition to the
standard library.

Existing Implementations

The standard library includes the unittest.TestCase.assertAlmostEqual
method, but it:

-   Is buried in the unittest.TestCase class
-   Is an assertion, so you can't use it as a general test at the
    command line, etc. (easily)
-   Is an absolute difference test. Often the measure of difference
    requires, particularly for floating point numbers, a relative error,
    i.e. "Are these two values within x% of each-other?", rather than an
    absolute error. Particularly when the magnitude of the values is
    unknown a priori.

The numpy package has the allclose() and isclose() functions, but they
are only available with numpy.

The statistics package tests include an implementation, used for its
unit tests.

One can also find discussion and sample implementations on Stack
Overflow and other help sites.

Many other non-python systems provide such a test, including the Boost
C++ library and the APL language[1].

These existing implementations indicate that this is a common need and
not trivial to write oneself, making it a candidate for the standard
library.

Proposed Implementation

NOTE: this PEP is the result of extended discussions on the python-ideas
list[2].

The new function will go into the math module, and have the following
signature:

    isclose(a, b, rel_tol=1e-9, abs_tol=0.0)

a and b: are the two values to be tested to relative closeness

rel_tol: is the relative tolerance -- it is the amount of error allowed,
relative to the larger absolute value of a or b. For example, to set a
tolerance of 5%, pass tol=0.05. The default tolerance is 1e-9, which
assures that the two values are the same within about 9 decimal digits.
rel_tol must be greater than 0.0

abs_tol: is a minimum absolute tolerance level -- useful for comparisons
near zero.

Modulo error checking, etc, the function will return the result of:

    abs(a-b) <= max( rel_tol * max(abs(a), abs(b)), abs_tol )

The name, isclose, is selected for consistency with the existing isnan
and isinf.

Handling of non-finite numbers

The IEEE 754 special values of NaN, inf, and -inf will be handled
according to IEEE rules. Specifically, NaN is not considered close to
any other value, including NaN. inf and -inf are only considered close
to themselves.

Non-float types

The primary use-case is expected to be floating point numbers. However,
users may want to compare other numeric types similarly. In theory, it
should work for any type that supports abs(), multiplication,
comparisons, and subtraction. However, the implementation in the math
module is written in C, and thus can not (easily) use python's duck
typing. Rather, the values passed into the function will be converted to
the float type before the calculation is performed. Passing in types (or
values) that cannot be converted to floats will raise an appropriate
Exception (TypeError, ValueError, or OverflowError).

The code will be tested to accommodate at least some values of these
types:

-   Decimal
-   int
-   Fraction
-   complex: For complex, a companion function will be added to the
    cmath module. In cmath.isclose(), the tolerances are specified as
    floats, and the absolute value of the complex values will be used
    for scaling and comparison. If a complex tolerance is passed in, the
    absolute value will be used as the tolerance.

NOTE: it may make sense to add a Decimal.isclose() that works properly
and completely with the decimal type, but that is not included as part
of this PEP.

Behavior near zero

Relative comparison is problematic if either value is zero. By
definition, no value is small relative to zero. And computationally, if
either value is zero, the difference is the absolute value of the other
value, and the computed absolute tolerance will be rel_tol times that
value. When rel_tol is less than one, the difference will never be less
than the tolerance.

However, while mathematically correct, there are many use cases where a
user will need to know if a computed value is "close" to zero. This
calls for an absolute tolerance test. If the user needs to call this
function inside a loop or comprehension, where some, but not all, of the
expected values may be zero, it is important that both a relative
tolerance and absolute tolerance can be tested for with a single
function with a single set of parameters.

There is a similar issue if the two values to be compared straddle zero:
if a is approximately equal to -b, then a and b will never be computed
as "close".

To handle this case, an optional parameter, abs_tol can be used to set a
minimum tolerance used in the case of very small or zero computed
relative tolerance. That is, the values will be always be considered
close if the difference between them is less than abs_tol

The default absolute tolerance value is set to zero because there is no
value that is appropriate for the general case. It is impossible to know
an appropriate value without knowing the likely values expected for a
given use case. If all the values tested are on order of one, then a
value of about 1e-9 might be appropriate, but that would be far too
large if expected values are on order of 1e-9 or smaller.

Any non-zero default might result in user's tests passing totally
inappropriately. If, on the other hand, a test against zero fails the
first time with defaults, a user will be prompted to select an
appropriate value for the problem at hand in order to get the test to
pass.

NOTE: that the author of this PEP has resolved to go back over many of
his tests that use the numpy allclose() function, which provides a
default absolute tolerance, and make sure that the default value is
appropriate.

If the user sets the rel_tol parameter to 0.0, then only the absolute
tolerance will effect the result. While not the goal of the function, it
does allow it to be used as a purely absolute tolerance check as well.

Implementation

A sample implementation in python is available (as of Jan 22, 2015) on
gitHub:

https://github.com/PythonCHB/close_pep/blob/master/is_close.py

This implementation has a flag that lets the user select which relative
tolerance test to apply -- this PEP does not suggest that that be
retained, but rather that the weak test be selected.

There are also drafts of this PEP and test code, etc. there:

https://github.com/PythonCHB/close_pep

Relative Difference

There are essentially two ways to think about how close two numbers are
to each-other:

Absolute difference: simply abs(a-b)

Relative difference: abs(a-b)/scale_factor[3].

The absolute difference is trivial enough that this proposal focuses on
the relative difference.

Usually, the scale factor is some function of the values under
consideration, for instance:

1)  The absolute value of one of the input values
2)  The maximum absolute value of the two
3)  The minimum absolute value of the two.
4)  The absolute value of the arithmetic mean of the two

These lead to the following possibilities for determining if two values,
a and b, are close to each other.

1)  abs(a-b) <= tol*abs(a)
2)  abs(a-b) <= tol * max( abs(a), abs(b) )
3)  abs(a-b) <= tol * min( abs(a), abs(b) )
4)  abs(a-b) <= tol * abs(a + b)/2

NOTE: (2) and (3) can also be written as:

2)  (abs(a-b) <= abs(tol*a)) or (abs(a-b) <= abs(tol*b))
3)  (abs(a-b) <= abs(tol*a)) and (abs(a-b) <= abs(tol*b))

(Boost refers to these as the "weak" and "strong" formulations[4]) These
can be a tiny bit more computationally efficient, and thus are used in
the example code.

Each of these formulations can lead to slightly different results.
However, if the tolerance value is small, the differences are quite
small. In fact, often less than available floating point precision.

How much difference does it make?

When selecting a method to determine closeness, one might want to know
how much of a difference it could make to use one test or the other --
i.e. how many values are there (or what range of values) that will pass
one test, but not the other.

The largest difference is between options (2) and (3) where the
allowable absolute difference is scaled by either the larger or smaller
of the values.

Define delta to be the difference between the allowable absolute
tolerance defined by the larger value and that defined by the smaller
value. That is, the amount that the two input values need to be
different in order to get a different result from the two tests. tol is
the relative tolerance value.

Assume that a is the larger value and that both a and b are positive, to
make the analysis a bit easier. delta is therefore:

    delta = tol * (a-b)

or:

    delta / tol = (a-b)

The largest absolute difference that would pass the test: (a-b), equals
the tolerance times the larger value:

    (a-b) = tol * a

Substituting into the expression for delta:

    delta / tol = tol * a

so:

    delta = tol**2 * a

For example, for a = 10, b = 9, tol = 0.1 (10%):

maximum tolerance tol * a == 0.1 * 10 == 1.0

minimum tolerance tol * b == 0.1 * 9.0 == 0.9

delta = (1.0 - 0.9) = 0.1 or tol**2 * a = 0.1**2 * 10 = .1

The absolute difference between the maximum and minimum tolerance tests
in this case could be substantial. However, the primary use case for the
proposed function is testing the results of computations. In that case a
relative tolerance is likely to be selected of much smaller magnitude.

For example, a relative tolerance of 1e-8 is about half the precision
available in a python float. In that case, the difference between the
two tests is 1e-8**2 * a or 1e-16 * a, which is close to the limit of
precision of a python float. If the relative tolerance is set to the
proposed default of 1e-9 (or smaller), the difference between the two
tests will be lost to the limits of precision of floating point. That
is, each of the four methods will yield exactly the same results for all
values of a and b.

In addition, in common use, tolerances are defined to 1 significant
figure -- that is, 1e-9 is specifying about 9 decimal digits of
accuracy. So the difference between the various possible tests is well
below the precision to which the tolerance is specified.

Symmetry

A relative comparison can be either symmetric or non-symmetric. For a
symmetric algorithm:

isclose(a,b) is always the same as isclose(b,a)

If a relative closeness test uses only one of the values (such as (1)
above), then the result is asymmetric, i.e. isclose(a,b) is not
necessarily the same as isclose(b,a).

Which approach is most appropriate depends on what question is being
asked. If the question is: "are these two numbers close to each other?",
there is no obvious ordering, and a symmetric test is most appropriate.

However, if the question is: "Is the computed value within x% of this
known value?", then it is appropriate to scale the tolerance to the
known value, and an asymmetric test is most appropriate.

From the previous section, it is clear that either approach would yield
the same or similar results in the common use cases. In that case, the
goal of this proposal is to provide a function that is least likely to
produce surprising results.

The symmetric approach provide an appealing consistency -- it mirrors
the symmetry of equality, and is less likely to confuse people. A
symmetric test also relieves the user of the need to think about the
order in which to set the arguments. It was also pointed out that there
may be some cases where the order of evaluation may not be well defined,
for instance in the case of comparing a set of values all against each
other.

There may be cases when a user does need to know that a value is within
a particular range of a known value. In that case, it is easy enough to
simply write the test directly:

    if a-b <= tol*a:

(assuming a > b in this case). There is little need to provide a
function for this particular case.

This proposal uses a symmetric test.

Which symmetric test?

There are three symmetric tests considered:

The case that uses the arithmetic mean of the two values requires that
the value be either added together before dividing by 2, which could
result in extra overflow to inf for very large numbers, or require each
value to be divided by two before being added together, which could
result in underflow to zero for very small numbers. This effect would
only occur at the very limit of float values, but it was decided there
was no benefit to the method worth reducing the range of functionality
or adding the complexity of checking values to determine the order of
computation.

This leaves the boost "weak" test (2)-- or using the larger value to
scale the tolerance, or the Boost "strong" (3) test, which uses the
smaller of the values to scale the tolerance. For small tolerance, they
yield the same result, but this proposal uses the boost "weak" test
case: it is symmetric and provides a more useful result for very large
tolerances.

Large Tolerances

The most common use case is expected to be small tolerances -- on order
of the default 1e-9. However, there may be use cases where a user wants
to know if two fairly disparate values are within a particular range of
each other: "is a within 200% (rel_tol = 2.0) of b? In this case, the
strong test would never indicate that two values are within that range
of each other if one of them is zero. The weak case, however would use
the larger (non-zero) value for the test, and thus return true if one
value is zero. For example: is 0 within 200% of 10? 200% of ten is 20,
so the range within 200% of ten is -10 to +30. Zero falls within that
range, so it will return True.

Defaults

Default values are required for the relative and absolute tolerance.

Relative Tolerance Default

The relative tolerance required for two values to be considered "close"
is entirely use-case dependent. Nevertheless, the relative tolerance
needs to be greater than 1e-16 (approximate precision of a python
float). The value of 1e-9 was selected because it is the largest
relative tolerance for which the various possible methods will yield the
same result, and it is also about half of the precision available to a
python float. In the general case, a good numerical algorithm is not
expected to lose more than about half of available digits of accuracy,
and if a much larger tolerance is acceptable, the user should be
considering the proper value in that case. Thus 1e-9 is expected to
"just work" for many cases.

Absolute tolerance default

The absolute tolerance value will be used primarily for comparing to
zero. The absolute tolerance required to determine if a value is "close"
to zero is entirely use-case dependent. There is also essentially no
bounds to the useful range -- expected values would conceivably be
anywhere within the limits of a python float. Thus a default of 0.0 is
selected.

If, for a given use case, a user needs to compare to zero, the test will
be guaranteed to fail the first time, and the user can select an
appropriate value.

It was suggested that comparing to zero is, in fact, a common use case
(evidence suggest that the numpy functions are often used with zero). In
this case, it would be desirable to have a "useful" default. Values
around 1e-8 were suggested, being about half of floating point precision
for values of around value 1.

However, to quote The Zen: "In the face of ambiguity, refuse the
temptation to guess." Guessing that users will most often be concerned
with values close to 1.0 would lead to spurious passing tests when used
with smaller values -- this is potentially more damaging than requiring
the user to thoughtfully select an appropriate value.

Expected Uses

The primary expected use case is various forms of testing -- "are the
results computed near what I expect as a result?" This sort of test may
or may not be part of a formal unit testing suite. Such testing could be
used one-off at the command line, in an IPython notebook, part of
doctests, or simple asserts in an if __name__ == "__main__" block.

It would also be an appropriate function to use for the termination
criteria for a simple iterative solution to an implicit function:

    guess = something
    while True:
        new_guess = implicit_function(guess, *args)
        if isclose(new_guess, guess):
            break
        guess = new_guess

Inappropriate uses

One use case for floating point comparison is testing the accuracy of a
numerical algorithm. However, in this case, the numerical analyst
ideally would be doing careful error propagation analysis, and should
understand exactly what to test for. It is also likely that ULP (Unit in
the Last Place) comparison may be called for. While this function may
prove useful in such situations, It is not intended to be used in that
way without careful consideration.

Other Approaches

unittest.TestCase.assertAlmostEqual

(https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertAlmostEqual)

Tests that values are approximately (or not approximately) equal by
computing the difference, rounding to the given number of decimal places
(default 7), and comparing to zero.

This method is purely an absolute tolerance test, and does not address
the need for a relative tolerance test.

numpy isclose()

http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.isclose.html

The numpy package provides the vectorized functions isclose() and
allclose(), for similar use cases as this proposal:

isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)

  Returns a boolean array where two arrays are element-wise equal within
  a tolerance.

  The tolerance values are positive, typically very small numbers. The
  relative difference (rtol * abs(b)) and the absolute difference atol
  are added together to compare against the absolute difference between
  a and b

In this approach, the absolute and relative tolerance are added
together, rather than the or method used in this proposal. This is
computationally more simple, and if relative tolerance is larger than
the absolute tolerance, then the addition will have no effect. However,
if the absolute and relative tolerances are of similar magnitude, then
the allowed difference will be about twice as large as expected.

This makes the function harder to understand, with no computational
advantage in this context.

Even more critically, if the values passed in are small compared to the
absolute tolerance, then the relative tolerance will be completely
swamped, perhaps unexpectedly.

This is why, in this proposal, the absolute tolerance defaults to zero
-- the user will be required to choose a value appropriate for the
values at hand.

Boost floating-point comparison

The Boost project ([5] ) provides a floating point comparison function.
It is a symmetric approach, with both "weak" (larger of the two relative
errors) and "strong" (smaller of the two relative errors) options. This
proposal uses the Boost "weak" approach. There is no need to complicate
the API by providing the option to select different methods when the
results will be similar in most cases, and the user is unlikely to know
which to select in any case.

Alternate Proposals

A Recipe

The primary alternate proposal was to not provide a standard library
function at all, but rather, provide a recipe for users to refer to.
This would have the advantage that the recipe could provide and explain
the various options, and let the user select that which is most
appropriate. However, that would require anyone needing such a test to,
at the very least, copy the function into their code base, and select
the comparison method to use.

zero_tol

One possibility was to provide a zero tolerance parameter, rather than
the absolute tolerance parameter. This would be an absolute tolerance
that would only be applied in the case of one of the arguments being
exactly zero. This would have the advantage of retaining the full
relative tolerance behavior for all non-zero values, while allowing
tests against zero to work. However, it would also result in the
potentially surprising result that a small value could be "close" to
zero, but not "close" to an even smaller value. e.g., 1e-10 is "close"
to zero, but not "close" to 1e-11.

No absolute tolerance

Given the issues with comparing to zero, another possibility would have
been to only provide a relative tolerance, and let comparison to zero
fail. In this case, the user would need to do a simple absolute test:
abs(val) < zero_tol in the case where the comparison involved zero.

However, this would not allow the same call to be used for a sequence of
values, such as in a loop or comprehension. Making the function far less
useful. It is noted that the default abs_tol=0.0 achieves the same
effect if the default is not overridden.

Other tests

The other tests considered are all discussed in the Relative Error
section above.

References

  https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

Copyright

This document has been placed in the public domain.

[1] 1976. R. H. Lathwell. APL comparison tolerance. Proceedings of the
eighth international conference on APL Pages 255 - 258

http://dl.acm.org/citation.cfm?doid=800114.803685

[2] Python-ideas list discussion threads

https://mail.python.org/pipermail/python-ideas/2015-January/030947.html

https://mail.python.org/pipermail/python-ideas/2015-January/031124.html

https://mail.python.org/pipermail/python-ideas/2015-January/031313.html

[3] Wikipedia page on relative difference

http://en.wikipedia.org/wiki/Relative_change_and_difference

[4] Boost project floating-point comparison algorithms

http://www.boost.org/doc/libs/1_35_0/libs/test/doc/components/test_tools/floating_point_comparison.html

[5] Boost project floating-point comparison algorithms

http://www.boost.org/doc/libs/1_35_0/libs/test/doc/components/test_tools/floating_point_comparison.html