Velocity Reviews > Confusing math problem

# Confusing math problem

Oscar Benjamin
Guest
Posts: n/a

 02-21-2013
On 21 February 2013 22:39, Schizoid Man <(E-Mail Removed)> wrote:
> "Dave Angel" <(E-Mail Removed)> wrote in message
>>
>> On 02/21/2013 02:33 PM, Schizoid Man wrote:
>> However, there is an important inaccuracy in math.pow, because it uses
>> floats to do the work. If you have very large integers, that means some of
>> them won't be correct. The following are some examples for 2.7.3 on Linux:
>>
>> a b math.pow(a,b) a**b
>> 3 34 1.66771816997e+16 16677181699666569
>> 3 35 5.0031545099e+16 50031545098999707
>> ...
>> 5 23 1.19209289551e+16 11920928955078125
>>
>> The built-in pow, on the other hand, seems to get identical answers for
>> all these cases. So use pow() instead of math.pow()

>
> I see. I thought using the ** was shorthand for math.pow() and didn't think
> that one would be integer operations and the other floats.

Then you want operator.pow:

>>> import operator
>>> operator.pow(3, 2)

9

math.pow is basically the (double)pow(double, double) function from
the underlying C library. operator.pow(a, b) is precisely the same as
a**b.

> I'm performing
> some large integer arithmetic operations. I would normally do this my
> writing my own multiplication class and storing results as strings, but a
> friend suggested that I look at Python.

There's no need to use strings if you're working with integers in
Python. The results with int (not float) will be exact and will not
overflow since Python's ints have unlimited range (unless your machine
runs out of memory but that only happens with *really* big integers).

If you want to do computations with non-integers and high precision,
take a look at the decimal and fractions modules.
http://docs.python.org/2/library/decimal.html
http://docs.python.org/2/library/fractions.html

There is also a good third party library, sympy, for more complicated
exact algebra:
http://sympy.org/en/index.html

Oscar

Chris Angelico
Guest
Posts: n/a

 02-21-2013
On Fri, Feb 22, 2013 at 9:33 AM, Dave Angel <(E-Mail Removed)> wrote:
> On 02/21/2013 05:11 PM, Chris Angelico wrote:
>>
>>
>>> <snip>

>>
>>
>> Note how, in each case, calculating three powers that have the same
>> real-number result gives a one-element set. Three to the sixtieth
>> power can't be perfectly rendered with a 53-bit mantissa, but it's
>> rendered the same way whichever route is used to calculate it.
>>

>
> But you don't know how the floating point math library (note, it's the
> machine's C-library, not Python's that used) actually calculates that.
>
> For example, if they were to calculate 2**64 by squaring the number 6 times,
> that's likely to give a different answer than multiplying by 2 63 times.
> And you don't know how the library does it. For any integer power up to
> 128, you can do a combination of square and multiply so that the total
> operations are never more than 13, more or less. But if you then figure a =
> a*a and b = b/2, and do the same optimization, you might not do them
> exactly in the same order, and therefore might not get exactly the same
>
> Even if it's being done in the coprocessor inside the Pentium, we don't have
> a documented algorithm for it. Professor Kahn helped with the 8087, but I
> know they've tweaked their algorithms over the years (as well as repairing
> bugs). So it might not be a difference between Python versions, nor between
> OS's, but between processor chips.

I was under the impression that, on most modern FPUs, calculations
were done inside the FPU with more precision than the 53-bit that gets
stored. But in any case, I'd find it _extremely_ surprising if the
calculation actually resulted in something that wasn't one of the two
nearest possible representable values to the correct result. And I'd
call it a CPU/FPU bug.

Of course, as we know, Intel's *never* had an FPU bug before...

ChrisA

Schizoid Man
Guest
Posts: n/a

 02-21-2013
"Dave Angel" <(E-Mail Removed)> wrote

> One other test:
>
> diff = set(map(int, result1)).symmetric_difference(set(result2))
> if diff:
> print diff
> print len(diff)
>
> shows me a diff set of 15656 members. One such member:
>
> 13552527156068805425093160010874271392822265625000 00000000000000000000000000000000000000000000000000 0000000000000L

These are the results I got for len(diff): Windows machines: 15656, Mac:
and I do have a x86 Mac, so in terms of processors it's a like-for-like
comparison.

I have a follow-up question: are ** and pow() identical?

Schizoid Man
Guest
Posts: n/a

 02-21-2013

"Oscar Benjamin" <(E-Mail Removed)> wrote in

> Then you want operator.pow:
>
>>>> import operator
>>>> operator.pow(3, 2)

> 9
>
> math.pow is basically the (double)pow(double, double) function from
> the underlying C library. operator.pow(a, b) is precisely the same as
> a**b.

So how is operator.pow() different from just pow()?

> There's no need to use strings if you're working with integers in
> Python. The results with int (not float) will be exact and will not
> overflow since Python's ints have unlimited range (unless your machine
> runs out of memory but that only happens with *really* big integers).

Yes, that's the idea. I haven't really used Python before, so was just
kicking the tyres a bit and got some odd results. This forum has been great
though.

> If you want to do computations with non-integers and high precision,
> take a look at the decimal and fractions modules.
> http://docs.python.org/2/library/decimal.html
> http://docs.python.org/2/library/fractions.html
>
> There is also a good third party library, sympy, for more complicated
> exact algebra:
> http://sympy.org/en/index.html

Thanks a lot for the references, I'll definitely check them out.

Oscar Benjamin
Guest
Posts: n/a

 02-22-2013
On 21 February 2013 23:41, Schizoid Man <(E-Mail Removed)> wrote:
> "Oscar Benjamin" <(E-Mail Removed)> wrote in
>> Then you want operator.pow:
>>
>>>>> import operator
>>>>> operator.pow(3, 2)

>>
>> 9
>>
>> math.pow is basically the (double)pow(double, double) function from
>> the underlying C library. operator.pow(a, b) is precisely the same as
>> a**b.

>
> So how is operator.pow() different from just pow()?

operator.pow(a, b) calls a.__pow__(b). This is also what a**b does. If
a and b are ints then the __pow__ method will create the appropriate
int exactly without overflow and return it. You can make your own
class and give it a __pow__ function that does whatever you like:

>>> class Number(object):

.... def __pow__(self, other):
.... print('__pow__ called, returning 3')
.... return 3
....
>>> n = Number()
>>> result = n**3

__pow__ called, returning 3
>>> result

3

operator.pow will call the method:

>>> import operator
>>> operator.pow(n, 3)

__pow__ called, returning 3
3

math.pow will not:

>>> import math
>>> math.pow(n, 3)

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: a float is required

The math module essentially exposes the functions that would be
available in C after importing math.h. So when you call math.pow(a,
b), a and b are if possible converted to float (which corresponds to a
double in C) and then the result is computed and returned as a float.

Oscar

Ian Kelly
Guest
Posts: n/a

 02-22-2013
On Thu, Feb 21, 2013 at 4:41 PM, Schizoid Man <(E-Mail Removed)> wrote:
> So how is operator.pow() different from just pow()?

math.pow() is a wrapper around the C library function.

** and operator.pow() are the same thing; the latter is just a
function version of the former.

The built-in pow() is a mutant version that takes either 2 or 3
arguments. With 2 arguments, pow(a, b) is again equivalent to a ** b.
With 3 arguments, pow(a, b, c) is equivalent to but more efficient
than a ** b % c, but a and b are restricted to integers.

Chris Angelico
Guest
Posts: n/a

 02-22-2013
On Fri, Feb 22, 2013 at 9:44 AM, Schizoid Man <(E-Mail Removed)> wrote:
>
> No, I was aware to be honest. I thought ** was just short hand for
> math.pow(). Since ** is the integer operation, I suppose ^ doesn't work as
> an exponent function in Python?

^ is bitwise XOR, completely different.

ChrisA

Dave Angel
Guest
Posts: n/a

 02-22-2013
On 02/21/2013 05:44 PM, Schizoid Man wrote:
> <snip>
>
>
> No, I was aware to be honest. I thought ** was just short hand for
> math.pow(). Since ** is the integer operation

It's an integer operation because you started with two ints. Unlike
math.pow, which converts to floats, whatever you feed it.

>
> I compared the difference and got a large blob of numbers. To make a
> proper comparison I'll need to compare the base and exponent for which
> the numbers are different rather than the numbers themselves. I'm
> following Dave's suggestion of determining the symmetric difference of
> the sets.

But once you have a few that are just plain way off, it doesn't really
matter whether some others differ in the 17th place. All the other
discussion is interesting, but don't forget the main point, that trying
to represent large integers (over 17 or so digits) as floats is going to
lose precision. That can happen a number of ways, and math.pow is just
one of them.

Floats use a finite precision to store their value, and the radix is
binary, not decimal. So figuring where they start to lose precision is
tricky. If you're doing a calculation where all intermediate values are
integers, you're usually better off sticking with int/long.

There are many other kinds of constraints that come up in programming,
and Python usually has an answer for each. But in a machine of finite
size, and when we care at least a little about performance, we
frequently have to pick our algorithm, our set of functions, and our
data format carefully.

Someone else has mentioned the decimal package and the fractions. Each
of those has a lot to offer in specific situations. But none is a panacea.

--
DaveA

Steven D'Aprano
Guest
Posts: n/a

 02-22-2013
On Fri, 22 Feb 2013 08:23:27 +1100, Chris Angelico wrote:

> and you can cast out 1's in binary to find out if it's a
> multiple of 1, too.

O_o

I wanna see the numbers that aren't a multiple of 1.

--
Steven

Steven D'Aprano
Guest
Posts: n/a

 02-22-2013
On Thu, 21 Feb 2013 19:33:32 +0000, Schizoid Man wrote:

> Hi there,
>
> I run the following code in Python 3.3.0 (on a Windows 7 machine) and
> Python 2.7.3 on a Mac and I get two different results:

Others have already explained that math.pow and the ** exponentiation
operator are subtly different. However I wish to discuss your code:

> result1 = []
> result2 = []
> for a in range(2,101):
> for b in range(2,101):
> result1.append(math.pow(a,b))
> result2.append(a**b)
> result1 = list(set(result1))
> result2 = list(set(result2))
> print (len(result1))
> print (len(result2))

This is more simply written as:

result1 = set()
result2 = set()
for a in range(2, 101):
for b in range(2, 101):

print(len(result1))
print(len(result2))

No need for the pointless conversion from list to set to list again, if
all you want is the number of unique values.

More interesting is to gather the pairs of values that differ:

results = []
for a in range(2, 101):
for b in range(2, 101):
if a**b != math.pow(a, b): results.append((a, b))

You will see that integer exponentiation and floating point
exponentiation are more frequently different than not.

(8105 of the calculations differ, 1696 of them are the same.)

--
Steven