Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   Confusing math problem (http://www.velocityreviews.com/forums/t957898-confusing-math-problem.html)

Schizoid Man 02-21-2013 07:33 PM

Confusing math problem
 
Hi there,

I run the following code in Python 3.3.0 (on a Windows 7 machine) and Python
2.7.3 on a Mac and I get two different results:

result1 = []
result2 = []
for a in range(2,101):
for b in range(2,101):
result1.append(math.pow(a,b))
result2.append(a**b)
result1 = list(set(result1))
result2 = list(set(result2))
print (len(result1))
print (len(result2))

On the Windows box, I get 9183 for on both lines. However, on the Mac I get
9220 and 9183. Why this difference? Is there some sort of precision subtlety
I'm missing between ** and math.pow()?

Thank you.


Dave Angel 02-21-2013 08:25 PM

Re: Confusing math problem
 
On 02/21/2013 02:33 PM, Schizoid Man wrote:
> Hi there,
>
> I run the following code in Python 3.3.0 (on a Windows 7 machine) and
> Python 2.7.3 on a Mac and I get two different results:
>
> result1 = []
> result2 = []
> for a in range(2,101):
> for b in range(2,101):
> result1.append(math.pow(a,b))
> result2.append(a**b)
> result1 = list(set(result1))
> result2 = list(set(result2))
> print (len(result1))
> print (len(result2))
>
> On the Windows box, I get 9183 for on both lines. However, on the Mac I
> get 9220 and 9183. Why this difference? Is there some sort of precision
> subtlety I'm missing between ** and math.pow()?
>
> Thank you.


I assumed this was some difference between Python 2.x and 3.x.
However, on my 2.7.3 on Linux, I also get 9183 both times.

However, there is an important inaccuracy in math.pow, because it uses
floats to do the work. If you have very large integers, that means some
of them won't be correct. The following are some examples for 2.7.3 on
Linux:

a b math.pow(a,b) a**b
3 34 1.66771816997e+16 16677181699666569
3 35 5.0031545099e+16 50031545098999707
....
5 23 1.19209289551e+16 11920928955078125

The built-in pow, on the other hand, seems to get identical answers for
all these cases. So use pow() instead of math.pow()

One other test:

diff = set(map(int, result1)).symmetric_difference(set(result2))
if diff:
print diff
print len(diff)

shows me a diff set of 15656 members. One such member:

13552527156068805425093160010874271392822265625000 00000000000000000000000000000000000000000000000000 0000000000000L

Notice how using floats truncated lots of the digits in the value?
--
DaveA

Ian Kelly 02-21-2013 08:42 PM

Re: Confusing math problem
 
On Thu, Feb 21, 2013 at 12:33 PM, Schizoid Man
<schiz_man@21stcentury.com> wrote:
> Hi there,
>
> I run the following code in Python 3.3.0 (on a Windows 7 machine) and Python
> 2.7.3 on a Mac and I get two different results:
>
> result1 = []
> result2 = []
> for a in range(2,101):
> for b in range(2,101):
> result1.append(math.pow(a,b))
> result2.append(a**b)
> result1 = list(set(result1))
> result2 = list(set(result2))
> print (len(result1))
> print (len(result2))
>
> On the Windows box, I get 9183 for on both lines. However, on the Mac I get
> 9220 and 9183. Why this difference? Is there some sort of precision subtlety
> I'm missing between ** and math.pow()?


math.pow is basically a wrapper for the C standard pow function, which
operates on doubles. The difference you're seeing is probably a
difference in implementation in the platform's C library. The **
operator on the other hand is implemented separately as a Python
built-in and operates on any numeric data type.

Dave Angel 02-21-2013 08:49 PM

Re: Confusing math problem
 
On 02/21/2013 03:25 PM, Dave Angel wrote:
>
>> <snip>
>>

> a b math.pow(a,b) a**b
> 3 34 1.66771816997e+16 16677181699666569
> 3 35 5.0031545099e+16 50031545098999707
> ...
> 5 23 1.19209289551e+16 11920928955078125
>
> The built-in pow, on the other hand, seems to get identical answers for
> all these cases. So use pow() instead of math.pow()
>
> One other test:
>
> diff = set(map(int, result1)).symmetric_difference(set(result2))
> if diff:
> print diff
> print len(diff)
>
> shows me a diff set of 15656 members. One such member:
>
> 13552527156068805425093160010874271392822265625000 00000000000000000000000000000000000000000000000000 0000000000000L
>
>
> Notice how using floats truncated lots of the digits in the value?


Sorry, I just rechecked, and that value is correct for 50**66 power.

However, if I do:

print 3**60, "\n", int(math.pow(3,60)), "\n", pow(3,60)


I get:

42391158275216203514294433201
42391158275216203520420085760
42391158275216203514294433201


and the middle one is the one that's wrong. You can tell by casting out
9's. The middle one gets 1 instead of zero, showing that it's NOT
divisible by 3.

--
DaveA

Chris Angelico 02-21-2013 09:23 PM

Re: Confusing math problem
 
On Fri, Feb 22, 2013 at 7:49 AM, Dave Angel <davea@davea.name> wrote:
> However, if I do:
>
> print 3**60, "\n", int(math.pow(3,60)), "\n", pow(3,60)
>
>
> I get:
>
> 42391158275216203514294433201
> 42391158275216203520420085760
> 42391158275216203514294433201
>
>
> and the middle one is the one that's wrong.


In theory, a float should hold the nearest representable value to the
exact result. Considering that only one operation is being performed,
there should be no accumulation of error. The integer results show a
small number (618) of collisions, eg 2**16 and 4**8; why should some
of those NOT collide when done with floating point? My initial thought
was "Oh, this is comparing floats for equality", but after one single
operation, that should be not a problem.

> You can tell by casting out 9's. The middle one gets 1
> instead of zero, showing that it's NOT divisible by 3.


Which I thought so cool and magical and awesome, until I started
exploring other bases and found that you could cast out F's in hex and
7's in octal... and you can cast out 1's in binary to find out if it's
a multiple of 1, too.

ChrisA

Peter Pearson 02-21-2013 09:59 PM

Re: Confusing math problem
 
On Fri, 22 Feb 2013 08:23:27 +1100, Chris Angelico <rosuav@gmail.com> wrote:
> On Fri, Feb 22, 2013 at 7:49 AM, Dave Angel <davea@davea.name> wrote:
>> However, if I do:
>>
>> print 3**60, "\n", int(math.pow(3,60)), "\n", pow(3,60)
>>
>>
>> I get:
>>
>> 42391158275216203514294433201
>> 42391158275216203520420085760
>> 42391158275216203514294433201
>>
>>
>> and the middle one is the one that's wrong.

>
> In theory, a float should hold the nearest representable value to the
> exact result. Considering that only one operation is being performed,
> there should be no accumulation of error. The integer results show a
> small number (618) of collisions, eg 2**16 and 4**8; why should some
> of those NOT collide when done with floating point? My initial thought
> was "Oh, this is comparing floats for equality", but after one single
> operation, that should be not a problem.


Does this help explain it?

>>> print hex(int(math.pow(3,60))); print hex(3**60)

0x88f924eeceeda80000000000L
0x88f924eeceeda7fe92e1f5b1L


--
To email me, substitute nowhere->spamcop, invalid->net.

Chris Angelico 02-21-2013 10:11 PM

Re: Confusing math problem
 
On Fri, Feb 22, 2013 at 8:59 AM, Peter Pearson <ppearson@nowhere.invalid> wrote:
> On Fri, 22 Feb 2013 08:23:27 +1100, Chris Angelico <rosuav@gmail.com> wrote:
>> In theory, a float should hold the nearest representable value to the
>> exact result. Considering that only one operation is being performed,
>> there should be no accumulation of error. The integer results show a
>> small number (618) of collisions, eg 2**16 and 4**8; why should some
>> of those NOT collide when done with floating point? My initial thought
>> was "Oh, this is comparing floats for equality", but after one single
>> operation, that should be not a problem.

>
> Does this help explain it?
>
>>>> print hex(int(math.pow(3,60))); print hex(3**60)

> 0x88f924eeceeda80000000000L
> 0x88f924eeceeda7fe92e1f5b1L
>


I understand how the inaccuracy works, but I'm expecting it to be as
consistent as Mr Grossmith's entertainments. It doesn't matter that
math.pow(3,60) != 3**60, but the number of collisions is different
when done with floats on the OP's Mac. Here's what I'm talking about:

>>> set((3**60,9**30,27**20))

{42391158275216203514294433201}
>>> set((math.pow(3,60),math.pow(9,30),math.pow(27,20) ))

{4.23911582752162e+28}

Note how, in each case, calculating three powers that have the same
real-number result gives a one-element set. Three to the sixtieth
power can't be perfectly rendered with a 53-bit mantissa, but it's
rendered the same way whichever route is used to calculate it.

ChrisA

Dave Angel 02-21-2013 10:33 PM

Re: Confusing math problem
 
On 02/21/2013 05:11 PM, Chris Angelico wrote:
>
>> <snip>

>
> Note how, in each case, calculating three powers that have the same
> real-number result gives a one-element set. Three to the sixtieth
> power can't be perfectly rendered with a 53-bit mantissa, but it's
> rendered the same way whichever route is used to calculate it.
>


But you don't know how the floating point math library (note, it's the
machine's C-library, not Python's that used) actually calculates that.

For example, if they were to calculate 2**64 by squaring the number 6
times, that's likely to give a different answer than multiplying by 2 63
times. And you don't know how the library does it. For any integer
power up to 128, you can do a combination of square and multiply so that
the total operations are never more than 13, more or less. But if you
then figure a = a*a and b = b/2, and do the same optimization, you
might not do them exactly in the same order, and therefore might not get
exactly the same answer.

Even if it's being done in the coprocessor inside the Pentium, we don't
have a documented algorithm for it. Professor Kahn helped with the
8087, but I know they've tweaked their algorithms over the years (as
well as repairing bugs). So it might not be a difference between Python
versions, nor between OS's, but between processor chips.

--
DaveA

Schizoid Man 02-21-2013 10:39 PM

Re: Confusing math problem
 


"Dave Angel" <davea@davea.name> wrote in message
> On 02/21/2013 02:33 PM, Schizoid Man wrote:
> However, there is an important inaccuracy in math.pow, because it uses
> floats to do the work. If you have very large integers, that means some
> of them won't be correct. The following are some examples for 2.7.3 on
> Linux:
>
> a b math.pow(a,b) a**b
> 3 34 1.66771816997e+16 16677181699666569
> 3 35 5.0031545099e+16 50031545098999707
> ...
> 5 23 1.19209289551e+16 11920928955078125
>
> The built-in pow, on the other hand, seems to get identical answers for
> all these cases. So use pow() instead of math.pow()


I see. I thought using the ** was shorthand for math.pow() and didn't think
that one would be integer operations and the other floats. I'm performing
some large integer arithmetic operations. I would normally do this my
writing my own multiplication class and storing results as strings, but a
friend suggested that I look at Python.

I ran this one example and was quite surprised at the difference, since 9183
is the correct answer.

> One other test:
>
> diff = set(map(int, result1)).symmetric_difference(set(result2))
> if diff:
> print diff
> print len(diff)
>
> shows me a diff set of 15656 members. One such member:
>
> 13552527156068805425093160010874271392822265625000 00000000000000000000000000000000000000000000000000 0000000000000L
>
> Notice how using floats truncated lots of the digits in the value?


I'm running this test now, but the Mac's fan has kicked in (it's a slightly
older machine) so might it let run through the night.

I appreciate the help.

> --
> DaveA



Schizoid Man 02-21-2013 10:44 PM

Re: Confusing math problem
 
"Chris Angelico" <rosuav@gmail.com> wrote in

<snip>

> First, are you aware that ** will return int (or sometimes long on
> 2.7.3), while math.pow() will return a float? That may tell you why
> you're seeing differences. That said, though, I wasn't able to
> replicate your result using 2.7.3 and 3.3.0 both on Windows - always
> 9183, indicating 618 of the powers are considered equal. But in
> theory, at least, what you're seeing is that 37 of them compare
> different in floating point on your Mac build. Something to consider:
>
> print(set(result1)-set(result2))


No, I was aware to be honest. I thought ** was just short hand for
math.pow(). Since ** is the integer operation, I suppose ^ doesn't work as
an exponent function in Python?

I compared the difference and got a large blob of numbers. To make a proper
comparison I'll need to compare the base and exponent for which the numbers
are different rather than the numbers themselves. I'm following Dave's
suggestion of determining the symmetric difference of the sets.

Thanks for the help.



All times are GMT. The time now is 08:29 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.