Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Confusing math problem

Reply
Thread Tools

Confusing math problem

 
 
Schizoid Man
Guest
Posts: n/a
 
      02-21-2013
Hi there,

I run the following code in Python 3.3.0 (on a Windows 7 machine) and Python
2.7.3 on a Mac and I get two different results:

result1 = []
result2 = []
for a in range(2,101):
for b in range(2,101):
result1.append(math.pow(a,b))
result2.append(a**b)
result1 = list(set(result1))
result2 = list(set(result2))
print (len(result1))
print (len(result2))

On the Windows box, I get 9183 for on both lines. However, on the Mac I get
9220 and 9183. Why this difference? Is there some sort of precision subtlety
I'm missing between ** and math.pow()?

Thank you.

 
Reply With Quote
 
 
 
 
Dave Angel
Guest
Posts: n/a
 
      02-21-2013
On 02/21/2013 02:33 PM, Schizoid Man wrote:
> Hi there,
>
> I run the following code in Python 3.3.0 (on a Windows 7 machine) and
> Python 2.7.3 on a Mac and I get two different results:
>
> result1 = []
> result2 = []
> for a in range(2,101):
> for b in range(2,101):
> result1.append(math.pow(a,b))
> result2.append(a**b)
> result1 = list(set(result1))
> result2 = list(set(result2))
> print (len(result1))
> print (len(result2))
>
> On the Windows box, I get 9183 for on both lines. However, on the Mac I
> get 9220 and 9183. Why this difference? Is there some sort of precision
> subtlety I'm missing between ** and math.pow()?
>
> Thank you.


I assumed this was some difference between Python 2.x and 3.x.
However, on my 2.7.3 on Linux, I also get 9183 both times.

However, there is an important inaccuracy in math.pow, because it uses
floats to do the work. If you have very large integers, that means some
of them won't be correct. The following are some examples for 2.7.3 on
Linux:

a b math.pow(a,b) a**b
3 34 1.66771816997e+16 16677181699666569
3 35 5.0031545099e+16 50031545098999707
....
5 23 1.19209289551e+16 11920928955078125

The built-in pow, on the other hand, seems to get identical answers for
all these cases. So use pow() instead of math.pow()

One other test:

diff = set(map(int, result1)).symmetric_difference(set(result2))
if diff:
print diff
print len(diff)

shows me a diff set of 15656 members. One such member:

13552527156068805425093160010874271392822265625000 00000000000000000000000000000000000000000000000000 0000000000000L

Notice how using floats truncated lots of the digits in the value?
--
DaveA
 
Reply With Quote
 
 
 
 
Ian Kelly
Guest
Posts: n/a
 
      02-21-2013
On Thu, Feb 21, 2013 at 12:33 PM, Schizoid Man
<(E-Mail Removed)> wrote:
> Hi there,
>
> I run the following code in Python 3.3.0 (on a Windows 7 machine) and Python
> 2.7.3 on a Mac and I get two different results:
>
> result1 = []
> result2 = []
> for a in range(2,101):
> for b in range(2,101):
> result1.append(math.pow(a,b))
> result2.append(a**b)
> result1 = list(set(result1))
> result2 = list(set(result2))
> print (len(result1))
> print (len(result2))
>
> On the Windows box, I get 9183 for on both lines. However, on the Mac I get
> 9220 and 9183. Why this difference? Is there some sort of precision subtlety
> I'm missing between ** and math.pow()?


math.pow is basically a wrapper for the C standard pow function, which
operates on doubles. The difference you're seeing is probably a
difference in implementation in the platform's C library. The **
operator on the other hand is implemented separately as a Python
built-in and operates on any numeric data type.
 
Reply With Quote
 
Dave Angel
Guest
Posts: n/a
 
      02-21-2013
On 02/21/2013 03:25 PM, Dave Angel wrote:
>
>> <snip>
>>

> a b math.pow(a,b) a**b
> 3 34 1.66771816997e+16 16677181699666569
> 3 35 5.0031545099e+16 50031545098999707
> ...
> 5 23 1.19209289551e+16 11920928955078125
>
> The built-in pow, on the other hand, seems to get identical answers for
> all these cases. So use pow() instead of math.pow()
>
> One other test:
>
> diff = set(map(int, result1)).symmetric_difference(set(result2))
> if diff:
> print diff
> print len(diff)
>
> shows me a diff set of 15656 members. One such member:
>
> 13552527156068805425093160010874271392822265625000 00000000000000000000000000000000000000000000000000 0000000000000L
>
>
> Notice how using floats truncated lots of the digits in the value?


Sorry, I just rechecked, and that value is correct for 50**66 power.

However, if I do:

print 3**60, "\n", int(math.pow(3,60)), "\n", pow(3,60)


I get:

42391158275216203514294433201
42391158275216203520420085760
42391158275216203514294433201


and the middle one is the one that's wrong. You can tell by casting out
9's. The middle one gets 1 instead of zero, showing that it's NOT
divisible by 3.

--
DaveA
 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      02-21-2013
On Fri, Feb 22, 2013 at 7:49 AM, Dave Angel <(E-Mail Removed)> wrote:
> However, if I do:
>
> print 3**60, "\n", int(math.pow(3,60)), "\n", pow(3,60)
>
>
> I get:
>
> 42391158275216203514294433201
> 42391158275216203520420085760
> 42391158275216203514294433201
>
>
> and the middle one is the one that's wrong.


In theory, a float should hold the nearest representable value to the
exact result. Considering that only one operation is being performed,
there should be no accumulation of error. The integer results show a
small number (61 of collisions, eg 2**16 and 4**8; why should some
of those NOT collide when done with floating point? My initial thought
was "Oh, this is comparing floats for equality", but after one single
operation, that should be not a problem.

> You can tell by casting out 9's. The middle one gets 1
> instead of zero, showing that it's NOT divisible by 3.


Which I thought so cool and magical and awesome, until I started
exploring other bases and found that you could cast out F's in hex and
7's in octal... and you can cast out 1's in binary to find out if it's
a multiple of 1, too.

ChrisA
 
Reply With Quote
 
Peter Pearson
Guest
Posts: n/a
 
      02-21-2013
On Fri, 22 Feb 2013 08:23:27 +1100, Chris Angelico <(E-Mail Removed)> wrote:
> On Fri, Feb 22, 2013 at 7:49 AM, Dave Angel <(E-Mail Removed)> wrote:
>> However, if I do:
>>
>> print 3**60, "\n", int(math.pow(3,60)), "\n", pow(3,60)
>>
>>
>> I get:
>>
>> 42391158275216203514294433201
>> 42391158275216203520420085760
>> 42391158275216203514294433201
>>
>>
>> and the middle one is the one that's wrong.

>
> In theory, a float should hold the nearest representable value to the
> exact result. Considering that only one operation is being performed,
> there should be no accumulation of error. The integer results show a
> small number (61 of collisions, eg 2**16 and 4**8; why should some
> of those NOT collide when done with floating point? My initial thought
> was "Oh, this is comparing floats for equality", but after one single
> operation, that should be not a problem.


Does this help explain it?

>>> print hex(int(math.pow(3,60))); print hex(3**60)

0x88f924eeceeda80000000000L
0x88f924eeceeda7fe92e1f5b1L


--
To email me, substitute nowhere->spamcop, invalid->net.
 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      02-21-2013
On Fri, Feb 22, 2013 at 8:59 AM, Peter Pearson <(E-Mail Removed)> wrote:
> On Fri, 22 Feb 2013 08:23:27 +1100, Chris Angelico <(E-Mail Removed)> wrote:
>> In theory, a float should hold the nearest representable value to the
>> exact result. Considering that only one operation is being performed,
>> there should be no accumulation of error. The integer results show a
>> small number (61 of collisions, eg 2**16 and 4**8; why should some
>> of those NOT collide when done with floating point? My initial thought
>> was "Oh, this is comparing floats for equality", but after one single
>> operation, that should be not a problem.

>
> Does this help explain it?
>
>>>> print hex(int(math.pow(3,60))); print hex(3**60)

> 0x88f924eeceeda80000000000L
> 0x88f924eeceeda7fe92e1f5b1L
>


I understand how the inaccuracy works, but I'm expecting it to be as
consistent as Mr Grossmith's entertainments. It doesn't matter that
math.pow(3,60) != 3**60, but the number of collisions is different
when done with floats on the OP's Mac. Here's what I'm talking about:

>>> set((3**60,9**30,27**20))

{42391158275216203514294433201}
>>> set((math.pow(3,60),math.pow(9,30),math.pow(27,20) ))

{4.23911582752162e+28}

Note how, in each case, calculating three powers that have the same
real-number result gives a one-element set. Three to the sixtieth
power can't be perfectly rendered with a 53-bit mantissa, but it's
rendered the same way whichever route is used to calculate it.

ChrisA
 
Reply With Quote
 
Dave Angel
Guest
Posts: n/a
 
      02-21-2013
On 02/21/2013 05:11 PM, Chris Angelico wrote:
>
>> <snip>

>
> Note how, in each case, calculating three powers that have the same
> real-number result gives a one-element set. Three to the sixtieth
> power can't be perfectly rendered with a 53-bit mantissa, but it's
> rendered the same way whichever route is used to calculate it.
>


But you don't know how the floating point math library (note, it's the
machine's C-library, not Python's that used) actually calculates that.

For example, if they were to calculate 2**64 by squaring the number 6
times, that's likely to give a different answer than multiplying by 2 63
times. And you don't know how the library does it. For any integer
power up to 128, you can do a combination of square and multiply so that
the total operations are never more than 13, more or less. But if you
then figure a = a*a and b = b/2, and do the same optimization, you
might not do them exactly in the same order, and therefore might not get
exactly the same answer.

Even if it's being done in the coprocessor inside the Pentium, we don't
have a documented algorithm for it. Professor Kahn helped with the
8087, but I know they've tweaked their algorithms over the years (as
well as repairing bugs). So it might not be a difference between Python
versions, nor between OS's, but between processor chips.

--
DaveA
 
Reply With Quote
 
Schizoid Man
Guest
Posts: n/a
 
      02-21-2013


"Dave Angel" <(E-Mail Removed)> wrote in message
> On 02/21/2013 02:33 PM, Schizoid Man wrote:
> However, there is an important inaccuracy in math.pow, because it uses
> floats to do the work. If you have very large integers, that means some
> of them won't be correct. The following are some examples for 2.7.3 on
> Linux:
>
> a b math.pow(a,b) a**b
> 3 34 1.66771816997e+16 16677181699666569
> 3 35 5.0031545099e+16 50031545098999707
> ...
> 5 23 1.19209289551e+16 11920928955078125
>
> The built-in pow, on the other hand, seems to get identical answers for
> all these cases. So use pow() instead of math.pow()


I see. I thought using the ** was shorthand for math.pow() and didn't think
that one would be integer operations and the other floats. I'm performing
some large integer arithmetic operations. I would normally do this my
writing my own multiplication class and storing results as strings, but a
friend suggested that I look at Python.

I ran this one example and was quite surprised at the difference, since 9183
is the correct answer.

> One other test:
>
> diff = set(map(int, result1)).symmetric_difference(set(result2))
> if diff:
> print diff
> print len(diff)
>
> shows me a diff set of 15656 members. One such member:
>
> 13552527156068805425093160010874271392822265625000 00000000000000000000000000000000000000000000000000 0000000000000L
>
> Notice how using floats truncated lots of the digits in the value?


I'm running this test now, but the Mac's fan has kicked in (it's a slightly
older machine) so might it let run through the night.

I appreciate the help.

> --
> DaveA


 
Reply With Quote
 
Schizoid Man
Guest
Posts: n/a
 
      02-21-2013
"Chris Angelico" <(E-Mail Removed)> wrote in

<snip>

> First, are you aware that ** will return int (or sometimes long on
> 2.7.3), while math.pow() will return a float? That may tell you why
> you're seeing differences. That said, though, I wasn't able to
> replicate your result using 2.7.3 and 3.3.0 both on Windows - always
> 9183, indicating 618 of the powers are considered equal. But in
> theory, at least, what you're seeing is that 37 of them compare
> different in floating point on your Mac build. Something to consider:
>
> print(set(result1)-set(result2))


No, I was aware to be honest. I thought ** was just short hand for
math.pow(). Since ** is the integer operation, I suppose ^ doesn't work as
an exponent function in Python?

I compared the difference and got a large blob of numbers. To make a proper
comparison I'll need to compare the base and exponent for which the numbers
are different rather than the numbers themselves. I'm following Dave's
suggestion of determining the symmetric difference of the sets.

Thanks for the help.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Math.random() and Math.round(Math.random()) and Math.floor(Math.random()*2) VK Javascript 15 05-02-2010 03:43 PM
Math.min and Math.max for byte / short Philipp Java 9 07-23-2008 12:37 AM
math.h trig functions questions (and some forgotten high school math) Mark Healey C Programming 7 05-22-2006 10:42 AM
Re: Is still math.h the C++ math library ? AciD_X C++ 4 04-01-2004 07:29 PM
Why can I not use: Math a=new Math(); chirs Java 18 03-02-2004 06:00 PM



Advertisments