Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Performance of int/long in Python 3

Reply
Thread Tools

Performance of int/long in Python 3

 
 
Chris Angelico
Guest
Posts: n/a
 
      03-25-2013
The Python 3 merge of int and long has effectively penalized
small-number arithmetic by removing an optimization. As we've seen
from PEP 393 strings (jmf aside), there can be huge benefits from
having a single type with multiple representations internally. Is
there value in making the int type have a machine-word optimization in
the same way?

The cost is clear. Compare these methods for calculating the sum of
all numbers up to 65535, which stays under 2^31:

def range_sum(n):
return sum(range(n+1))

def forloop(n):
tot=0
for i in range(n+1):
tot+=i
return tot

def forloop_offset(n):
tot=1000000000000000
for i in range(n+1):
tot+=i
return tot-1000000000000000

import timeit
import sys
print(sys.version)
print("inline: %d"%sum(range(65536)))
print(timeit.timeit("sum(range(65536))",number=100 0))
for func in ['range_sum','forloop','forloop_offset']:
print("%s: %r"%(func,(globals()[func](65535))))
print(timeit.timeit(func+"(65535)","from __main__ import "+func,number=1000))


Windows XP:
C:\>python26\python inttime.py
2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)]
inline: 2147450880
2.36770455463
range_sum: 2147450880
2.61778550067
forloop: 2147450880
7.91409131608
forloop_offset: 2147450880L
23.3116954809

C:\>python33\python inttime.py
3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:4 [MSC v.1600 32 bit (Intel)]
inline: 2147450880
5.25038713020789
range_sum: 2147450880
5.412975112758745
forloop: 2147450880
17.875799577879313
forloop_offset: 2147450880
19.31672544974291

Debian Wheezy:
rosuav@sikorsky:~$ python inttime.py
2.7.3 (default, Jan 2 2013, 13:56:14)
[GCC 4.7.2]
inline: 2147450880
1.92763710022
range_sum: 2147450880
1.93409109116
forloop: 2147450880
5.14633893967
forloop_offset: 2147450880
5.13459300995
rosuav@sikorsky:~$ python3 inttime.py
3.2.3 (default, Feb 20 2013, 14:44:27)
[GCC 4.7.2]
inline: 2147450880
2.884124994277954
range_sum: 2147450880
2.6586129665374756
forloop: 2147450880
7.660192012786865
forloop_offset: 2147450880
8.11817193031311


On 2.6/2.7, there's a massive penalty for switching to longs; on
3.2/3.3, the two for-loop versions are nearly identical in time.

(Side point: I'm often seeing that 3.2 on Linux is marginally faster
calling my range_sum function than doing the same thing inline. I do
not understand this. If anyone can explain what's going on there, I'm
all ears!)

Python 3's int is faster than Python 2's long, but slower than Python
2's int. So the question really is, would a two-form representation be
beneficial, and if so, is it worth the coding trouble?

ChrisA
 
Reply With Quote
 
 
 
 
Cousin Stanley
Guest
Posts: n/a
 
      03-25-2013

Chris Angelico wrote:

> The Python 3 merge of int and long has effectively penalized
> small-number arithmetic by removing an optimization.
> ....
> The cost is clear.
> ....


The cost isn't quite as clear
under Debian Wheezy here ....

Stanley C. Kitching
Debian Wheezy

python inline range_sum forloop forloop_offset

2.7.3 3.1359 3.0725 9.0778 15.6475

3.2.3 2.8226 2.8074 13.47624 13.6430


# ---------------------------------------------------------

Chris Angelico
Debian Wheezy

python inline range_sum forloop forloop_offset

2.7.3 1.9276 1.9341 5.1463 5.1346

3.2.3 2.8841 2.6586 7.6602 8.1182


--
Stanley C. Kitching
Human Being
Phoenix, Arizona

 
Reply With Quote
 
 
 
 
Dan Stromberg
Guest
Posts: n/a
 
      03-26-2013
On Mon, Mar 25, 2013 at 4:35 PM, Cousin Stanley <(E-Mail Removed)>wrote:

>
> Chris Angelico wrote:
>
> > The Python 3 merge of int and long has effectively penalized
> > small-number arithmetic by removing an optimization.
> > ....
> > The cost is clear.
> > ....

>


I thought I heard that Python 3.x will use machine words for small
integers, and automatically coerce internally to a 2.x long as needed.

Either way, it's better to have a small performance cost to avoid problems
when computers move from 32 to 64 bit words, or 64 bit to 128 bit words.
With 3.x int's, you don't have to worry about a new crop of CPU's breaking
your code.

 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      03-26-2013
On Tue, Mar 26, 2013 at 10:35 AM, Cousin Stanley
<(E-Mail Removed)> wrote:
>
> Chris Angelico wrote:
>
>> The Python 3 merge of int and long has effectively penalized
>> small-number arithmetic by removing an optimization.
>> ....
>> The cost is clear.
>> ....

>
> The cost isn't quite as clear
> under Debian Wheezy here ....
>
> Stanley C. Kitching
> Debian Wheezy
>
> python inline range_sum forloop forloop_offset
>
> 2.7.3 3.1359 3.0725 9.0778 15.6475
>
> 3.2.3 2.8226 2.8074 13.47624 13.6430


Interesting, so your 3.x sum() is optimizing something somewhere.
Strange. Are we both running the same Python? I got those from
apt-get, aiming for consistency (rather than building a 3.3 from
source).

The cost is still visible in the for-loop versions, though, and you're
still seeing the <2^31 and >2^31 for-loops behave the same way in 3.x
but perform quite differently in 2.x. So it's looking like things are
mostly the same.

ChrisA
 
Reply With Quote
 
Cousin Stanley
Guest
Posts: n/a
 
      03-26-2013
Chris Angelico wrote:

> Interesting, so your 3.x sum() is optimizing something somewhere.
> Strange. Are we both running the same Python ?
>
> I got those from apt-get
> ....


I also installed python here under Debian Wheezy
via apt-get and our versions look to be the same ....

-sk-

2.7.3 (default, Jan 2 2013, 16:53:07) [GCC 4.7.2]

3.2.3 (default, Feb 20 2013, 17:02:41) [GCC 4.7.2]

CPU : Intel(R) Celeron(R) D CPU 3.33GHz


-ca-

2.7.3 (default, Jan 2 2013, 13:56:14) [GCC 4.7.2]

3.2.3 (default, Feb 20 2013, 14:44:27) [GCC 4.7.2]

CPU : ???


Could differences in underlying CPU architecture
lead to our differing python integer results ?



--
Stanley C. Kitching
Human Being
Phoenix, Arizona

 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      03-26-2013
On Wed, Mar 27, 2013 at 12:38 AM, Cousin Stanley
<(E-Mail Removed)> wrote:
> Chris Angelico wrote:
>
>> Interesting, so your 3.x sum() is optimizing something somewhere.
>> Strange. Are we both running the same Python ?
>>
>> I got those from apt-get
>> ....

>
> I also installed python here under Debian Wheezy
> via apt-get and our versions look to be the same ....
>
> -sk-
>
> 2.7.3 (default, Jan 2 2013, 16:53:07) [GCC 4.7.2]
>
> 3.2.3 (default, Feb 20 2013, 17:02:41) [GCC 4.7.2]
>
> CPU : Intel(R) Celeron(R) D CPU 3.33GHz
>
>
> -ca-
>
> 2.7.3 (default, Jan 2 2013, 13:56:14) [GCC 4.7.2]
>
> 3.2.3 (default, Feb 20 2013, 14:44:27) [GCC 4.7.2]
>
> CPU : ???
>
>
> Could differences in underlying CPU architecture
> lead to our differing python integer results ?


Doubtful. I have Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz quad-core
with hyperthreading, but I'm only using one core for this job. I've
run the tests several times and each time, Py2 is a shade under two
seconds for inline/range_sum, and Py3 is about 2.5 seconds for each.
Fascinating.

Just for curiosity's sake, I spun up the tests on my reiplophobic
server, still running Ubuntu Karmic. Pentium(R) Dual-Core CPU
E6500 @ 2.93GHz.

gideon@gideon:~$ python inttime.py
2.6.4 (r264:75706, Dec 7 2009, 18:45:15)
[GCC 4.4.1]
inline: 2147450880
2.7050409317
range_sum: 2147450880
2.64918494225
forloop: 2147450880
6.58765792847
forloop_offset: 2147450880L
16.5167789459
gideon@gideon:~$ python3 inttime.py
3.1.1+ (r311:74480, Nov 2 2009, 14:49:22)
[GCC 4.4.1]
inline: 2147450880
4.44533085823
range_sum: 2147450880
4.37314105034
forloop: 2147450880
12.4834370613
forloop_offset: 2147450880
13.5000522137

Once again, Py3 is slower on small integers than Py2. So where's the
difference with your system? This is really weird! I assume you can
repeat the tests and get the same result every time?

ChrisA
 
Reply With Quote
 
Cousin Stanley
Guest
Posts: n/a
 
      03-26-2013

Chris Angelico wrote:

> Once again, Py3 is slower on small integers than Py2.


Chris Angelico
Ubuntu Karmic.
Pentium(R) Dual-Core CPU E6500 @ 2.93GHz.

python inline range_sum forloop forloop_offset

2.6.4 2.7050 2.6492 6.5877 16.5168

3.1.1 4.4453 4.3731 12.4834 13.5001

You do seem to have a slight py3 improvement
under ubuntu for the forloop_offset case ....


> So where's the difference with your system ?


CPU ????


> This is really weird !


Yep ...


> I assume you can repeat the tests
> and get the same result every time ?


Yes ....

First lines of numbers below are from yesterday
while second lines are from today ....

Stanley C. Kitching
Debian Wheezy
Intel(R) Celeron(R) D CPU 3.33GH Single Core

python inline range_sum forloop forloop_offset

2.7.3 3.1359 3.0725 9.0778 15.6475
2.7.3 3.0382 3.1452 9.8799 16.8579

3.2.3 2.8226 2.8074 13.47624 13.6430
3.2.3 2.8331 2.8228 13.54151 13.8716


--
Stanley C. Kitching
Human Being
Phoenix, Arizona

 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      03-26-2013
On Wed, Mar 27, 2013 at 3:41 AM, Cousin Stanley <(E-Mail Removed)> wrote:
>
> Chris Angelico wrote:
>
>> Once again, Py3 is slower on small integers than Py2.

>
> Chris Angelico
> Ubuntu Karmic.
> Pentium(R) Dual-Core CPU E6500 @ 2.93GHz.
>
> python inline range_sum forloop forloop_offset
>
> 2.6.4 2.7050 2.6492 6.5877 16.5168
>
> 3.1.1 4.4453 4.3731 12.4834 13.5001
>
> You do seem to have a slight py3 improvement
> under ubuntu for the forloop_offset case ....


Yes, that's correct. The forloop_offset one is using long integers in
all cases. (Well, on Py2 it's adding a series of ints to a long, but
the arithmetic always has to be done with longs.) Python 3 has had
some improvements done, but the main thing is that there's a massive
spike in the Py2 time, while Py3 has _already paid_ that cost - as
evidenced by the closeness of the forloop and forloop_offset times on
Py3.

ChrisA
 
Reply With Quote
 
Terry Reedy
Guest
Posts: n/a
 
      03-26-2013
On 3/26/2013 12:41 PM, Cousin Stanley wrote:

>> So where's the difference with your system ?

>
> CPU ????


Compilers and compiler settings can also make a difference.

--
Terry Jan Reedy

 
Reply With Quote
 
jmfauth
Guest
Posts: n/a
 
      03-26-2013
On 25 mar, 22:51, Chris Angelico <(E-Mail Removed)> wrote:
> The Python 3 merge of int and long has effectively penalized
> small-number arithmetic by removing an optimization. As we've seen
> from PEP 393 strings (jmf aside), there can be huge benefits from
> having a single type with multiple representations internally ...


------

A character is not an integer (short form).

jmf
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Performance Tutorials Services - Boosting Performance by DisablingUnnecessary Services on Windows XP Home Edition Software Engineer Javascript 0 06-10-2011 02:18 AM
Re: Performance (pystone) of python 2.4 lower then python 2.3 ??? Andreas Kostyrka Python 0 12-17-2004 02:00 PM
Performance (pystone) of python 2.4 lower then python 2.3 ??? Lucas Hofman Python 13 12-16-2004 03:24 AM
RE: Straw poll on Python performance (was Re: Python is far from atop performer ...) Robert Brewer Python 1 01-10-2004 06:54 AM
Web Form Performance Versus Single File Performance jm ASP .Net 1 12-12-2003 11:14 PM



Advertisments