Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Python is faster than C

Reply
Thread Tools

Python is faster than C

 
 
Armin Rigo
Guest
Posts: n/a
 
      04-03-2004
Hi!

This is a rant against the optimization trend of the Python interpreter.

Sorting a list of 100000 integers in random order takes:

* 0.75 seconds in Python 2.1
* 0.51 seconds in Python 2.2
* 0.46 seconds in Python 2.3

Tim Peters did a great job optimizing list.sort(). If I try with a
simple, non-stable pure Python quicksort implementation, in Python 2.3:

* 4.83 seconds
* 0.21 seconds with Psyco

First step towards world domination of high-level languages

The reason that Psyco manages to outperform the C implementation is not
that gcc is a bad compiler (it is about 10 times better than Psyco's).
The reason is that the C implementation must use a generic '<' operator
to compare elements, while the Psyco version quickly figures out that it
can expect to find ints in the list; it still has to check this
assumption, but this is cheap and then the comparison is done with a
single machine instruction.

Similarily, here are some results about the heapq module, which is
rewritten in C in the CVS tree for Python 2.4:

l = [random.random() for x in range(200000)]
heapq.heapify(l)

This code executes on my laptop in:

* 1.96 seconds on Python 2.3 (pure Python)
* 0.18 seconds on Python 2.4cvs (rewritten in C)
* 0.16 seconds on Python 2.3 with Psyco

So this is not so much a plug for Psyco as a rant against the current
trend of rewriting standard modules in C. Premature optimization and
all that.

Worse, and more importantly, the optimization starts to become visible
to the programmer. Iterators, for example, are great in limited cases
but I consider their introduction a significant complication in the
language; before, you could expect that some function from which you
would expect a sequence returned a list. Python was all lists and
dicts, with dicts used as namespaces here and there. Nowadays you have
to be careful. Moreover, it is harder to explain:

>>> zip([1,2,3], [4,5,6]) # easy to understand and explain

[(1, 4), (2, 5), (3, 6)]

>>> enumerate([6,7,8,9]) # uh ?

<enumerate object at 0x401a102c>

I know you can always do list(_). My point is that this is a
user-visible optimization. enumerate() should return a normal list, and
it should be someone else's job to ensure that it is correctly optimized
away if possible (and I'm not even talking about Psyco, it could be done
in the current Python implementation with a reasonable amount of
effort).


Protesting-ly yours,

Armin
 
Reply With Quote
 
 
 
 
Michel Claveau/Hamster
Guest
Posts: n/a
 
      04-03-2004
Hi !

For the 1st April, it's finish.





 
Reply With Quote
 
 
 
 
Josiah Carlson
Guest
Posts: n/a
 
      04-03-2004
> For the 1st April, it's finish.

That wasn't a joke, psyco does in fact work /really/ well for some things.

- Josiah
 
Reply With Quote
 
Armin Rigo
Guest
Posts: n/a
 
      04-03-2004
Michel Claveau/Hamster wrote:
> For the 1st April, it's finish.


Check it for yourself. Find yourself an Intel machine and grab Psyco
from http://psyco.sf.net. Here is the source of my test:

# Python Quicksort Written by Magnus Lie Hetland
# http://www.hetland.org/python/quicksort.html

def _partition(list, start, end):
pivot = list[end]
bottom = start-1
top = end

done = 0
while not done:

while not done:
bottom = bottom+1

if bottom == top:
done = 1
break

if pivot < list[bottom]:
list[top] = list[bottom]
break

while not done:
top = top-1

if top == bottom:
done = 1
break

if list[top] < pivot:
list[bottom] = list[top]
break

list[top] = pivot
return top


def _quicksort(list, start, end):
if start < end:
split = _partition(list, start, end)
_quicksort(list, start, split-1)
_quicksort(list, split+1, end)

def quicksort(list):
if len(list) > 1:
_quicksort(list, 0, len(list)-1)

# __________________________________________________ __________

import random, time, psyco
l = range(100000)
random.shuffle(l)

#TIMEIT = "l.sort()"
#TIMEIT = "quicksort(l)"
TIMEIT = "psyco.proxy(quicksort)(l)"


print TIMEIT, ':',
t = time.time()
exec TIMEIT
print time.time() - t

assert l == range(100000)
 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      04-03-2004
Armin Rigo <(E-Mail Removed)> writes:
> >>> enumerate([6,7,8,9]) # uh ?

> <enumerate object at 0x401a102c>
>
> I know you can always do list(_). My point is that this is a
> user-visible optimization. enumerate() should return a normal list, and
> it should be someone else's job to ensure that it is correctly optimized
> away if possible (and I'm not even talking about Psyco, it could be done
> in the current Python implementation with a reasonable amount of
> effort).


I think enumerate(xrange(1000000000)) returning a normal list would
exhaust memory before some later optimizer had a chance to do anything
with it.
 
Reply With Quote
 
Joe Mason
Guest
Posts: n/a
 
      04-03-2004
In article <(E-Mail Removed)>, Armin Rigo wrote:
> The reason that Psyco manages to outperform the C implementation is not
> that gcc is a bad compiler (it is about 10 times better than Psyco's).
> The reason is that the C implementation must use a generic '<' operator
> to compare elements, while the Psyco version quickly figures out that it
> can expect to find ints in the list; it still has to check this
> assumption, but this is cheap and then the comparison is done with a
> single machine instruction.


Why can't the C implementation do the same thing?

Joe
 
Reply With Quote
 
Armin Rigo
Guest
Posts: n/a
 
      04-03-2004
Hello Paul,

Paul Rubin wrote:
> I think enumerate(xrange(1000000000)) returning a normal list would
> exhaust memory before some later optimizer had a chance to do anything
> with it.


There are two levels: the language specification and its implementation.
My point is that there is no reason why everything that is (at the
language level) a list, should always be implemented as a plain array of
objects. The list returned by range(1000000) doesn't have to use 4MB of
memory when the same information can be encoded in a couple of ints.
The presence of xrange() is the oldest of these user-visible
optimization hack that should have been optimized in the implementation
instead. Similarily, if you do 'a'*999999999 I can think of a better
way to encode the resulting string than with 100MB of RAM.

On your specific example, we could also argue that a slightly involved
but reasonable amount of effort would allow C functions to internally
receive a context hint, which would tell enumerate() that its result
will only ever be used for iteration when this is the case, so that it
can internally return an iterable instead of a list -- but this would
only be a hack, a workaround for the limits of CPython's internal object
representations.


Armin
 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      04-03-2004
Armin Rigo <(E-Mail Removed)> writes:
> There are two levels: the language specification and its implementation.
> My point is that there is no reason why everything that is (at the
> language level) a list, should always be implemented as a plain array of
> objects. The list returned by range(1000000) doesn't have to use 4MB of
> memory when the same information can be encoded in a couple of ints.
> The presence of xrange() is the oldest of these user-visible
> optimization hack that should have been optimized in the implementation
> instead.


I think you're saying that instead of having xrange return a special
object, range should return that special object instead. I'm not too
clear on the distinction. Also, how can the compiler or interpreter
know whether the program will only access the object sequentially?
It has to predict the behavior of the program, instead of being told
explicitly.
 
Reply With Quote
 
Andrew MacIntyre
Guest
Posts: n/a
 
      04-04-2004
On Sat, 3 Apr 2004, Armin Rigo wrote:

> This is a rant against the optimization trend of the Python interpreter.
>
> Sorting a list of 100000 integers in random order takes:
>
> * 0.75 seconds in Python 2.1
> * 0.51 seconds in Python 2.2
> * 0.46 seconds in Python 2.3
>
> Tim Peters did a great job optimizing list.sort(). If I try with a
> simple, non-stable pure Python quicksort implementation, in Python 2.3:
>
> * 4.83 seconds
> * 0.21 seconds with Psyco
>
> First step towards world domination of high-level languages


{...}

> So this is not so much a plug for Psyco as a rant against the current
> trend of rewriting standard modules in C. Premature optimization and
> all that.


{...}

> Protesting-ly yours,


While I certainly appreciate pysco and what it can do, I believe your
protest to be unreasonable as it denies better performance on platforms
that psyco doesn't yet (& probably never will) support.

Moreover, your protest about iterators is also unreasonable as people are
benefitting from the reduced memory consumption iterators and their ilk
bring (quite often accompanied by performance gains from not having to
thrash large amounts of RAM through pitiful caches). As such, iterators
are a _different_ optimisation, and I hope that you can come to terms with
this and psycoise them too!

--
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: http://www.velocityreviews.com/forums/(E-Mail Removed) (pref) | Snail: PO Box 370
(E-Mail Removed) (alt) | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia

 
Reply With Quote
 
Armin Rigo
Guest
Posts: n/a
 
      04-04-2004
Paul Rubin wrote:
> I think you're saying that instead of having xrange return a special
> object, range should return that special object instead. I'm not too
> clear on the distinction.


No, range should return an object that is a list, as far as you can tell
from Python, but which is represented more efficiently than an array of
objects internally. The distinction is between the language level (it
would be a list, with all operations, etc.) and the implementation
(there is no reason why all lists should be arrays of PyObjects
internally).

Another example would be 'a'*999999999: the result is a string, but
there is no reason that it takes 100MB of memory. Instead, store it
into a C structure that contains a pointer to the original string object
'a' and the repetition counter, but still give this C structure the
Python type str, so that the difference doesn't show up and the Python
language remains simple. (This is a bit difficult to implement
currently in CPython, but not impossible.)

> Also, how can the compiler or interpreter
> know whether the program will only access the object sequentially?
> It has to predict the behavior of the program, instead of being told
> explicitly.


Ideally: If you do x=range(100); x[50]='hi' then the interpreter first
builds this optimized range representation and assigns it to x; and when
in the next statement you modify this list x it says 'oops! i cannot do
that with this representation', so it reverts to an array-like
representation (i.e. it creates all 100 elements) and then changes the
50th. No gain here. If on the other hand you only ever do 'easy'
things with your list, like iterate over it or read elements, then it
can all be done with the range representation, without falling back to
the array representation.

Again I'm not saying it is trivial to implement it, but that not having
to expose for optimization purposes 'xrange' and the whole 'iterator'
part of the language would be worth it, in my opinion.


Armin
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
RE: Wow, Python much faster than MatLab Doran, Harold Python 10 01-01-2007 06:56 PM
Wow, Python much faster than MatLab Stef Mientki Python 11 01-01-2007 02:05 AM
Lisp development with macros faster than Python development?.. seberino@spawar.navy.mil Python 37 07-11-2005 02:10 PM
Re: Python is faster than C Armin Rigo Python 6 04-05-2004 04:33 PM
RE: Python is faster than C Robert Brewer Python 2 04-05-2004 04:16 AM



Advertisments