Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Which is faster?

Reply
Thread Tools

Which is faster?

 
 
cnb
Guest
Posts: n/a
 
      08-30-2008
For a big nbr of it might matter?
Is av_grade O(n*2) and the first O(n) when it comes to adding or is
"sum x for x in y" just traversing the list ones, accumulating the
values, it doesnt first build the list and then travese it for sum?

def averageGrade(self):
tot = 0
for review in self.reviews:
tot += review.grade
return tot / len(self.reviews)

def av_grade(self):
return sum(review.grade for review in self.reviews) / \
len(self.reviews)
 
Reply With Quote
 
 
 
 
Fredrik Lundh
Guest
Posts: n/a
 
      08-30-2008
cnb wrote:

> For a big nbr of it might matter?
> Is av_grade O(n*2) and the first O(n) when it comes to adding or is
> "sum x for x in y" just traversing the list ones, accumulating the
> values, it doesnt first build the list and then travese it for sum?


sum() doesn't build a list, but even if it would do that, it'd still be
O(n) -- looping over an array twice doesn't change the time complexity.

to find out which one's actually faster, benchmark them.

</F>

 
Reply With Quote
 
 
 
 
Gabriel Genellina
Guest
Posts: n/a
 
      08-30-2008
En Sat, 30 Aug 2008 01:26:35 -0300, cnb <(E-Mail Removed)> escribi�:

> For a big nbr of it might matter?
> Is av_grade O(n*2) and the first O(n) when it comes to adding or is
> "sum x for x in y" just traversing the list ones, accumulating the
> values, it doesnt first build the list and then travese it for sum?


AFAIK both versions are O(n).
sum(x for x in y) contains a "generator expression" [1], it does not
create an intermediate list.
sum([x for x in y]) contains a "list comprehension" [2] and it does create
an intermediate list.

I'd say the version using sum() should be faster, because the iteration is
implemented inside the interpreter, not in Python code as the other one.
But you should actually measure their relative performance; use the timeit
module for that.

> def averageGrade(self):
> tot = 0
> for review in self.reviews:
> tot += review.grade
> return tot / len(self.reviews)
>
> def av_grade(self):
> return sum(review.grade for review in self.reviews) / \
> len(self.reviews)


[1] http://docs.python.org/tut/node11.ht...00000000000000
[2] http://docs.python.org/tut/node7.html
--
Gabriel Genellina

 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      08-30-2008
On Fri, 29 Aug 2008 21:26:35 -0700, cnb wrote:

> def averageGrade(self):
> tot = 0
> for review in self.reviews:
> tot += review.grade
> return tot / len(self.reviews)
>
> def av_grade(self):
> return sum(review.grade for review in self.reviews) / \
> len(self.reviews)


Re-writing the functions so they can be tested alone:

def averageGrade(alist):
tot = 0.0
for x in alist:
tot += x
return tot/len(alist)


def av_grade(alist):
return sum(alist)/len(alist)


>>> from timeit import Timer
>>> # small amount of items

.... alist = range(100)
>>> Timer('averageGrade(alist)',

.... 'from __main__ import alist, averageGrade').repeat(number=100000)
[3.9559240341186523, 3.4910569190979004, 3.4856188297271729]
>>>
>>> Timer('av_grade(alist)',

.... 'from __main__ import alist, av_grade').repeat(number=100000)
[2.0255107879638672, 1.0968310832977295, 1.0733180046081543]


The version with sum() is much faster. How about with lots of data?

>>> alist = xrange(1000000)
>>> Timer('averageGrade(alist)',

.... 'from __main__ import alist, averageGrade').repeat(number=50)
[17.699107885360718, 18.182793140411377, 18.651514053344727]
>>>
>>> Timer('av_grade(alist)',

.... 'from __main__ import alist, av_grade').repeat(number=50)
[17.125216007232666, 15.72636890411377, 16.309713840484619]

sum() is still a little faster.



--
Steven
 
Reply With Quote
 
Gabriel Genellina
Guest
Posts: n/a
 
      08-30-2008
En Sat, 30 Aug 2008 03:15:30 -0300, Steven D'Aprano
<(E-Mail Removed)> escribi�:

> On Fri, 29 Aug 2008 21:26:35 -0700, cnb wrote:
>
>> def averageGrade(self):
>> tot = 0
>> for review in self.reviews:
>> tot += review.grade
>> return tot / len(self.reviews)
>>
>> def av_grade(self):
>> return sum(review.grade for review in self.reviews) / \
>> len(self.reviews)

>
> Re-writing the functions so they can be tested alone:
>
> def averageGrade(alist):
> tot = 0.0
> for x in alist:
> tot += x
> return tot/len(alist)
>
>
> def av_grade(alist):
> return sum(alist)/len(alist)
>
>
>>>> from timeit import Timer
>>>> # small amount of items

> ... alist = range(100)
>>>> Timer('averageGrade(alist)',

> ... 'from __main__ import alist, averageGrade').repeat(number=100000)
> [3.9559240341186523, 3.4910569190979004, 3.4856188297271729]
>>>>
>>>> Timer('av_grade(alist)',

> ... 'from __main__ import alist, av_grade').repeat(number=100000)
> [2.0255107879638672, 1.0968310832977295, 1.0733180046081543]
>
>
> The version with sum() is much faster. How about with lots of data?
>
>>>> alist = xrange(1000000)
>>>> Timer('averageGrade(alist)',

> ... 'from __main__ import alist, averageGrade').repeat(number=50)
> [17.699107885360718, 18.182793140411377, 18.651514053344727]
>>>>
>>>> Timer('av_grade(alist)',

> ... 'from __main__ import alist, av_grade').repeat(number=50)
> [17.125216007232666, 15.72636890411377, 16.309713840484619]
>
> sum() is still a little faster.


Mmm, in this last test you're measuring the long integer operations
performance (because the sum exceeds largely what can be represented in a
plain integer). Long integers are so slow that the difference between both
loops becomes negligible.

I've tried again using float values:
alist = [float(x) for x in xrange(1000000)]
and got consistent results for any input size (the version using sum() is
about twice as fast as the for loop)

--
Gabriel Genellina

 
Reply With Quote
 
cnb
Guest
Posts: n/a
 
      08-30-2008
how does doing something twice not change complexity? yes it maybe
belongs to the same complexity-class but is still twice as slow no?
 
Reply With Quote
 
Fredrik Lundh
Guest
Posts: n/a
 
      08-30-2008
cnb wrote:

> how does doing something twice not change complexity? yes it maybe
> belongs to the same complexity-class but is still twice as slow no?


doing two things quickly can still be faster than doing one thing slowly.

</F>

 
Reply With Quote
 
Diez B. Roggisch
Guest
Posts: n/a
 
      08-30-2008
cnb schrieb:
> how does doing something twice not change complexity? yes it maybe
> belongs to the same complexity-class but is still twice as slow no?


Because big O notation is not about constant factors. Or even subterms
with lower powers.

http://en.wikipedia.org/wiki/Big_O_notation

Diez
 
Reply With Quote
 
Lie
Guest
Posts: n/a
 
      08-30-2008
On Aug 30, 5:30*pm, cnb <(E-Mail Removed)> wrote:
> how does doing something twice not change complexity? yes it maybe
> belongs to the same complexity-class but is still twice as slow no?


Who is doing something twice? Definitely not sum().

sum() does not create intermediate list, and if you pass generator
expressions in it you wouldn't make any intermediate list at all, thus
simple looping but in interpreter code. sum() that is passed a list
comprehension should be faster for extremely small numbers of values
to sum, but we don't care about small things, do we? But using
intermediate list should be fast enough for even large numbers of
values, the time when it can't cope anymore would be when the
intermediate list takes half of your memory (how often is that for
regular applications?).
 
Reply With Quote
 
Fredrik Lundh
Guest
Posts: n/a
 
      08-30-2008
Lie wrote:

>> how does doing something twice not change complexity? yes it maybe
>> belongs to the same complexity-class but is still twice as slow no?

>
> Who is doing something twice? Definitely not sum().


nobody's claiming that -- but as mentioned above, even if sum() had done
two passes over the source sequence, it'd still be O(n).

</F>

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Microcontrollers: which one ? which language ? which compiler ? The Jesus of Suburbia NZ Computing 2 02-11-2006 06:53 PM
Web Stats? Which to use? Which is best? Familyman HTML 3 02-09-2006 11:05 PM
ADSL WIC support - which NM's, and which IOS versions? Kralizec Craig Cisco 5 12-08-2005 02:20 AM
which XMI version compatible to which UML version? Kenny XML 0 06-02-2004 10:20 PM
Keeping track of which user controls need to be loaded and which not John ASP .Net 0 07-08-2003 09:26 AM



Advertisments