Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: Why is my code faster with append() in a loop than with a largelist?

Reply
Thread Tools

Re: Why is my code faster with append() in a loop than with a largelist?

 
 
Dave Angel
Guest
Posts: n/a
 
      07-06-2009
Xavier Ho wrote:
> (Here's a short version of the long version below if you don't want to
> read
>
> Why is version B of the code faster than version A? (Only three lines
> different)
>
> Version A: http://pastebin.com/f14561243
> Version B: http://pastebin.com/f1f657afc
>
> ------------------------------------------------
>
> I was doing the problems on Project Euler for practice with Python last
> night. Problem 12 was to find the value of the first triangular number that
> has over 500 divisors.
> ================================================== =======================================
>
> The sequence of triangle numbers is generated by adding the natural numbers.
> So the 7[image: ^(]th[image: )] triangle number would be 1 + 2 + 3 + 4 + 5 +
> 6 + 7 = 28. The first ten terms would be:
>
> 1, 3, 6, 10, 15, 21, 28, 36, 45, 55, ...
>
> Let us list the factors of the first seven triangle numbers:
>
> * 1*: 1
> * 3*: 1,3
> * 6*: 1,2,3,6
> *10*: 1,2,5,10
> *15*: 1,3,5,15
> *21*: 1,3,7,21
> *28*: 1,2,4,7,14,28
>
> We can see that 28 is the first triangle number to have over five divisors.
>
> What is the value of the first triangle number to have over five hundred
> divisors?
> ================================================== =======================================
>
> My initial code was to loop through from 1 to half the number and see which
> were divisors, and as I find them I store them in a list. That would have
> taken days.
>
> My second try was factorising the number each time, and count the divisors
> using the powers of each factor, plus 1, and multiply together.
> The code is here (Version A): http://pastebin.com/f14561243
>
> This worked, but it took overnight to compute. Before I went to bed a friend
> of mine caught me online, and apparently left me a working version under 8
> seconds with only 3 line difference.
> The code is here (Version B): http://pastebin.com/f1f657afc
>
> That was amazing. But I have no idea why his edit makes it so much faster. I
> did a test to see whether if append() was faster (which I doubted) than
> defining a list with a large size to begin with, and I was right:
> http://pastebin.com/f4b46d0db
> Which shows that appending is 40x slower, and was expected. But I still
> can't puzzle out why his use of appending in Version B was so much faster
> than mine.
>
> Any insights would be welcome. I'm going on a family trip, though, so my
> replies may delay.
>
> Best regards,
>
> Ching-Yun "Xavier" Ho, Technical Artist
>
> Contact Information
> Mobile: (+61) 04 3335 4748
> Skype ID: SpaXe85
> Email: http://www.velocityreviews.com/forums/(E-Mail Removed)
> Website: http://xavierho.com/
>
>

Just by inspection, it would seem the bottleneck in your first version
is that you return a huge list of nearly all zeroes, from factorize().
This slows down countDivisors() a great deal.

It would probably save some time to not bother storing the zeroes in the
list at all. And it should help if you were to step through a list of
primes, rather than trying every possible int. Or at least constrain
yourself to odd numbers (after the initial case of 2).

DaveA
 
Reply With Quote
 
 
 
 
Piet van Oostrum
Guest
Posts: n/a
 
      07-06-2009
>>>>> Dave Angel <(E-Mail Removed)> (DA) wrote:

>DA> It would probably save some time to not bother storing the zeroes in the
>DA> list at all. And it should help if you were to step through a list of
>DA> primes, rather than trying every possible int. Or at least constrain
>DA> yourself to odd numbers (after the initial case of 2).


The first and the last save a constant factor (slightly less than 2):

def factorise(num):
"""Returns a list of prime factor powers. For example:
factorise(6) will return
[2, 2] (the powers are returned one higher than the actual value)
as in, 2^1 * 3^1 = 6."""
powers = []
factor = 2
while num > 1:
power = 0
while num % factor == 0:
power += 1
num /= factor
if power > 0:
powers.append(power+1)
factor += 1
return powers

....
return reduce(mul, powers)

or to skip the odd factors:

def factorise(num):
"""Returns a list of prime factor powers. For example:
factorise(6) will return
[2, 2] (the powers are returned one higher than the actual value)
as in, 2^1 * 3^1 = 6."""
powers = []
factor = 2
while num > 1:
power = 0
while num % factor == 0:
power += 1
num /= factor
if power > 0:
powers.append(power+1)
factor = 3 if factor == 2 else factor + 2
return powers

This can be slightly optimised by taking factor 2 out of the loop.

def factorise(num):
"""Returns a list of prime factor powers. For example:
factorise(6) will return
[2, 2] (the powers are returned one higher than the actual value)
as in, 2^1 * 3^1 = 6."""
powers = []
power = 0
while num % 2 == 0:
power += 1
num /= 2
if power > 0:
powers.append(power+1)
factor = 3
while num > 1:
power = 0
while num % factor == 0:
power += 1
num /= factor
if power > 0:
powers.append(power+1)
factor += 2
return powers

To restrict the search to primes you would have to use a
sieve of Eratosthenes or something similar.
My first attempt (with a sieve from
http://code.activestate.com/recipes/117119/) only gave a speed decrease!!
But this had the sieve recreated for every triangle number. A global
sieve that is reused at each triangle number is better. But the speed
increase relative to the odd factors only is not dramatical.


# Based upon http://code.activestate.com/recipes/117119/

D = {9: 6} # contains composite numbers
Dlist = [2, 3] # list of already generated primes

def sieve():
'''generator that yields all prime numbers'''
global D
global Dlist
for p in Dlist:
yield p
q = Dlist[-1]+2
while True:
if q in D:
p = D[q]
x = q + p
while x in D: x += p
D[x] = p
else:
Dlist.append(q)
yield q
D[q*q] = 2*q
q += 2

def factorise(num):
"""Returns a list of prime factor powers. For example:
factorise(6) will return
[2, 2] (the powers are returned one higher than the actual value)
as in, 2^1 * 3^1 = 6."""
powers = []
power = 0
for factor in sieve():
power = 0
while num % factor == 0:
power += 1
num /= factor
if power > 0:
# if you really want the factors then append((factor, power))
powers.append(power+1)
if num == 1:
break
return powers

--
Piet van Oostrum <(E-Mail Removed)>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: (E-Mail Removed)
 
Reply With Quote
 
 
 
 
Piet van Oostrum
Guest
Posts: n/a
 
      07-06-2009
Sorry, there was an error in the sieve in my last example. Here is a
corrected version:

D = {9: 6} # contains composite numbers
Dlist = [2, 3] # list of already generated primes
def sieve():
'''generator that yields all prime numbers'''
global D
global Dlist
for q in Dlist:
yield q
while True:
q += 2
p = D.pop(q, 0)
if p:
x = q + p
while x in D: x += p
D[x] = p
else:
Dlist.append(q)
D[q*q] = 2*q
yield q

--
Piet van Oostrum <(E-Mail Removed)>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: (E-Mail Removed)
 
Reply With Quote
 
Dave Angel
Guest
Posts: n/a
 
      07-06-2009
Scott David Daniels wrote:
> <div class="moz-text-flowed" style="font-family: -moz-fixed">Piet van
> Oostrum wrote:
>>>>>>> Dave Angel <(E-Mail Removed)> (DA) wrote:

>>
>>> DA> It would probably save some time to not bother storing the
>>> zeroes in the
>>> DA> list at all. And it should help if you were to step through a
>>> list of
>>> DA> primes, rather than trying every possible int. Or at least
>>> constrain
>>> DA> yourself to odd numbers (after the initial case of 2).

>>
>> ...
>> # Based upon http://code.activestate.com/recipes/117119/
>>
>> D = {9: 6} # contains composite numbers

> XXX Dlist = [2, 3] # list of already generated primes
> Elist = [(2, 4), (3, 9)] # list of primes and their squares
>>

> XXX def sieve():
> XXX '''generator that yields all prime numbers'''
> XXX global D
> XXX global Dlist
> def sieve2():
> '''generator that yields all primes and their squares'''
> # No need for global declarations, we alter, not replace
> XXX for p in Dlist:
> XXX yield p
> XXX q = Dlist[-1]+2
>
> for pair in Elist:
> yield pair
> q = pair[0] + 2
>
>> while True:
>> if q in D:
>> p = D[q]
>> x = q + p
>> while x in D: x += p
>> D[x] = p
>> else:

> XXX Dlist.append(q)
> XXX yield q
> XXX D[q*q] = 2*q
> square = q * q
> pair = q, square
> Elist.append(pair)
> yield pair
> D[square] = 2 * q
>> q += 2
>>
>> def factorise(num):
>> """Returns a list of prime factor powers. For example:
>> factorise(6) will return
>> [2, 2] (the powers are returned one higher than the actual
>> value)
>> as in, 2^1 * 3^1 = 6."""
>> powers = []
>> power = 0

> XXX for factor in sieve():
> for factor, limit in sieve2():
>> power = 0
>> while num % factor == 0:
>> power += 1
>> num /= factor

> XXX if power > 0:
> if power: # good enough here, and faster
>> # if you really want the factors then append((factor,
>> power))
>> powers.append(power+1)

> XXX if num == 1:
> XXX break
> XXX return powers
> if num < limit:
> if num > 1:
> # if you really want the factors then append((num, 1))
> powers.append(2)
> return powers
>
> OK, that's a straightforward speedup, _but_:
> factorize(6) == [2, 2] == factorize(10) == factorize(15)
> So I am not sure exactly what you are calculating.
>
>
> --Scott David Daniels
> (E-Mail Removed)
>
> </div>
>

The OP only needed the number of factors, not the actual factors. So
the zeroes in the list are unneeded. 6, 10, and 15 each have 4 factors.


 
Reply With Quote
 
Piet van Oostrum
Guest
Posts: n/a
 
      07-06-2009
>>>>> Scott David Daniels <(E-Mail Removed)> (SDD) wrote:

>SDD> # No need for global declarations, we alter, not replace


Yes, I know, but I find it neater to do the global declarations, if only
for documentation. And they don't affect the run time, only the compile
time.
--
Piet van Oostrum <(E-Mail Removed)>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: (E-Mail Removed)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Memset is faster than simple loop? AndersWang@gmail.com C Programming 24 03-12-2013 09:42 PM
Triple nested loop python (While loop insde of for loop inside ofwhile loop) Isaac Won Python 9 03-04-2013 10:08 AM
findcontrol("PlaceHolderPrice") why why why why why why why why why why why Mr. SweatyFinger ASP .Net 2 12-02-2006 03:46 PM
method calls faster than a loop? tom fredriksen Java 42 03-20-2006 08:06 PM
Why .NET is faster than Java Cindi Jenkins Java 1 12-06-2004 03:28 AM



Advertisments