Velocity Reviews > Python 3.3 vs. MSDOS Basic

# Python 3.3 vs. MSDOS Basic

John Immarino
Guest
Posts: n/a

 02-18-2013
I coded a Python solution for Problem #14 on the Project Euler website. I was very surprised to find that it took 107 sec. to run even though it's a pretty simple program. I also coded an equivalent solution for the problem in the old MSDOS basic. (That's the 16 bit app of 1980s vintage.) It ran in 56 sec. Is there a flaw in my coding, or is Python really this slow in this particular application. MSDOS Basic usually runs at a snails pace compared to Python.

Below is the problem and the code:

The following iterative sequence is defined for the set of positive integers:

n → n/2 (n is even)
n → 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:
13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1

It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

max=0
m=0
while m<=1000000:
m+=1
count=0
n=m
while n!=1:
count+=1
if n%2==0:
n=n//2
else:
n=3*n+1
if count>max:
max=count
num=m
print(num,max)

Ian Kelly
Guest
Posts: n/a

 02-18-2013
On Mon, Feb 18, 2013 at 12:13 PM, John Immarino <(E-Mail Removed)> wrote:
> I coded a Python solution for Problem #14 on the Project Euler website. Iwas very surprised to find that it took 107 sec. to run even though it's apretty simple program. I also coded an equivalent solution for the problem in the old MSDOS basic. (That's the 16 bit app of 1980s vintage.) It ranin 56 sec. Is there a flaw in my coding, or is Python really this slow in this particular application. MSDOS Basic usually runs at a snails pace compared to Python.

Well, I don't see anything that looks especially slow in that code,
but the algorithm that you're using is not very efficient. I rewrote
it using dynamic programming (details left as an exercise), which got
the runtime down to about 4 seconds.

Chris Angelico
Guest
Posts: n/a

 02-18-2013
On Tue, Feb 19, 2013 at 6:13 AM, John Immarino <(E-Mail Removed)> wrote:
> I coded a Python solution for Problem #14 on the Project Euler website. Iwas very surprised to find that it took 107 sec. to run even though it's apretty simple program. I also coded an equivalent solution for the problem in the old MSDOS basic. (That's the 16 bit app of 1980s vintage.) It ranin 56 sec. Is there a flaw in my coding, or is Python really this slow in this particular application. MSDOS Basic usually runs at a snails pace compared to Python.

BASIC does a lot less. If you wrote an 8086 assembly language
interpreter in Python, it'd run fairly slowly too Python isn't
really the world's best language for number crunching inside a machine
word; though if this were a major project, I would recommend looking
into Cython, as it lets you translate a few critical portions of your
code to C while leaving the rest in Python.

In order to get some useful stats, I added a little timing code to
your original; on my Windows XP laptop, running Python 3.3, your
version took 212.64 seconds to get to a result (namely, 837799 with a
count of 524).

Here's how I'd code it:

import time
start=time.time()
max=0
for m in range(1,1000001):
n=m
count=0
while n>1:
if n%2: n=3*n+1
else: n//=2
count+=1
if count>max: max,num=count,m
if not m&16383: print("->",m,count)
print(num,max)
print(time.time()-start)

(You'll see the same timing information that I added to yours. It adds
immeasurably to the run-time, and gives some early idea of how it's
going.)

Running under Python 2.6, both your version and mine take about 90
seconds to run. But under Python 3.3, where (among other things)
range() yields values lazily, my version is significantly faster than
yours. BUT! Both versions, under 3.3, are significantly *slower* than
under 2.6. My first thought is that it's because Py2 has different
types for 'int' and 'long', and Py3 doesn't (effectively, everything's
a long), so I added an L suffix to every number and ran each of them
under 2.6 again. Seems that was the bulk of the difference, though not
all.

Pythonistas, does this count as a regression, or is Python
sufficiently "not a number crunching language" that we don't care?

(range = my code, as above; while = original version with a C-style
loop counter)
range py3: 171.07846403121948
while py3: 212.64104509353638
range py2: 87.859000206
while py2: 86.4059998989
range py2 longs: 190.530999899
while py2 longs: 176.125999928

For comparison purposes, I also coded up the equivalent in Pike.
Pike's a very similar language to Python, but with a C-like syntax,
and certain optimizations - including, significantly to this exercise,
an integer type that sits within a machine word if it can (though
it'll happily go arbitrary precision when it's needed to). It pretends
to the programmer that it's a Py3-style "everything's an int", but
underneath, functions more like Py2 with separate short and long
types. The result: 22.649 seconds to reach the same conclusion.

How long did your BASIC version take, and how long did the Python
version on the same hardware?

This sort of pure number crunching isn't really where a modern high
level language shines. You'll come to *really* appreciate Python as
soon as you start working with huge arrays, dictionaries, etc. This is
a job for C, really.

ChrisA

Chris Angelico
Guest
Posts: n/a

 02-18-2013
On Tue, Feb 19, 2013 at 8:55 AM, Chris Angelico <(E-Mail Removed)> wrote:
> How long did your BASIC version take, and how long did the Python
> version on the same hardware?

Which Python version didyou use?

ChrisA

Chris Angelico
Guest
Posts: n/a

 02-18-2013
On Tue, Feb 19, 2013 at 8:56 AM, Chris Angelico <(E-Mail Removed)> wrote:
> On Tue, Feb 19, 2013 at 8:55 AM, Chris Angelico <(E-Mail Removed)> wrote:
>> How long did your BASIC version take, and how long did the Python
>> version on the same hardware?

>
> Which Python version didyou use?
>
> ChrisA

Doh. I'm having a great day of not reading properly, today. (I blame
checking mail on the bus, it took me over an hour to read this one
message and I'd forgotten the subject line by the time I got to the
end.) Python 3.3, right there in the header. Disregard me!

ChrisA

Chris Angelico
Guest
Posts: n/a

 02-18-2013
On Tue, Feb 19, 2013 at 8:54 AM, Ian Kelly <(E-Mail Removed)> wrote:
> Well, I don't see anything that looks especially slow in that code,
> but the algorithm that you're using is not very efficient. I rewrote
> it using dynamic programming (details left as an exercise), which got
> the runtime down to about 4 seconds.

Did it involve a dictionary, mapping a value to its count, so that any
time you hit a value you've seen, you can short-cut it? That was my
first optimization consideration, though I didn't implement it in any
version, so as to keep the timings comparable.

ChrisA

Ian Kelly
Guest
Posts: n/a

 02-18-2013
On Mon, Feb 18, 2013 at 3:01 PM, Chris Angelico <(E-Mail Removed)> wrote:
> On Tue, Feb 19, 2013 at 8:54 AM, Ian Kelly <(E-Mail Removed)> wrote:
>> Well, I don't see anything that looks especially slow in that code,
>> but the algorithm that you're using is not very efficient. I rewrote
>> it using dynamic programming (details left as an exercise), which got
>> the runtime down to about 4 seconds.

>
> Did it involve a dictionary, mapping a value to its count, so that any
> time you hit a value you've seen, you can short-cut it? That was my
> first optimization consideration, though I didn't implement it in any
> version, so as to keep the timings comparable.

Ayup.

Alexander Blinne
Guest
Posts: n/a

 02-19-2013
Am 18.02.2013 20:13, schrieb John Immarino:
> I coded a Python solution for Problem #14 on the Project Euler website. I was very surprised to find that it took 107 sec. to run even though it's a pretty simple program. I also coded an equivalent solution for the problem in the old MSDOS basic. (That's the 16 bit app of 1980s vintage.) It ran in 56 sec. Is there a flaw in my coding, or is Python really this slow in this particular application. MSDOS Basic usually runs at a snails pace compared to Python.

> max=0
> m=0
> while m<=1000000:
> m+=1
> count=0
> n=m
> while n!=1:
> count+=1
> if n%2==0:
> n=n//2
> else:
> n=3*n+1
> if count>max:
> max=count
> num=m
> print(num,max)

I cannot compare my timings with basic but python 2.7.3 and python 3.2.3
are both equally slow hier (~50 sec).
pypy is a lot faster (only some old version 1.7.0, current versions
should be faster still) with about 5 sec.

The following C-Program:

#include <stdio.h>

int main(void) {

int max = 0;
int m = 0;
long int n;
int count;
int num;

while(m<=1000000) {
m++;
n = m;
count = 0;

while(n != 1) {
count++;
if(n % 2 == 0) {
n = n / 2;
}
else {
n = n*3 + 1;
}
}

if(count > max) {
max = count;
num = m;
}
}

printf("%d, %d\n", num, max);
}

Does the job in just under 1 sec.

Greetings
Alexander

Dennis Lee Bieber
Guest
Posts: n/a

 02-19-2013
On Mon, 18 Feb 2013 11:13:04 -0800 (PST), John Immarino
<(E-Mail Removed)> declaimed the following in gmane.comp.python.general:

> I coded a Python solution for Problem #14 on the Project Euler website. I was very surprised to find that it took 107 sec. to run even though it's a pretty simple program. I also coded an equivalent solution for the problem in the old MSDOS basic. (That's the 16 bit app of 1980s vintage.) It ran in 56 sec. Is there a flaw in my coding, or is Python really this slow in this particular application. MSDOS Basic usually runs at a snails pace compared to Python.
>

<snip>
> max=0

"max" is a bad name -- it masks the built-in max() function

> m=0
> while m<=1000000:
> m+=1

Since "m" is only modified here and has a value of 1 for the first
pass through, you can replace those three lines with

for m in xrange(1, 1000001): #python 2.x, just use range() for 3.x

> count=0
> n=m

> while n!=1:
> count+=1
> if n%2==0:
> n=n//2
> else:
> n=3*n+1

Avoid the comparison to 0 by reversing the then/else actions... Any
0 result is false.

-=-=-=-=-
import time

mx = 0

start = time.time()
for m in xrange(1, 1000001):
count = 0
n = m
while n > 1:
count += 1
if n % 2: # 0 means false
n = 3 * n + 1
else:
n = n // 2

if count > mx:
mx, num = count, m

end = time.time()

print num, mx
print end-start
-=-=-=-=-
Microsoft Windows XP [Version 5.1.2600]

E:\UserData\Wulfraed\My Documents>cd "Python Progs"

E:\UserData\Wulfraed\My Documents\Python Progs>Script1.py
837799 524
83.2030000687

E:\UserData\Wulfraed\My Documents\Python Progs>

--
Wulfraed Dennis Lee Bieber AF6VN
http://www.velocityreviews.com/forums/(E-Mail Removed) HTTP://wlfraed.home.netcom.com/

Terry Reedy
Guest
Posts: n/a

 02-19-2013
On 2/18/2013 2:13 PM, John Immarino wrote:
> I coded a Python solution for Problem #14 on the Project Euler
> website. I was very surprised to find that it took 107 sec. to run
> even though it's a pretty simple program. I also coded an equivalent
> solution for the problem in the old MSDOS basic. (That's the 16 bit
> app of 1980s vintage.) It ran in 56 sec. Is there a flaw in my
> coding, or is Python really this slow in this particular application.
> MSDOS Basic usually runs at a snails pace compared to Python.

I find this surprising too. I am also surprised that it even works,
given that the highest intermediate value is about 57 billion and I do
not remember that Basic had infinite precision ints.

> The following iterative sequence is defined for the set of positive
> integers:
>
> n → n/2 (n is even) n → 3n + 1 (n is odd)

Note that if n is odd, 3n + 1 is even (and not 1!), so one may take two
steps with (3n + 1)/2.

> Using the rule above and starting with 13, we generate the following
> sequence: 13 → 40 → 20 → 10 → 5 →16 → 8 → 4 → 2 → 1
>
> It can be seen that this sequence (starting at 13 and finishing at 1)
> contains 10 terms. Although it has not been proved yet (Collatz
> Problem), it is thought that all starting numbers finish at 1.

https://en.wikipedia.org/wiki/Collatz_conjecture

> Which starting number, under one million, produces the longest
> chain?

I suppose 'print(837799)' would not count as a proper solution.

> NOTE: Once the chain starts the terms are allowed to go above one
> million.

Here is my slightly revised code with timings on a good, 20 month old
win 7 machine.

from time import time
start = time()

num, max = 0, 0
for m in range(1, 1000001):
n = m
count=0
while n !=1:
if n & 1: #n % 2:
n = (3*n + 1) // 2
count += 2
else:
n = n//2
count += 1
if count > max:
num = m
max = count

print(num, max , time()-start)

# original: 837799, 524 steps, 53.9 secs
# for ... range: 52.3
# reverse inner if 49.0
# double step 39.1
# n & 1 instead of n % 2 for test: 36.0, 36.0, 35.9
# n>>1 instead of n//2: 34.7, 36.1, 36.2;
# this may be internally optimized, so skip

I do not see any fluff left to remove, unless one takes the major step
of saving already calculated values in an array.

Since the highest intermediate value of n is 56991483520 (445245965
*2**7, from adding "if n > maxn: maxn = n" to the odd branch, before
dividing by 2), the array would have to be limited to a much lower
value, say a few million.

--
Terry Jan Reedy