I finally came to the conclusion that the exceeding good performance

of Psyco was due to the fact that the function was called a million

times with the *same* argument. Evidently Psyco is smart enough to

notice that. Changing the argument at each call

(erfc(0.456) -> i/1000000.0) slows down Python+Psyco at 1/4 of C speed.

Psyco improves Python performance by an order of magnitude, but still it

is not enough

I was too optimistic!

Here I my numbers for Python 2.3, Psyco 1.0, Red Hat Linux 7.3,

Pentium II 366 MHz:

$ time p23 erf.py

real 0m3.245s

user 0m3.164s

sys 0m0.037s

This is more than four times slower than optimized C:

$ gcc erf.c -lm -O3

$ time ./a.out

real 0m0.742s

user 0m0.725s

sys 0m0.002s

Here is the situation for pure Python

$time p23 erf.jy

real 0m27.470s

user 0m27.162s

sys 0m0.023s

and, just for fun, here is Jython performance:

$ time jython erf.jy

real 0m44.395s

user 0m42.602s

sys 0m0.389s

----------------------------------------------------------------------

$ cat erf.py

import math

import psyco

psyco.full()

def erfc(x):

exp = math.exp

p = 0.3275911

a1 = 0.254829592

a2 = -0.284496736

a3 = 1.421413741

a4 = -1.453152027

a5 = 1.061405429

t = 1.0 / (1.0 + p*x)

erfcx = ( (a1 + (a2 + (a3 +

(a4 + a5*t)*t)*t)*t)*t ) * exp(-x*x)

return erfcx

def main():

erg = 0.0

for i in xrange(1000000):

erg += erfc(i/1000000.0)

if __name__ == '__main__':

main()

--------------------------------------------------------------------------

# python/jython version = same without "import psyco; psyco.full()"

--------------------------------------------------------------------------

$cat erf.c

#include <stdio.h>

#include <math.h>

double erfc( double x )

{

double p, a1, a2, a3, a4, a5;

double t, erfcx;

p = 0.3275911;

a1 = 0.254829592;

a2 = -0.284496736;

a3 = 1.421413741;

a4 = -1.453152027;

a5 = 1.061405429;

t = 1.0 / (1.0 + p*x);

erfcx = ( (a1 + (a2 + (a3 +

(a4 + a5*t)*t)*t)*t)*t ) * exp(-x*x);

return erfcx;

}

int main()

{

double erg=0.0;

int i;

for(i=0; i<1000000; i++)

{

erg = erg + erfc(i/1000000.0);

}

return 0;

}

