At some point, Mike <(E-Mail Removed)> wrote:

> Here's the chunk of code that I'm spending most of my time executing:

>

> # Rational approximation for erfc(x) (Abramowitz & Stegun, Sec. 7.1.26)

> # Fifth order approximation. |error| <= 1.5e-7 for all x

> #

> def erfc( x ):

> p = 0.3275911

> a1 = 0.254829592

> a2 = -0.284496736

> a3 = 1.421413741

> a4 = -1.453152027

> a5 = 1.061405429

>

> t = 1.0 / (1.0 + p*float(x))

> erfcx = ( (a1 + (a2 + (a3 +

> (a4 + a5*t)*t)*t)*t)*t ) * math.exp(-(x**2))

> return erfcx
With python2.2, a million iterations takes 9.93 s on my machine.

With python2.3, 8.13 s.

Replacing float(x) with just x (multiplying by p (a float) is enough to

coerce x to a float), with python 2.3 I'm down to 6.98 s. Assigning

math.exp to a global variable exp cuts this down to 6.44 s:

exp = math.exp

def erfc(x):

[...]

t = 1.0 / (1.0 + p*x)

erfcx = ... * exp(-x**2)

return erfcx

You can go a little further and look up ways of optimising the

polynomial evaluation -- Horner's method is a good, general-purpose

technique, but you can get faster if you're willing to put some effort

into it. I believe Ralston and Rabonwitz's numerical analysis book has

a discussion on this (sorry, I don't have the complete reference; the

book's at home).

That's about the best you can do in pure python.

Now look at how you're calling this routine. Can you call it on a

bunch of numbers at once? Using Numeric arrays (

http://numpy.org/),

calling the above routine on an array of a million numbers takes

1.6 s. Calling it 1000 times on an array of 1000 numbers takes 0.61 s.

Using erfc from scipy.special, this takes 0.65 s on a array of a

million numbers. Scipy (from

http://scipy.org/) is a library of

scientific tools, supplementing Numeric. I believe the routine it uses

for erfc comes from the Cephes library (

http://netlib.org/), which has

a error < 3.4e-14 for x in (-13,0). Calling it 1000 times on an array

of 1000 numbers takes 0.43 s.

(Ok, I don't know why 1000 times on arrays of 1000 numbers is twice as

fast as once on an array of 1000*1000 numbers.)

Extrapolating from your numbers, your C program is 28x faster. Using

scipy.special.erfc on moderate-sized arrays is 18.9x faster. That's

pretty good

For numerical codes in Python, I *strongly* suggest that you use

Numeric (or numarray, it's successor). It's _much_ better to pass 1000

numbers around as a aggregate, instead of 1 number 1000 times.

--

|>|\/|<

/--------------------------------------------------------------------------\

|David M. Cooke

|cookedm(at)physics(dot)mcmaster(dot)ca