Velocity Reviews > Re: Python speed vs csharp

# Re: Python speed vs csharp

David M. Cooke
Guest
Posts: n/a

 07-31-2003
At some point, Mike <(E-Mail Removed)> wrote:
> Here's the chunk of code that I'm spending most of my time executing:
>
> # Rational approximation for erfc(x) (Abramowitz & Stegun, Sec. 7.1.26)
> # Fifth order approximation. |error| <= 1.5e-7 for all x
> #
> def erfc( x ):
> p = 0.3275911
> a1 = 0.254829592
> a2 = -0.284496736
> a3 = 1.421413741
> a4 = -1.453152027
> a5 = 1.061405429
>
> t = 1.0 / (1.0 + p*float(x))
> erfcx = ( (a1 + (a2 + (a3 +
> (a4 + a5*t)*t)*t)*t)*t ) * math.exp(-(x**2))
> return erfcx

With python2.2, a million iterations takes 9.93 s on my machine.
With python2.3, 8.13 s.

Replacing float(x) with just x (multiplying by p (a float) is enough to
coerce x to a float), with python 2.3 I'm down to 6.98 s. Assigning
math.exp to a global variable exp cuts this down to 6.44 s:

exp = math.exp
def erfc(x):
[...]
t = 1.0 / (1.0 + p*x)
erfcx = ... * exp(-x**2)
return erfcx

You can go a little further and look up ways of optimising the
polynomial evaluation -- Horner's method is a good, general-purpose
technique, but you can get faster if you're willing to put some effort
into it. I believe Ralston and Rabonwitz's numerical analysis book has
a discussion on this (sorry, I don't have the complete reference; the
book's at home).

That's about the best you can do in pure python.

Now look at how you're calling this routine. Can you call it on a
bunch of numbers at once? Using Numeric arrays (http://numpy.org/),
calling the above routine on an array of a million numbers takes
1.6 s. Calling it 1000 times on an array of 1000 numbers takes 0.61 s.

Using erfc from scipy.special, this takes 0.65 s on a array of a
million numbers. Scipy (from http://scipy.org/) is a library of
scientific tools, supplementing Numeric. I believe the routine it uses
for erfc comes from the Cephes library (http://netlib.org/), which has
a error < 3.4e-14 for x in (-13,0). Calling it 1000 times on an array
of 1000 numbers takes 0.43 s.

(Ok, I don't know why 1000 times on arrays of 1000 numbers is twice as
fast as once on an array of 1000*1000 numbers.)

Extrapolating from your numbers, your C program is 28x faster. Using
scipy.special.erfc on moderate-sized arrays is 18.9x faster. That's
pretty good

For numerical codes in Python, I *strongly* suggest that you use
Numeric (or numarray, it's successor). It's _much_ better to pass 1000
numbers around as a aggregate, instead of 1 number 1000 times.

--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca

Terry Reedy
Guest
Posts: n/a

 08-01-2003

"David M. Cooke" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed).. .
> At some point, Mike <(E-Mail Removed)> wrote:
> > Here's the chunk of code that I'm spending most of my time

executing:
> >
> > # Rational approximation for erfc(x) (Abramowitz & Stegun, Sec.

7.1.26)
> > # Fifth order approximation. |error| <= 1.5e-7 for all x
> > #
> > def erfc( x ):
> > p = 0.3275911
> > a1 = 0.254829592
> > a2 = -0.284496736
> > a3 = 1.421413741
> > a4 = -1.453152027
> > a5 = 1.061405429
> >
> > t = 1.0 / (1.0 + p*float(x))
> > erfcx = ( (a1 + (a2 + (a3 +
> > (a4 + a5*t)*t)*t)*t)*t ) * math.exp(-(x**2))
> > return erfcx

Does inlining the constants give any speedup? or is local lookup so
fast that it does not matter?

TJR

David M. Cooke
Guest
Posts: n/a

 08-01-2003
At some point, "Terry Reedy" <(E-Mail Removed)> wrote:

> "David M. Cooke" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed).. .
>> At some point, Mike <(E-Mail Removed)> wrote:
>> > Here's the chunk of code that I'm spending most of my time

> executing:
>> >
>> > # Rational approximation for erfc(x) (Abramowitz & Stegun, Sec.

> 7.1.26)
>> > # Fifth order approximation. |error| <= 1.5e-7 for all x
>> > #
>> > def erfc( x ):
>> > p = 0.3275911
>> > a1 = 0.254829592
>> > a2 = -0.284496736
>> > a3 = 1.421413741
>> > a4 = -1.453152027
>> > a5 = 1.061405429
>> >
>> > t = 1.0 / (1.0 + p*float(x))
>> > erfcx = ( (a1 + (a2 + (a3 +
>> > (a4 + a5*t)*t)*t)*t)*t ) * math.exp(-(x**2))
>> > return erfcx

>
> Does inlining the constants give any speedup? or is local lookup so
> fast that it does not matter?

Should've thought of that too . It runs in 6.07 s now (my previous
version was 6.45 s, and the unmodified version above was 8.23 s).

Here's my best version:

exp = math.exp
def erfc( x ):
t = 1.0 + 0.3275911*x
return ( (0.254829592 + (
(1.421413741 + (-1.453152027 + 1.061405429/t)/t)/t - 0.284496736)
/t)/t ) * exp(-(x**2))

The reversal of the order of the a2 evaluation is because looking at
the bytecodes (using the dis module) it turns out that -0.284 was
being stored as 0.284, then negated. However, -1.45 is stored as that.
This runs in 5.77 s. You could speed that up a little if you're
willing to manipulate the bytecode directly

--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca

Brendan Hahn
Guest
Posts: n/a

 08-01-2003
(E-Mail Removed) (David M. Cooke) wrote:
>(Ok, I don't know why 1000 times on arrays of 1000 numbers is twice as
>fast as once on an array of 1000*1000 numbers.)

Probably cache effects.

--
brendan DOT hahn AT hp DOT com

Andrew Dalke
Guest
Posts: n/a

 08-01-2003
David M. Cooke:
> Here's my best version:
>
> exp = math.exp
> def erfc( x ):

Change that to

def erfc(x, exp = math.exp)

so there's a local namespace lookup instead of a global one. This
is considered a hack.

Andrew
http://www.velocityreviews.com/forums/(E-Mail Removed)

 Thread Tools

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is OffTrackbacks are On Pingbacks are On Refbacks are Off Forum Rules

 Similar Threads Thread Thread Starter Forum Replies Last Post Greg Brunet Python 13 08-04-2003 07:06 PM Mike Python 2 08-03-2003 03:09 PM Bengt Richter Python 1 08-02-2003 02:08 AM Richie Hindle Python 3 08-01-2003 12:20 PM Martin v. =?iso-8859-15?q?L=F6wis?= Python 1 07-31-2003 10:37 PM

Advertisments