Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > speed

Reply
Thread Tools

speed

 
 
Peter Kleiweg
Guest
Posts: n/a
 
      08-19-2004

I implemented a lexer in Pylly and compared it to the version I
had written in Flex. Processing 219062 lines took 0.9 seconds in
C (from Flex), and 5 minutes 54 second in Python (from Pylly), a
ratio of 393 to 1.

Is this normal for Python, or does Flex produce better parsers
than Pylly? I have been looking at the code produced by Flex to
see if I could translate it to Python automaticly. But it has a
lot of goto statements, and I haven't figured out how to
translate those to Python efficiently.

What are the average times used for text processing of Python
compared to C?

--
Peter Kleiweg L:NL,af,da,de,en,ia,nds,no,sv,(fr,it) S:NL,de,en,(da,ia)
info: http://www.let.rug.nl/~kleiweg/ls.html

 
Reply With Quote
 
 
 
 
John Lenton
Guest
Posts: n/a
 
      08-19-2004
On Thu, Aug 19, 2004 at 03:37:26PM +0200, Peter Kleiweg wrote:
>
> I implemented a lexer in Pylly and compared it to the version I
> had written in Flex. Processing 219062 lines took 0.9 seconds in
> C (from Flex), and 5 minutes 54 second in Python (from Pylly), a
> ratio of 393 to 1.
>
> Is this normal for Python, or does Flex produce better parsers
> than Pylly? I have been looking at the code produced by Flex to
> see if I could translate it to Python automaticly. But it has a
> lot of goto statements, and I haven't figured out how to
> translate those to Python efficiently.


flex has an option to generate code without the gotos...

--
John Lenton ((E-Mail Removed)) -- Random fortune:
Don't read everything you believe.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFBJLEYgPqu395ykGsRAnZWAJ9Kf/+vqmZ/t/FJrBWvfsQPwMVdXwCgk7Jp
YmxLnwJ2ciNDG9qzeKHSW/s=
=BquW
-----END PGP SIGNATURE-----

 
Reply With Quote
 
 
 
 
Peter Kleiweg
Guest
Posts: n/a
 
      08-19-2004
John Lenton schreef:


> flex has an option to generate code without the gotos...


I have the latest version. I can't find it, not as run time
option, not as build option.



--
Peter Kleiweg L:NL,af,da,de,en,ia,nds,no,sv,(fr,it) S:NL,de,en,(da,ia)
info: http://www.let.rug.nl/~kleiweg/ls.html

 
Reply With Quote
 
John Lenton
Guest
Posts: n/a
 
      08-19-2004
On Thu, Aug 19, 2004 at 04:16:24PM +0200, Peter Kleiweg wrote:
> John Lenton schreef:
>
>
> > flex has an option to generate code without the gotos...

>
> I have the latest version. I can't find it, not as run time
> option, not as build option.


hmm! you're right... I wonder what lexer it was, then? I definitely
have a weak ref to the option in my head, but the owner has been gc'ed


--
John Lenton ((E-Mail Removed)) -- Random fortune:
There was a phone call for you.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFBJLuogPqu395ykGsRAhDKAJ4xO/JWXvLl8UnQGpV3VzZWE7ArWwCgtefk
Kdqboao+WYsvWqsdZkgz2UY=
=4JCc
-----END PGP SIGNATURE-----

 
Reply With Quote
 
Oliver Fromme
Guest
Posts: n/a
 
      08-19-2004
Peter Kleiweg <(E-Mail Removed)> wrote:
> I implemented a lexer in Pylly and compared it to the version I
> had written in Flex. Processing 219062 lines took 0.9 seconds in
> C (from Flex), and 5 minutes 54 second in Python (from Pylly), a
> ratio of 393 to 1.
>
> Is this normal for Python, or does Flex produce better parsers
> than Pylly? I have been looking at the code produced by Flex to
> see if I could translate it to Python automaticly. But it has a
> lot of goto statements, and I haven't figured out how to
> translate those to Python efficiently.
>
> What are the average times used for text processing of Python
> compared to C?


I don't know Pylly, but I guess it generates a parser using
a finite automaton -- just like lex/flex, except it handles
every single character in Python, wheres lex/flex will lead
to compiled C code. That would explain the speed difference.

When I have to parse something in Python, I try to do that
using things like string.split(), string.find(), the "re"
module etc. Those things are written in C, therefore they
are fast enough for most applications. There are also some
modules for specialized cases, such as "ConfigParser" and
"shlex". See the Python Library Reference.

Best regards
Oliver

--
Oliver Fromme, Konrad-Celtis-Str. 72, 81369 Munich, Germany

``All that we see or seem is just a dream within a dream.''
(E. A. Poe)
 
Reply With Quote
 
Ayose
Guest
Posts: n/a
 
      08-20-2004
Hi,

On Thu, Aug 19, 2004 at 03:37:26PM +0200, Peter Kleiweg wrote:
>
> I implemented a lexer in Pylly and compared it to the version I
> had written in Flex. Processing 219062 lines took 0.9 seconds in
> C (from Flex), and 5 minutes 54 second in Python (from Pylly), a
> ratio of 393 to 1.
>
> Is this normal for Python, or does Flex produce better parsers
> than Pylly? I have been looking at the code produced by Flex to
> see if I could translate it to Python automaticly. But it has a
> lot of goto statements, and I haven't figured out how to
> translate those to Python efficiently.


Don't try to translate the generated code to python. Python code is
(almost) always slower than C code, because C is converted into machine
code, and Python has to be interpreted by the VM. Besides, python does a
lot of checks.

Try with PLY, <http://systems.cs.uchicago.edu/ply/>. If you have
experience with flex/yacc in C, this module should be easy to use.

You can also play with Psyco (a JIT compiler for x86) or even with
Pyrex.

But, IMHO, if you has to process very big files, don't do it with
python. Instead, write a simple C-module, which uses your Flex parser
and creates python objects with that information. It should be trivial
if you have experience with the C API.

>
> What are the average times used for text processing of Python
> compared to C?
>


IMO, Python is a powerful language to do almost everything, but in some
cases it is bad. One of this cases is intensive computing (like parsing a
big file). Use the correct tool =)

--
Ayose Cazorla León
Debian GNU/Linux - setepo
 
Reply With Quote
 
Jean Brouwers
Guest
Posts: n/a
 
      08-20-2004

Another Python parser generator to look into is SimpleParse/mxTextTools

<http://simpleparse.sourceforge.net/>

We use it to parse and process large log files. In our case, a typical
grammar contains over 250 productions and parsing a log file of 180
Klines (100 MB) takes approx 3 min. Processing the result from the
parse step requires an additional 3 mins. This on a 2.4 GHz Xeon
machine running RedHat 8.

Obviously these figures are very grammar and application specific. Your
milage may vary.

/Jean Brouwers

PS) A good reference is David Mertz' book "Text Processing in Python"

<http://www.informit.com/title/0321112547>

or several articles on (t)his web page

<http://gnosis.cx/publish/tech_index_cp.html>




In article <(E-Mail Removed)>, Ayose
<(E-Mail Removed)> wrote:

> <http://systems.cs.uchicago.edu/ply/>.

 
Reply With Quote
 
David M. Cooke
Guest
Posts: n/a
 
      08-22-2004
At some point, Ayose <(E-Mail Removed)> wrote:
> On Thu, Aug 19, 2004 at 03:37:26PM +0200, Peter Kleiweg wrote:
>>
>> I implemented a lexer in Pylly and compared it to the version I
>> had written in Flex. Processing 219062 lines took 0.9 seconds in
>> C (from Flex), and 5 minutes 54 second in Python (from Pylly), a
>> ratio of 393 to 1.
>>
>> Is this normal for Python, or does Flex produce better parsers
>> than Pylly? I have been looking at the code produced by Flex to
>> see if I could translate it to Python automaticly. But it has a
>> lot of goto statements, and I haven't figured out how to
>> translate those to Python efficiently.

>...
> But, IMHO, if you has to process very big files, don't do it with
> python. Instead, write a simple C-module, which uses your Flex parser
> and creates python objects with that information. It should be trivial
> if you have experience with the C API.


Or have a look at FlexModule at
http://www.cs.utexas.edu/users/mcgui...ware/fbmodule/
which makes it really simple without experience with the C API.

--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Reported Wireless speed w/ repeater 7-9x Measured Speed Lance Wireless Networking 0 10-31-2004 09:31 PM
I need speed Mr .Net....speed Ham ASP .Net 6 10-29-2004 08:04 AM
speed speed speed a.metselaar Computer Support 14 12-30-2003 03:34 AM
java tool to test disk i/o, processor speed, and network speed efiedler Java 1 10-09-2003 03:36 PM
USB High Speed against USB Non High Speed DannyD1355 Computer Support 1 09-07-2003 02:59 AM



Advertisments