Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Perl / python regex / performance comparison

Reply
Thread Tools

Perl / python regex / performance comparison

 
 
Ivan
Guest
Posts: n/a
 
      03-03-2009
Hello everyone,

I know this is not a direct python question, forgive me for that, but
maybe some of you will still be able to help me. I've been told that
for my application it would be best to learn a scripting language, so
I looked around and found perl and python to be the nice. Their syntax
and "way" is not similar, though.
So, I was wondering, could any of you please elaborate on the
following, as to ease my dilemma:

1. Although it is all relatively similar, there are differences
between regexes of these two. Which do you believe is the more
powerful variant (maybe an example) ?

2. They are both interpreted languages, and I can't really be sure how
they measure in speed. In your opinion, for handling large files,
which is better ?
(I'm processing files of numerical data of several hundred mb - let's
say 200mb - how would python handle file of such size ? As compared to
perl ?)

3. This last one is somewhat subjective, but what do you think, in the
future, which will be more useful. Which, in your (humble) opinion
"has a future" ?

Thank you for all the info you can spare, and expecially grateful for
the time in doing so.
-- Ivan
 
Reply With Quote
 
 
 
 
Ciprian Dorin, Craciun
Guest
Posts: n/a
 
      03-03-2009
On Tue, Mar 3, 2009 at 7:03 PM, Ivan <(E-Mail Removed)> wrote:
> Hello everyone,
>
> I know this is not a direct python question, forgive me for that, but
> maybe some of you will still be able to help me. I've been told that
> for my application it would be best to learn a scripting language, so
> I looked around and found perl and python to be the nice. Their syntax
> and "way" is not similar, though.
> So, I was wondering, could any of you please elaborate on the
> following, as to ease my dilemma:
>
> 1. Although it is all relatively similar, there are differences
> between regexes of these two. Which do you believe is the more
> powerful variant (maybe an example) ?
>
> 2. They are both interpreted languages, and I can't really be sure how
> they measure in speed. In your opinion, for handling large files,
> which is better ?
> (I'm processing files of numerical data of several hundred mb - let's
> say 200mb - how would python handle file of such size ? As compared to
> perl ?)
>
> 3. This last one is somewhat subjective, but what do you think, in the
> future, which will be more useful. Which, in your (humble) opinion
> "has a future" ?
>
> Thank you for all the info you can spare, and expecially grateful for
> the time in doing so.
> -- Ivan
> --
> http://mail.python.org/mailman/listinfo/python-list


I could answer to your second question (will Python handle large
files). In my case I use Python to create statistics from some trace
files from a genetic algorithm, and my current size is up to 20MB for
about 40 files. I do the following:
* use regular expressions to identify each line type, extract the
information (as numbers);
* either create statistics on the fly, either load the dumped data
into an Sqlite3 database (which got up to a couple of hundred MB);
* everything works fine until now;

I've also used Python (better said an application built in Python
with cElementTree?), that took the Wikipedia XML dumps (7GB? I'm not
sure, but a couple of GB), then created a custom format file, from
which I've tried to create SQL inserts... And everything worked good.
(Of course it took some time to do all the processing).

So my conclusion is that if you try to keep your in-memory data
small, and use the smart (right) solution for the problem you could
use Python without (big) overhead.

Another side-note, I've also used Python (with NumPy) to implement
neural networks (in fact clustering with ART), where I had about 20
thousand training elements (arrays of thousands of elements), and it
worked remarkably good (I would better than in Java, and comparable
with C/C++).

I hope I've helped you,
Ciprian Craciun.

P.S. If you just need one regular expression transformation to
another, or you need regular expression searching, then just use sed
or grep as you would not get anything better than them.
 
Reply With Quote
 
 
 
 
Terry Reedy
Guest
Posts: n/a
 
      03-03-2009
Ivan wrote:
> Hello everyone,
>
> I know this is not a direct python question, forgive me for that, but
> maybe some of you will still be able to help me. I've been told that
> for my application it would be best to learn a scripting language, so
> I looked around and found perl and python to be the nice. Their syntax
> and "way" is not similar, though.
> So, I was wondering, could any of you please elaborate on the
> following, as to ease my dilemma:


Which way are *you* more comfortable with? There are people who
regularly use both, and many who do not.

>
> 1. Although it is all relatively similar, there are differences
> between regexes of these two. Which do you believe is the more
> powerful variant (maybe an example) ?


This is not relevant to your application below. In any case, the
differences are in rather esoteric details.
>
> 2. They are both interpreted languages, and I can't really be sure how
> they measure in speed. In your opinion, for handling large files,
> which is better ?
> (I'm processing files of numerical data of several hundred mb - let's
> say 200mb - how would python handle file of such size ? As compared to
> perl ?)


For one file and simple processing, the time difference should be less
than the time you spent asking the question. For complex processing or
multiple files, a Python user might use numpy, scipy, or other
pre-written analysis extensions.

> 3. This last one is somewhat subjective, but what do you think, in the
> future, which will be more useful. Which, in your (humble) opinion
> "has a future" ?


Python at least for me.

Terry Jan Reedy

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl-python regex-performance comparison Ivan Python 5 03-03-2009 10:17 PM
Perl python - regex performance comparison Ivan Python 1 03-03-2009 07:11 PM
Perl-python regex-performance comparison Ivan Python 0 03-03-2009 05:28 PM
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
perl regex to java regex Rick Venter Java 5 11-06-2003 10:55 AM



Advertisments