Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Basic Regular Expressions question...

Reply
Thread Tools

Basic Regular Expressions question...

 
 
Tassilo v. Parseval
Guest
Posts: n/a
 
      04-08-2005
Also sprach Anno Siegel:

> John W. Krahn <(E-Mail Removed)> wrote in comp.lang.perl.misc:
>> Anno Siegel wrote:
>> > Gunnar Hjalmarsson <(E-Mail Removed)> wrote in comp.lang.perl.misc:

>
> [using index() instead of s///]
>
>> > my $i = length $thisPage;
>> > substr( $thisPage, $i, length $find) = $replace while
>> > ( $i = rindex $thisPage, $find, $i) >= 0;
>> >
>> > That way only the unchanged part of the string is ever searched.

>>
>> Also, using the four argument substr() should be faster.

>
> How so? I never heard of that.


John is right according to a benchmark:

#!/usr/bin/perl -w

use strict;
use Benchmark qw/cmpthese/;

my $string = "0" x 100;

cmpthese(-2, {
arg4 => sub {
substr $string, rand length $string, 1, "0";
},
arg3 => sub {
substr($string, rand length $string, 1) = "0";
},
});
__END__
Rate arg3 arg4
arg3 512250/s -- -43%
arg4 903253/s 76% --

The reason for arg4 being faster is the fact that perl needs to attach
magic to the return value of the 3-argument substr(). In the 4-argument
case this is not the case.

> I use "=" with substr() assignments because it reads better. Four argument
> substr is for when I need the old value of the substring too.


In most cases though the argument of readability supersedes speed, so
now you're right.

Tassilo
--
use bigint;
$n=71423350343770280161397026330337371139054411854 220053437565440;
$m=-8,;;$_=$n&(0xff)<<$m,,$_>>=$m,,print+chr,,while(($ m+=<=200);
 
Reply With Quote
 
 
 
 
Anno Siegel
Guest
Posts: n/a
 
      04-08-2005
Tassilo v. Parseval <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> Also sprach Anno Siegel:
> > John W. Krahn <(E-Mail Removed)> wrote in comp.lang.perl.misc:


> >> Also, using the four argument substr() should be faster.

> >
> > How so? I never heard of that.

>
> John is right according to a benchmark:


Yup. I ran one too. The difference is significant.

> The reason for arg4 being faster is the fact that perl needs to attach
> magic to the return value of the 3-argument substr(). In the 4-argument
> case this is not the case.


Ah... the return value from 4-arg substr is indeed not magic:

my $x = 'aaaaabbbbbcccc';
my $ref = \ substr( $x, 5, 5, 'ZZZZZ');

$$ref = 'XXXXX'; # has no effect on $x
print "$x\n"; # aaaaaZZZZZcccc

After removing the fourth argument from substr() $x changes to
"aaaaaXXXXXcccc".

Live and learn.

Anno
 
Reply With Quote
 
 
 
 
Anno Siegel
Guest
Posts: n/a
 
      04-08-2005
Tassilo v. Parseval <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> Also sprach Anno Siegel:
> > John W. Krahn <(E-Mail Removed)> wrote in comp.lang.perl.misc:


> >> Also, using the four argument substr() should be faster.

> >
> > How so? I never heard of that.

>
> John is right according to a benchmark:


Yup. I ran one too. The difference is significant.

> The reason for arg4 being faster is the fact that perl needs to attach
> magic to the return value of the 3-argument substr(). In the 4-argument
> case this is not the case.


Ah... the return value from 4-arg substr is indeed not magic:

my $x = 'aaaaabbbbbcccc';
my $ref = \ substr( $x, 5, 5, 'ZZZZZ');

$$ref = 'XXXXX'; # has no effect on $x
print "$x\n"; # aaaaaZZZZZcccc

After removing the fourth argument from substr(), $x changes to
"aaaaaXXXXXcccc".

Live and learn.

Anno

 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      04-08-2005
Tassilo v. Parseval <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> Also sprach Anno Siegel:
> > John W. Krahn <(E-Mail Removed)> wrote in comp.lang.perl.misc:


> >> Also, using the four argument substr() should be faster.

> >
> > How so? I never heard of that.

>
> John is right according to a benchmark:


Yup. I ran one too. The difference is significant.

> The reason for arg4 being faster is the fact that perl needs to attach
> magic to the return value of the 3-argument substr(). In the 4-argument
> case this is not the case.


Ah... the return value from 4-arg substr is indeed not magic:

my $x = 'aaaaabbbbbcccc';
my $ref = \ substr( $x, 5, 5, 'ZZZZZ');

$$ref = 'XXXXX'; # has no effect on $x
print "$x\n"; # aaaaaZZZZZcccc

After removing the fourth argument from substr(), $x changes to
"aaaaaXXXXXcccc".

Live and learn.

Anno

 
Reply With Quote
 
Bart Lateur
Guest
Posts: n/a
 
      04-11-2005
Anno Siegel wrote:

>You're right, though I wouldn't bother with storing the length of
>anything. Working backwards runs smoother:
>
> my $i = length $thisPage;
> substr( $thisPage, $i, length $find) = $replace while
> ( $i = rindex $thisPage, $find, $i) >= 0;
>
>That way only the unchanged part of the string is ever searched.


I'm sure you're aware that the behaviour of this version is not the same
as the original one for overlapping matches. Try replacing "papa" in
"papapaya", for example.

--
Bart.
 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      04-11-2005
Bart Lateur <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> Anno Siegel wrote:
>
> >You're right, though I wouldn't bother with storing the length of
> >anything. Working backwards runs smoother:
> >
> > my $i = length $thisPage;
> > substr( $thisPage, $i, length $find) = $replace while
> > ( $i = rindex $thisPage, $find, $i) >= 0;
> >
> >That way only the unchanged part of the string is ever searched.

>
> I'm sure you're aware that the behaviour of this version is not the same
> as the original one for overlapping matches. Try replacing "papa" in
> "papapaya", for example.


Frankly, no, I didn't notice. Thanks for pointing it out.

Anno
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
regular expressions -- very basic grouping question Tom Javascript 5 11-16-2006 08:32 PM
Custom Regular Expressions in ASP.net Jay Douglas ASP .Net 3 11-03-2003 08:09 PM
Regular expressions mark Perl 4 10-28-2003 12:37 PM
perl regular expressions return last matched occurence? Dustin D. Perl 1 08-28-2003 01:51 AM
Add custom regular expressions to the validation list of available expressions Jay Douglas ASP .Net 0 08-15-2003 10:19 PM



Advertisments