Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Regexp: look ahead and match

Reply
Thread Tools

Regexp: look ahead and match

 
 
jm
Guest
Posts: n/a
 
      03-01-2004
I was using the following script to find two identical letters in a row,
and now would like to find two letters in a row that are in alphabetic
order,
e.g. ab, or mn, or yz, etc... Haven't had any luck changing the
script
to do that. I'm thinking that I should be able to look ahead to the next
letter.
Is there a way to increment a backreference to do that?
Thanks in advance.


use strict:
use warnings;

while (my $str = <> ){
chomp($str);
while($string =~ /([a-z])(?=\1)/cgi) {
print "Match\n";
}


 
Reply With Quote
 
 
 
 
Matt Garrish
Guest
Posts: n/a
 
      03-02-2004

"jm" <(E-Mail Removed)> wrote in message
news:4043c48b$0$3095$(E-Mail Removed)...
> I was using the following script to find two identical letters in a row,
> and now would like to find two letters in a row that are in alphabetic
> order,
> e.g. ab, or mn, or yz, etc... Haven't had any luck changing the
> script
> to do that. I'm thinking that I should be able to look ahead to the next
> letter.
> Is there a way to increment a backreference to do that?
> Thanks in advance.
>
>
> use strict:
> use warnings;
>
> while (my $str = <> ){
> chomp($str);
> while($string =~ /([a-z])(?=\1)/cgi) {
> print "Match\n";
> }
>


I don't think it can be done from within a regex (at least I can't think of
a way that won't result in an eval error). Something simple like the
following should work, though:

while (my $str = <>) {

my $lval = 0;

foreach my $char ($str =~ /(.)/g) {

my $ordval = ord($char);

if ($char =~ /[A-Za-z]/) {
if ($ordval == ($lval + 1)) {
print chr($lval) . "$char\n";
}
}

$lval = $ordval;

}

}

Matt


 
Reply With Quote
 
 
 
 
Jay Tilton
Guest
Posts: n/a
 
      03-02-2004
"jm" <(E-Mail Removed)> wrote:

: I was using the following script to find two identical letters in a row,
: and now would like to find two letters in a row that are in alphabetic
: order,
: e.g. ab, or mn, or yz, etc... Haven't had any luck changing the
: script
: to do that. I'm thinking that I should be able to look ahead to the next
: letter.
: Is there a way to increment a backreference to do that?

/([[:alpha:]])(??{ chr(ord($1)+1) })/

That assumes that alphabetical order and character code order are the same
thing, which isn't necessarily true.

 
Reply With Quote
 
Matt Garrish
Guest
Posts: n/a
 
      03-02-2004

"Jay Tilton" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> "jm" <(E-Mail Removed)> wrote:
>
> : I was using the following script to find two identical letters in a row,
> : and now would like to find two letters in a row that are in alphabetic
> : order,
> : e.g. ab, or mn, or yz, etc... Haven't had any luck changing the
> : script
> : to do that. I'm thinking that I should be able to look ahead to the

next
> : letter.
> : Is there a way to increment a backreference to do that?
>
> /([[:alpha:]])(??{ chr(ord($1)+1) })/
>


Brain not function good today. I was using (?{ }) and it just wouldn't work.
I should've gone back to perlre...

Is anyone aware of just how "experimental" these extended regexes are? I
know code can always be rewritten, but I don't like the thought of my
scripts breaking just because they're being run under a newer version of
perl (hence I would generally only use something like the above in a
throw-away script).

Matt


 
Reply With Quote
 
Jay Tilton
Guest
Posts: n/a
 
      03-02-2004
"Matt Garrish" <(E-Mail Removed)> wrote:

: while (my $str = <>) {
: my $lval = 0;
: foreach my $char ($str =~ /(.)/g) {
: my $ordval = ord($char);
: if ($char =~ /[A-Za-z]/) {
: if ($ordval == ($lval + 1)) {
: print chr($lval) . "$char\n";
: }
: }
: $lval = $ordval;
: }
: }

I'm hip on being cautious around any regex patterns labelled as
"experimental" (rether branch of this thread). That technique is
comparatively bulletproof. The \G regex meta can tighten it up without
corrupting the spirit of the algorithm.

while( $str =~ /([[:alpha:]])/g ) {
my $m = $1;
my $n = chr( ord($m)+1 );
print "$m$n\n" if $str =~ /\G$n/;
}

 
Reply With Quote
 
Matt Garrish
Guest
Posts: n/a
 
      03-02-2004

"Jay Tilton" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> "Matt Garrish" <(E-Mail Removed)> wrote:
>
> : while (my $str = <>) {
> : my $lval = 0;
> : foreach my $char ($str =~ /(.)/g) {
> : my $ordval = ord($char);
> : if ($char =~ /[A-Za-z]/) {
> : if ($ordval == ($lval + 1)) {
> : print chr($lval) . "$char\n";
> : }
> : }
> : $lval = $ordval;
> : }
> : }
>
> I'm hip on being cautious around any regex patterns labelled as
> "experimental" (rether branch of this thread). That technique is
> comparatively bulletproof. The \G regex meta can tighten it up without
> corrupting the spirit of the algorithm.
>
> while( $str =~ /([[:alpha:]])/g ) {
> my $m = $1;
> my $n = chr( ord($m)+1 );
> print "$m$n\n" if $str =~ /\G$n/;
> }
>


I would only comment that you would pay a price in speed by running the
second regex every time though the loop. Otherwise, as you say, it is much
more compact.

Matt


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Can I use a look-ahead and a look-behind at the same time? dan.j.weber@gmail.com Perl Misc 4 04-09-2008 10:25 PM
The Road Ahead - a look back on bit-tech.net Silverstrand Front Page News 0 02-09-2006 02:56 AM
Why do look-ahead and look-behind have to be fixed-width patterns? inhahe Python 3 01-28-2005 12:50 PM
GO AHEAD -MAKE ME LOOK DUMB- Please Aubrey Hutchison Python 4 12-31-2003 06:38 PM
File processing: Line Look-ahead Prabh Java 1 09-10-2003 11:13 PM



Advertisments