Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > (?{ ... }) puzzlement

Reply
Thread Tools

(?{ ... }) puzzlement

 
 
J Krugman
Guest
Posts: n/a
 
      05-31-2004



In an attempt to find a single regexp that would succeed if three
different sub-regexps matched in any order (see why in the thread
called '"Commutative" regexps'), I started playing with (?{...})-type
regexps. As warm-up, I tried this:

1 use strict;
2 use re 'eval';
3
4 my @re0 = qw(abc pqr xyz);
5 my @seen = (undef) x @re0;
6 my @re = map sprintf('%s(?{ $seen[%d] ||= "@-" })',
7 $re0[$_], $_),
8 0..$#re0;
9 my $re = eval "qr/@{[join('|', @re)]}/";
10
11 #0 1 2
12 #01234567890123456789012345
13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*/;
14
15 print "\$seen[$_] = $seen[$_]\n" for (0..$#seen);
16
17 __END__

$seen[0] =
$seen[1] =
$seen[2] =

If I change line 13 to

13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*(?!)/;

The output I get changes to

$seen[0] = 2 14 14
$seen[1] = 2
$seen[2] = 2 2 2


I find both results completely puzzling. I realize that ?{ ... }
is a highly experimental feature, but if anyone can explain to me
what's going on I'd very much appreciate it.

TIA,

jill

P.S. Unrelated regexp question: if I have a string or regexp in
a variable $x, and I want to use this variable to write a regexp
corresponding to 5 repeats of the contents of $x, how do I write
it? If I wrote /$x{5}/, it would be interpreted by perl as attempting
to access the value corresponding to key '5' in the hash %x.

--
To s&e^n]d me m~a}i]l r%e*m?o\v[e bit from my a|d)d:r{e:s]s.

 
Reply With Quote
 
 
 
 
Ben Morrow
Guest
Posts: n/a
 
      05-31-2004

Quoth J Krugman <(E-Mail Removed)>:
>
> In an attempt to find a single regexp that would succeed if three
> different sub-regexps matched in any order (see why in the thread
> called '"Commutative" regexps'), I started playing with (?{...})-type
> regexps. As warm-up, I tried this:
>
> 1 use strict;
> 2 use re 'eval';
> 3
> 4 my @re0 = qw(abc pqr xyz);
> 5 my @seen = (undef) x @re0;
> 6 my @re = map sprintf('%s(?{ $seen[%d] ||= "@-" })',
> 7 $re0[$_], $_),
> 8 0..$#re0;


@- only contains entries for () sub-expressions. You have none here
(that have matched yet), so it won't work. (My guess is that the first
entry isn't filled in until after the match has finished, but I don't
rightly know...)

> 9 my $re = eval "qr/@{[join('|', @re)]}/";
> 10
> 11 #0 1 2
> 12 #01234567890123456789012345
> 13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*/;
> 14
> 15 print "\$seen[$_] = $seen[$_]\n" for (0..$#seen);
> 16
> 17 __END__
>
> $seen[0] =
> $seen[1] =
> $seen[2] =
>
> If I change line 13 to
>
> 13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*(?!)/;
>
> The output I get changes to
>
> $seen[0] = 2 14 14
> $seen[1] = 2
> $seen[2] = 2 2 2


Presumably the (?!) is causing it to carry on trying, so it re-runs
those bits of code after @- has some entries.

Try something more like (completely untested):

my @pats = qw/abc pqr xyx/;
my %seen;
my $re = join '|', map {
qr/(\Q$_\E) (?{ $seen{$_} = 1 }) (?!)/x
} @pats;

'__pqr__xyz__pqr__abc__' =~ /$re/;

$, = $\ = "\n"
print keys %seen;

> P.S. Unrelated regexp question: if I have a string or regexp in
> a variable $x, and I want to use this variable to write a regexp
> corresponding to 5 repeats of the contents of $x, how do I write
> it? If I wrote /$x{5}/, it would be interpreted by perl as attempting
> to access the value corresponding to key '5' in the hash %x.


Try /(?:$x){5}/.

Ben

--
For the last month, a large number of PSNs in the Arpa[Inter-]net have been
reporting symptoms of congestion ... These reports have been accompanied by an
increasing number of user complaints ... As of June,... the Arpanet contained
47 nodes and 63 links. [ftp://rtfm.mit.edu/pub/arpaprob.txt] * http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
 
 
 
J Krugman
Guest
Posts: n/a
 
      05-31-2004
In <c9fabj$8bg$(E-Mail Removed)> Ben Morrow <(E-Mail Removed)> writes:


>Quoth J Krugman <(E-Mail Removed)>:
>>
>> In an attempt to find a single regexp that would succeed if three
>> different sub-regexps matched in any order (see why in the thread
>> called '"Commutative" regexps'), I started playing with (?{...})-type
>> regexps. As warm-up, I tried this:
>>
>> 1 use strict;
>> 2 use re 'eval';
>> 3
>> 4 my @re0 = qw(abc pqr xyz);
>> 5 my @seen = (undef) x @re0;
>> 6 my @re = map sprintf('%s(?{ $seen[%d] ||= "@-" })',
>> 7 $re0[$_], $_),
>> 8 0..$#re0;


>@- only contains entries for () sub-expressions. You have none here
>(that have matched yet), so it won't work. (My guess is that the first
>entry isn't filled in until after the match has finished, but I don't
>rightly know...)


I get the same results (i.e. %seen never gets initialized) if I
change line 6 to

6 my @re = map sprintf('(%s)(?{ $seen[%d] ||= "@-" })',



>Try something more like (completely untested):


>my @pats = qw/abc pqr xyx/;
>my %seen;
>my $re = join '|', map {
> qr/(\Q$_\E) (?{ $seen{$_} = 1 }) (?!)/x
>} @pats;


>'__pqr__xyz__pqr__abc__' =~ /$re/;


>$, = $\ = "\n"
>print keys %seen;


OK, tried it (and many variants); no luck. I now think that ?{
.... } is not the way to go; too complicated (and/or buggy) for me.



>> P.S. Unrelated regexp question: if I have a string or regexp in
>> a variable $x, and I want to use this variable to write a regexp
>> corresponding to 5 repeats of the contents of $x, how do I write
>> it? If I wrote /$x{5}/, it would be interpreted by perl as attempting
>> to access the value corresponding to key '5' in the hash %x.


>Try /(?:$x){5}/.


Thanks!

jill



--
To s&e^n]d me m~a}i]l r%e*m?o\v[e bit from my a|d)d:r{e:s]s.

 
Reply With Quote
 
Matt Garrish
Guest
Posts: n/a
 
      05-31-2004

"J Krugman" <(E-Mail Removed)> wrote in message
news:c9f8on$no6$(E-Mail Removed)...
>
>
>
> In an attempt to find a single regexp that would succeed if three
> different sub-regexps matched in any order (see why in the thread
> called '"Commutative" regexps'), I started playing with (?{...})-type
> regexps. As warm-up, I tried this:
>
> 1 use strict;
> 2 use re 'eval';
> 3
> 4 my @re0 = qw(abc pqr xyz);
> 5 my @seen = (undef) x @re0;
> 6 my @re = map sprintf('%s(?{ $seen[%d] ||= "@-" })',


I think you want ?? not ?

my @re = map sprintf('%s(??{ $seen[%d] ||= "@-" })',


> 7 $re0[$_], $_),
> 8 0..$#re0;
> 9 my $re = eval "qr/@{[join('|', @re)]}/";
> 10
> 11 #0 1 2
> 12 #01234567890123456789012345
> 13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*/;
> 14
> 15 print "\$seen[$_] = $seen[$_]\n" for (0..$#seen);
> 16
> 17 __END__
>


The output then becomes

$seen[0] = 20
$seen[1] = 2
$seen[2] = 9

Matt


 
Reply With Quote
 
Matt Garrish
Guest
Posts: n/a
 
      05-31-2004

"Matt Garrish" <(E-Mail Removed)> wrote in message
news:9OOuc.84348$(E-Mail Removed) ...
>
> "J Krugman" <(E-Mail Removed)> wrote in message
> news:c9f8on$no6$(E-Mail Removed)...
> >
> >
> >
> > In an attempt to find a single regexp that would succeed if three
> > different sub-regexps matched in any order (see why in the thread
> > called '"Commutative" regexps'), I started playing with (?{...})-type
> > regexps. As warm-up, I tried this:
> >
> > 1 use strict;
> > 2 use re 'eval';
> > 3
> > 4 my @re0 = qw(abc pqr xyz);
> > 5 my @seen = (undef) x @re0;
> > 6 my @re = map sprintf('%s(?{ $seen[%d] ||= "@-" })',

>
> I think you want ?? not ?
>
> my @re = map sprintf('%s(??{ $seen[%d] ||= "@-" })',
>
>
> > 7 $re0[$_], $_),
> > 8 0..$#re0;
> > 9 my $re = eval "qr/@{[join('|', @re)]}/";
> > 10
> > 11 #0 1 2
> > 12 #01234567890123456789012345
> > 13 '__pqr____xyz__pqr___abc___' =~ /(($re).*?)*/;
> > 14
> > 15 print "\$seen[$_] = $seen[$_]\n" for (0..$#seen);
> > 16
> > 17 __END__
> >

>
> The output then becomes
>
> $seen[0] = 20
> $seen[1] = 2
> $seen[2] = 9
>


Forgot to mention that I also changed line 13 to:

'__pqr____xyz__pqr___abc___' =~ /(($re).*?)*(?!)/;

as per your original post.

Matt


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
super.clone() puzzlement Derek Fountain Java 42 11-27-2008 10:40 AM
String#scan puzzlement Martin DeMello Ruby 2 07-12-2006 12:59 PM
puzzlement about classmethod Faheem Mitha Python 2 06-24-2006 09:34 PM
Descriptor puzzlement John Roth Python 9 01-08-2004 10:22 PM



Advertisments