On Sun, 24 Jan 2010 07:42:40 +0100, Martijn Lievaart <> wrote:
>On Fri, 22 Jan 2010 21:29:30 -0800, sln wrote:
>
>> Its not too hard to analyse the string returned by qr// to get the start
>> (and thereby the count) of capture groups. To get the actual group text
>> requires some recursion and thought.
>>
>> use strict;
>> use warnings;
>>
>> my $tmp = qr/\(\$th (i(s))(i(s))(i(s))(?
i\(s)\)(i(s))(i(s))\))/x; my
>> @capt;
>>
>> while ($tmp =~ /( (?<!\\)\((?!\?) )/xg ) {
>> push @capt, pos($tmp);
>> }
>> print "$tmp\n";
>> my ($i,$last) = (1,1);
>>
>> for my $p (@capt) {
>> print (' 'x ($p - $last), $i++ % 10); $last = $p+1;
>> }
>> print "\nFound ",scalar @capt, " capture groups\n";
>>
>> __END__
>>
>> (?x-ism:\(\$th (i(s))(i(s))(i(s))(?
i\(s)\)(i(s))(i(s))\)))
>> 1 2 3 4 5 6 7 8 9 0 1
>> Found 11 capture groups
>
>I think this will fail on the regexp /\\(.)/.
>
>M4
Correct. Inserting (?:\\.)* should fix it.
See if this will fail on anything.
-sln
use strict;
use warnings;
my $tmp = qr/\(\$th(\\(?:.) \\(.\) \\\(.)(i(s))(i(s))(?

i\(s)\)(i(s))(i(s))\)))/x;
my @capt;
# /(?<!\\)(?:\\.)*\((?!\?)/
# -------------------------
my $grprx = qr/
(?<!\\) # Not an escape behind us
(?:\\.)* # 0 or more escape + any char
\( # (
(?!\?) # Not a ? in front of us
/x;
while ($tmp =~ /($grprx)/g ) {
# print "'$1'\n";
push @capt, pos($tmp);
}
print "$tmp\n";
my ($i,$last) = (1,1);
for my $p (@capt) {
print (' 'x ($p - $last), $i++ % 10);
$last = $p+1;
}
print "\nFound ",scalar @capt, " capture groups\n";
__END__
(?x-ism:\(\$th(\\(?:.) \\(.\) \\\(.)(i(s))(i(s))(?

i\(s)\)(i(s))(i(s))\))))
1 2 3 4 5 6 7 8 9 0 1
Found 11 capture groups