Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > How to dissect a Regexp object?

Reply
Thread Tools

How to dissect a Regexp object?

 
 
kj
Guest
Posts: n/a
 
      01-22-2010



Is there a way to tell if a given Regexp object, generated at
runtime, includes at least one pair of capture parentheses?

More generally, is there any documentation for the Regexp class?
(I'm referring to the class alluded to by the output of, e.g., ref
qr//). Running perldoc Regexp fails ("no docs found"), and perldoc
perlre does not say much at all about this class as such.

TIA!

Kynn
 
Reply With Quote
 
 
 
 
C.DeRykus
Guest
Posts: n/a
 
      01-22-2010
On Jan 22, 9:54*am, kj <no.em...@please.post> wrote:
> Is there a way to tell if a given Regexp object, generated at
> runtime, includes at least one pair of capture parentheses?
>
> More generally, is there any documentation for the Regexp class?
> (I'm referring to the class alluded to by the output of, e.g., ref
> qr//). *Running perldoc Regexp fails ("no docs found"), and perldoc
> perlre does not say much at all about this class as such.
>


perldoc perlop (see: Regexp Quote-Like Operators)

The regex object is viewable as a string:

$ perl -le '$regex = qr/ab(\d+)/; print $regex'
(?-xism:ab(\d+))

--
Charles DeRykus

 
Reply With Quote
 
 
 
 
Ilya Zakharevich
Guest
Posts: n/a
 
      01-23-2010
On 2010-01-22, Ben Morrow <> wrote:
> The Regexp 'class' doesn't have any methods[1], and isn't really useable
> as a class at all. In Perl 5.12 it will be mostly replaced by a new
> REGEXP svtype, which was what it should have been from the beginning.
> (qr// will still return a ref blessed into Regexp, for compatibility.)


Is there any way to find that an object is of a REGEXP svtype (without
using overload::StrVal)? Without this, serialization is not possible;
witness failure of FreezeThaw...

Or, at least, get hints that it "might be REGEXP" (with no false
negatives), so that a call to overload::StrVal() is needed...

Yours,
Ilya
 
Reply With Quote
 
C.DeRykus
Guest
Posts: n/a
 
      01-23-2010
On Jan 22, 9:54*am, kj <no.em...@please.post> wrote:
> Is there a way to tell if a given Regexp object, generated at
> runtime, includes at least one pair of capture parentheses?
> ...


If you have a recently current Perl version, you might be able
to leverage re::regexp_pattern in list context to check paren's.

On a Win32 5.10.1 strawberry distro for instance:

c:\strawberry\perl\bin\perl.exe -le "
use re 'regexp_pattern';
$r = qr/ab(\d+)/;
($pat) = regexp_pattern($r);
print $pat"
ab(\d+)

So you could parse $pat for capturing paren's. You'd need to
exclude certain assertions such as (? ... ) but that's left
as an exercise for the reader

--
Charles DeRykus
 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      01-23-2010
On Fri, 22 Jan 2010 17:54:49 +0000 (UTC), kj <> wrote:

>Is there a way to tell if a given Regexp object, generated at
>runtime, includes at least one pair of capture parentheses?
>
>More generally, is there any documentation for the Regexp class?
>(I'm referring to the class alluded to by the output of, e.g., ref
>qr//). Running perldoc Regexp fails ("no docs found"), and perldoc
>perlre does not say much at all about this class as such.
>
>TIA!
>
>Kynn


Its not too hard to analyse the string returned by qr//
to get the start (and thereby the count) of capture groups.
To get the actual group text requires some recursion and thought.

use strict;
use warnings;

my $tmp = qr/\(\$th (i(s))(i(s))(i(s))(?i\(s)\)(i(s))(i(s))\))/x;
my @capt;

while ($tmp =~ /( (?<!\\)\((?!\?) )/xg ) {
push @capt, pos($tmp);
}
print "$tmp\n";
my ($i,$last) = (1,1);

for my $p (@capt) {
print (' 'x ($p - $last), $i++ % 10);
$last = $p+1;
}
print "\nFound ",scalar @capt, " capture groups\n";

__END__

(?x-ism:\(\$th (i(s))(i(s))(i(s))(?i\(s)\)(i(s))(i(s))\)))
1 2 3 4 5 6 7 8 9 0 1
Found 11 capture groups

 
Reply With Quote
 
kj
Guest
Posts: n/a
 
      01-23-2010
In <> writes:

>On Fri, 22 Jan 2010 17:54:49 +0000 (UTC), kj <> wrote:


>>Is there a way to tell if a given Regexp object, generated at
>>runtime, includes at least one pair of capture parentheses?
>>
>>More generally, is there any documentation for the Regexp class?
>>(I'm referring to the class alluded to by the output of, e.g., ref
>>qr//). Running perldoc Regexp fails ("no docs found"), and perldoc
>>perlre does not say much at all about this class as such.
>>
>>TIA!
>>
>>Kynn


>Its not too hard to analyse the string returned by qr//
>to get the start (and thereby the count) of capture groups.
>To get the actual group text requires some recursion and thought.


> use strict;
> use warnings;


> my $tmp = qr/\(\$th (i(s))(i(s))(i(s))(?i\(s)\)(i(s))(i(s))\))/x;
> my @capt;


> while ($tmp =~ /( (?<!\\)\((?!\?) )/xg ) {
> push @capt, pos($tmp);
> }
> print "$tmp\n";
> my ($i,$last) = (1,1);


> for my $p (@capt) {
> print (' 'x ($p - $last), $i++ % 10);
> $last = $p+1;
> }
> print "\nFound ",scalar @capt, " capture groups\n";
>
>__END__


>(?x-ism:\(\$th (i(s))(i(s))(i(s))(?i\(s)\)(i(s))(i(s))\)))
> 1 2 3 4 5 6 7 8 9 0 1
>Found 11 capture groups



Thanks for this code! Now I must study it.

~K

 
Reply With Quote
 
Martijn Lievaart
Guest
Posts: n/a
 
      01-24-2010
On Fri, 22 Jan 2010 21:29:30 -0800, sln wrote:

> Its not too hard to analyse the string returned by qr// to get the start
> (and thereby the count) of capture groups. To get the actual group text
> requires some recursion and thought.
>
> use strict;
> use warnings;
>
> my $tmp = qr/\(\$th (i(s))(i(s))(i(s))(?i\(s)\)(i(s))(i(s))\))/x; my
> @capt;
>
> while ($tmp =~ /( (?<!\\)\((?!\?) )/xg ) {
> push @capt, pos($tmp);
> }
> print "$tmp\n";
> my ($i,$last) = (1,1);
>
> for my $p (@capt) {
> print (' 'x ($p - $last), $i++ % 10); $last = $p+1;
> }
> print "\nFound ",scalar @capt, " capture groups\n";
>
> __END__
>
> (?x-ism:\(\$th (i(s))(i(s))(i(s))(?i\(s)\)(i(s))(i(s))\)))
> 1 2 3 4 5 6 7 8 9 0 1
> Found 11 capture groups


I think this will fail on the regexp /\\(.)/.

M4
 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      01-24-2010
On Sun, 24 Jan 2010 07:42:40 +0100, Martijn Lievaart <> wrote:

>On Fri, 22 Jan 2010 21:29:30 -0800, sln wrote:
>
>> Its not too hard to analyse the string returned by qr// to get the start
>> (and thereby the count) of capture groups. To get the actual group text
>> requires some recursion and thought.
>>
>> use strict;
>> use warnings;
>>
>> my $tmp = qr/\(\$th (i(s))(i(s))(i(s))(?i\(s)\)(i(s))(i(s))\))/x; my
>> @capt;
>>
>> while ($tmp =~ /( (?<!\\)\((?!\?) )/xg ) {
>> push @capt, pos($tmp);
>> }
>> print "$tmp\n";
>> my ($i,$last) = (1,1);
>>
>> for my $p (@capt) {
>> print (' 'x ($p - $last), $i++ % 10); $last = $p+1;
>> }
>> print "\nFound ",scalar @capt, " capture groups\n";
>>
>> __END__
>>
>> (?x-ism:\(\$th (i(s))(i(s))(i(s))(?i\(s)\)(i(s))(i(s))\)))
>> 1 2 3 4 5 6 7 8 9 0 1
>> Found 11 capture groups

>
>I think this will fail on the regexp /\\(.)/.
>
>M4


Correct. Inserting (?:\\.)* should fix it.
See if this will fail on anything.

-sln

use strict;
use warnings;

my $tmp = qr/\(\$th(\\(?:.) \\(.\) \\\(.)(i(s))(i(s))(?i\(s)\)(i(s))(i(s))\)))/x;
my @capt;

# /(?<!\\)(?:\\.)*\((?!\?)/
# -------------------------
my $grprx = qr/
(?<!\\) # Not an escape behind us
(?:\\.)* # 0 or more escape + any char
\( # (
(?!\?) # Not a ? in front of us
/x;

while ($tmp =~ /($grprx)/g ) {
# print "'$1'\n";
push @capt, pos($tmp);
}
print "$tmp\n";
my ($i,$last) = (1,1);

for my $p (@capt) {
print (' 'x ($p - $last), $i++ % 10);
$last = $p+1;
}
print "\nFound ",scalar @capt, " capture groups\n";

__END__

(?x-ism:\(\$th(\\(?:.) \\(.\) \\\(.)(i(s))(i(s))(?i\(s)\)(i(s))(i(s))\))))
1 2 3 4 5 6 7 8 9 0 1
Found 11 capture groups

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
[regexp] How to convert string "/regexp/i" to /regexp/i - ? Joao Silva Ruby 16 08-21-2009 05:52 PM
2811 router config- dissect glearng@gmail.com Cisco 0 07-21-2008 08:56 PM
Ruby 1.9 - ArgumentError: incompatible encoding regexp match(US-ASCII regexp with ISO-2022-JP string) Mikel Lindsaar Ruby 0 03-31-2008 10:27 AM
Programmatically turning a Regexp into an anchored Regexp Greg Hurrell Ruby 4 02-14-2007 06:56 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57