Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Help with nested pattern.

Reply
Thread Tools

Help with nested pattern.

 
 
somedeveloper@gmail.com
Guest
Posts: n/a
 
      04-14-2007
Hi,

Would appreciate some hints on a 'smart' / 'nifty' solution to this
problem.

The problem:
I need to extract a block of text lying between -- let's say -- a
pair of brackets.
There can be an arbitrary # of such [] blocks nested one inside the
other.
I know how to mark my first '[' to start the matching process.

Example:
abc [ def .*
[ .* ]
[ .*
[ .* ]
]
uvw ] xyz

Desired output: [ def .* uvw ]


1. Now, I don't know if this is something Perl regexps can handle. I
read somewhere (possibly incorrectly) that nested patterns are in
general constructs that are handled via grammars (flex/bison combo)
and not regexps.

2. But since Perl provides features like match-time-code-evaluation in
regexps, I thought incrementing a count variable on each '[',
decrementing it on each ']', and printing the current pattern when the
count goes to zero would do the job... but I'm not so sure how.

3. If there's really no solution via regexps and grammars, I would
have to use the brute-force approach of processing each character in a
loop looking for ['s and ]'s. (yuck!)

Regards...

 
Reply With Quote
 
 
 
 
Brian McCauley
Guest
Posts: n/a
 
      04-14-2007
On 14 Apr, 11:31, (E-Mail Removed) wrote:
> Hi,
>
> Would appreciate some hints on a 'smart' / 'nifty' solution to this
> problem.
>
> The problem:
> I need to extract a block of text lying between -- let's say --
> a pair of brackets.
> There can be an arbitrary # of such [] blocks nested one inside
> the other.


This is FAQ: "How do I find matching/nesting anything?"

 
Reply With Quote
 
 
 
 
Brian McCauley
Guest
Posts: n/a
 
      04-14-2007
On 14 Apr, 11:42, "Brian McCauley" <(E-Mail Removed)> wrote:

> This is FAQ: "How do I find matching/nesting anything?"


Applying the suggestions given there

use strict;
use warnings;

my $in = ' abc [ def .*
[ .* ]
[ .*
[ .* ]
]
uvw ] xyz';

local our $re;

# Taken from "perldoc perlre" section dealing with (??{ })
$re = qr{
\[
(?:
(?> [^\[\]]+ )
|
(??{ $re })
)*
\]
}x;

# Find first top-level bracketed section
my ($out) = $in =~ /($re)/;

# Remove sub-brackets
$out =~ s/(?<!\A)$re//g;

# Normalize whitespace
$out =~ s/\s+/ /g;

print "$out\n";

__END__

 
Reply With Quote
 
somedeveloper@gmail.com
Guest
Posts: n/a
 
      04-14-2007
On Apr 14, 4:28 pm, "Brian McCauley" <(E-Mail Removed)> wrote:
> On 14 Apr, 11:42, "Brian McCauley" <(E-Mail Removed)> wrote:
>
> > This is FAQ: "How do I find matching/nesting anything?"

>
> Applying the suggestions given there
>
> use strict;
> use warnings;
>
> my $in = ' abc [ def .*
> [ .* ]
> [ .*
> [ .* ]
> ]
> uvw ] xyz';
>
> local our $re;
>
> # Taken from "perldoc perlre" section dealing with (??{ })
> $re = qr{
> \[
> (?:
> (?> [^\[\]]+ )
> |
> (??{ $re })
> )*
> \]
> }x;
>
> # Find first top-level bracketed section
> my ($out) = $in =~ /($re)/;
>
> # Remove sub-brackets
> $out =~ s/(?<!\A)$re//g;
>
> # Normalize whitespace
> $out =~ s/\s+/ /g;
>
> print "$out\n";
>
> __END__


Can't thank you enough! It was (really){2,}\.\.\. dumb on my part to
not check the faq first!

 
Reply With Quote
 
Mirco Wahab
Guest
Posts: n/a
 
      04-14-2007
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> The problem:
> I need to extract a block of text lying between -- let's say -- a
> pair of brackets.
> There can be an arbitrary # of such [] blocks nested one inside the
> other.
> I know how to mark my first '[' to start the matching process.
> Example:
> abc [ def .*
> [ .* ]
> [ .*
> [ .* ]
> ]
> uvw ] xyz
>
> Desired output: [ def .* uvw ]


If the problem stays as simple as your example,
which means: you know in advance to capture
only the outer part of something, you could
simply re-model it as a regexp and forget about
the inner structure (if you don't need it).

Example (you know you need only the "outer pair")

use strict;
use warnings;

my $text = '
abc [ def .*
[ .* ]
[ .*
[ .* ]
]
uvw ] xyz ';

my $reg;

$reg = qr/ \A # start of string
.+? (\[ \s+ \w+) \s+ (\S+) # re-model abc [ def ~~~
.* # be greedy
\b(\w+ \s+ \]) \s+ \w+ \s+ # re-model backwards
\z
/xs;


if( $text =~ /$reg/ ) {
print "$1 $2 $3"
}


If your real problem is more complicated,
then you'd go with Brians solution imho.

Regards

Mirco
 
Reply With Quote
 
Brian McCauley
Guest
Posts: n/a
 
      04-19-2007
On Apr 14, 12:28 pm, "Brian McCauley" <(E-Mail Removed)> wrote:

> # Remove sub-brackets
> $out =~ s/(?<!\A)$re//g;


\A is zero width (so look-behind = look-ahead) and without a /m
qualifier it's equivalent to ^ so the above is more neatly written as:

$out =~ s/(?!^)$re//g;

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Nested friend class in nested template problem tonvandenheuvel@gmail.com C++ 3 12-07-2007 03:02 PM
dealing with nested xml within nested xml within...... Ultrus Python 3 07-09-2007 09:00 PM
Is nested class automatically friend of class that it is nested in? request@no_spam.com C++ 5 09-25-2006 08:31 AM
Nested Vector Nester Classes are Nested in my Brain Chad E. Dollins C++ 3 11-08-2005 04:46 AM
Nested iterators (well, not nested exactly...) Russ Perry Jr Java 2 08-20-2004 06:51 PM



Advertisments