Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > FAQ 4.23 How do I find matching/nesting anything?

Reply
Thread Tools

FAQ 4.23 How do I find matching/nesting anything?

 
 
PerlFAQ Server
Guest
Posts: n/a
 
      04-02-2011
This is an excerpt from the latest version perlfaq4.pod, which
comes with the standard Perl distribution. These postings aim to
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

4.23: How do I find matching/nesting anything?

This isn't something that can be done in one regular expression, no
matter how complicated. To find something between two single characters,
a pattern like "/x([^x]*)x/" will get the intervening bits in $1. For
multiple ones, then something more like "/alpha(.*?)omega/" would be
needed. But none of these deals with nested patterns. For balanced
expressions using "(", "{", "[" or "<" as delimiters, use the CPAN
module Regexp::Common, or see "(??{ code })" in perlre. For other cases,
you'll have to write a parser.

If you are serious about writing a parser, there are a number of modules
or oddities that will make your life a lot easier. There are the CPAN
modules "Parse::RecDescent", "Parse::Yapp", and "Text::Balanced"; and
the "byacc" program. Starting from perl 5.8 the "Text::Balanced" is part
of the standard distribution.

One simple destructive, inside-out approach that you might try is to
pull out the smallest nesting parts one at a time:

while (s/BEGIN((??!BEGIN)(?!END).)*)END//gs) {
# do something with $1
}

A more complicated and sneaky approach is to make Perl's regular
expression engine do it for you. This is courtesy Dean Inada, and rather
has the nature of an Obfuscated Perl Contest entry, but it really does
work:

# $_ contains the string to parse
# BEGIN and END are the opening and closing markers for the
# nested text.

@( = ('(','');
@) = (')','');
($re=$_)=~s/((BEGIN)|(END)|.)/$)[!$3]\Q$1\E$([!$2]/gs;
@$ = (eval{/$re/},$@!~/unmatched/i);
print join("\n",@$[0..$#$]) if( $$[-1] );



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in
perlfaq.pod.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
How to exclude action of Find::Find::find in subdirectories withknown names? vdvorkin Perl Misc 3 02-14-2011 05:28 AM
How to exclude action of Find::Find::find in subdirectories withknown names? vdvorkin Perl Misc 0 02-10-2011 05:18 PM
FAQ or not FAQ? =?ISO-8859-15?Q?Juli=E1n?= Albo C++ 28 01-15-2007 04:33 AM
Find.find does not find orphaned links? Wybo Dekker Ruby 1 11-15-2005 02:50 PM



Advertisments