Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl Misc (http://www.velocityreviews.com/forums/f67-perl-misc.html)
-   -   pattern matching (http://www.velocityreviews.com/forums/t886007-pattern-matching.html)

LiHui 04-20-2004 01:58 AM

pattern matching
 
Can someone tell me what does this line do ?

$line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o

I know that it check to see if the line begin with "|" follow by
whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?)

Any help will be greatly appreciate. Thanks LH

Brad Baxter 04-20-2004 02:58 AM

Re: pattern matching
 
On Mon, 19 Apr 2004, LiHui wrote:

> Can someone tell me what does this line do ?
>
> $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o
>
> I know that it check to see if the line begin with "|" follow by
> whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?)


Make that "|" followed by optional whitespace, a word character, optional
nonwhitespace, optional whitespace, ten or more of "these": |x..., and
ending in "|".

Also, it's not s*(?:\|.+?), it's \s*(?:\|..+?). (?:...) are grouping (not
capturing) parentheses. They're followed by {10,} so you want 10 or more
of those groups. Each group is "|" followed by at least one, possibly
more, character(s) that match(es) /./ but not "|" (because +? is
non-greedy).

Below is an expanded (using /x) version:

# example line that will match
my $line = '| abc |0|1|2|3|4|5|6|7|8|9|x|y|z|';

$line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o and print "yes\n";

$line =~ m/
^ # begins with
\| # 'or' bar
\s* # optional whitespace
\w # ONE word character
\S* # optional nonwhitespace
\s* # optional whitespace
(?: # begin the group
\| # 'or' bar
.+ # at least one character
? # make the '+' non-greedy
) # end the group
{10,} # give me 10 or more GROUPS
\| # 'or' bar
$ # at the end
/ox and print "yes\n";


I assume you've looked at perldoc perlre.

Regards,

Brad

John W. Krahn 04-20-2004 06:07 AM

Re: pattern matching
 
Brad Baxter wrote:
>
> On Mon, 19 Apr 2004, LiHui wrote:
>
> > Can someone tell me what does this line do ?
> >
> > $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o
> >
> > I know that it check to see if the line begin with "|" follow by
> > whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?)

>
> Make that "|" followed by optional whitespace, a word character, optional
> nonwhitespace, optional whitespace, ten or more of "these": |x..., and
> ending in "|".
>
> Also, it's not s*(?:\|.+?), it's \s*(?:\|..+?). (?:...) are grouping (not
> capturing) parentheses. They're followed by {10,} so you want 10 or more
> of those groups. Each group is "|" followed by at least one, possibly
> more, character(s) that match(es) /./ but not "|" (because +? is
> non-greedy).
>
> Below is an expanded (using /x) version:
>
> # example line that will match
> my $line = '| abc |0|1|2|3|4|5|6|7|8|9|x|y|z|';
>
> $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o and print "yes\n";
>
> $line =~ m/
> ^ # begins with
> \| # 'or' bar
> \s* # optional whitespace
> \w # ONE word character
> \S* # optional nonwhitespace
> \s* # optional whitespace
> (?: # begin the group
> \| # 'or' bar
> .+ # at least one character
> ? # make the '+' non-greedy
> ) # end the group
> {10,} # give me 10 or more GROUPS
> \| # 'or' bar
> $ # at the end
> /ox and print "yes\n";
>
> I assume you've looked at perldoc perlre.


Also, the /o option is not required as there are no variables in the
regular expression.

perldoc perlop


John
--
use Perl;
program
fulfillment

LiHui 04-21-2004 08:48 AM

Re: pattern matching
 
Thanks Brad & John. Got it now.

LiHui

Scott J 07-07-2004 07:16 PM

Re: pattern matching
 
On 19 Apr 2004 18:58:30 -0700, tanlh_listing@hotmail.com (LiHui)
wrote:

>Can someone tell me what does this line do ?
>
>$line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o
>
>I know that it check to see if the line begin with "|" follow by
>whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?)
>
>Any help will be greatly appreciate. Thanks LH


I'll give it a go :) My regex abilities are a bit rusty.

^\| line starts with |
\s* followed by 0 or more whitespaces
\w followed by an alphanumeric character
\S* followed by 0 or more non whitespaces
\s* followed by 0 or more whitespaces
(?:\|.+?){10,} is a quantified extended regex sequence (see below)
\|$ line ends with |

o switch tells the pattern to compile only once.

Quantified regex sequence :
(?:...) is a cluster only parenthesis, no capturing (thanks Camel
book) which I think means that the pattern matches, but does not store
the matched string in a variable. The remainder of this sequence is a
regular regex :
\| matches |
..+? matches one character, 1 or more times (minimally)

{10,} tells the pattern inside the () to match at least 10 times

Does that help or hinder?

Scott


All times are GMT. The time now is 03:12 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.