![]() |
pattern matching
Can someone tell me what does this line do ?
$line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o I know that it check to see if the line begin with "|" follow by whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?) Any help will be greatly appreciate. Thanks LH |
Re: pattern matching
On Mon, 19 Apr 2004, LiHui wrote:
> Can someone tell me what does this line do ? > > $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o > > I know that it check to see if the line begin with "|" follow by > whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?) Make that "|" followed by optional whitespace, a word character, optional nonwhitespace, optional whitespace, ten or more of "these": |x..., and ending in "|". Also, it's not s*(?:\|.+?), it's \s*(?:\|..+?). (?:...) are grouping (not capturing) parentheses. They're followed by {10,} so you want 10 or more of those groups. Each group is "|" followed by at least one, possibly more, character(s) that match(es) /./ but not "|" (because +? is non-greedy). Below is an expanded (using /x) version: # example line that will match my $line = '| abc |0|1|2|3|4|5|6|7|8|9|x|y|z|'; $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o and print "yes\n"; $line =~ m/ ^ # begins with \| # 'or' bar \s* # optional whitespace \w # ONE word character \S* # optional nonwhitespace \s* # optional whitespace (?: # begin the group \| # 'or' bar .+ # at least one character ? # make the '+' non-greedy ) # end the group {10,} # give me 10 or more GROUPS \| # 'or' bar $ # at the end /ox and print "yes\n"; I assume you've looked at perldoc perlre. Regards, Brad |
Re: pattern matching
Brad Baxter wrote:
> > On Mon, 19 Apr 2004, LiHui wrote: > > > Can someone tell me what does this line do ? > > > > $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o > > > > I know that it check to see if the line begin with "|" follow by > > whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?) > > Make that "|" followed by optional whitespace, a word character, optional > nonwhitespace, optional whitespace, ten or more of "these": |x..., and > ending in "|". > > Also, it's not s*(?:\|.+?), it's \s*(?:\|..+?). (?:...) are grouping (not > capturing) parentheses. They're followed by {10,} so you want 10 or more > of those groups. Each group is "|" followed by at least one, possibly > more, character(s) that match(es) /./ but not "|" (because +? is > non-greedy). > > Below is an expanded (using /x) version: > > # example line that will match > my $line = '| abc |0|1|2|3|4|5|6|7|8|9|x|y|z|'; > > $line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o and print "yes\n"; > > $line =~ m/ > ^ # begins with > \| # 'or' bar > \s* # optional whitespace > \w # ONE word character > \S* # optional nonwhitespace > \s* # optional whitespace > (?: # begin the group > \| # 'or' bar > .+ # at least one character > ? # make the '+' non-greedy > ) # end the group > {10,} # give me 10 or more GROUPS > \| # 'or' bar > $ # at the end > /ox and print "yes\n"; > > I assume you've looked at perldoc perlre. Also, the /o option is not required as there are no variables in the regular expression. perldoc perlop John -- use Perl; program fulfillment |
Re: pattern matching
Thanks Brad & John. Got it now.
LiHui |
Re: pattern matching
On 19 Apr 2004 18:58:30 -0700, tanlh_listing@hotmail.com (LiHui)
wrote: >Can someone tell me what does this line do ? > >$line =~ m/^\|\s*\w\S*\s*(?:\|.+?){10,}\|$/o > >I know that it check to see if the line begin with "|" follow by >whitespace, word, nonwhitespace and than I'm lost. What is s*(?:\|.+?) > >Any help will be greatly appreciate. Thanks LH I'll give it a go :) My regex abilities are a bit rusty. ^\| line starts with | \s* followed by 0 or more whitespaces \w followed by an alphanumeric character \S* followed by 0 or more non whitespaces \s* followed by 0 or more whitespaces (?:\|.+?){10,} is a quantified extended regex sequence (see below) \|$ line ends with | o switch tells the pattern to compile only once. Quantified regex sequence : (?:...) is a cluster only parenthesis, no capturing (thanks Camel book) which I think means that the pattern matches, but does not store the matched string in a variable. The remainder of this sequence is a regular regex : \| matches | ..+? matches one character, 1 or more times (minimally) {10,} tells the pattern inside the () to match at least 10 times Does that help or hinder? Scott |
| All times are GMT. The time now is 07:22 AM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.