Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Re: Capture only first match in regular expression

Thread Tools

Re: Capture only first match in regular expression

Peter Tuente
Posts: n/a
Hi Zapanaz,

the default behaviour of regular expression terms is to be "greedy", so to
suppress this behaviour to become "not greedy" you have to apply a single
question mark "?" right after the desired expression(s). Sounds some kind of
complex, but I hope you get me

In your case the following should be sufficient:

# old: if($content =~ /.*(<a.*<\/a>).*/i){
$anchorContent = $1;

# new:
if($content =~ /.*?(<a.*?<\/a>).*/i){
$anchorContent = $1;

The effect is, that the first expression ".*" becomes not so greedy eating
all the possible chars (incl. one/some "<a" chars that prefix the last
occurrence of "<a" in the current line). Same with the second ".*".

Hope this helps



"Zapanaz" <> schrieb im Newsbeitrag
news:(E-Mail Removed)...
> Excuse the cross-post, my server doesn't carry comp.lang.perl.misc but
> it looks like there is more activity there.
> The answer to this is probably staring me in the face ...
> I am parsing/page scraping some HTML. I know the first anchor tag <a>
> contains information I want.
> So I do this:
> if($content =~ /.*(<a.*<\/a>).*/i){
> $anchorContent = $1;
> This basically works the way I want, it matches an anchor tag and
> captures the content of it.
> But there are multiple anchor tags in the HTML. What I want is the
> first one, but what I get is the last one.
> I think I should be using one of these
> * Match 0 or more times
> + Match 1 or more times
> ? Match 1 or 0 times
> {n} Match exactly n times
> {n,} Match at least n times
> {n,m} Match at least n but not more than m times
> To be honest, I really don't know how (n) is actually supposed to
> look. Would I actually use /a(1)/ to match "a" only one time?
> --
> Zapanaz
> International Satanic Conspiracy
> Customer Support Specialist
> Despite the strange appearance of the scooters, the Chinese ant-terror
> police are lethal in action.
> :: Currently listening to No 21 in C major K467 Allegro maestoso, 1785, by
> Mozart, from "Piano Concertos - Vladimir Ashkenazy"

Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Capture only first match in regular expression Mike Spencer Perl 0 04-19-2009 07:17 AM
Re: Capture only first match in regular expression Jürgen Exner Perl 0 04-12-2009 02:40 AM
Match First Sequence in Regular Expression? Roger L. Cauvin Python 43 01-28-2006 03:39 PM
Regular Expression - looking to match 'www' only if it the start of a URL ASP .Net 4 07-12-2005 01:01 PM
Regular Expression: match up to first colon in line aliensite Javascript 4 04-13-2005 01:42 AM