![]() |
|
|
|||||||
![]() |
PERL - Re: Capture only first match in regular expression |
|
|
Thread Tools | Search this Thread |
|
|
#1 |
|
Zapanaz <http://joecosby.com/code/mail.pl> wrote:
> >The answer to this is probably staring me in the face ... > >I am parsing/page scraping some HTML. I know the first anchor tag <a> >contains information I want. > >So I do this: > > if($content =~ /.*(<a.*<\/a>).*/i){ > $anchorContent = $1; > >This basically works the way I want, it matches an anchor tag and >captures the content of it. > >But there are multiple anchor tags in the HTML. What I want is the >first one, but what I get is the last one. Drop that .* at the beginning of your RE, it doesn't do you any good but eats up everything as far as it can provided the following RE still matches (in short: it is greedy). Having said that unless your HTML is some fixed format you really really should be using an HTML parser to parse HTML. HTML is not a regular language and therefore cannot be parsed using pure regular expressions. >I think I should be using one of these > >* Match 0 or more times >+ Match 1 or more times >? Match 1 or 0 times >{n} Match exactly n times >{n,} Match at least n times >{n,m} Match at least n but not more than m times If at all you could use ? to turn the * into non-greedy as in .*?, but that's just stupid because it would match the empty string anywhere. jue Jürgen Exner |
|
|
![]() |
| Thread Tools | Search this Thread |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| SuperVideoCap work as a broadcast capture and screen capture and record tool. | hely0123 | Media | 0 | 10-30-2007 08:59 AM |
| Need help on Modelsim VHDL syntax? ASAP:) | kaji | General Help Related Topics | 0 | 03-14-2007 10:43 PM |
| Need help on a Modelsim VHDL Syntax? ASAP:) | kaji | Software | 0 | 03-14-2007 10:43 PM |
| Need Help on a Modelsim VHDL Syntax....ASAP:) | kaji | Hardware | 0 | 03-14-2007 10:41 PM |
| Capture Card and Software Advice | Scott | DVD Video | 1 | 04-18-2004 08:39 PM |