![]() |
|
|
|||||||
![]() |
PERL - Re: Capture only first match in regular expression |
|
|
Thread Tools | Search this Thread |
|
|
#1 |
|
Zapanaz <http://joecosby.com/code/mail.pl> wrote: > I am parsing/page scraping some HTML. I know the first anchor tag <a> > contains information I want. > > So I do this: > > if($content =~ /.*(<a.*<\/a>).*/i){ > $anchorContent = $1; Another poster suggested that regular expressions aren't sufficient for this. But you may be able to do it anyway if you can confidently predict features of the incoming HTML. That is, if you know "know the first anchor tag <a> contains information" you want, you may also know other things about the HTML you're trying to parse. Given an anchor of the general form: <a href=foo possible-other-arrtibutes=bar> Anchor-text </a> If you know in advance that the "Anchor-text" is *not* an <IMG src=...> tag and that the "Anchor-text" does not itself contain any other tags (such as, say, "<i>Anchor-text</i>) then you could use: if($content =~ /(<a\s[^>]+>[^<]*<\/a>)/i) { $anchorContent = $1; } Match the <a literally Require some matching whitespace after the 'a' Match anything that can occur within an opening <A...> tag Match the closing '>' of the opening <a tag Match any text except the '<' that will signal the closing </a> tag Match the closing </a> tag Won't work if the incoming HTML is arbitrary because you might have: <a href=foo><img src=bar></a> or <a href=foo> <i> Yow<b>!</b></i> </a> I'm no expert but I suspect that to reliably match what you want from any arbitrary HTML, you'll have to write a more general parser. -- Mike Spencer Nova Scotia, Canada Mike Spencer |
|
|
![]() |
| Thread Tools | Search this Thread |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| SuperVideoCap work as a broadcast capture and screen capture and record tool. | hely0123 | Media | 0 | 10-30-2007 08:59 AM |
| Need help on Modelsim VHDL syntax? ASAP:) | kaji | General Help Related Topics | 0 | 03-14-2007 10:43 PM |
| Need help on a Modelsim VHDL Syntax? ASAP:) | kaji | Software | 0 | 03-14-2007 10:43 PM |
| Need Help on a Modelsim VHDL Syntax....ASAP:) | kaji | Hardware | 0 | 03-14-2007 10:41 PM |
| Capture Card and Software Advice | Scott | DVD Video | 1 | 04-18-2004 08:39 PM |