Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Pattern matching [newbie]

Reply
Thread Tools

Pattern matching [newbie]

 
 
vivek_12315
Guest
Posts: n/a
 
      02-13-2013
I m working on my perl regex code, where I have to parse a html line like :

<a href="/question?id=15422849"><p>MY text here 1</p><p>MY text here 2</p><p>MY text here 3</p></a>

I am doing something like:
$string =~ m/(.*)href(.*)/;

But this is not helping me in what I want. I want something closer to following text:

"MY text here 1 MY text here 2 MY text here 3"

Can some give some ideas ?
 
Reply With Quote
 
 
 
 
Jürgen Exner
Guest
Posts: n/a
 
      02-13-2013
vivek_12315 <> wrote:
>I m working on my perl regex code, where I have to parse a html line like :
>
> <a href="/question?id=15422849"><p>MY text here 1</p><p>MY text here 2</p><p>MY text here 3</p></a>
>
>I am doing something like:
>$string =~ m/(.*)href(.*)/;
>
>But this is not helping me in what I want. I want something closer to following text:
>"MY text here 1 MY text here 2 MY text here 3"
>
>Can some give some ideas ?


Your Question used to be Asked Frequently. Please see

perldoc -q "remove html"

jue
 
Reply With Quote
 
 
 
 
brian d foy
Guest
Posts: n/a
 
      02-13-2013
In article <678ed33b-3479-46bb-b6d3->,
vivek_12315 <> wrote:

> I m working on my perl regex code, where I have to parse a html line like :
>
> <a href="/question?id=15422849"><p>MY text here 1</p><p>MY text here
> 2</p><p>MY text here 3</p></a>
>
> I am doing something like:
> $string =~ m/(.*)href(.*)/;
>
> But this is not helping me in what I want. I want something closer to
> following text:
>
> "MY text here 1 MY text here 2 MY text here 3"



http://search.cpan.org/dist/HTML-Strip/Strip.pm
 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      02-13-2013
Henry Law <> wrote:
>On 13/02/13 00:16, vivek_12315 wrote:
>> I m working on my perl regex code, where I have to parse a html line like :
>>
>> <a href="/question?id=15422849"><p>MY text here 1</p><p>MY text here 2</p><p>MY text here 3</p></a>

>
>I appreciate that you call yourself a newbie, and to you what I'm about
>to suggest may seem complicated and difficult; but that's the way we all
>learn ...
>
>Have you thought of parsing the HTML properly, using a module like
>HTML::Tree or HTML::TreeBuilder? The hardest part is choosing the
>module; after that you should find it moderately easy to use it do what
>you want, since it's pretty simple. And once you've done it it will
>probably be a lot better than hand-cranked parsing code.
>
>Note to all concerned: I'm not joining in the "you can't parse HTML with
>regexes" thread. In this case, at least, I'm sure that's perfectly
>possible.


Actually for this particular example it is almost trivial(*):
s/<.*?>//g;
Of course this is going to fail as soon as the HTML code becomes a tiny
bit more complex.

*: almost because it doesn't add the space characters between the
individual paragraph elements.

jue
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with Pattern matching. Matching multiple lines from while reading from a file. Bobby Chamness Perl Misc 2 05-03-2007 06:02 PM
Matching neighbouring words of a pattern using Regex CV Perl 2 08-31-2004 12:27 AM
Pattern matching : not matching problem Marc Bissonnette Perl Misc 9 01-13-2004 05:52 PM
Pattern matching help! grep emails from file! danpres2k Perl 3 08-25-2003 02:47 PM
A newbie question on pattern matching DelphiDude Perl 3 07-26-2003 12:54 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57