Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Pattern matching

Reply
Thread Tools

Pattern matching

 
 
Deepan Perl XML Parser
Guest
Posts: n/a
 
      03-25-2008
Hi all,
I am having a file like below:

<?xml version="1.0" encoding="UTF-8"?>
<log xmlns="http://www.httpwatch.com/xml/log/5.1">
<entry method="GET" URL="http://www.google.com/sa/frame_main.cgi">
..
..
..
..
-------some text-------------
..
..
..
</entry>
<entry method="GET" URL="http://www.toogle.com/framer/main.cgi">
..
..
..
..
-------some text-------------
..
..
..
</entry>
<entry method="GET" URL="http://www.google.com/sa/frame_main.html">
..
..
..
..
-------some text-------------
..
..
..
</entry>
<page id="page_0" title="Sustaining Portal" dynamic="true"
unknown="false">
<started>00:00:00.000</started>
<startedDateTime>2008-03-25T09:52:12.791</startedDateTime>
</page>
<page id="page_1" title="Sustaining Portal" dynamic="true"
unknown="false">
<started>00:00:08.455</started>
<startedDateTime>2008-03-25T09:52:21.246</startedDateTime>
</page>
<page id="page_2" title="Sustaining Portal" dynamic="true"
unknown="false">
<started>00:00:20.296</started>
<startedDateTime>2008-03-25T09:52:33.087</startedDateTime>
</page>
<page id="page_3" title="Sustaining Portal" dynamic="true"
unknown="false">
<started>00:00:29.848</started>
<startedDateTime>2008-03-25T09:52:42.639</startedDateTime>
</page>
</log>

----------------------------------------------------------------------------------

Now how to get all those <entry ....> tags into an array? I mean
getting

<entry method="GET" URL="http://www.google.com/sa/
frame_main.cgi">
<entry method="GET" URL="http://www.toogle.com/framer/
main.cgi">
<entry method="GET" URL="http://www.google.com/sa/
frame_main.html">

into some array.

Thanks,
Deepan
 
Reply With Quote
 
 
 
 
Deepan Perl XML Parser
Guest
Posts: n/a
 
      03-25-2008
On Mar 25, 9:50 am, Deepan Perl XML Parser <(E-Mail Removed)>
wrote:
> Hi all,
> I am having a file like below:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <log xmlns="http://www.httpwatch.com/xml/log/5.1">
> <entry method="GET" URL="http://www.google.com/sa/frame_main.cgi">
> .
> .
> .
> .
> -------some text-------------
> .
> .
> .
> </entry>
> <entry method="GET" URL="http://www.toogle.com/framer/main.cgi">
> .
> .
> .
> .
> -------some text-------------
> .
> .
> .
> </entry>
> <entry method="GET" URL="http://www.google.com/sa/frame_main.html">
> .
> .
> .
> .
> -------some text-------------
> .
> .
> .
> </entry>
> <page id="page_0" title="Sustaining Portal" dynamic="true"
> unknown="false">
> <started>00:00:00.000</started>
> <startedDateTime>2008-03-25T09:52:12.791</startedDateTime>
> </page>
> <page id="page_1" title="Sustaining Portal" dynamic="true"
> unknown="false">
> <started>00:00:08.455</started>
> <startedDateTime>2008-03-25T09:52:21.246</startedDateTime>
> </page>
> <page id="page_2" title="Sustaining Portal" dynamic="true"
> unknown="false">
> <started>00:00:20.296</started>
> <startedDateTime>2008-03-25T09:52:33.087</startedDateTime>
> </page>
> <page id="page_3" title="Sustaining Portal" dynamic="true"
> unknown="false">
> <started>00:00:29.848</started>
> <startedDateTime>2008-03-25T09:52:42.639</startedDateTime>
> </page>
> </log>
>
> ----------------------------------------------------------------------------------
>
> Now how to get all those <entry ....> tags into an array? I mean
> getting
>
> <entry method="GET" URL="http://www.google.com/sa/
> frame_main.cgi">
> <entry method="GET" URL="http://www.toogle.com/framer/
> main.cgi">
> <entry method="GET" URL="http://www.google.com/sa/
> frame_main.html">
>
> into some array.
>
> Thanks,
> Deepan


while($string =~ m#<entry method=.* URL="http://(.*)">#g)
{
..................
..................
}

I am able to do this by using the above expr. Is there any fair way of
doing it other than this?
 
Reply With Quote
 
 
 
 
Jürgen Exner
Guest
Posts: n/a
 
      03-25-2008
Deepan Perl XML Parser <(E-Mail Removed)> wrote:
> I am having a file like below:
><?xml version="1.0" encoding="UTF-8"?>

[...]
>Now how to get all those <entry ....> tags into an array? I mean


You would use an XML parser to parse XML.
Has nothing to do with pattern matching at all.

jue
 
Reply With Quote
 
Deepan Perl XML Parser
Guest
Posts: n/a
 
      03-25-2008
On Mar 25, 10:40 am, Jürgen Exner <(E-Mail Removed)> wrote:
> Deepan Perl XML Parser <(E-Mail Removed)> wrote:
>
> > I am having a file like below:
> ><?xml version="1.0" encoding="UTF-8"?>

> [...]
> >Now how to get all those <entry ....> tags into an array? I mean

>
> You would use an XML parser to parse XML.
> Has nothing to do with pattern matching at all.
>
> jue


No i am writing my own XML parser.
 
Reply With Quote
 
Deepan Perl XML Parser
Guest
Posts: n/a
 
      03-26-2008
On Mar 25, 7:13 pm, Lawrence Statton <(E-Mail Removed)> wrote:
> Deepan Perl XML Parser <(E-Mail Removed)> writes:
>
>
>
> > No i am writing my own XML parser.

>
> Don't. There are many good XML parsers out there, the world doesn't
> need another one.
>
> --
> Lawrence Statton - (E-Mail Removed) s/aba/c/g
> Computer software consists of only two components: ones and
> zeros, in roughly equal proportions. All that is required is to
> place them into the correct order.


Okay then can you name any parsers that would get the CDATA section?
 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      03-26-2008
On 2008-03-26 04:00, Deepan Perl XML Parser <(E-Mail Removed)> wrote:
> On Mar 25, 7:13 pm, Lawrence Statton <(E-Mail Removed)> wrote:
>> Deepan Perl XML Parser <(E-Mail Removed)> writes:
>>
>> > No i am writing my own XML parser.

>>
>> Don't. There are many good XML parsers out there, the world doesn't
>> need another one.

>
> Okay then can you name any parsers that would get the CDATA section?


Which one doesn't?

LibXML certainly does (I just tested it). I think expat does, too.
I have my doubts about the pure perl XML parser, but that has a lot of
other problems too and shouldn't be used.

hp

 
Reply With Quote
 
Deepan Perl XML Parser
Guest
Posts: n/a
 
      03-27-2008
On Mar 26, 5:31 pm, "Peter J. Holzer" <(E-Mail Removed)> wrote:
> On 2008-03-26 04:00, Deepan Perl XML Parser <(E-Mail Removed)> wrote:
>
> > On Mar 25, 7:13 pm, Lawrence Statton <(E-Mail Removed)> wrote:
> >> Deepan Perl XML Parser <(E-Mail Removed)> writes:

>
> >> > No i am writing my own XML parser.

>
> >> Don't. There are many good XML parsers out there, the world doesn't
> >> need another one.

>
> > Okay then can you name any parsers that would get the CDATA section?

>
> Which one doesn't?
>
> LibXML certainly does (I just tested it). I think expat does, too.
> I have my doubts about the pure perl XML parser, but that has a lot of
> other problems too and shouldn't be used.
>
> hp


This one XML:arser doesn't. It just signals you as CDATA starts and
ends here. It is not possible to get the data which is present using
this.
 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      03-27-2008
On 2008-03-27 04:32, Deepan Perl XML Parser <(E-Mail Removed)> wrote:
> On Mar 26, 5:31 pm, "Peter J. Holzer" <(E-Mail Removed)> wrote:
>> On 2008-03-26 04:00, Deepan Perl XML Parser <(E-Mail Removed)> wrote:
>> > On Mar 25, 7:13 pm, Lawrence Statton <(E-Mail Removed)> wrote:
>> >> Deepan Perl XML Parser <(E-Mail Removed)> writes:

>>
>> >> > No i am writing my own XML parser.

>>
>> >> Don't. There are many good XML parsers out there, the world doesn't
>> >> need another one.

>>
>> > Okay then can you name any parsers that would get the CDATA section?

>>
>> Which one doesn't?
>>
>> LibXML certainly does (I just tested it). I think expat does, too.
>> I have my doubts about the pure perl XML parser, but that has a lot of
>> other problems too and shouldn't be used.

>
> This one XML:arser doesn't. It just signals you as CDATA starts and
> ends here. It is not possible to get the data which is present using
> this.


Works for me:

chronos:/wsrdb/users/hjp/tmp 20:47 112% cat foo.xml
<script>
<![CDATA[
function matchwo(a,b)
{
if (a < b && a < 0) then
{
return 1;
}
else
{
return 0;
}
}
]]>
</script>
chronos:/wsrdb/users/hjp/tmp 20:47 113% cat foo
#!/usr/bin/perl
use XML::Simple;
use Data:umper;

$x = XMLin($ARGV[0]);
print Dumper $x;
chronos:/wsrdb/users/hjp/tmp 20:47 114% export XML_SIMPLE_PREFERRED_PARSER=XML:arser
chronos:/wsrdb/users/hjp/tmp 20:47 115% ./foo foo.xml
$VAR1 = '

function matchwo(a,b)
{
if (a < b && a < 0) then
{
return 1;
}
else
{
return 0;
}
}

';

hp

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with Pattern matching. Matching multiple lines from while reading from a file. Bobby Chamness Perl Misc 2 05-03-2007 06:02 PM
Matching neighbouring words of a pattern using Regex CV Perl 2 08-31-2004 12:27 AM
Pattern matching : not matching problem Marc Bissonnette Perl Misc 9 01-13-2004 05:52 PM
Pattern matching help! grep emails from file! danpres2k Perl 3 08-25-2003 02:47 PM
A newbie question on pattern matching DelphiDude Perl 3 07-26-2003 12:54 PM



Advertisments