Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > xml type parser in the standard perl installation ?

Reply
Thread Tools

xml type parser in the standard perl installation ?

 
 
Abhinav
Guest
Posts: n/a
 
      05-27-2004
Hi

I have a script where some chuncks of text are marked between xml-type
tags .

I say 'xml-type' instead of xml as the tags are preceded with a comment
character "# " so that the script does not fail.

I need to be able to extract the data between tags (which can be
nested), and store it in a hash with each key being the tag itself and
the value, the data in between (it is multiline).

The problem is that I initiially tried using Text::Balanced, but gave up
since ir was too demanding for this kind of work .. spanning across
multiple lines ..

I am thinking of stripping the # from all tagged lines so that it
becomes an xml file, adding a root element (which was not present
before) , and then using an xml parser.

My questions :
1. Is the approach feasible, or is there som other simpler way to do it
... (after all, TIMTOWTDI)
2. If the above is the optimal solution, is there any parser/module
shipped along with the standard perl (5. distro .. ?

Many thanks ..
Abhinav

 
Reply With Quote
 
 
 
 
John Bokma
Guest
Posts: n/a
 
      05-27-2004
Abhinav wrote:

> Hi
>
> I have a script where some chuncks of text are marked between xml-type
> tags .
>
> I say 'xml-type' instead of xml as the tags are preceded with a comment
> character "# " so that the script does not fail.


Why not put the XML at the end, after __END__ and read it using <DATA>?

> I need to be able to extract the data between tags (which can be
> nested), and store it in a hash with each key being the tag itself and
> the value, the data in between (it is multiline).


Or open your script as a file, and read the #'s and throw away real
comments (you can use ## for real ones for example), and parse the
result. But I recommend __END__

> The problem is that I initiially tried using Text::Balanced, but gave up
> since ir was too demanding for this kind of work .. spanning across
> multiple lines ..
>
> I am thinking of stripping the # from all tagged lines so that it
> becomes an xml file, adding a root element (which was not present
> before) , and then using an xml parser.


Yup, good idea .

> My questions :
> 1. Is the approach feasible, or is there som other simpler way to do it
> .. (after all, TIMTOWTDI)


use __END__

> 2. If the above is the optimal solution, is there any parser/module
> shipped along with the standard perl (5. distro .. ?


Yes, but I like XML::Twig a lot Have a look at it.

http://xmltwig.com/xmltwig/

Other pointers:

http://www.xml.com/pub/a/2000/04/05/feature/index.html
http://perl-xml.sourceforge.net/faq/
--

John MexIT: http://johnbokma.com/mexit/
personal page: http://johnbokma.com/
Experienced Perl programmer available: http://castleamber.com/
 
Reply With Quote
 
 
 
 
chanio
Guest
Posts: n/a
 
      05-28-2004
John Bokma (comp.lang.perl.misc) dijo...

> Abhinav wrote:
>
>> Hi
>>
>> I have a script where some chuncks of text are marked between xml-type
>> tags .
>>
>> I say 'xml-type' instead of xml as the tags are preceded with a comment
>> character "# " so that the script does not fail.

>
> Why not put the XML at the end, after __END__ and read it using <DATA>?
>
>> I need to be able to extract the data between tags (which can be
>> nested), and store it in a hash with each key being the tag itself and
>> the value, the data in between (it is multiline).

>
> Or open your script as a file, and read the #'s and throw away real
> comments (you can use ## for real ones for example), and parse the
> result. But I recommend __END__
>
>> The problem is that I initiially tried using Text::Balanced, but gave up
>> since ir was too demanding for this kind of work .. spanning across
>> multiple lines ..
>>
>> I am thinking of stripping the # from all tagged lines so that it
>> becomes an xml file, adding a root element (which was not present
>> before) , and then using an xml parser.

>
> Yup, good idea .
>
>> My questions :
>> 1. Is the approach feasible, or is there som other simpler way to do it
>> .. (after all, TIMTOWTDI)

>
> use __END__
>
>> 2. If the above is the optimal solution, is there any parser/module
>> shipped along with the standard perl (5. distro .. ?

>
> Yes, but I like XML::Twig a lot Have a look at it.
>
> http://xmltwig.com/xmltwig/
>
> Other pointers:
>
> http://www.xml.com/pub/a/2000/04/05/feature/index.html
> http://perl-xml.sourceforge.net/faq/

You read each line without the preceding # and load it all in a scalar. Then
add it to the XMLin part of xmltwig. And get the parsed xml from XMLout.
Read the help file well since it has a lot of clauses in order to interpret
it well, but it is only trying and changing until you get the best result.
Then you have it all inside a hash reference (with array references inside).

--
.------------------. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
| ___ _ _ _ _ | ALBERTO ADRIAN SCHIANO - ARGENTINA - 2004
|/ __/ | \ || | | | <> # 34-34S 058-25W(z-3)
|||_< \| || ' | | +------------+------------------------------
|`____/|_\_|`___' | LINUX COUNTER: 240 133 ~ machine : 119 401
| _ _ _ __ _ | +------------+----------+-------------------
|| | | \ |\ \/ | AMD Athlon 6 |RAM 512Mb.|krnl.: 2.6.3-10mdk
|| |_ | | \ \ | i586-mandrake-linux-gnu |MDK 9.2 - KDE 3.13
||___||_\_|_/\_\ | +-----------------------+-------------------
| __ __ ___ _ _ | Maxtor #4D040H2 32Gb. |DISPLAY_VGA SiS 630
|| \ \| . \| / | ------------------------+--+----------------
|| || | || \ | PCI Audio snd-trident 7018 | ViewSonic E771
||_|_|_||___/|_\_ | ---------------------------+----------------
| | http://perlmonks.org/index.pl?node_id=245320
'------------------' -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
 
Reply With Quote
 
Abhinav
Guest
Posts: n/a
 
      05-28-2004


John Bokma wrote:
> Abhinav wrote:
>
>> Hi
>>
>> I have a script where some chuncks of text are marked between xml-type
>> tags .
>>
>> I say 'xml-type' instead of xml as the tags are preceded with a
>> comment character "# " so that the script does not fail.

>
>
> Why not put the XML at the end, after __END__ and read it using <DATA>?
>

Hi John ,
Thanks ! I was not clear when I said "I have a script" . I actually
meant that I have a Winrunner script, Not Perl script, in which i wanted
to put these tags. (So as to extract info from the Winrunner script,
using a perl script )

>> I need to be able to extract the data between tags (which can be
>> nested), and store it in a hash with each key being the tag itself and
>> the value, the data in between (it is multiline).

>
>
> Or open your script as a file, and read the #'s and throw away real
> comments (you can use ## for real ones for example), and parse the
> result. But I recommend __END__
>
>> The problem is that I initiially tried using Text::Balanced, but gave
>> up since ir was too demanding for this kind of work .. spanning across
>> multiple lines ..
>>
>> I am thinking of stripping the # from all tagged lines so that it
>> becomes an xml file, adding a root element (which was not present
>> before) , and then using an xml parser.

>
>
> Yup, good idea .
>
>> My questions :
>> 1. Is the approach feasible, or is there som other simpler way to do
>> it .. (after all, TIMTOWTDI)

>
>
> use __END__
>
>> 2. If the above is the optimal solution, is there any parser/module
>> shipped along with the standard perl (5. distro .. ?

>
>
> Yes, but I like XML::Twig a lot Have a look at it.
>
> http://xmltwig.com/xmltwig/
>
> Other pointers:
>
> http://www.xml.com/pub/a/2000/04/05/feature/index.html
> http://perl-xml.sourceforge.net/faq/


Thanks .. that gives me enough to do for now Anyway, good to know
that the approach I want to use fnids accepteance

Regards
AB

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
XML::Parser Installation error: XML-Parser-2.34 Sean Perl Misc 3 10-03-2006 01:23 AM
XML::Parser Installation error: XML-Parser-2.34 Sean Perl Misc 0 10-02-2006 06:20 PM
Different results parsing a XML file with XML::Simple (XML::Sax vs. XML::Parser) Erik Wasser Perl Misc 5 03-05-2006 10:09 PM
XML-Parser to XML-Parser communication (encoding issues?) arne Perl Misc 0 09-13-2005 12:53 PM
Installation Problem with XML::Parser perl module HarishN Perl Misc 4 02-23-2004 01:37 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57