John Bokma (comp.lang.perl.misc) dijo...
> Abhinav wrote:
>
>> Hi
>>
>> I have a script where some chuncks of text are marked between xml-type
>> tags .
>>
>> I say 'xml-type' instead of xml as the tags are preceded with a comment
>> character "# " so that the script does not fail.
>
> Why not put the XML at the end, after __END__ and read it using <DATA>?
>
>> I need to be able to extract the data between tags (which can be
>> nested), and store it in a hash with each key being the tag itself and
>> the value, the data in between (it is multiline).
>
> Or open your script as a file, and read the #'s and throw away real
> comments (you can use ## for real ones for example), and parse the
> result. But I recommend __END__
>
>> The problem is that I initiially tried using Text::Balanced, but gave up
>> since ir was too demanding for this kind of work .. spanning across
>> multiple lines ..
>>
>> I am thinking of stripping the # from all tagged lines so that it
>> becomes an xml file, adding a root element (which was not present
>> before) , and then using an xml parser.
>
> Yup, good idea
.
>
>> My questions :
>> 1. Is the approach feasible, or is there som other simpler way to do it
>> .. (after all, TIMTOWTDI)
>
> use __END__
>
>> 2. If the above is the optimal solution, is there any parser/module
>> shipped along with the standard perl (5.
distro .. ?
>
> Yes, but I like XML::Twig a lot
Have a look at it.
>
> http://xmltwig.com/xmltwig/
>
> Other pointers:
>
> http://www.xml.com/pub/a/2000/04/05/feature/index.html
> http://perl-xml.sourceforge.net/faq/
You read each line without the preceding # and load it all in a scalar. Then
add it to the XMLin part of xmltwig. And get the parsed xml from XMLout.
Read the help file well since it has a lot of clauses in order to interpret
it well, but it is only trying and changing until you get the best result.
Then you have it all inside a hash reference (with array references inside).
--
.------------------. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
| ___ _ _ _ _ | ALBERTO ADRIAN SCHIANO - ARGENTINA - 2004
|/ __/ | \ || | | | <> # 34-34S 058-25W(z-3)
|||_< \| || ' | | +------------+------------------------------
|`____/|_\_|`___' | LINUX COUNTER: 240 133 ~ machine : 119 401
| _ _ _ __ _ | +------------+----------+-------------------
|| | | \ |\ \/ | AMD Athlon 6 |RAM 512Mb.|krnl.: 2.6.3-10mdk
|| |_ | | \ \ | i586-mandrake-linux-gnu |MDK 9.2 - KDE 3.13
||___||_\_|_/\_\ | +-----------------------+-------------------
| __ __ ___ _ _ | Maxtor #4D040H2 32Gb. |DISPLAY_VGA SiS 630
|| \ \| . \| / | ------------------------+--+----------------
|| || | || \ | PCI Audio snd-trident 7018 | ViewSonic E771
||_|_|_||___/|_\_ | ---------------------------+----------------
| |
http://perlmonks.org/index.pl?node_id=245320
'------------------' -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.