Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Converting XML to Perl structures FAST

Reply
Thread Tools

Converting XML to Perl structures FAST

 
 
Ignoramus17503
Guest
Posts: n/a
 
      06-12-2006
Aside from a suggestion to look at RXParse, which I will do, I have
not yet seen what I was looking for, so here's a rephrase of my
question.

I need to convert XML documents to Perl structures, very efficiently,
CPU wise.

I am currently using XML::Simple, which does what I want, but is
slow.

I have looked at various Perl XML FAQs, manual for XML::LibXML, etc,
looks like they parse XML into all kinds of strange (to me) things.

So, here's my question: what perl module converts XML to perl
structure (hashes of hashes of arrays etc), and does it very
efficiently.

I am not loooking for suggestions to "use google", I need suggestions
from people who have a real life answer.

Thanks.

i

 
Reply With Quote
 
 
 
 
robic0
Guest
Posts: n/a
 
      06-12-2006
On Mon, 12 Jun 2006 23:30:22 GMT, Ignoramus17503 <> wrote:

>Aside from a suggestion to look at RXParse, which I will do, I have
>not yet seen what I was looking for, so here's a rephrase of my
>question.
>
>I need to convert XML documents to Perl structures, very efficiently,
>CPU wise.
>
>I am currently using XML::Simple, which does what I want, but is
>slow.
>
>I have looked at various Perl XML FAQs, manual for XML::LibXML, etc,
>looks like they parse XML into all kinds of strange (to me) things.
>
>So, here's my question: what perl module converts XML to perl
>structure (hashes of hashes of arrays etc), and does it very
>efficiently.
>
>I am not loooking for suggestions to "use google", I need suggestions
>from people who have a real life answer.
>
>Thanks.
>
>i


There are many here that have used Simple. The general procedure is
us Expat or Parse, set your handlers, set flags in the handlers,
grab the data when it comes around. Stop the grab when its gone.
When you hit the tag you need, store the "original" content data
that is passed (appeneded to a string, with tags) then pass the
entire "original" xml/xhtml (tags and all) to Simple to glean the hash data.
This avoids unnecessary duality.
Does that about cover it?

robic0
(god of porn)
 
Reply With Quote
 
 
 
 
robic0
Guest
Posts: n/a
 
      06-12-2006
On Mon, 12 Jun 2006 16:38:03 -0700, robic0 wrote:

>On Mon, 12 Jun 2006 23:30:22 GMT, Ignoramus17503 <> wrote:
>
>>Aside from a suggestion to look at RXParse, which I will do, I have
>>not yet seen what I was looking for, so here's a rephrase of my
>>question.
>>
>>I need to convert XML documents to Perl structures, very efficiently,
>>CPU wise.
>>
>>I am currently using XML::Simple, which does what I want, but is
>>slow.
>>
>>I have looked at various Perl XML FAQs, manual for XML::LibXML, etc,
>>looks like they parse XML into all kinds of strange (to me) things.
>>
>>So, here's my question: what perl module converts XML to perl
>>structure (hashes of hashes of arrays etc), and does it very
>>efficiently.
>>
>>I am not loooking for suggestions to "use google", I need suggestions
>>from people who have a real life answer.
>>
>>Thanks.
>>
>>i

>
>There are many here that have used Simple. The general procedure is
>us Expat or Parse, set your handlers, set flags in the handlers,
>grab the data when it comes around. Stop the grab when its gone.
>When you hit the tag you need, store the "original" content data
>that is passed (appeneded to a string, with tags) then pass the
>entire "original" xml/xhtml (tags and all) to Simple to glean the hash data.
>This avoids unnecessary duality.
>Does that about cover it?
>
>robic0
>(god of porn)


RXParse is just a Create/Filter/Search & Replace (modify)/ parser.
It won't internalize xml data into a hash. Although I did do one of those
posted here along time ago (a Simple replacement).

You need to understand that for what you (think) are trying to do you will have
to lead off with parser handlers to "drill down" to the start of the extraction
data, capture it (raw), wait for the finish, then past the "raw" string to Simple.

That is how its done buddy.....

robic0
(god of porn)
 
Reply With Quote
 
robic0
Guest
Posts: n/a
 
      06-12-2006
On Mon, 12 Jun 2006 16:44:48 -0700, robic0 wrote:

>On Mon, 12 Jun 2006 16:38:03 -0700, robic0 wrote:
>
>>On Mon, 12 Jun 2006 23:30:22 GMT, Ignoramus17503 <> wrote:
>>
>>>Aside from a suggestion to look at RXParse, which I will do, I have
>>>not yet seen what I was looking for, so here's a rephrase of my
>>>question.
>>>
>>>I need to convert XML documents to Perl structures, very efficiently,
>>>CPU wise.
>>>
>>>I am currently using XML::Simple, which does what I want, but is
>>>slow.
>>>
>>>I have looked at various Perl XML FAQs, manual for XML::LibXML, etc,
>>>looks like they parse XML into all kinds of strange (to me) things.
>>>
>>>So, here's my question: what perl module converts XML to perl
>>>structure (hashes of hashes of arrays etc), and does it very
>>>efficiently.
>>>
>>>I am not loooking for suggestions to "use google", I need suggestions
>>>from people who have a real life answer.
>>>
>>>Thanks.
>>>
>>>i

>>
>>There are many here that have used Simple. The general procedure is
>>us Expat or Parse, set your handlers, set flags in the handlers,
>>grab the data when it comes around. Stop the grab when its gone.
>>When you hit the tag you need, store the "original" content data
>>that is passed (appeneded to a string, with tags) then pass the
>>entire "original" xml/xhtml (tags and all) to Simple to glean the hash data.
>>This avoids unnecessary duality.
>>Does that about cover it?
>>
>>robic0
>>(god of porn)

>
>RXParse is just a Create/Filter/Search & Replace (modify)/ parser.
>It won't internalize xml data into a hash. Although I did do one of those
>posted here along time ago (a Simple replacement).
>
>You need to understand that for what you (think) are trying to do you will have
>to lead off with parser handlers to "drill down" to the start of the extraction
>data, capture it (raw), wait for the finish, then past the "raw" string to Simple.
>
>That is how its done buddy.....
>
>robic0
>(god of porn)


postscript:

Usually when you capture sub xml/xhtml in this fashion, you will want to encapsulate
the raw data with a tag before you send it to Simple. Simple invokes a user selected
parser (Expat is default, I think). So if its non-compliant it will croak/carp on you.
Expat is better than Parse though.

Like:

<root>

captured xml/xhtml

</root>

robic0
(god of porn)
 
Reply With Quote
 
robic0
Guest
Posts: n/a
 
      06-12-2006
On Mon, 12 Jun 2006 16:50:35 -0700, robic0 wrote:

>On Mon, 12 Jun 2006 16:44:48 -0700, robic0 wrote:
>
>>On Mon, 12 Jun 2006 16:38:03 -0700, robic0 wrote:
>>
>>>On Mon, 12 Jun 2006 23:30:22 GMT, Ignoramus17503 <> wrote:
>>>
>>>>Aside from a suggestion to look at RXParse, which I will do, I have
>>>>not yet seen what I was looking for, so here's a rephrase of my
>>>>question.
>>>>
>>>>I need to convert XML documents to Perl structures, very efficiently,
>>>>CPU wise.
>>>>
>>>>I am currently using XML::Simple, which does what I want, but is
>>>>slow.
>>>>
>>>>I have looked at various Perl XML FAQs, manual for XML::LibXML, etc,
>>>>looks like they parse XML into all kinds of strange (to me) things.
>>>>
>>>>So, here's my question: what perl module converts XML to perl
>>>>structure (hashes of hashes of arrays etc), and does it very
>>>>efficiently.
>>>>
>>>>I am not loooking for suggestions to "use google", I need suggestions
>>>>from people who have a real life answer.
>>>>
>>>>Thanks.
>>>>
>>>>i
>>>
>>>There are many here that have used Simple. The general procedure is
>>>us Expat or Parse, set your handlers, set flags in the handlers,
>>>grab the data when it comes around. Stop the grab when its gone.
>>>When you hit the tag you need, store the "original" content data
>>>that is passed (appeneded to a string, with tags) then pass the
>>>entire "original" xml/xhtml (tags and all) to Simple to glean the hash data.
>>>This avoids unnecessary duality.
>>>Does that about cover it?
>>>
>>>robic0
>>>(god of porn)

>>
>>RXParse is just a Create/Filter/Search & Replace (modify)/ parser.
>>It won't internalize xml data into a hash. Although I did do one of those
>>posted here along time ago (a Simple replacement).
>>
>>You need to understand that for what you (think) are trying to do you will have
>>to lead off with parser handlers to "drill down" to the start of the extraction
>>data, capture it (raw), wait for the finish, then past the "raw" string to Simple.
>>
>>That is how its done buddy.....
>>
>>robic0
>>(god of porn)

>
>postscript:
>
>Usually when you capture sub xml/xhtml in this fashion, you will want to encapsulate
>the raw data with a tag before you send it to Simple. Simple invokes a user selected
>parser (Expat is default, I think). So if its non-compliant it will croak/carp on you.
>Expat is better than Parse though.
>
>Like:
>
><root>
>
> captured xml/xhtml
>
></root>
>
>robic0
>(god of porn)


Btw, Simple doesen't know of RXParse, so it won't invoke it. RXParse is faster than Expat
and Parse, which each use a C dll interface. RXParse is a very fast (er than them) Perl only
parser. So they will not support it until its formalized on CPan or something. I won't take it
to CPan. I reject the Perl establishment, period. I am going to force the maggpies to come to me!!!

robic0
(god of porn)
 
Reply With Quote
 
Ignoramus17503
Guest
Posts: n/a
 
      06-13-2006
On Mon, 12 Jun 2006 17:13:26 -0700, Jim Gibson <> wrote:
> In article <iOmjg.14484$>, Ignoramus17503
><> wrote:
>
>> Aside from a suggestion to look at RXParse, which I will do, I have
>> not yet seen what I was looking for, so here's a rephrase of my
>> question.

>
> Why have you started a new thread?
>
> Please ignore any post from robic0 and don't consider trying to use
> RXParse. If you do, you will get no help from anyone here (including
> robic0).
>
> I have used XML:arser and the expat library
> (<http://expat.sourceforge.net/>) to parse XML. I started developing a
> program that uses XML by using XML::SAX:urePerl. It worked on small
> test files. When I was ready to test on larger files, I installed the
> expat library and used XML:arser. The speed-up was about a factor of
> 40.


Do you have any code sample that you could share?

>>
>> I need to convert XML documents to Perl structures, very efficiently,
>> CPU wise.
>>
>> I am currently using XML::Simple, which does what I want, but is
>> slow.
>>
>> I have looked at various Perl XML FAQs, manual for XML::LibXML, etc,
>> looks like they parse XML into all kinds of strange (to me) things.
>>
>> So, here's my question: what perl module converts XML to perl
>> structure (hashes of hashes of arrays etc), and does it very
>> efficiently.

>
> SAX parsers do not produce Perl data structures. They call your
> routines on each element. You then store the data in your own
> structures. It is very efficient, but I do not have any experience with
> XML::Simple or XML::Twig, so cannot give you a comparison.


Thank you for the tips.

I installed XML::Twig, and things seem, so far, to be a lot faster and
CPU use is way down. It is not quite as easy to use, but I can live
with it.

I am running my process, which repeatedly parses large XML structures,
now, it will run for the rest of the evening. Time will tell if it
slows down with more parsed documents, maybe due to memory leaks or
who knows what.

i

 
Reply With Quote
 
robic0
Guest
Posts: n/a
 
      06-13-2006
On Mon, 12 Jun 2006 17:13:26 -0700, Jim Gibson <> wrote:

>In article <iOmjg.14484$>, Ignoramus17503
><> wrote:
>
>> Aside from a suggestion to look at RXParse, which I will do, I have
>> not yet seen what I was looking for, so here's a rephrase of my
>> question.

>
>Why have you started a new thread?
>
>Please ignore any post from robic0 and don't consider trying to use
>RXParse. If you do, you will get no help from anyone here (including
>robic0).
>
>I have used XML:arser and the expat library
>(<http://expat.sourceforge.net/>) to parse XML. I started developing a
>program that uses XML by using XML::SAX:urePerl. It worked on small
>test files. When I was ready to test on larger files, I installed the
>expat library and used XML:arser. The speed-up was about a factor of
>40.
>
>>
>> I need to convert XML documents to Perl structures, very efficiently,
>> CPU wise.
>>
>> I am currently using XML::Simple, which does what I want, but is
>> slow.
>>
>> I have looked at various Perl XML FAQs, manual for XML::LibXML, etc,
>> looks like they parse XML into all kinds of strange (to me) things.
>>
>> So, here's my question: what perl module converts XML to perl
>> structure (hashes of hashes of arrays etc), and does it very
>> efficiently.

>
>SAX parsers do not produce Perl data structures. They call your
>routines on each element. You then store the data in your own
>structures. It is very efficient, but I do not have any experience with
>XML::Simple or XML::Twig, so cannot give you a comparison.


I'm gonna let this slight go Jim, consider yourself lucky!!!

Since you do not have experience with Simple or Twig, I consider
you post a light easy breeze that fades with the tides. I would not
have folled you comments but for the "indirect" reference to ignore
robic0 entirely! I have a long memory and will not forget this.

I've been purposely absent and choose posts now based on my expertise.
I have on now. A really big, complicated one.

He never mentioned SAX (simple api xml), why did you? You don't know
what xml is and you will never. I don't take kindly to personal attacks!
The next one and I will rip you a new asshole!!!!!!

robic0
(god of porn)
 
Reply With Quote
 
robic0
Guest
Posts: n/a
 
      06-13-2006
On Tue, 13 Jun 2006 00:25:17 GMT, Ignoramus17503 <> wrote:

>On Mon, 12 Jun 2006 17:13:26 -0700, Jim Gibson <> wrote:
>> In article <iOmjg.14484$>, Ignoramus17503
>><> wrote:
>>
>>> Aside from a suggestion to look at RXParse, which I will do, I have
>>> not yet seen what I was looking for, so here's a rephrase of my
>>> question.

>>
>> Why have you started a new thread?
>>
>> Please ignore any post from robic0 and don't consider trying to use
>> RXParse. If you do, you will get no help from anyone here (including
>> robic0).
>>
>> I have used XML:arser and the expat library
>> (<http://expat.sourceforge.net/>) to parse XML. I started developing a
>> program that uses XML by using XML::SAX:urePerl. It worked on small
>> test files. When I was ready to test on larger files, I installed the
>> expat library and used XML:arser. The speed-up was about a factor of
>> 40.

>
>Do you have any code sample that you could share?
>
>>>
>>> I need to convert XML documents to Perl structures, very efficiently,
>>> CPU wise.
>>>
>>> I am currently using XML::Simple, which does what I want, but is
>>> slow.
>>>
>>> I have looked at various Perl XML FAQs, manual for XML::LibXML, etc,
>>> looks like they parse XML into all kinds of strange (to me) things.
>>>
>>> So, here's my question: what perl module converts XML to perl
>>> structure (hashes of hashes of arrays etc), and does it very
>>> efficiently.

>>
>> SAX parsers do not produce Perl data structures. They call your
>> routines on each element. You then store the data in your own
>> structures. It is very efficient, but I do not have any experience with
>> XML::Simple or XML::Twig, so cannot give you a comparison.

>
>Thank you for the tips.
>
>I installed XML::Twig, and things seem, so far, to be a lot faster and
>CPU use is way down. It is not quite as easy to use, but I can live
>with it.
>
>I am running my process, which repeatedly parses large XML structures,
>now, it will run for the rest of the evening. Time will tell if it
>slows down with more parsed documents, maybe due to memory leaks or
>who knows what.
>
>i

Oh I thought your intention was to create data structures? Or was it to
parse xml? Or create data structures from parsed xml?
Its a really, really hard, hard thing to get from you the simplest of
simple answers.

The volume of folks (I know of) here, know these answers intimately,
me one of them.

You are indeed an Ebiotch asshole !!!!!!!!!!!
(this line above won't get you anymore answers)

robic0
(god of porn)
 
Reply With Quote
 
Ignoramus17503
Guest
Posts: n/a
 
      06-13-2006
On Mon, 12 Jun 2006 17:39:41 -0700, robic0 <robic0> wrote:
> On Tue, 13 Jun 2006 00:25:17 GMT, Ignoramus17503 <> wrote:
>
>>On Mon, 12 Jun 2006 17:13:26 -0700, Jim Gibson <> wrote:
>>> In article <iOmjg.14484$>, Ignoramus17503
>>><> wrote:
>>>
>>>> Aside from a suggestion to look at RXParse, which I will do, I have
>>>> not yet seen what I was looking for, so here's a rephrase of my
>>>> question.
>>>
>>> Why have you started a new thread?
>>>
>>> Please ignore any post from robic0 and don't consider trying to use
>>> RXParse. If you do, you will get no help from anyone here (including
>>> robic0).
>>>
>>> I have used XML:arser and the expat library
>>> (<http://expat.sourceforge.net/>) to parse XML. I started developing a
>>> program that uses XML by using XML::SAX:urePerl. It worked on small
>>> test files. When I was ready to test on larger files, I installed the
>>> expat library and used XML:arser. The speed-up was about a factor of
>>> 40.

>>
>>Do you have any code sample that you could share?
>>
>>>>
>>>> I need to convert XML documents to Perl structures, very efficiently,
>>>> CPU wise.
>>>>
>>>> I am currently using XML::Simple, which does what I want, but is
>>>> slow.
>>>>
>>>> I have looked at various Perl XML FAQs, manual for XML::LibXML, etc,
>>>> looks like they parse XML into all kinds of strange (to me) things.
>>>>
>>>> So, here's my question: what perl module converts XML to perl
>>>> structure (hashes of hashes of arrays etc), and does it very
>>>> efficiently.
>>>
>>> SAX parsers do not produce Perl data structures. They call your
>>> routines on each element. You then store the data in your own
>>> structures. It is very efficient, but I do not have any experience with
>>> XML::Simple or XML::Twig, so cannot give you a comparison.

>>
>>Thank you for the tips.
>>
>>I installed XML::Twig, and things seem, so far, to be a lot faster and
>>CPU use is way down. It is not quite as easy to use, but I can live
>>with it.
>>
>>I am running my process, which repeatedly parses large XML structures,
>>now, it will run for the rest of the evening. Time will tell if it
>>slows down with more parsed documents, maybe due to memory leaks or
>>who knows what.
>>
>>i

> Oh I thought your intention was to create data structures? Or was it to
> parse xml? Or create data structures from parsed xml?
> Its a really, really hard, hard thing to get from you the simplest of
> simple answers.
>
> The volume of folks (I know of) here, know these answers intimately,
> me one of them.
>
> You are indeed an Ebiotch asshole !!!!!!!!!!!
> (this line above won't get you anymore answers)
>
> robic0
> (god of porn)


My goal was to parse XML into usable data structures.

i

 
Reply With Quote
 
robic0
Guest
Posts: n/a
 
      06-13-2006
On Tue, 13 Jun 2006 00:42:54 GMT, Ignoramus17503 <> wrote:

>On Mon, 12 Jun 2006 17:39:41 -0700, robic0 <robic0> wrote:
>> On Tue, 13 Jun 2006 00:25:17 GMT, Ignoramus17503 <> wrote:
>>
>>>On Mon, 12 Jun 2006 17:13:26 -0700, Jim Gibson <> wrote:
>>>> In article <iOmjg.14484$>, Ignoramus17503
>>>><> wrote:
>>>>
>>>>> Aside from a suggestion to look at RXParse, which I will do, I have
>>>>> not yet seen what I was looking for, so here's a rephrase of my
>>>>> question.
>>>>
>>>> Why have you started a new thread?
>>>>
>>>> Please ignore any post from robic0 and don't consider trying to use
>>>> RXParse. If you do, you will get no help from anyone here (including
>>>> robic0).
>>>>
>>>> I have used XML:arser and the expat library
>>>> (<http://expat.sourceforge.net/>) to parse XML. I started developing a
>>>> program that uses XML by using XML::SAX:urePerl. It worked on small
>>>> test files. When I was ready to test on larger files, I installed the
>>>> expat library and used XML:arser. The speed-up was about a factor of
>>>> 40.
>>>
>>>Do you have any code sample that you could share?
>>>
>>>>>
>>>>> I need to convert XML documents to Perl structures, very efficiently,
>>>>> CPU wise.
>>>>>
>>>>> I am currently using XML::Simple, which does what I want, but is
>>>>> slow.
>>>>>
>>>>> I have looked at various Perl XML FAQs, manual for XML::LibXML, etc,
>>>>> looks like they parse XML into all kinds of strange (to me) things.
>>>>>
>>>>> So, here's my question: what perl module converts XML to perl
>>>>> structure (hashes of hashes of arrays etc), and does it very
>>>>> efficiently.
>>>>
>>>> SAX parsers do not produce Perl data structures. They call your
>>>> routines on each element. You then store the data in your own
>>>> structures. It is very efficient, but I do not have any experience with
>>>> XML::Simple or XML::Twig, so cannot give you a comparison.
>>>
>>>Thank you for the tips.
>>>
>>>I installed XML::Twig, and things seem, so far, to be a lot faster and
>>>CPU use is way down. It is not quite as easy to use, but I can live
>>>with it.
>>>
>>>I am running my process, which repeatedly parses large XML structures,
>>>now, it will run for the rest of the evening. Time will tell if it
>>>slows down with more parsed documents, maybe due to memory leaks or
>>>who knows what.
>>>
>>>i

>> Oh I thought your intention was to create data structures? Or was it to
>> parse xml? Or create data structures from parsed xml?
>> Its a really, really hard, hard thing to get from you the simplest of
>> simple answers.
>>
>> The volume of folks (I know of) here, know these answers intimately,
>> me one of them.
>>
>> You are indeed an Ebiotch asshole !!!!!!!!!!!
>> (this line above won't get you anymore answers)
>>
>> robic0
>> (god of porn)

>
>My goal was to parse XML into usable data structures.
>
>i


Its been said over and over again in the last hour....

This is a little info for you. It is extremely *HARD*
to divine xml into data structures!!!!!!!!

Given a flat requirement, passing a module to such that does
so will prove useless to the requestor!

DO YOU NOT FLUCKIN UNDERSTAND THAT??????????????????????

robic0
(god of porn)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem to insert an XML-element by XSLT-converting from one XML-file into another XML-file jkflens XML 2 05-30-2006 09:41 AM
structures, structures and more structures (questions about nestedstructures) Alfonso Morra C Programming 11 09-24-2005 07:42 PM
Converting Java code to C++ ...data structures questions graftonfot@yahoo.com C++ 2 12-14-2004 02:22 AM
Type Casting IPv4 and IPv6 structures to Generic Structures tweak C Programming 14 06-11-2004 02:43 PM
I NEED HELP FAST!!!!! REAL FAST!!!!! R. Jizzle MCSE 3 09-29-2003 08:51 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57