Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > memory management

Reply
Thread Tools

memory management

 
 
Ted Byers
Guest
Posts: n/a
 
      12-22-2008
Activestate's perl 5.10.0 on WXP.

I have recently found a couple of my scripts failing with out of
memory error messages, notably with XML::Twig.

This makes no sense since the files being processed are only of the
order of a few dozen megabytes to a maximum of 100MB, and the system
has 4 GB RAM. The machine is not especially heavily loaded (e.g.,
most of the time, when these scripts fail, they have executed over
night with nothing else running except, of course, the OS - WXP).

Curiously, I have yet to find anything useful in the Activestate
documentation for (Active)Perl.5.10.0 regarding memory management. Is
there anything, or any package, that I can use to tell me what is
going awry and how to fix it? I didn't see any likely candidates
using PPM and CPAN. It would be nice if I could have my script tell
me how much memory it is using, and for which data structures. Or
must I remain effectively blind and just split the task into smaller
tasks until it runs to completion on each?

Thanks

Ted
 
Reply With Quote
 
 
 
 
A. Sinan Unur
Guest
Posts: n/a
 
      12-22-2008
Ted Byers <(E-Mail Removed)> wrote in news:e58a033c-c05c-4dd4-85a4-
http://www.velocityreviews.com/forums/(E-Mail Removed):

> Activestate's perl 5.10.0 on WXP.
>
> I have recently found a couple of my scripts failing with out of
> memory error messages, notably with XML::Twig.
>
> This makes no sense since the files being processed are only of the
> order of a few dozen megabytes to a maximum of 100MB, and the system
> has 4 GB RAM. The machine is not especially heavily loaded (e.g.,
> most of the time, when these scripts fail, they have executed over
> night with nothing else running except, of course, the OS - WXP).


This seems to be a FAQ:

http://xmltwig.com/xmltwig/XML-Twig-FAQ.html#Q12

http://xmltwig.com/xmltwig/XML-Twig-FAQ.html#Q21

http://tomacorp.com/perl/xml/saxvstwig.html

Reports memory usage of 12M for a 614K input file.

Sinan

--
A. Sinan Unur <(E-Mail Removed)>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/
 
Reply With Quote
 
 
 
 
sln@netherlands.com
Guest
Posts: n/a
 
      12-22-2008
On Mon, 22 Dec 2008 10:05:01 -0800 (PST), Ted Byers <(E-Mail Removed)> wrote:

>Activestate's perl 5.10.0 on WXP.
>
>I have recently found a couple of my scripts failing with out of
>memory error messages, notably with XML::Twig.
>
>This makes no sense since the files being processed are only of the
>order of a few dozen megabytes to a maximum of 100MB, and the system
>has 4 GB RAM. The machine is not especially heavily loaded (e.g.,
>most of the time, when these scripts fail, they have executed over
>night with nothing else running except, of course, the OS - WXP).
>
>Curiously, I have yet to find anything useful in the Activestate
>documentation for (Active)Perl.5.10.0 regarding memory management. Is
>there anything, or any package, that I can use to tell me what is
>going awry and how to fix it? I didn't see any likely candidates
>using PPM and CPAN. It would be nice if I could have my script tell
>me how much memory it is using, and for which data structures. Or
>must I remain effectively blind and just split the task into smaller
>tasks until it runs to completion on each?
>
>Thanks
>
>Ted


You can check data structure sizes with some Devil:: packages.

use Devel::Size qw( total_size );
# build an array or create objects.. then
print total_size(_reference_), "\n";

Twig does its own special memory management. Mostly it builds
node tree's in memory, but it might have hybrid qualities as well.
This adds tremendous memory overhead, probably on the order of 10-50 to
1, depending on what your doing.

Another consideration is what your doing in the code. Are you making
temporaries all over the place?

By and large, 100MB's of raw data will translate into a possible Gig or
more with all the overhead.

sln

 
Reply With Quote
 
Tad J McClellan
Guest
Posts: n/a
 
      12-22-2008
(E-Mail Removed) <(E-Mail Removed)> wrote:


> You can check data structure sizes with some Devil:: packages.



But those only work on October 31st...


--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      12-22-2008
On Mon, 22 Dec 2008 13:24:39 -0600, Tad J McClellan <(E-Mail Removed)> wrote:

>(E-Mail Removed) <(E-Mail Removed)> wrote:
>
>
>> You can check data structure sizes with some Devil:: packages.

>
>
>But those only work on October 31st...


Oh, maybe just 1 then. I'm not a Devil fan so dunno.

sln
 
Reply With Quote
 
Ted Byers
Guest
Posts: n/a
 
      12-22-2008
On Dec 22, 1:42*pm, "A. Sinan Unur" <(E-Mail Removed)> wrote:
> Ted Byers <(E-Mail Removed)> wrote in news:e58a033c-c05c-4dd4-85a4-
> (E-Mail Removed):
>
> > Activestate's perl 5.10.0 on WXP.

>
> > I have recently found a couple of my scripts failing with out of
> > memory error messages, notably with XML::Twig.

>
> > This makes no sense since the files being processed are only of the
> > order of a few dozen megabytes to a maximum of 100MB, and the system
> > has 4 GB RAM. *The machine is not especially heavily loaded (e.g.,
> > most of the time, when these scripts fail, they have executed over
> > night with nothing else running except, of course, the OS - WXP).

>
> This seems to be a FAQ:
>
> http://xmltwig.com/xmltwig/XML-Twig-FAQ.html#Q12
>
> http://xmltwig.com/xmltwig/XML-Twig-FAQ.html#Q21
>
> http://tomacorp.com/perl/xml/saxvstwig.html
>
> Reports memory usage of 12M for a 614K input file.
>
> Sinan
>
> --
> A. Sinan Unur <(E-Mail Removed)>
> (remove .invalid and reverse each component for email address)
>
> comp.lang.perl.misc guidelines on the WWW:http://www.rehabitation.com/clpmisc/


Ah, OK. I hadn't thought it specific to Twig since I had seen issues
with memory in other scripts using LWP. I thought maybe Perl, or
Active State's distribution of it, might have some issues, because
each of the scripts that encountered trouble was handling only a few
MB, and ran perfectly when working with contrived data of only a few
hundred K.

Thanks, I'll take a look there too.
 
Reply With Quote
 
Ted Byers
Guest
Posts: n/a
 
      12-22-2008
On Dec 22, 1:53*pm, (E-Mail Removed) wrote:
> On Mon, 22 Dec 2008 10:05:01 -0800 (PST), Ted Byers <(E-Mail Removed)> wrote:
> >Activestate's perl 5.10.0 on WXP.

>
> >I have recently found a couple of my scripts failing with out of
> >memory error messages, notably with XML::Twig.

>
> >This makes no sense since the files being processed are only of the
> >order of a few dozen megabytes to a maximum of 100MB, and the system
> >has 4 GB RAM. *The machine is not especially heavily loaded (e.g.,
> >most of the time, when these scripts fail, they have executed over
> >night with nothing else running except, of course, the OS - WXP).

>
> >Curiously, I have yet to find anything useful in the Activestate
> >documentation for (Active)Perl.5.10.0 regarding memory management. *Is
> >there anything, or any package, that I can use to tell me what is
> >going awry and how to fix it? *I didn't see any likely candidates
> >using PPM and CPAN. *It would be nice if I could have my script tell
> >me how much memory it is using, and for which data structures. *Or
> >must I remain effectively blind and just split the task into smaller
> >tasks until it runs to completion on each?

>
> >Thanks

>
> >Ted

>
> You can check data structure sizes with some Devil:: packages.
>
> use Devel::Size qw( total_size );
> # build an array or create objects.. then
> print total_size(_reference_), "\n";
>
> Twig does its own special memory management. Mostly it builds
> node tree's in memory, but it might have hybrid qualities as well.
> This adds tremendous memory overhead, probably on the order of 10-50 to
> 1, depending on what your doing.
>
> Another consideration is what your doing in the code. Are you making
> temporaries all over the place?
>
> By and large, 100MB's of raw data will translate into a possible Gig or
> more with all the overhead.
>
> sln


Thanks.

Actually, the script giving the most trouble is just using Twig to
parse an XML file and write the data to flat, tab delimited files to
be used to bulk load the data into our DB (but that is done using a
SQL script passed to a command line client in a separate process).

Usually, when this script is executed, there is about half of the 4 GB
of physical memory free, so even with the numbers you give, we ought
to have plenty of memory available. In fact, I have yet to see
anything less than 1.5 GB free memory even when I am working my system
hard (the bottle neck is usually HDD IO, regardless of the language
I'm using).

Thanks again,

Ted
 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      12-22-2008
On Mon, 22 Dec 2008 12:39:01 -0800 (PST), Ted Byers <(E-Mail Removed)> wrote:

>On Dec 22, 1:53*pm, (E-Mail Removed) wrote:
>> On Mon, 22 Dec 2008 10:05:01 -0800 (PST), Ted Byers <(E-Mail Removed)> wrote:
>> >Activestate's perl 5.10.0 on WXP.

>>
>> >I have recently found a couple of my scripts failing with out of
>> >memory error messages, notably with XML::Twig.

>>
>> >This makes no sense since the files being processed are only of the
>> >order of a few dozen megabytes to a maximum of 100MB, and the system
>> >has 4 GB RAM. *The machine is not especially heavily loaded (e.g.,
>> >most of the time, when these scripts fail, they have executed over
>> >night with nothing else running except, of course, the OS - WXP).

>>
>> >Curiously, I have yet to find anything useful in the Activestate
>> >documentation for (Active)Perl.5.10.0 regarding memory management. *Is
>> >there anything, or any package, that I can use to tell me what is
>> >going awry and how to fix it? *I didn't see any likely candidates
>> >using PPM and CPAN. *It would be nice if I could have my script tell
>> >me how much memory it is using, and for which data structures. *Or
>> >must I remain effectively blind and just split the task into smaller
>> >tasks until it runs to completion on each?

>>
>> >Thanks

>>
>> >Ted

>>
>> You can check data structure sizes with some Devil:: packages.
>>
>> use Devel::Size qw( total_size );
>> # build an array or create objects.. then
>> print total_size(_reference_), "\n";
>>
>> Twig does its own special memory management. Mostly it builds
>> node tree's in memory, but it might have hybrid qualities as well.
>> This adds tremendous memory overhead, probably on the order of 10-50 to
>> 1, depending on what your doing.
>>
>> Another consideration is what your doing in the code. Are you making
>> temporaries all over the place?
>>
>> By and large, 100MB's of raw data will translate into a possible Gig or
>> more with all the overhead.
>>
>> sln

>
>Thanks.
>
>Actually, the script giving the most trouble is just using Twig to
>parse an XML file and write the data to flat, tab delimited files to
>be used to bulk load the data into our DB (but that is done using a
>SQL script passed to a command line client in a separate process).
>
>Usually, when this script is executed, there is about half of the 4 GB
>of physical memory free, so even with the numbers you give, we ought
>to have plenty of memory available. In fact, I have yet to see
>anything less than 1.5 GB free memory even when I am working my system
>hard (the bottle neck is usually HDD IO, regardless of the language
>I'm using).
>
>Thanks again,
>
>Ted


Be careful when you say Twig and Parse in the same sentence.
Although I think Twig does its on parsing on some level, it can
use other Parsers if directed. The unique thing about Twig is its
ability to do its own parsing. How it does that I don't know.
What it means is it has the ability to introduce tools outside of
mainstream SAX parsers. How it does that is unknown to me, I'm not
really interested. This results in the ability to do stream as well as
bufferred processing, culminating in a node tree, possible illusionary
object in the hybrid sense. But the node-tree is the result. There are
performance issues, it can also search, like XPath, and replace, then
rewrite xml. This is no small feat.

I am in the process of doing similar tools, but mine captures, does
SAX, does search and replace with regular expressions and some other stuff.
I can tell you its fairly complicated. The reward though is just phenominal.
I manage memory differently. And I do other things than Twig.

Perhaps you could post a skeleton structure of what it is your doing
and I could run it through my routines.

You could however do this all yourself with a fast SAX parser.
The fastest Parser on the planet is Expat, not the Perl interface to it,
which is 6 times slower, but using C/C++.
Unfortunately, all it does is parse, its really a tremendously impaired work,
lacking any tools whatsoever.

sln

 
Reply With Quote
 
Ted Byers
Guest
Posts: n/a
 
      12-22-2008
On Dec 22, 4:12*pm, (E-Mail Removed) wrote:
> On Mon, 22 Dec 2008 12:39:01 -0800 (PST), Ted Byers <(E-Mail Removed)> wrote:
> >On Dec 22, 1:53*pm, (E-Mail Removed) wrote:
> >> On Mon, 22 Dec 2008 10:05:01 -0800 (PST), Ted Byers <r.ted.by...@gmail..com> wrote:
> >> >Activestate's perl 5.10.0 on WXP.

>
> >> >I have recently found a couple of my scripts failing with out of
> >> >memory error messages, notably with XML::Twig.

>
> >> >This makes no sense since the files being processed are only of the
> >> >order of a few dozen megabytes to a maximum of 100MB, and the system
> >> >has 4 GB RAM. *The machine is not especially heavily loaded (e.g.,
> >> >most of the time, when these scripts fail, they have executed over
> >> >night with nothing else running except, of course, the OS - WXP).

>
> >> >Curiously, I have yet to find anything useful in the Activestate
> >> >documentation for (Active)Perl.5.10.0 regarding memory management. *Is
> >> >there anything, or any package, that I can use to tell me what is
> >> >going awry and how to fix it? *I didn't see any likely candidates
> >> >using PPM and CPAN. *It would be nice if I could have my script tell
> >> >me how much memory it is using, and for which data structures. *Or
> >> >must I remain effectively blind and just split the task into smaller
> >> >tasks until it runs to completion on each?

>
> >> >Thanks

>
> >> >Ted

>
> >> You can check data structure sizes with some Devil:: packages.

>
> >> use Devel::Size qw( total_size );
> >> # build an array or create objects.. then
> >> print total_size(_reference_), "\n";

>
> >> Twig does its own special memory management. Mostly it builds
> >> node tree's in memory, but it might have hybrid qualities as well.
> >> This adds tremendous memory overhead, probably on the order of 10-50 to
> >> 1, depending on what your doing.

>
> >> Another consideration is what your doing in the code. Are you making
> >> temporaries all over the place?

>
> >> By and large, 100MB's of raw data will translate into a possible Gig or
> >> more with all the overhead.

>
> >> sln

>
> >Thanks.

>
> >Actually, the script giving the most trouble is just using Twig to
> >parse an XML file and write the data to flat, tab delimited files to
> >be used to bulk load the data into our DB (but that is done using a
> >SQL script passed to a command line client in a separate process).

>
> >Usually, when this script is executed, there is about half of the 4 GB
> >of physical memory free, so even with the numbers you give, we ought
> >to have plenty of memory available. *In fact, I have yet to see
> >anything less than 1.5 GB free memory even when I am working my system
> >hard (the bottle neck is usually HDD IO, regardless of the language
> >I'm using).

>
> >Thanks again,

>
> >Ted

>
> Be careful when you say Twig and Parse in the same sentence.
> Although I think Twig does its on parsing on some level, it can
> use other Parsers if directed. The unique thing about Twig is its
> ability to do its own parsing. How it does that I don't know.
> What it means is it has the ability to introduce tools outside of
> mainstream SAX parsers. How it does that is unknown to me, I'm not
> really interested. This results in the ability to do stream as well as
> bufferred processing, culminating in a node tree, possible illusionary
> object in the hybrid sense. But the node-tree is the result. There are
> performance issues, it can also search, like XPath, and replace, then
> rewrite xml. This is no small feat.
>
> I am in the process of doing similar tools, but mine captures, does
> SAX, does search and replace with regular expressions and some other stuff.
> I can tell you its fairly complicated. The reward though is just phenominal.
> I manage memory differently. And I do other things than Twig.
>
> Perhaps you could post a skeleton structure of what it is your doing
> and I could run it through my routines.
>
> You could however do this all yourself with a fast SAX parser.
> The fastest Parser on the planet is Expat, not the Perl interface to it,
> which is 6 times slower, but using C/C++.
> Unfortunately, all it does is parse, its really a tremendously impaired work,
> lacking any tools whatsoever.
>
> sln


OK, I'll work up a skeleton after dinner (once I'm not on the clock).
Basically, I get a data feed, in well formed XML, and I need to get
that data into our DB. This feed consists of over 100 XML files,
ranging from less than 1 kb to several dozen MB. Since I have no
direct connection between the feed and the DB (which lacks the ability
to import XML data), I resorted to reading the XML and writing tab
delimited files, which the DB can bulk load in a flash (it is PDQ with
this bulk load).

Maybe it is blasphemy here, but C++ is one of my favourite programing
languages.

I respect guys like you and your efforts with XML. You're strong in
an area where I am challenged. Once of the things I always hated
doing was writing code to parse and validate input. My forte is in
making numeric algorithms fast (hence my preference for fortran and C+
+). I believe you when you say it is complicated, and would be very
interested in hearing about the rewards you describe as phenomenal.
Maybe I'll develop a taste for it?

Anyway, this relates to one of the things I find frustrating in modern
application development is that I can define a suite of interrelated
data structures (picture a properly normalized database with dozens
tables). The frustration is that I have to waste time repeating this,
in SQL to set up the tables, in classes in (pick one of C++, Java,
Perl, your favourite OO language) for use in business logic, and then
again in the user interface. And of course, XML can be added to the
mix, for communicating between layers (back end, business layer, GUI,
&c.). The data and relationships in it remain the same and it is
quite tedious to duplicate it in so many languages used in the
different layers.

Thanks

Ted
 
Reply With Quote
 
A. Sinan Unur
Guest
Posts: n/a
 
      12-22-2008
Ted Byers <(E-Mail Removed)> wrote in
news:(E-Mail Removed):

> The frustration is that I have to waste time repeating this,
> in SQL to set up the tables, in classes in (pick one of C++, Java,
> Perl, your favourite OO language) for use in business logic, and then
> again in the user interface. And of course, XML can be added to the
> mix, for communicating between layers


http://www.google.com/search?&q=site...ilywtf.com+xml

--
A. Sinan Unur <(E-Mail Removed)>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Project management / bug management Floris van Haaster ASP .Net 3 09-23-2005 08:36 PM
queue management with "application failure management" pouet Java 2 07-30-2004 09:59 PM
CatOS web management or CiscoView management ? Martin Bilgrav Cisco 1 12-20-2003 01:49 PM
perl memory management - does @array = () free the memory? Matt Oefinger Perl Misc 0 06-25-2003 09:11 PM



Advertisments