Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Script "terminates" when processing large numbers of files

Reply
Thread Tools

Script "terminates" when processing large numbers of files

 
 
Scott Stark
Guest
Posts: n/a
 
      08-02-2003
Hi, I'm running a script that reads through large numbers of html
files (1500-2000 or so) in each of about 20 directories, searching for
strings in the files.

For some reason the script quits midway through, and I get a
"Terminated" message. It quits while checking a batch of files at a
different point in the file system every time, so I know it's not a
code error. In fact if I limit the total number of files processed to
a couple of hundred, the script runs fine.

Is this some kind of memory problem or other resource problem? I've
tried breaking up each directory pass into separate subroutine calls,
and even broken up the individual directory lists so that they process
in smaller batches of 300 each, thinking that might free up resources.
Something like this:

foreach $d (@dirs){
my @files = glob("$basedir/$d/*.html $basedir/$d/*.htm");
if(scalar(@files) > 300){
... # make smaller lists called my(@shortList) of 300 each
search_files(@shortList);
}
}

sub search_files {
my @files = @_;
... # search through each file
}

I've tried running the script with perl -d and #! /usr/bin/perl -w
with no errors and get the same results, but at different points in
the file system.

Any thoughts? If it's a memory problem, is there some way to free up
memory?

thanks,
Scott
 
Reply With Quote
 
 
 
 
Scott Stark
Guest
Posts: n/a
 
      08-03-2003
Tim Heaney <> wrote in message news:<>...
> Perhaps the glob is hitting the expansion limit. Try reading the
> directory yourself...something like


Hi Tim, well that didn't work either. I've done some further testing
and discovered that the "termination" is happening not in the glob (or
read) but in the search_files() subroutine, always (as far as I can
gather) after it's closed one file in the @files list and before it
opens the next.

Here's an abbreviated version of the search_files subroutine that's
called for each directory:

sub search_files{
my(@files) = @_;
my(@searchStrings) = split(/\s+/,param('terms'));
foreach $f (@files){
open(F, "$f") || on_error("Can't open file $f for reading");
while($line=<F>){
for($s=0;$s<scalar(@searchStrings);$s++){
$line =~ s/($searchStrings[$s])/<font
color=\"blue\"><b>$1<\/b><\/font>/gi
and $found{$searchStrings[$s]}=$line
and next if($line =~/$searchStrings[$s]/i);
}
}
close(F);
}

Not much unusual going on here -- perhaps Gregory is correct, there's
a time limit? The whole thing never takes more than a couple of
minutes though. And where it stops varies every time.

thanks
Scott
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Processing large files with TextFieldParser Jon Spivey ASP .Net 3 12-01-2009 10:02 PM
Quicken copy time when processing large files Clement Ow Ruby 0 04-04-2008 10:16 AM
processing large numbers/values/figures Lukas Ruf Perl Misc 22 07-12-2006 07:47 PM
Backing Up Large Files..Or A Large Amount Of Files Scott D. Weber For Unuathorized Thoughts Inc. Computer Support 1 09-19-2003 07:28 PM
Processing file input for large files[100+ MB] - Performance suggestions? Maxim ASP .Net 0 07-07-2003 05:31 AM



Advertisments