Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > optimize log parsing

Reply
Thread Tools

optimize log parsing

 
 
it_says_BALLS_on_your forehead
Guest
Posts: n/a
 
      10-04-2005
i have roughly 400 logs to parse, with the size of each log ranging
from 200 bytes to 263,000,000 bytes.

i have 20 processes that i can allocate to log parsing.

i have a hash that has the log file name as the key, and its size in
bytes as the value. ( i can sort the entries by size, or use an array
if that's better suited to the solution. )

$file{'/var/log/log1'} => 200
$file{'/var/log/log2'} => 210
....
$file{'/var/log/log400'} => 262000000

i would like to feed each process roughly the same amount of data (i.e.
a certain queue of log files to parse, such that the sum of the bytes
of the log files for each process are roughly equivalent).

so let's just say there are 20 buckets, if that will make things more
comprehensible.

i calculate the sum of the sizes of all the logs, and it was about 7.6
GB. divided among 20 buckets, that's roughly 380 MB per bucket.

one caveat is that i can't break up files. these logs are gzipped, and
i can't split them while zipped. i tried unzipping the 262 MB one (that
took about 2 min 40 sec), and then splitting into files that were
600,000 lines each. that took over 10 min. after running (in Unix)
split -l 600000 file small_file, and waiting 10 min, i canceled, and
only 3 files were written so far! it probably would have taken 20
minutes or more to finish, and nothing was even being parsed yet.
unacceptable.

basically this problem can be reduced to figuring out a way to fit 400
discrete numbers into 20 buckets where the count of the numbers does
not matter, but the sum of the numbers should be roughly equal.

one way of accomplishing this is to sort descending, and loop through
the 400 files, putting the largest one in the first bucket, and going
down the line, putting the largest one that will fit into the current
bucket until you get to the end. granted, this is not perfect--perhaps
if you had waited, 2 smaller ones would have brought you closer to your
bucket-size goal. then you go to the next bucket, and do the same
thing. this is a O = n^2 solution though. i was hoping that there was a
more elegant, less brute-force method. perhaps the most perfect
solution would be to take combinations and permutations of all the 400
logs and finding which combinations of groups would result in the most
equal 20 buckets, but that would take way too much computation time.

using the method i outlined (not the combinations one, the one before
that), i would have to go through the list of 400 logs for the 1st
bucket, then let's say 395 logs for the 2nd bucket, etc. until the 20th
bucket. this really isn't that much, and it's unlikely that the number
of logs and processes will expand by orders of magnitude, so is it even
worth it to expend the effort finding this algorithm? any help would be
appreciated

 
Reply With Quote
 
 
 
 
xhoster@gmail.com
Guest
Posts: n/a
 
      10-04-2005
"it_says_BALLS_on_your forehead" <> wrote:
> i have roughly 400 logs to parse, with the size of each log ranging
> from 200 bytes to 263,000,000 bytes.
>
> i have 20 processes that i can allocate to log parsing.
>
> i have a hash that has the log file name as the key, and its size in
> bytes as the value. ( i can sort the entries by size, or use an array
> if that's better suited to the solution. )
>
> $file{'/var/log/log1'} => 200
> $file{'/var/log/log2'} => 210
> ...
> $file{'/var/log/log400'} => 262000000
>
> i would like to feed each process roughly the same amount of data (i.e.
> a certain queue of log files to parse, such that the sum of the bytes
> of the log files for each process are roughly equivalent).
>
> so let's just say there are 20 buckets, if that will make things more
> comprehensible.
>
> i calculate the sum of the sizes of all the logs, and it was about 7.6
> GB. divided among 20 buckets, that's roughly 380 MB per bucket.


I wouldn't bother with this 'bucket' stuff at all. Just do it on the fly.
By addressing the files in the appropriate order (from most work to least
work) you ensure nearly optimal processing. In fact, because there is no
guarantee that the actual time for a file to be processed is exactly
proportional to the file size, balancing on the fly is almost surely going
to be better than some precomputed balancing based on the assumption that
size = time.

$pm = new Parallel::ForkManager(20);

foreach $file (sort {$files{$b}<=>$files{$a}} keys %files) {
my $pid = $pm->start and next;
##Process the $file
$pm->finish; # Terminates the child process
}
$pm->wait_all_children;

....

> basically this problem can be reduced to figuring out a way to fit 400
> discrete numbers into 20 buckets where the count of the numbers does
> not matter, but the sum of the numbers should be roughly equal.
>
> one way of accomplishing this is to sort descending, and loop through
> the 400 files, putting the largest one in the first bucket, and going
> down the line, putting the largest one that will fit into the current
> bucket until you get to the end. granted, this is not perfect--perhaps
> if you had waited, 2 smaller ones would have brought you closer to your
> bucket-size goal. then you go to the next bucket, and do the same
> thing. this is a O = n^2 solution though.


How so? N ln N to sort the files, only has to be done once. Finding the
biggest file that fits the current bucket is ln N using a binary search
into the sorted files. You have 1 N ln N operation, followed by N
operations that are lnN. That comes out to N ln N overall.

But anyway, why try to preordain the order in which the buckets become
empty for full? Start processing the 20 biggest files. When one of them
finishes (regardless of which one it is), start the next file.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
 
Reply With Quote
 
 
 
 
it_says_BALLS_on_your forehead
Guest
Posts: n/a
 
      10-05-2005
> I wouldn't bother with this 'bucket' stuff at all. Just do it on the fly.
> By addressing the files in the appropriate order (from most work to least
> work) you ensure nearly optimal processing. In fact, because there is no
> guarantee that the actual time for a file to be processed is exactly
> proportional to the file size,


you're right here, although i had the idea that i could weight certain
parse methods by multiplying each log size by a coefficient. the
coefficient would be derived by dividing the average speed of a parse
method by the average speed of the slowest parse method.

balancing on the fly is almost surely going
> to be better than some precomputed balancing based on the assumption that
> size = time.
>
> $pm = new Parallel::ForkManager(20);
>
> foreach $file (sort {$files{$b}<=>$files{$a}} keys %files) {
> my $pid = $pm->start and next;
> ##Process the $file
> $pm->finish; # Terminates the child process
> }
> $pm->wait_all_children;
>
> ...


i admit i'm not too familiar with Threads/Forks (the only fork i use is
the one called from system() ). also, i've read that Perl threading
isn't too stable. i've looked on the web a little, but have not found
anything that describes how to do all of the following:

1) instantiate N processes (or threads)
2) start each process parsing a log file
3) the first process that is done looks at a shared or global queue and
pulls the next log file from that and processes until the queue is
empty.


....the current architecture of my log processing is:
1) set a number of processes (e.g. 20)
2) in a loop for the number of processes:
my @rc;
for my $i (1..$num_processes) {
my $command = 'parseLog.pl $i';
$rc[$i] = system($command);
}
# there is a conf file that has an entry for each log, along with a
number in the next field--the number represents the process_id (can be
1 thru 20)
3) in a loop of all the logs, push logs into arrays if the process_id
== the $num_process that was passed along, so each process has an array
of files to process/parse. each process parses each file in its array
of files. problem with this is that maybe each process has a similar
number of logs to process (the process_id just increments for each
line, then wraps around once it reaches the max number of processes i
defined), but some could be huge while others are small, so not very
optimal. one process could have 20 files of 200 bytes each, while the
other could have 20 files of 230 MB each.

since using the system() approach is all i know, the only scenarios i
considered were those that dealt with providing each process with a
balanced amount of data.

if i can get the on-the-fly thing working, that would be preferable.
then sorting would not even be helpful, would it?

>
> > basically this problem can be reduced to figuring out a way to fit 400
> > discrete numbers into 20 buckets where the count of the numbers does
> > not matter, but the sum of the numbers should be roughly equal.
> >
> > one way of accomplishing this is to sort descending, and loop through
> > the 400 files, putting the largest one in the first bucket, and going
> > down the line, putting the largest one that will fit into the current
> > bucket until you get to the end. granted, this is not perfect--perhaps
> > if you had waited, 2 smaller ones would have brought you closer to your
> > bucket-size goal. then you go to the next bucket, and do the same
> > thing. this is a O = n^2 solution though.

>
> How so? N ln N to sort the files, only has to be done once. Finding the
> biggest file that fits the current bucket is ln N using a binary search
> into the sorted files. You have 1 N ln N operation, followed by N
> operations that are lnN. That comes out to N ln N overall.
>


you would be roughly doing m x n iterations, which i believe amounts to
the same thing as big O of n^2. (i AM a bit rusty at algorithms). i
didn't even count the sort, since that's only done once, and i'm not
familiar with the under-the-hood mechanics of Perl's hash value sort.


> But anyway, why try to preordain the order in which the buckets become
> empty for full?


that's the only way i know how for now.

Start processing the 20 biggest files. When one of them
> finishes (regardless of which one it is), start the next file.
>


if i can do the fork thing, why start with the biggest?

> Xho
>
> --
> -------------------- http://NewsReader.Com/ --------------------
> Usenet Newsgroup Service $9.95/Month 30GB


 
Reply With Quote
 
Tassilo v. Parseval
Guest
Posts: n/a
 
      10-05-2005
Also sprach it_says_BALLS_on_your forehead:

>> I wouldn't bother with this 'bucket' stuff at all. Just do it on the fly.
>> By addressing the files in the appropriate order (from most work to least
>> work) you ensure nearly optimal processing. In fact, because there is no
>> guarantee that the actual time for a file to be processed is exactly
>> proportional to the file size,

>
> you're right here, although i had the idea that i could weight certain
> parse methods by multiplying each log size by a coefficient. the
> coefficient would be derived by dividing the average speed of a parse
> method by the average speed of the slowest parse method.
>
> balancing on the fly is almost surely going
>> to be better than some precomputed balancing based on the assumption that
>> size = time.
>>
>> $pm = new Parallel::ForkManager(20);
>>
>> foreach $file (sort {$files{$b}<=>$files{$a}} keys %files) {
>> my $pid = $pm->start and next;
>> ##Process the $file
>> $pm->finish; # Terminates the child process
>> }
>> $pm->wait_all_children;
>>
>> ...

>
> i admit i'm not too familiar with Threads/Forks (the only fork i use is
> the one called from system() ). also, i've read that Perl threading
> isn't too stable.


It arguably still has its flaws, but for a task as easy as yours they're
perfectly usable.

> i've looked on the web a little, but have not found
> anything that describes how to do all of the following:
>
> 1) instantiate N processes (or threads)
> 2) start each process parsing a log file
> 3) the first process that is done looks at a shared or global queue and
> pulls the next log file from that and processes until the queue is
> empty.


Extremely easy with threads. Here's a complete example of a program that
spawns off a number of threads where each thread pulls data from a
global queue until it is empty:

#!/usr/bin/perl -w

use strict;

use threads;
use threads::shared;

use constant NUM_THREADS => 10;

# shared queue visible to every thread
my @queue : shared = 1 .. 30;

# create threads
my @threads;
push @threads, threads->new("run") for 1 .. NUM_THREADS;

# wait for all threads to finish
$_->join for @threads;

# code executed by each thread
sub run {
while (defined(my $element = pop @queue)) {
printf "thread %i: working with %i\n", threads->tid, $element;
# make runtime vary a little for
# demonstration purpose
select undef, undef, undef, rand;
}
}


> ...the current architecture of my log processing is:
> 1) set a number of processes (e.g. 20)
> 2) in a loop for the number of processes:
> my @rc;
> for my $i (1..$num_processes) {
> my $command = 'parseLog.pl $i';
> $rc[$i] = system($command);
> }
> # there is a conf file that has an entry for each log, along with a
> number in the next field--the number represents the process_id (can be
> 1 thru 20)
> 3) in a loop of all the logs, push logs into arrays if the process_id
>== the $num_process that was passed along, so each process has an array
> of files to process/parse. each process parses each file in its array
> of files. problem with this is that maybe each process has a similar
> number of logs to process (the process_id just increments for each
> line, then wraps around once it reaches the max number of processes i
> defined), but some could be huge while others are small, so not very
> optimal. one process could have 20 files of 200 bytes each, while the
> other could have 20 files of 230 MB each.


Don't think in terms of processes. If you're using processes for that
kind of thing you'll need to find a way for them to communicate
(possibly pipes, or maybe shared memory). Threads takes this work off
your shoulders as they can share data in a simple and secure manner.

> since using the system() approach is all i know, the only scenarios i
> considered were those that dealt with providing each process with a
> balanced amount of data.


Bad idea. It may take a different time for each piece of datum. The real
way is to store the work in one central repository and each thread
retrieves its working set from there. When it is done, it fetches the
next unless the central pool is empty.

> if i can get the on-the-fly thing working, that would be preferable.
> then sorting would not even be helpful, would it?


It is not helpful.

> Start processing the 20 biggest files. When one of them
>> finishes (regardless of which one it is), start the next file.
>>

>
> if i can do the fork thing, why start with the biggest?


Don't even worry about sorting. Use threads and have each thread do the
parsing of the files in any order. It makes no difference since it's
truely parallel and asynchronous.

Tassilo
--
use bigint;
$n=71423350343770280161397026330337371139054411854 220053437565440;
$m=-8,;;$_=$n&(0xff)<<$m,,$_>>=$m,,print+chr,,while(($ m+=<=200);
 
Reply With Quote
 
it_says_BALLS_on_your forehead
Guest
Posts: n/a
 
      10-05-2005
>
> Extremely easy with threads. Here's a complete example of a program that
> spawns off a number of threads where each thread pulls data from a
> global queue until it is empty:
>
> #!/usr/bin/perl -w
>
> use strict;
>
> use threads;
> use threads::shared;
>
> use constant NUM_THREADS => 10;
>
> # shared queue visible to every thread
> my @queue : shared = 1 .. 30;
>
> # create threads
> my @threads;
> push @threads, threads->new("run") for 1 .. NUM_THREADS;
>
> # wait for all threads to finish
> $_->join for @threads;
>
> # code executed by each thread
> sub run {
> while (defined(my $element = pop @queue)) {
> printf "thread %i: working with %i\n", threads->tid, $element;
> # make runtime vary a little for
> # demonstration purpose
> select undef, undef, undef, rand;
> }
> }
>
>
>
> Don't think in terms of processes. If you're using processes for that
> kind of thing you'll need to find a way for them to communicate
> (possibly pipes, or maybe shared memory). Threads takes this work off
> your shoulders as they can share data in a simple and secure manner.
>


Tassilo, thank you very much for your help. If i could trouble you once
more for your insight...

What benefit does the thread model have that the following does not?
What drawbacks?

use Parallel::ForkManager;
$pm = new Parallel::ForkManager($MAX_PROCESSES);
foreach $data (@all_data) {
# Forks and returns the pid for the child:
my $pid = $pm->start and next;
... do some work with $data in the child process ...
$pm->finish; # Terminates the child process
}



> use bigint;
> $n=71423350343770280161397026330337371139054411854 220053437565440;
> $m=-8,;;$_=$n&(0xff)<<$m,,$_>>=$m,,print+chr,,while(($ m+=<=200);


 
Reply With Quote
 
it_says_BALLS_on_your forehead
Guest
Posts: n/a
 
      10-05-2005
>
> Extremely easy with threads. Here's a complete example of a program that
> spawns off a number of threads where each thread pulls data from a
> global queue until it is empty:


i'm running Perl 5.6, are threads and threads::shared available/stable
in this version?


>
> #!/usr/bin/perl -w
>
> use strict;
>
> use threads;
> use threads::shared;
>
> use constant NUM_THREADS => 10;
>
> # shared queue visible to every thread
> my @queue : shared = 1 .. 30;
>
> # create threads
> my @threads;
> push @threads, threads->new("run") for 1 .. NUM_THREADS;
>
> # wait for all threads to finish
> $_->join for @threads;
>
> # code executed by each thread
> sub run {
> while (defined(my $element = pop @queue)) {
> printf "thread %i: working with %i\n", threads->tid, $element;


^ what is this?

> # make runtime vary a little for
> # demonstration purpose
> select undef, undef, undef, rand;
> }
> }
>
>


 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      10-05-2005
it_says_BALLS_on_your forehead wrote:
> > printf "thread %i: working with %i\n", threads->tid, $element;

> ^ what is this?


When you don't understand what a function is doing, the best way to
figure it out is to read the documentation. (And I admit, in this
case, the documenation will lead you on a trail, so let's follow it

perldoc -f printf
printf FILEHANDLE FORMAT, LIST
printf FORMAT, LIST
Equivalent to "print FILEHANDLE sprintf(FORMAT,
LIST)",

Okay, so we need to figure out what sprintf does instead:
perldoc -f sprintf
sprintf FORMAT, LIST
Returns a string formatted by the usual "printf"
conventions of the C library function "sprintf".
See below for more details
. . .
Finally, for backward (and we do mean "backward")
compatibility, Perl permits these unnecessary but
widely-supported conversions:

%i a synonym for %d
which, if we look back a bit, we see:
Perl's "sprintf" permits the following universally-
known conversions:
. . .
%d a signed integer, in decimal



So, the original code took the two arguments, threads->tid and
$element, converted them to integers (if necessary), and placed them at
the corresponding %i markers.

Paul Lalli

 
Reply With Quote
 
it_says_BALLS_on_your forehead
Guest
Posts: n/a
 
      10-05-2005
>
>
> So, the original code took the two arguments, threads->tid and
> $element, converted them to integers (if necessary), and placed them at
> the corresponding %i markers.
>
> Paul Lalli


Ahhh...thanks Paul

 
Reply With Quote
 
Tassilo v. Parseval
Guest
Posts: n/a
 
      10-05-2005
Also sprach it_says_BALLS_on_your forehead:

>> Extremely easy with threads. Here's a complete example of a program that
>> spawns off a number of threads where each thread pulls data from a
>> global queue until it is empty:
>>
>> #!/usr/bin/perl -w
>>
>> use strict;
>>
>> use threads;
>> use threads::shared;
>>
>> use constant NUM_THREADS => 10;
>>
>> # shared queue visible to every thread
>> my @queue : shared = 1 .. 30;
>>
>> # create threads
>> my @threads;
>> push @threads, threads->new("run") for 1 .. NUM_THREADS;
>>
>> # wait for all threads to finish
>> $_->join for @threads;
>>
>> # code executed by each thread
>> sub run {
>> while (defined(my $element = pop @queue)) {
>> printf "thread %i: working with %i\n", threads->tid, $element;
>> # make runtime vary a little for
>> # demonstration purpose
>> select undef, undef, undef, rand;
>> }
>> }
>>
>>
>>
>> Don't think in terms of processes. If you're using processes for that
>> kind of thing you'll need to find a way for them to communicate
>> (possibly pipes, or maybe shared memory). Threads takes this work off
>> your shoulders as they can share data in a simple and secure manner.
>>

>
> Tassilo, thank you very much for your help. If i could trouble you once
> more for your insight...
>
> What benefit does the thread model have that the following does not?
> What drawbacks?
>
> use Parallel::ForkManager;
> $pm = new Parallel::ForkManager($MAX_PROCESSES);
> foreach $data (@all_data) {
> # Forks and returns the pid for the child:
> my $pid = $pm->start and next;
> ... do some work with $data in the child process ...
> $pm->finish; # Terminates the child process
> }


Perl threads are said to be inefficient in that a Perl interpreter is
cloned for each thread. However, the Parallel::ForkManager approach
requires a new definition of efficiency.

Consider what happens in code like this:

foreach my $data (1 .. 100_000) {
$pm->start and next;
do_something($data);
$pm->finish;
}

For each of the 100.000 items a new process is spawned off. So unlike
with threads where a thread handles more than one item here each process
is essentially a once-and-throw-away thing. This is because
Parallel::ForkManager provides no infrastructure for sharing data
between these processes.

Just try this little program yourself and see how poorly it performs:

use Parallel::ForkManager;

my $pm = Parallel::ForkManager->new(30);
my @data = 1 ... shift;
foreach(@data) {
my $pid = $pm->start and next;
print "$_\n";
$pm->finish;
}
$pm->wait_all_children;

and compare it to an equivalent threads-implementation where the items
are stored in a shared array:

use threads;
use threads::shared;

use constant NUM_THREADS => 30;

my @queue : shared = 1 .. shift;
my @threads;

push @threads, threads->new("run") for 1 .. NUM_THREADS;
$_->join for @threads;

sub run {
while (defined(my $element = shift @queue)) {
print "$element\n";
}
}

On my machine I get:

ethan@ethan:~$ time perl procs.pl 2000
[...]
real 0m7.002s
user 0m2.500s
sys 0m4.500s

ethan@ethan:~$ time perl thread.pl 2000
[...]
real 0m3.141s
user 0m1.540s
sys 0m1.590s

If you increase the number further to, say, 10000, it's already

[processes]
real 0m45.605s
user 0m24.320s
sys 0m21.130s

[threads]
real 0m8.671s
user 0m1.090s
sys 0m7.580s

So apparently threads scale much better than processes which is no
wonder because threads only require a one-time initialization for
creating the threads whereas Parallel::ForkManager constantly has to
create and terminate new processes.

Besides, I find the code with threads much easier to understand: Each
thread has a function that it executes. With fork it's more tricky.
You first have to find the call to fork() and then one of the code
branches after that is for the child, the other one for the parent.

Tassilo
--
use bigint;
$n=71423350343770280161397026330337371139054411854 220053437565440;
$m=-8,;;$_=$n&(0xff)<<$m,,$_>>=$m,,print+chr,,while(($ m+=<=200);
 
Reply With Quote
 
xhoster@gmail.com
Guest
Posts: n/a
 
      10-05-2005
"it_says_BALLS_on_your forehead" <> wrote:


> > balancing on the fly is almost surely going
> > to be better than some precomputed balancing based on the assumption
> > that size = time.
> >
> > $pm = new Parallel::ForkManager(20);
> >
> > foreach $file (sort {$files{$b}<=>$files{$a}} keys %files) {
> > my $pid = $pm->start and next;
> > ##Process the $file
> > $pm->finish; # Terminates the child process
> > }
> > $pm->wait_all_children;
> >
> > ...

>
> i admit i'm not too familiar with Threads/Forks (the only fork i use is
> the one called from system() ).


One advantage of Parallel::ForkManager is that you don't need to be all
that familiar with fork. It handles most of it for you, as long as you
follow the example of having a "$pm->start and next" near the top of the
loop and a "$pm->finish" at the end of the loop. ($pm->finish actually
calls exit in the child process, so anything between the finish and the end
of the loop is not executed.)


> also, i've read that Perl threading
> isn't too stable.


Forking on linux is rock stable. Forking on Windows is emulated using
threads, but I think it is stable enough for what you are doing.

> i've looked on the web a little, but have not found
> anything that describes how to do all of the following:
>
> 1) instantiate N processes (or threads)


$pm = new Parallel::ForkManager($N);
(Doesn't actual instantiate them, but declares how many you want
instantiated, once you get around to instantiating them.)

> 2) start each process parsing a log file


That is what the "foreach...$pm->start and next" does. It starts a process
on the next log file, unless there are already 20 (or $N) outstanding
processes. In that case, it waits for one of those outstanding processes to
end, then starts a process on the next log file.

> 3) the first process that is done looks at a shared or global queue and
> pulls the next log file from that and processes until the queue is
> empty.


ForkManager uses inversion of control (or at least something like it). The
first slave process that is done finishes. As part of finishing, it
notifies the master process. The master process keeps the queue, and uses
it to start the next process, to replace the one that finished.

....
>
> if i can get the on-the-fly thing working, that would be preferable.
> then sorting would not even be helpful, would it?


I find that it is helpful, especially when the length of the various
tasks vary by orders of magnitude.

Let's say your largest task will take 20 minutes for 1 process/CPU to
process, and all the rest of your tasks combined will take 20 minutes for
the other 19 CPUs to process. If you start the largest task first, then in
20 minutes you are done. If you start the largest task last, then say it
takes 15 minutes before it gets started[1], and then 20 minutes for it to
run, so the time to completeion is 35 minutes.

By starting the tasks it reverse order of run time, it lets the shorter
tasks pack around the longer ones in an efficient way.

(I just did a test on a uniform distribution of run-lengths[2], and
"processing" from long to short took 8:25 while short to long took 9:10. I
think the difference can be larger if the dispersion in runtimes is
greater)

Xho



[1] Since all-but-the-longest take 20 minutes to finish on 19 CPUs, they
will take ~19 minutes to finish on 20 CPUs (since we haven't yet started
the longest task, the shorter ones will have 20 CPUs to use, not 19).
However, the longest one doesn't need to wait for all of the shorter ones
to finish before it starts, it only needs to wait for 381 out of the 400 of
the shorter ones to finish. So I just pulled 15 minutes out of my ass, as
a guess of how long it will take for 381 of them to finish.

[2]
use strict;

use Parallel::ForkManager;

my $pm = new Parallel::ForkManager(10);

## do it with and without the "reverse"
foreach my $file (reverse 1..100) {
my $pid = $pm->start and next;
sleep $file;
$pm->finish; # Terminates the child process
}
$pm->wait_all_children;

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
log and figure out what bits are slow and optimize them. sajuptpm Python 6 02-11-2012 07:53 AM
optimize XML parsing SynapseTesting@gmail.com Perl Misc 2 06-12-2007 04:47 PM
My.Log.Writeexception not writing to Application Event Log. =?Utf-8?B?VG9tIFdpbmdlcnQ=?= ASP .Net 0 01-20-2006 06:41 PM
Urgent Pls: Facing problem in reading Log information from Log file, created by IIS Amratash ASP .Net 0 04-13-2004 09:08 AM
Need help on the Permissions needed to log to Event Log from ASP.NET? Henrik_the_boss ASP .Net 0 11-05-2003 10:14 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57