Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > efficient way to write multiple loops code

Reply
Thread Tools

efficient way to write multiple loops code

 
 
friend.05@gmail.com
Guest
Posts: n/a
 
      10-07-2008
Hi,

I am trying to analyze some data. I have big data files.

I have 3 different files in following format. ($file_1, $file_2,
$file_3)

ID | Time | IP | Code

Following is psuedo code which I am writing. I want to know another
efficient way to do same thing.

open(INFO_1,$file_1);
open(INFO_2,$file_2);
open(INFO_3,$file_3);

@file1_lines = <INFO_1>;
@file2_lines = <INFO_2>;
@file3_lines = <INFO_3>;

foreach $file1_line (@file1_lines)
{
@file1 = split('\|',$file1_line);

#some code

foreach $file2_line (@file2_lines)
{
@file2 = split('\|',$file2_line);

#some code

#if condition between File1 data and File2 data
{

#some code

foreach $file3_line (@file3_lines)
{
@file3 = split('\|',$file3_line);

#some code

#if condition

}

}


}


}



So I am going thorugh each data of file 1 and depending on if data is
present in file2 and again depending on some if condition I look for
that data in file3.


So each data of file1 will have to go through each data of file2 and
each data of file2 will have to go thorugh file3.

So this code is taking lot of time. I want some suggestion for
efficient code.

Can I use Hash Array (by reading file in hash array)



Thanks







 
Reply With Quote
 
 
 
 
xhoster@gmail.com
Guest
Posts: n/a
 
      10-07-2008
"(E-Mail Removed)" <(E-Mail Removed)> wrote:

> foreach $file1_line (@file1_lines)
> {
> @file1 = split('\|',$file1_line);
> foreach $file2_line (@file2_lines)
> {
> @file2 = split('\|',$file2_line);
> #if condition between File1 data and File2 data


....
>
> So this code is taking lot of time. I want some suggestion for
> efficient code.
>
> Can I use Hash Array (by reading file in hash array)


Whether you can use a hash to speed this up depends on whether
"If condition between File1 data and File2 data" can be reduced
to (or protected by) fast hash look ups. We can't answer this for you
without knowing what the nature of that condition is.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
Reply With Quote
 
 
 
 
Tim Greer
Guest
Posts: n/a
 
      10-07-2008
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

>
> I have 3 different files in following format. ($file_1, $file_2,
> $file_3)
>
> ID | Time | IP | Code
>
> Following is psuedo code which I am writing. I want to know another
> efficient way to do same thing.
>
> open(INFO_1,$file_1);
> open(INFO_2,$file_2);
> open(INFO_3,$file_3);
>
> @file1_lines = <INFO_1>;
> @file2_lines = <INFO_2>;
> @file3_lines = <INFO_3>;


>
>
> So I am going thorugh each data of file 1 and depending on if data is
> present in file2 and again depending on some if condition I look for
> that data in file3.
>
>
> So each data of file1 will have to go through each data of file2 and
> each data of file2 will have to go thorugh file3.
>
> So this code is taking lot of time. I want some suggestion for
> efficient code.
>
> Can I use Hash Array (by reading file in hash array)
>
>


The answer very much depends on what #some code is actually doing. Is
the data fixed in the files, what specific checks are you doing? Could
the data be anywhere in a file, inside of a line of data, or are you
trying to match lines from ^ start to $ end of line per file, or are
you doing some other type of processing?
--
Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
Industry's most experienced staff! -- Web Hosting With Muscle!
 
Reply With Quote
 
friend.05@gmail.com
Guest
Posts: n/a
 
      10-07-2008
On Oct 7, 4:48*pm, Tim Greer <(E-Mail Removed)> wrote:
> (E-Mail Removed) wrote:
>
> > I have 3 different files in following format. ($file_1, $file_2,
> > $file_3)

>
> > ID | Time | IP | Code

>
> > Following is psuedo code which I am writing. I want to know another
> > efficient way to do same thing.

>
> > open(INFO_1,$file_1);
> > open(INFO_2,$file_2);
> > open(INFO_3,$file_3);

>
> > @file1_lines = <INFO_1>;
> > @file2_lines = <INFO_2>;
> > @file3_lines = <INFO_3>;

>
> > So I am going thorugh each data of file 1 and depending on if data is
> > present in file2 and again depending on some if condition I look for
> > that data in file3.

>
> > So each data of file1 will have to go through each data of file2 and
> > each data of file2 will have to go thorugh file3.

>
> > So this code is taking lot of time. I want some suggestion for
> > efficient code.

>
> > Can I use Hash Array (by reading file in *hash array)

>
> The answer very much depends on what #some code is actually doing. *Is
> the data fixed in the files, what specific checks are you doing? *Could
> the data be anywhere in a file, inside of a line of data, or are you
> trying to match lines from ^ start to $ end of line per file, or are
> you doing some other type of processing?
> --
> Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
> Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
> and Custom Hosting. *24/7 support, 30 day guarantee, secure servers.
> Industry's most experienced staff! -- Web Hosting With Muscle!- Hide quoted text -
>
> - Show quoted text -


I am checking data from a line not whole line.

I want to check if IP and Code of file1 is present in file2 and if it
is present in file2 then check again if it is there in file3.

I am doing all this processing analyze some data.

Let me know if still it is not clear.

Thanks.
 
Reply With Quote
 
Tim Greer
Guest
Posts: n/a
 
      10-07-2008
(E-Mail Removed) wrote:

> On Oct 7, 4:48*pm, Tim Greer <(E-Mail Removed)> wrote:
>> (E-Mail Removed) wrote:
>>
>> > I have 3 different files in following format. ($file_1, $file_2,
>> > $file_3)

>>
>> > ID | Time | IP | Code

>>
>> > Following is psuedo code which I am writing. I want to know another
>> > efficient way to do same thing.

>>
>> > open(INFO_1,$file_1);
>> > open(INFO_2,$file_2);
>> > open(INFO_3,$file_3);

>>
>> > @file1_lines = <INFO_1>;
>> > @file2_lines = <INFO_2>;
>> > @file3_lines = <INFO_3>;

>>
>> > So I am going thorugh each data of file 1 and depending on if data
>> > is present in file2 and again depending on some if condition I look
>> > for that data in file3.

>>
>> > So each data of file1 will have to go through each data of file2
>> > and each data of file2 will have to go thorugh file3.

>>
>> > So this code is taking lot of time. I want some suggestion for
>> > efficient code.

>>
>> > Can I use Hash Array (by reading file in *hash array)

>>
>> The answer very much depends on what #some code is actually doing.
>> Is the data fixed in the files, what specific checks are you doing?
>> Could the data be anywhere in a file, inside of a line of data, or
>> are you trying to match lines from ^ start to $ end of line per file,
>> or are you doing some other type of processing?
>> --


>>
>> - Show quoted text -

>
> I am checking data from a line not whole line.
>
> I want to check if IP and Code of file1 is present in file2 and if it
> is present in file2 then check again if it is there in file3.
>
> I am doing all this processing analyze some data.
>
> Let me know if still it is not clear.
>
> Thanks.


I'd personally just either create a hash key and value based on it, if
there's not a lot of data involved, and open the next file and check if
it exists that way, which you can check per line with a while loop
against file 2 and 3 (if needed), instead of reading all three files
into arrays. If the files are potentially large, you'll want to avoid
that because it'll read a lot of data into memory that wouldn't be
necessary. I'd open the first file, do a split on a while loop and
create a hash, close it and then open file 2 and do a while loop and
check to see if the hash key/val exists. If not, repeat for file 3.
There is probably a better way than that, but that's a generally better
idea off the top of my head with what you're attempting now.
--
Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
Industry's most experienced staff! -- Web Hosting With Muscle!
 
Reply With Quote
 
Grant
Guest
Posts: n/a
 
      10-07-2008
On Tue, 7 Oct 2008 13:31:50 -0700 (PDT), "(E-Mail Removed)" <(E-Mail Removed)> wrote:

>Hi,
>
>I am trying to analyze some data. I have big data files.
>
>I have 3 different files in following format. ($file_1, $file_2,
>$file_3)
>
>ID | Time | IP | Code
>
>Following is psuedo code which I am writing. I want to know another
>efficient way to do same thing.


Who knows, without seeing your data and requirements, but I'll offer
this optimised database table loader as an example that follows all
the speedup clues from the camel book. Loads a ~100k record data
table followed by a ~250 record table on a slow 500MHz Celeron box
in about 3 seconds:
....
do_log("read: $indexfile");
open FILE, "< $indexfile" or do_die("$indexfile $!");
flock FILE, 1;
$ip2c_cn = 0;
while (<FILE>) {
next if /^$/; next if /^#/; next if /^junkview/; chomp;
( $ip2c_lo[++$ip2c_cn],
$ip2c_hi[$ip2c_cn],
$ip2c_cc[$ip2c_cn]
) = split /\s+/, $_;
}
close FILE;

do_log("read: $namesfile");
open FILE, "< $namesfile" or do_die("$namesfile $!");
flock FILE, 1;
%cc_name = ();
while (<FILE>) {
next if /^$/; next if /^#/; next if /^junkview/; chomp;
my ($cc, $name) = split /:/, $_;
$cc_name{$cc} = $name;
}
close FILE;
}

You can see that as far as possible you avoid useless processing of
irrelevant data, so plan on how to skip (with 'next') over sections
of your loop code rather than using 'if ... processing', avoid complex
regexps, don't chomp records that are about to be discarded.

From log file:
2008-10-07.21:28:17 - read: /etc/ip2cn-server.conf
2008-10-07.21:28:17 - read: /usr/local/share/ip2cn/ip2c-data
2008-10-07.21:28:20 - read: /usr/local/share/ip2cn/ip2c-names
2008-10-07.21:28:20 - listen: localhost:4743

Context: http://bugsplatter.id.au/ip2cn/ip2cn-server.txt

Grant.
--
http://bugsplatter.id.au/
 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      10-07-2008
On Tue, 7 Oct 2008 13:31:50 -0700 (PDT), "(E-Mail Removed)" <(E-Mail Removed)> wrote:

>Hi,
>
>I am trying to analyze some data. I have big data files.
>
>I have 3 different files in following format. ($file_1, $file_2,
>$file_3)
>
>ID | Time | IP | Code
>
>Following is psuedo code which I am writing. I want to know another
>efficient way to do same thing.
>
>open(INFO_1,$file_1);
>open(INFO_2,$file_2);
>open(INFO_3,$file_3);
>
>@file1_lines = <INFO_1>;
>@file2_lines = <INFO_2>;
>@file3_lines = <INFO_3>;
>
>foreach $file1_line (@file1_lines)
>{
> @file1 = split('\|',$file1_line);
>
> #some code
>
> foreach $file2_line (@file2_lines)
> {
> @file2 = split('\|',$file2_line);
>
> #some code
>
> #if condition between File1 data and File2 data
> {
>
> #some code
>
> foreach $file3_line (@file3_lines)
> {
> @file3 = split('\|',$file3_line);
>
> #some code
>
> #if condition
>
> }
>
> }
>
>
> }
>
>
>}
>
>
>
>So I am going thorugh each data of file 1 and depending on if data is
>present in file2 and again depending on some if condition I look for
>that data in file3.
>
>
>So each data of file1 will have to go through each data of file2 and
>each data of file2 will have to go thorugh file3.
>
>So this code is taking lot of time. I want some suggestion for
>efficient code.
>
>Can I use Hash Array (by reading file in hash array)
>


Nobody knows the impact of any pseudo code, or what data that
it process is. There is no generalization to be sought.

The best you can do, through trial and error, is benchmark
it yourself:

use Benchmark ':hireswallclock';
my $t0 = new Benchmark;

{{{{ code block}}}

my $t1 = new Benchmark;
my $tdif = timediff($t1, $t0);
print STDERR "the code took:",timestr($tdif),"\n";

sln

 
Reply With Quote
 
Ilya Zakharevich
Guest
Posts: n/a
 
      10-07-2008
[A complimentary Cc of this posting was sent to
(E-Mail Removed)
<(E-Mail Removed)>], who wrote in article <(E-Mail Removed)>:

Nobody else commented on that yet:

> @file1_lines = <INFO_1>;
> @file2_lines = <INFO_2>;
> @file3_lines = <INFO_3>;
>
> foreach $file1_line (@file1_lines)
> {
> @file1 = split('\|',$file1_line);


> foreach $file2_line (@file2_lines)
> {
> @file2 = split('\|',$file2_line);


This split is done again and again, once per every line of INFO_1.
The result is going to be the same anyway. Better move it outside of
the loop

@file2_fields = map [split '\|', $_], @file2_lines;

if you have enough memory. Likewise for other stuff.

Hope this helps,
Ilya
 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      10-08-2008
On Tue, 07 Oct 2008 22:20:06 GMT, (E-Mail Removed) wrote:

>On Tue, 7 Oct 2008 13:31:50 -0700 (PDT), "(E-Mail Removed)" <(E-Mail Removed)> wrote:
>
>>Hi,
>>
>>I am trying to analyze some data. I have big data files.
>>
>>I have 3 different files in following format. ($file_1, $file_2,
>>$file_3)
>>
>>ID | Time | IP | Code
>>
>>Following is psuedo code which I am writing. I want to know another
>>efficient way to do same thing.
>>
>>open(INFO_1,$file_1);
>>open(INFO_2,$file_2);
>>open(INFO_3,$file_3);
>>
>>@file1_lines = <INFO_1>;
>>@file2_lines = <INFO_2>;
>>@file3_lines = <INFO_3>;
>>
>>foreach $file1_line (@file1_lines)
>>{
>> @file1 = split('\|',$file1_line);
>>
>> #some code
>>
>> foreach $file2_line (@file2_lines)
>> {
>> @file2 = split('\|',$file2_line);
>>
>> #some code
>>
>> #if condition between File1 data and File2 data
>> {
>>
>> #some code
>>
>> foreach $file3_line (@file3_lines)
>> {
>> @file3 = split('\|',$file3_line);
>>
>> #some code
>>
>> #if condition
>>
>> }
>>
>> }
>>
>>
>> }
>>
>>
>>}
>>
>>
>>
>>So I am going thorugh each data of file 1 and depending on if data is
>>present in file2 and again depending on some if condition I look for
>>that data in file3.
>>
>>
>>So each data of file1 will have to go through each data of file2 and
>>each data of file2 will have to go thorugh file3.
>>
>>So this code is taking lot of time. I want some suggestion for
>>efficient code.
>>
>>Can I use Hash Array (by reading file in hash array)
>>

>
>Nobody knows the impact of any pseudo code, or what data that
>it process is. There is no generalization to be sought.
>
>The best you can do, through trial and error, is benchmark
>it yourself:
>
>use Benchmark ':hireswallclock';
>my $t0 = new Benchmark;
>
>{{{{ code block}}}
>
>my $t1 = new Benchmark;
>my $tdif = timediff($t1, $t0);
>print STDERR "the code took:",timestr($tdif),"\n";
>
>sln


Well, if it were my code, I would know exactly how to do it without benchmarks.
But you don't know yourself it seams. Do you?
Instead, you post phoney PSEUDO code as if you know something, which you don't.
Yet put the burdon on the sucker who is stupid enough to respond to you.

Outta here... ignant

sln

 
Reply With Quote
 
friend.05@gmail.com
Guest
Posts: n/a
 
      10-08-2008
On Oct 8, 4:18*am, (E-Mail Removed) wrote:
> On Tue, 07 Oct 2008 22:20:06 GMT, (E-Mail Removed) wrote:
> >On Tue, 7 Oct 2008 13:31:50 -0700 (PDT), "(E-Mail Removed)" <(E-Mail Removed)> wrote:

>
> >>Hi,

>
> >>I am trying to analyze some data. I have big data files.

>
> >>I have 3 different files in following format. ($file_1, $file_2,
> >>$file_3)

>
> >>ID | Time | IP | Code

>
> >>Following is psuedo code which I am writing. I want to know another
> >>efficient way to do same thing.

>
> >>open(INFO_1,$file_1);
> >>open(INFO_2,$file_2);
> >>open(INFO_3,$file_3);

>
> >>@file1_lines = <INFO_1>;
> >>@file2_lines = <INFO_2>;
> >>@file3_lines = <INFO_3>;

>
> >>foreach $file1_line (@file1_lines)
> >>{
> >> * * * * @file1 = split('\|',$file1_line);

>
> >> * * * * #some code

>
> >> * * * * foreach $file2_line (@file2_lines)
> >> * * * * {
> >> * * * * * * * * @file2 = split('\|',$file2_line);

>
> >> * * * * * * * * #some code

>
> >> * * * * * * * * #if condition between File1 data and File2 data
> >> * * * * * * * * {

>
> >> * * * * * * * * *#some code

>
> >> * * * * * * * * * * * * * *foreach $file3_line (@file3_lines)
> >> * * * * * * * * * * * * * *{
> >> * * * * * * * * * * * * * * * * * * @file3 = split('\|',$file3_line);

>
> >> * * * * * * * * * * * * * * * * * * #some code

>
> >> * * * * * * * * * * * * * * * * * *#if condition

>
> >> * * * * * * * * * * * * * *}

>
> >> * * * * * * * * }

>
> >> * * * * }

>
> >>}

>
> >>So I am going thorugh each data of file 1 and depending on if data is
> >>present in file2 and again depending on some if condition I look for
> >>that data in file3.

>
> >>So each data of file1 will have to go through each data of file2 and
> >>each data of file2 will have to go thorugh file3.

>
> >>So this code is taking lot of time. I want some suggestion for
> >>efficient code.

>
> >>Can I use Hash Array (by reading file in *hash array)

>
> >Nobody knows the impact of any pseudo code, or what data that
> >it process is. There is no generalization to be sought.

>
> >The best you can do, through trial and error, is benchmark
> >it yourself:

>
> >use Benchmark ':hireswallclock';
> >my $t0 = new Benchmark;

>
> >{{{{ code block}}}

>
> >my $t1 = new Benchmark;
> >my $tdif = timediff($t1, $t0);
> >print STDERR "the code took:",timestr($tdif),"\n";

>
> >sln

>
> Well, if it were my code, I would know exactly how to do it without benchmarks.
> But you don't know yourself it seams. Do you?
> Instead, you post phoney PSEUDO code as if you know something, which you don't.
> Yet put the burdon on the sucker who is stupid enough to respond to you.
>
> Outta here... ignant
>
> sln- Hide quoted text -
>
> - Show quoted text -


Thanks to all for replying.

Can I use Hash even if I don't have unique key ? Because in my data I
need IP and Code which are not necessary to be unique.

Below is my code:

open(INFO_1,$file_1);
open(INFO_2,$file_2);
open(INFO_3,$file_3);


@file1_lines = <INFO_1>;
@file2_lines = <INFO_2>;
@file3_lines = <INFO_3>;


foreach $file1_line (@file1_lines)
{
@file1 = split('\|',$file1_line);
$file1_ip = $file[2];
$file2_code = $file[3];

foreach $file2_line (@file2_lines)
{
@file2 = split('\|',$file2_line);
$file2_ip = $file[2];
$file2_code = $file[3];

if($file1_ip eq $file2_ip)
{
$flag = 1;
if($file1_code eq $file2_code)
{
$r_flag = 0;

foreach $file3_line (@file3_lines)
{
@file3 = split('\|',$file3_line);
$file3_ip = $file[2];
$file3_code = $file[3];

if(($file1_ip eq $file3_ip) &&
($file1_code eq $file3_code))
{
#some flag
}

}
#depending on flag I increment some counter
}

}
}
#depending on flag I increment some counter
}


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Loops with loops using html-template Me Perl Misc 2 01-12-2006 05:07 PM
How can I write the most efficient code about this problem? storyGerald@gmail.com C++ 10 12-06-2005 02:09 AM
Is there a better, more efficient way to write this program? brian.digipimp@gmail.com C++ 8 11-10-2005 09:59 PM
Loops in stored proc or webcode? Which is most efficient? Roy ASP .Net 14 03-18-2005 11:00 AM
Most efficient way to return multiple values to client Michael Hetrick ASP .Net Web Services 3 09-30-2003 11:22 AM



Advertisments