Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Comparison of two files..

Reply
Thread Tools

Comparison of two files..

 
 
clearguy02@yahoo.com
Guest
Posts: n/a
 
      10-23-2008
Hi folks,

I have two files:
a.txt has 100 unique log_id's (one id per line);
all.txt has 5000 entries (each line has six entries seperated by a
tab and the first entry on each line is the login ID and then full
name, country etc).

Now I want to match both files and get the output with all 100 full
entries and ignore the rest.

Here is the code I am working on.. for some reason, I see more 160
entries instead of the exact 100 entries.

++++++++++++++++++++
my %myconfig = (
input1 => 'a.txt',
input2 => 'all.txt',
matching => 'required.txt',
non_matching => 'ignore.txt',
);

my %fields2;
{
open my $input, '<', $myconfig{input1} or die "Cannot open
'$myconfig{input1}': $!";
while ( <$input> )
{
if ( /^(\w+)/ )
{
$fields2{ $1 } = 1;
}
}
close $input or die "Cannot close '$myconfig{input1}': $!";
}
open my $input, '<', $myconfig{input2} or die "Cannot open
'$myconfig{input2}': $!";
open my $matching, '>', $myconfig{matching} or die "Cannot open
'$myconfig{matching}': $!";
open my $non_matching, '>', $myconfig{non_matching} or die "Cannot
open '$myconfig{non_matching}': $!";

while ( <$input> )
{
if ( /^(\w+)/ )
{
if ( exists $fields2{ $1 } )
{
print $matching "$_\n";
}
else
{
print $non_matching "$_\n";
}
}
}

++++++++++++++++++++++++++++++++++++

What I am doing wrong here? Or is there any alternative way of doing
it?

Thanks,
J
 
Reply With Quote
 
 
 
 
Jim Gibson
Guest
Posts: n/a
 
      10-24-2008
In article
<(E-Mail Removed)>,
<(E-Mail Removed)> wrote:

> Hi folks,
>
> I have two files:
> a.txt has 100 unique log_id's (one id per line);
> all.txt has 5000 entries (each line has six entries seperated by a
> tab and the first entry on each line is the login ID and then full
> name, country etc).
>
> Now I want to match both files and get the output with all 100 full
> entries and ignore the rest.
>
> Here is the code I am working on.. for some reason, I see more 160
> entries instead of the exact 100 entries.


What does "I see more 160 entries ..." mean? Do you mean you see more
than 160 lines output to required.txt when you only expected 100? What
constitutes the excess lines? Are there duplicates in required.txt? Are
there lines in required.txt that do not have corresponding entries in
a.txt?

>
> ++++++++++++++++++++
> my %myconfig = (
> input1 => 'a.txt',
> input2 => 'all.txt',
> matching => 'required.txt',
> non_matching => 'ignore.txt',
> );
>
> my %fields2;
> {
> open my $input, '<', $myconfig{input1} or die "Cannot open
> '$myconfig{input1}': $!";
> while ( <$input> )
> {
> if ( /^(\w+)/ )
> {
> $fields2{ $1 } = 1;
> }
> }
> close $input or die "Cannot close '$myconfig{input1}': $!";
> }
> open my $input, '<', $myconfig{input2} or die "Cannot open
> '$myconfig{input2}': $!";
> open my $matching, '>', $myconfig{matching} or die "Cannot open
> '$myconfig{matching}': $!";
> open my $non_matching, '>', $myconfig{non_matching} or die "Cannot
> open '$myconfig{non_matching}': $!";
>
> while ( <$input> )
> {
> if ( /^(\w+)/ )
> {
> if ( exists $fields2{ $1 } )
> {
> print $matching "$_\n";
> }
> else
> {
> print $non_matching "$_\n";
> }
> }
> }
>
> ++++++++++++++++++++++++++++++++++++
>
> What I am doing wrong here? Or is there any alternative way of doing
> it?


There doesn't appear to be anything wrong with your code (nothing
obvious anyway). While there are certainly alternate ways of doing
this, you seem to have stumbled upon a good solution that uses a hash.
Without seeing your exact input and output data, it is difficult to do
any further analysis of your problem.

If you can answer the questions above, it might help. If you can
isolate the problem to a few anomalous test cases, you can post those.

--
Jim Gibson
 
Reply With Quote
 
 
 
 
Jürgen Exner
Guest
Posts: n/a
 
      10-24-2008
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
>Now I want to match both files and get the output with all 100 full
>entries and ignore the rest.
>
>Here is the code I am working on.. for some reason, I see more 160
>entries instead of the exact 100 entries.

[...]
>What I am doing wrong here? Or is there any alternative way of doing
>it?


Your code logic looks alright to me and I can't spot any glaring issues
with it.
Did you consider, that some IDs might appear more than once in the
second file? If you got duplicates that would explain the mismatch.

jue
 
Reply With Quote
 
Eric Pozharski
Guest
Posts: n/a
 
      10-24-2008
On 2008-10-24, Jim Gibson <(E-Mail Removed)> wrote:
> In article
><(E-Mail Removed)>,
><(E-Mail Removed)> wrote:
>

*SKIP*
>> while ( <$input> )
>> {
>> if ( /^(\w+)/ )
>> {
>> if ( exists $fields2{ $1 } )
>> {
>> print $matching "$_\n";
>> }
>> else
>> {
>> print $non_matching "$_\n";
>> }
>> }
>> }
>>
>> ++++++++++++++++++++++++++++++++++++
>>
>> What I am doing wrong here? Or is there any alternative way of doing
>> it?

>
> There doesn't appear to be anything wrong with your code (nothing
> obvious anyway). While there are certainly alternate ways of doing


Looking at that --

perl -wle '
q|x| =~ m/(x)/; print $1;
q|y| =~ m/(x)/; print $1;'
x
x

I suppose, that OP doesn't show his code.

*CUT*

--
Torvalds' goal for Linux is very simple: World Domination
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Comparison of 2 files and generating the output based on comparison Deepu Perl Misc 1 02-07-2011 03:09 PM
Price Comparison Service. Best Deal. Special Coupon atBest-Price-Comparison.com rapee Digital Photography 0 03-14-2008 06:46 AM
How to compare two SOAP Envelope or two Document or two XML files GenxLogic Java 3 12-06-2006 08:41 PM
How to do unsigned comparison of two longs? Chris Java 12 08-24-2006 02:28 AM
Deep comparison of two objects' graphs Albretch Java 9 09-10-2004 08:12 AM



Advertisments