Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > deleting duplicates in array using references

Reply
Thread Tools

deleting duplicates in array using references

 
 
billb
Guest
Posts: n/a
 
      07-30-2007
i have a multidimensional array, but i want to delete duplicate
entries based on the first element of each 'row'.

my array is:

my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
0.010] , [UK0012821, H4060, H, 0.010] );

and I want to end up with
( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
[UK0012821, H4030, H, 0.010] )

(no real preference in which row is dropped...just on a first come
first served basis.)

i.e. take out the duplicate codes based on the first element of each
row $array[$row] -> [0]

i looked into splice() function based on the index but not sure this
is the best way or the syntax for this?

splice (@array , $row, 1); ?

thanks.

 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      07-30-2007
On Jul 30, 2:37 pm, billb <(E-Mail Removed)> wrote:
> i have a multidimensional array, but i want to delete duplicate
> entries based on the first element of each 'row'.


$ perldoc -q duplicate
Found in /opt2/Perl5_8_4/lib/perl5/5.8.4/pod/perlfaq4.pod
How can I remove duplicate elements from a list or array?

> my array is:
>
> my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
> 0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
> 0.010] , [UK0012821, H4060, H, 0.010] );
>
> and I want to end up with
> ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
> [UK0012821, H4030, H, 0.010] )
>
> (no real preference in which row is dropped...just on a first come
> first served basis.)
>
> i.e. take out the duplicate codes based on the first element of each
> row $array[$row] -> [0]


$ perl -MData:umper -e'
my @array = (
[UK9004411, A140, B, 0.040] ,
[UK0030239, H7140, H, 0.030] ,
[UK0030239, S1393, M1, 0.030] ,
[UK0012821, H4030, H, 0.010] ,
[UK0012821, H4060, H, 0.010] ,
);
my %seen;
my @nodups = grep { !$seen{$_->[0]}++ } @array;
print Dumper(\@nodups);
'
$VAR1 = [
[
'UK9004411',
'A140',
'B',
'0.04'
],
[
'UK0030239',
'H7140',
'H',
'0.03'
],
[
'UK0012821',
'H4030',
'H',
'0.01'
]
];

> i looked into splice() function based on the index but not sure this
> is the best way or the syntax for this?
>
> splice (@array , $row, 1); ?


splice() is fine for removing the elements once you know which ones
you want to remove, but it's useless for actually finding which
elements to remove.

Paul Lalli

 
Reply With Quote
 
 
 
 
billb
Guest
Posts: n/a
 
      07-30-2007
On 30 Jul, 19:46, Paul Lalli <(E-Mail Removed)> wrote:
> On Jul 30, 2:37 pm, billb <(E-Mail Removed)> wrote:
>
> > i have a multidimensional array, but i want to delete duplicate
> > entries based on the first element of each 'row'.

>
> $ perldoc -q duplicate
> Found in /opt2/Perl5_8_4/lib/perl5/5.8.4/pod/perlfaq4.pod
> How can I remove duplicate elements from a list or array?
>
> > my array is:

>
> > my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
> > 0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
> > 0.010] , [UK0012821, H4060, H, 0.010] );

>
> > and I want to end up with
> > ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
> > [UK0012821, H4030, H, 0.010] )

>
> > (no real preference in which row is dropped...just on a first come
> > first served basis.)

>
> > i.e. take out the duplicate codes based on the first element of each
> > row $array[$row] -> [0]

>
> $ perl -MData:umper -e'
> my @array = (
> [UK9004411, A140, B, 0.040] ,
> [UK0030239, H7140, H, 0.030] ,
> [UK0030239, S1393, M1, 0.030] ,
> [UK0012821, H4030, H, 0.010] ,
> [UK0012821, H4060, H, 0.010] ,
> );
> my %seen;
> my @nodups = grep { !$seen{$_->[0]}++ } @array;
> print Dumper(\@nodups);
> '
> $VAR1 = [
> [
> 'UK9004411',
> 'A140',
> 'B',
> '0.04'
> ],
> [
> 'UK0030239',
> 'H7140',
> 'H',
> '0.03'
> ],
> [
> 'UK0012821',
> 'H4030',
> 'H',
> '0.01'
> ]
> ];
>
> > i looked into splice() function based on the index but not sure this
> > is the best way or the syntax for this?

>
> > splice (@array , $row, 1); ?

>
> splice() is fine for removing the elements once you know which ones
> you want to remove, but it's useless for actually finding which
> elements to remove.
>
> Paul Lalli


ah, very simple and very fast as well! I'll have to understand how
this is working. It uses a hash I see. Many thanks.

 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      07-31-2007
On Jul 30, 5:18 pm, billb <(E-Mail Removed)> wrote:
> On 30 Jul, 19:46, Paul Lalli <(E-Mail Removed)> wrote:
>
>
>
>
>
> > On Jul 30, 2:37 pm, billb <(E-Mail Removed)> wrote:

>
> > > i have a multidimensional array, but i want to delete duplicate
> > > entries based on the first element of each 'row'.


> > > my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
> > > 0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
> > > 0.010] , [UK0012821, H4060, H, 0.010] );

>
> > > and I want to end up with
> > > ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
> > > [UK0012821, H4030, H, 0.010] )


> > my %seen;
> > my @nodups = grep { !$seen{$_->[0]}++ } @array;


> ah, very simple and very fast as well! I'll have to understand how
> this is working. It uses a hash I see.


It helps if you expand it out to remove all the "shortcuts"

my %seen;
my @nodups;
foreach my $elem (@array) {
if (! $seen{$elem->[0]}) {
push @nodups, $elem;
}
$seen{$elem->[0]}++;
}

So we're looping through the 2d array, and we check to see if the
first element of the current array reference has been "seen" yet. If
not, we add this array reference to our list of no duplicates. Then
we increment the number of times we've "seen" this element, so that if
the same element is seen again, we won't add it next time.

The shortcuts:
* a foreach-if-push combination is equivalent to grep(). grep selects
only those elements from a list for which the if condition holds.
* in the grep, $_ is used to represent the current element of the
array (rather than $elem as in the above expansion)
* The ++ operator is applied to the same expression as when we're
checking the current value of $seen{$_->[0]}, because a post-fix ++
increments the value *after* returning that value. That is:
$x = $foo++;
is equivalent to:
$x = $foo;
$foo++;

In contrast,
$x = ++$foo;
is equivalent to
$foo++;
$x = $foo;

> Many thanks.


You're welcome

Paul Lalli

 
Reply With Quote
 
anno4000@radom.zrz.tu-berlin.de
Guest
Posts: n/a
 
      07-31-2007
billb <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> On 30 Jul, 19:46, Paul Lalli <(E-Mail Removed)> wrote:
> > On Jul 30, 2:37 pm, billb <(E-Mail Removed)> wrote:
> >
> > > i have a multidimensional array, but i want to delete duplicate

^^^^^^^^^
[Paul's solution snipped]

> ah, very simple and very fast as well! I'll have to understand how
> this is working. It uses a hash I see. Many thanks.


On hearing the word "duplicate", like a Pavlovian dog a Perl programmer
goes "Hash, hash, hash...". The word "unique" hash the same effect.

Anno
 
Reply With Quote
 
anno4000@radom.zrz.tu-berlin.de
Guest
Posts: n/a
 
      07-31-2007
billb <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> On 30 Jul, 19:46, Paul Lalli <(E-Mail Removed)> wrote:
> > On Jul 30, 2:37 pm, billb <(E-Mail Removed)> wrote:
> >
> > > i have a multidimensional array, but i want to delete duplicate

^^^^^^^^^
[Paul's solution snipped]

> ah, very simple and very fast as well! I'll have to understand how
> this is working. It uses a hash I see. Many thanks.


On hearing the word "duplicate", like a Pavlovian dog a Perl programmer
goes "Hash, hash, hash...". The word "unique" has the same effect.

Anno

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: duplicates in array user923005 C Programming 1 05-22-2007 11:41 PM
duplicates in array ak C Programming 13 05-22-2007 11:01 PM
remove duplicates of array of object based on a attribute senthil Ruby 10 03-06-2007 11:53 PM
removing duplicates from a string array Fred Java 15 03-12-2005 12:32 AM
identify duplicates in an array and number of times duplicated Michelle ASP General 8 08-02-2003 10:46 AM



Advertisments