Velocity Reviews > Perl > counting number of uniques in a multidimensional array column

counting number of uniques in a multidimensional array column

Jack
Guest
Posts: n/a

 07-25-2006
Hi I have data in a multidim array and DONT want to create another
array representing just 1 column from this multidim array.. I want to
determine the number of uniques, I did this easily with just a regular
array (code below), does anyone know how to do this over just 1 column
of a multidim array (in other words, number of uniques across 1 column
of the multi dim defined as: multidim[0][0],multidim[1][0],
multidim[2][0].... etc)

sort @\$columnarray;
@out = grep(\$_ ne \$prev && (\$prev = \$_, 1), @\$columnarray);
if (\$#out == -1) { \$#out = 0; }
\$out = \$#out +1; # makes \$#out of 0 = 1 so it gets counted !
push @distinctcounts, \$out;

Thanks!
Jack

Paul Lalli
Guest
Posts: n/a

 07-25-2006
Jack wrote:
> Hi I have data in a multidim array and DONT want to create another
> array representing just 1 column from this multidim array..

Why?

> I want to
> determine the number of uniques, I did this easily with just a regular
> array (code below), does anyone know how to do this over just 1 column
> of a multidim array (in other words, number of uniques across 1 column
> of the multi dim defined as: multidim[0][0],multidim[1][0],
> multidim[2][0].... etc)
>
> sort @\$columnarray;

This does nothing at all. You are clearly not enabling warnings in

> @out = grep(\$_ ne \$prev && (\$prev = \$_, 1), @\$columnarray);
> if (\$#out == -1) { \$#out = 0; }

"if @out is empty, create one undefined element in @out"

Why? Under what circumstances do you believe @out could ever be empty
from the above code (assuming you had sorted @\$columnarray correctly)?
Well, I suppose it could be if your array had nothing but undefined
values in it. Is that the circumstance you were going for?

> \$out = \$#out +1; # makes \$#out of 0 = 1 so it gets counted !

Now you're assigning \$out to be the size of @out. Why not just use the
size of @out?

> push @distinctcounts, \$out;

The above code looks remarkably like the first answer to
perldoc -q duplicate

Have you seen the other answers?

Have you considered using map to generate a list of the first "columns"
of each array, and using that as your list rather than @{\$columnarray}
?

map { \$_->[0] } @\$columnarray

will give you that.

Paul Lalli

Jack
Guest
Posts: n/a

 07-25-2006

Paul Lalli wrote:
> Jack wrote:
> > Hi I have data in a multidim array and DONT want to create another
> > array representing just 1 column from this multidim array..

>
> Why?
>
> > I want to
> > determine the number of uniques, I did this easily with just a regular
> > array (code below), does anyone know how to do this over just 1 column
> > of a multidim array (in other words, number of uniques across 1 column
> > of the multi dim defined as: multidim[0][0],multidim[1][0],
> > multidim[2][0].... etc)
> >
> > sort @\$columnarray;

>
> This does nothing at all. You are clearly not enabling warnings in
>
> > @out = grep(\$_ ne \$prev && (\$prev = \$_, 1), @\$columnarray);
> > if (\$#out == -1) { \$#out = 0; }

>
> "if @out is empty, create one undefined element in @out"
>
> Why? Under what circumstances do you believe @out could ever be empty
> from the above code (assuming you had sorted @\$columnarray correctly)?
> Well, I suppose it could be if your array had nothing but undefined
> values in it. Is that the circumstance you were going for?
>
> > \$out = \$#out +1; # makes \$#out of 0 = 1 so it gets counted !

>
> Now you're assigning \$out to be the size of @out. Why not just use the
> size of @out?
>
> > push @distinctcounts, \$out;

>
> The above code looks remarkably like the first answer to
> perldoc -q duplicate
>
> Have you seen the other answers?
>
> Have you considered using map to generate a list of the first "columns"
> of each array, and using that as your list rather than @{\$columnarray}
> ?
>
> map { \$_->[0] } @\$columnarray
>
> will give you that.
>
> Paul Lalli

Just ignore the @\$ (this represents a variable) - assume the code is
this:
sort @columnarray;
@out = grep(\$_ ne \$prev && (\$prev = \$_, 1), @columnarray);
if (\$#out == -1) { \$#out = 0; }
print \$out;

Are you saying the above doesnt work ?? It works great on a single
array. Do you have a better code, if so, what is it ? Also, can you
multidim column with an actual example. Appreciate your response.
Thanks, Jack

Paul Lalli
Guest
Posts: n/a

 07-25-2006
Jack wrote:

> Just ignore the @\$ (this represents a variable)

There was no @\$ in your original snippet, so ignoring it is a no-op.
There was, however, @\$columnarray, which is a perfectly valid array. I
have no idea why you're saying to get rid of it now.

> - assume the code is this:
> sort @columnarray;

Once again, THIS LINE DOES NOTHING. You still have not bothered to
turn warnings on? Why? You are asking for help, help is being given
to you, and you're ignoring it. That's really very annoying.

> @out = grep(\$_ ne \$prev && (\$prev = \$_, 1), @columnarray);
> if (\$#out == -1) { \$#out = 0; }
> print \$out;
>
> Are you saying the above doesnt work ??

I did not say that at all. What part of my post implies that the code
doesn't work? I said that the first line of it does nothing at all,
and the messing about with \$#out is pointless.

> It works great on a single
> array. Do you have a better code, if so, what is it ?

Once again, I point you to the other responses in the FAQ that you
apparently saw to get this code:
perldoc -q duplicate
(Or did you never see that FAQ, and are instead just copy/pasting some
other code you found lying around somewhere?)
Once again, why are you ignoring what I've already told you to do,
preferring instead to believe that I'm just not bothering to help?

> Also, can you
> multidim column with an actual example

I *did*! Why are you ignoring my entire response?! I told you
precisely how to change your example to use a list of the first
columns, rather than a single array. The fact that you ignored that

Really doesn't appear that way.

Paul Lalli

Jack
Guest
Posts: n/a

 07-25-2006

Paul Lalli wrote:
> Jack wrote:
>
> > Just ignore the @\$ (this represents a variable)

>
> There was no @\$ in your original snippet, so ignoring it is a no-op.
> There was, however, @\$columnarray, which is a perfectly valid array. I
> have no idea why you're saying to get rid of it now.
>
> > - assume the code is this:
> > sort @columnarray;

>
> Once again, THIS LINE DOES NOTHING. You still have not bothered to
> turn warnings on? Why? You are asking for help, help is being given
> to you, and you're ignoring it. That's really very annoying.
>
> > @out = grep(\$_ ne \$prev && (\$prev = \$_, 1), @columnarray);
> > if (\$#out == -1) { \$#out = 0; }
> > print \$out;
> >
> > Are you saying the above doesnt work ??

>
> I did not say that at all. What part of my post implies that the code
> doesn't work? I said that the first line of it does nothing at all,
> and the messing about with \$#out is pointless.
>
> > It works great on a single
> > array. Do you have a better code, if so, what is it ?

>
> Once again, I point you to the other responses in the FAQ that you
> apparently saw to get this code:
> perldoc -q duplicate
> (Or did you never see that FAQ, and are instead just copy/pasting some
> other code you found lying around somewhere?)
> Once again, why are you ignoring what I've already told you to do,
> preferring instead to believe that I'm just not bothering to help?
>
> > Also, can you
> > multidim column with an actual example

>
> I *did*! Why are you ignoring my entire response?! I told you
> precisely how to change your example to use a list of the first
> columns, rather than a single array. The fact that you ignored that
>

>
> Really doesn't appear that way.
>
> Paul Lalli

Forgive me if I am limited to some degree. I am just asking if someone
can provide some sample code that works takes \$multidimarray[1][0],
\$multidimarray[2][0], (a column) and produces a distinct count...

I dont know how to take your suggestion of
map { \$_->[0] } @columnarray
and convert that into a solution for that counts the distinct entires
for the first column in a multidimensional array ..

Would you consider elaborating, or perhaps someone who is willing to
help/share.

Thank you,
Jack

xhoster@gmail.com
Guest
Posts: n/a

 07-25-2006
"Jack" <(E-Mail Removed)> wrote:
> Hi I have data in a multidim array and DONT want to create another
> array representing just 1 column from this multidim array.. I want to
> determine the number of uniques, I did this easily with just a regular
> array (code below),

I don't know if the code below actually does work, but I will assume it
does.

> does anyone know how to do this over just 1 column
> of a multidim array (in other words, number of uniques across 1 column
> of the multi dim defined as: multidim[0][0],multidim[1][0],
> multidim[2][0].... etc)

my \$col_number=0; # or whatever column you want
my \$columnarray=[map \$_->[\$col_number], @multidim];

Now procede as before with \$columnarray.

Xho

--
Usenet Newsgroup Service \$9.95/Month 30GB

Ted Zlatanov
Guest
Posts: n/a

 07-25-2006
On 25 Jul 2006, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> Forgive me if I am limited to some degree. I am just asking if someone
> can provide some sample code that works takes \$multidimarray[1][0],
> \$multidimarray[2][0], (a column) and produces a distinct count...
>
> I dont know how to take your suggestion of
> map { \$_->[0] } @columnarray
> and convert that into a solution for that counts the distinct entires
> for the first column in a multidimensional array ..

missing things here and there, everyone has to start somewhere.

That map call will return the first (0) column of the array as a list.

Your original question was how to find unique elements in a column.

You posted:

> sort @\$columnarray;
> @out = grep(\$_ ne \$prev && (\$prev = \$_, 1), @\$columnarray);
> if (\$#out == -1) { \$#out = 0; }
> \$out = \$#out +1; # makes \$#out of 0 = 1 so it gets counted !
> push @distinctcounts, \$out;

The first line does nothing at all. Paul mentioned that too. Use
warnings and strict mode, if possible, to avoid such code. Sort
*returns* the sorted list, it doesn't modify in place.

In addition your 'uniques' code is not very good. It may work in some
cases, but really you should use a hash. Look at 'perldoc -q
duplicates' and 'perldoc perldata' to get started. Actually all of
the perldoc info is good

Here's a (very simple) function to give you the unique items from a
list you pass:

sub uniques
{
my %unique = ();
\$unique{\$_}++ foreach @_;
return keys %unique;
}

Now use it like this:

my @columnarray = ( [1,2,3], [1,2,3], [4,5,6], [7,8,9], );

foreach my \$column (1 .. scalar @{\$columnarray[0]})
{
print "Unique elements in column \$column: ";
print join ', ',
uniques(map { \$_->[\$column-1] }
@columnarray
);
print "\n";
}

I formatted this to be easy to understand, and I tested it with the
data above under

use warnings;
use strict;

and it worked correctly. Please learn from the code posted above - it
shows many useful techniques.

Ted

Paul Lalli
Guest
Posts: n/a

 07-25-2006
Jack wrote:
> Paul Lalli wrote:
> > > @out = grep(\$_ ne \$prev && (\$prev = \$_, 1), @columnarray);

> > > Also, can you
> > > multidim column with an actual example

> >
> > I *did*! Why are you ignoring my entire response?! I told you
> > precisely how to change your example to use a list of the first
> > columns, rather than a single array. The fact that you ignored that
> >

> Forgive me if I am limited to some degree.

Being new to Perl is not something that requires forgiveness. Being
unwilling to put forth effort of your own, and only accepting solutions
that are spoonfed to you, is not worthy of forgiveness.

> I am just asking if someone can provide some sample code

I know exactly what you're asking. I have answered it 3 times now.
The answer is "No, I will not write code for you. I will, however,
give you all the information you need to do it yourself." If that's
not good enough for you, I strongly suggest you hire a consultant.

> that works takes \$multidimarray[1][0],
> \$multidimarray[2][0], (a column) and produces a distinct count...
>
> I dont know how to take your suggestion of
> map { \$_->[0] } @columnarray
> and convert that into a solution for that counts the distinct entires
> for the first column in a multidimensional array ..

I told you to take that expression, and operate on that, rather than on
@columnarray itself. What part of that is confusing to you?

Take that expression right there, and put that where you currently have
'@columnarray' in the first quoted line of this message.

> Would you consider elaborating, or perhaps someone who is willing to
> help/share.

Implying that I am *not* willing to help or share? You have a very
bizarre definition of "help".

*PLONK*

Paul Lalli

Mumia W.
Guest
Posts: n/a

 07-25-2006
On 07/25/2006 01:54 PM, Jack wrote:
> Paul Lalli wrote:
>> [ snipped ]

> [...]
> I dont know how to take your suggestion of
> map { \$_->[0] } @columnarray
> and convert that into a solution for that counts the distinct entires
> for the first column in a multidimensional array ..
>
> Would you consider elaborating, or perhaps someone who is willing to
> help/share.
>
> Thank you,
> Jack
>

Paul Lalli gave you half of the answer. You're supposed to
figure out the other half. The other half is storing the data
in a hash where the keys are the column data returned from the
map, and the values are incremented once for each entry in the
column.

Hashes have a "magical" quality that makes their keys unique.
Using a hash, you can count the number of unique items in an
array, because each key in a hash appears only once.

1: use Data:umper;
2: my @temps = (30, 38, 26, 38, 39);
3: my %hash;
4: for my \$tp (@temps) { \$hash{\$tp} += 1 }
5: print Dumper(\%hash);

Line 4 increments a hash value each time it's found[0] in the
array. Notice that 38 only appears once in the hash, despite
the fact that it appears twice in @temps.

:-O UNTESTED CODE :-O

--
[0] Simplified language. Untrue.

Jack
Guest
Posts: n/a

 07-25-2006

Ted Zlatanov wrote:
> On 25 Jul 2006, (E-Mail Removed) wrote:
>
> > Forgive me if I am limited to some degree. I am just asking if someone
> > can provide some sample code that works takes \$multidimarray[1][0],
> > \$multidimarray[2][0], (a column) and produces a distinct count...
> >
> > I dont know how to take your suggestion of
> > map { \$_->[0] } @columnarray
> > and convert that into a solution for that counts the distinct entires
> > for the first column in a multidimensional array ..

>
> useful, I'm just restating it and elaborating. Don't feel bad about
> missing things here and there, everyone has to start somewhere.
>
> That map call will return the first (0) column of the array as a list.
>
> Your original question was how to find unique elements in a column.
>
> You posted:
>
> > sort @\$columnarray;
> > @out = grep(\$_ ne \$prev && (\$prev = \$_, 1), @\$columnarray);
> > if (\$#out == -1) { \$#out = 0; }
> > \$out = \$#out +1; # makes \$#out of 0 = 1 so it gets counted !
> > push @distinctcounts, \$out;

>
> The first line does nothing at all. Paul mentioned that too. Use
> warnings and strict mode, if possible, to avoid such code. Sort
> *returns* the sorted list, it doesn't modify in place.
>
> In addition your 'uniques' code is not very good. It may work in some
> cases, but really you should use a hash. Look at 'perldoc -q
> duplicates' and 'perldoc perldata' to get started. Actually all of
> the perldoc info is good
>
> Here's a (very simple) function to give you the unique items from a
> list you pass:
>
> sub uniques
> {
> my %unique = ();
> \$unique{\$_}++ foreach @_;
> return keys %unique;
> }
>
> Now use it like this:
>
> my @columnarray = ( [1,2,3], [1,2,3], [4,5,6], [7,8,9], );
>
> foreach my \$column (1 .. scalar @{\$columnarray[0]})
> {
> print "Unique elements in column \$column: ";
> print join ', ',
> uniques(map { \$_->[\$column-1] }
> @columnarray
> );
> print "\n";
> }
>
> I formatted this to be easy to understand, and I tested it with the
> data above under
>
> use warnings;
> use strict;
>
> and it worked correctly. Please learn from the code posted above - it
> shows many useful techniques.
>
> Ted

Ted, great job that works killer... can you tell me, I want to exclude
from the counting any null values, I tried adding this without
success..any reply would be appreciated..thanks, Jack

sub uniques
{
my %unique = ();
if (@_ != /^\z/) { \$unique{\$_}++ foreach @_ } ;
return keys %unique;
}