Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > How do you sort a 2D array with column headers?

Reply
Thread Tools

How do you sort a 2D array with column headers?

 
 
Dennis@NoSpam.com
Guest
Posts: n/a
 
      06-28-2003
I have a numerical array consisting of 5000 rows and 30 columns. The first row
consists of 30 ascii column labels for example L1,L2.....L30. I would like to
sort the column with the header L5 in ascending order leaving the header labels
intact on the first row.

I'm familiar with code

@array =sort { $a->[1] <=> $b->[1]} @array;

and I have read the perdoc -f sort.

But the code above doesn't allow me to sort the array by column labels.

How would I do that?


Any help would be appreciated.

Dennis
 
Reply With Quote
 
 
 
 
Greg Bacon
Guest
Posts: n/a
 
      06-28-2003
In article <(E-Mail Removed)>,
<(E-Mail Removed)> wrote:

: I have a numerical array consisting of 5000 rows and 30 columns. The
: first row consists of 30 ascii column labels for example
: L1,L2.....L30. I would like to sort the column with the header L5 in
: ascending order leaving the header labels intact on the first row.

Assuming @array is an array of rows, you could use something similar
to the code below.

[14:53] ant% cat try
#! /usr/local/bin/perl

use warnings;
use strict;

sub find_column_index {
my $a = shift;
my $col = shift;

my $header = $a->[0];
my $colidx = 0;
for (@$header) {
last if $_ eq $col;
++$colidx;
}

$colidx >= @$header ? () : $colidx;
}

sub sort_by_column {
my $m = shift;
my $col = shift;

return unless ref($m) && @$m && $col;

my $colidx = find_column_index $m, $col;
return unless defined $colidx;

@{$m}[1..$#$m] = sort { $a->[$colidx] <=> $b->[$colidx] }
@{$m}[1..$#$m];
}

my @array = (
[qw/ L1 L2 L3 L4 L5 /],
[9, 8, 7, 6, 5],
[1, 2, 3, 4, 5],
[0, 0, 0, 0, 0],
);

sort_by_column \@array, 'L3';

for (@array) {
printf join(" ", ("%5s") x @$_) . "\n", @$_;
}
[14:53] ant% ./try
L1 L2 L3 L4 L5
0 0 0 0 0
1 2 3 4 5
9 8 7 6 5

Hope this helps,
Greg
--
Government, even in its best state, is but a necessary evil; in its worst
state, an intolerable one. Government, like dress, is the badge of lost
innocence; the palaces of kings are built upon the ruins of the bowers of
paradise. -- Thomas Paine, "Common Sense"
 
Reply With Quote
 
 
 
 
Dennis@NoSpam.com
Guest
Posts: n/a
 
      06-28-2003
http://www.velocityreviews.com/forums/(E-Mail Removed) (Greg Bacon) wrote:

>In article <(E-Mail Removed)>,
> <(E-Mail Removed)> wrote:
>
>: I have a numerical array consisting of 5000 rows and 30 columns. The
>: first row consists of 30 ascii column labels for example
>: L1,L2.....L30. I would like to sort the column with the header L5 in
>: ascending order leaving the header labels intact on the first row.
>
>Assuming @array is an array of rows, you could use something similar
>to the code below.


<snipped code for it's shown in above post>

>Hope this helps,
>Greg


Thank you Greg!

A lot of neat code. Some of the perl syntax is new to me but I'll get to work
with my Perl books and learn. Thanks again.

Dennis
 
Reply With Quote
 
Greg Bacon
Guest
Posts: n/a
 
      06-29-2003
In article <(E-Mail Removed)>,
<(E-Mail Removed)> wrote:

: [...]
:
: A lot of neat code. Some of the perl syntax is new to me but I'll get
: to work with my Perl books and learn. Thanks again.

Anything in particular that gave you trouble? This is a discussion
group, after all. If you'll permit a guess, reading the perlref,
perllol, and perldsc manpages will help your understanding.

Greg
--
What has transformed the limited war between royal armies into total war,
the clash between peoples, is not technicalities of military art, but the
substitution of the welfare state for the laissez-faire state.
-- Ludwig von Mises, *Human Action*
 
Reply With Quote
 
Dennis@NoSpam.com
Guest
Posts: n/a
 
      06-29-2003
(E-Mail Removed) (Greg Bacon) wrote:

>In article <(E-Mail Removed)>,
> <(E-Mail Removed)> wrote:
>
>: [...]
>:
>: A lot of neat code. Some of the perl syntax is new to me but I'll get
>: to work with my Perl books and learn. Thanks again.
>
>Anything in particular that gave you trouble? This is a discussion
>group, after all. If you'll permit a guess, reading the perlref,
>perllol, and perldsc manpages will help your understanding.


Greg,

Well I read your above perl manpages and the subroutine section of "Perl
Cookbook" by Tom Christiansen & N. Torkington.

Below is the code I don't understand:

First in the subroutine sort_by_column

sub sort_by_column {
my $m = shift;
my $col = shift;

return unless ref($m) && @$m && $col;

my $colidx = find_column_index $m, $col;
return unless defined $colidx;

@{$m}[1..$#$m] = sort { $a->[$colidx] <=> $b->[$colidx] }
@{$m}[1..$#$m];
}

sort_by_column \@array, 'L3';

I don't understand the shift operator and how it moves \@array (a reference to
an array) and 'L3' into $m and $col. I know the input to a subroutine are the
elements of @_ but what does shift mean?

The statement return unless ref($m) && @$m && $col; tests to see that the
reference $m and value $col exist but what's @$m mean? An array whose pointer
reference starts at $m?

Also I'm not sure what the expression @{$m}[1..$#$m] means. obviously a
pointer $m to an array but [1..$#$m]? .

Next I don't understand some of the code in the subroutine find_column_index:

sub find_column_index {
my $a = shift;
my $col = shift;

my $header = $a->[0];
my $colidx = 0;
for (@$header) {
last if $_ eq $col;
++$colidx;
}

$colidx >= @$header ? () : $colidx;
}

I take it that "my $header = $a->[0];" means store the pointer reference of the
0'th element into $header? "for (@$header)" means for each element of the input
array do the below? I didn't know "last" would end the loop after the last
statement if the "if" statement was true. Neat. I take it that when you say
"for(@$header)" each element of the array is stored into $_ one by one in the
for loop?

Last what does $colidx >= @$header ? () : $colidx; mean? If the array element
number of 'L3' is greater then or equal to ...then I get lost.

Thanks for your help, I'm learning a lot!

Dennis

 
Reply With Quote
 
Greg Bacon
Guest
Posts: n/a
 
      06-29-2003
In article <(E-Mail Removed)>,
<(E-Mail Removed)> wrote:

: [...]
:
: First in the subroutine sort_by_column
:
: sub sort_by_column {
: my $m = shift;
: my $col = shift;
:
: return unless ref($m) && @$m && $col;
:
: my $colidx = find_column_index $m, $col;
: return unless defined $colidx;
:
: @{$m}[1..$#$m] = sort { $a->[$colidx] <=> $b->[$colidx] }
: @{$m}[1..$#$m];
: }
:
: sort_by_column \@array, 'L3';
:
: I don't understand the shift operator and how it moves \@array (a
: reference to an array) and 'L3' into $m and $col. I know the input to
: a subroutine are the elements of @_ but what does shift mean?

From the perlfunc documentation on the shift operator:

Shifts the first value of the array off and returns it,
shortening the array by 1 and moving everything down. If
there are no elements in the array, returns the undefined
value. If ARRAY is omitted, shifts the "@_" array within
the lexical scope of subroutines . . .

The shifts are plucking off the subroutine's arguments. To see shift
in action, consider the following:

[16:15] ant% cat try
#! /usr/local/bin/perl

$" = "]["; # separator for interpolating arrays

@a = ('apples', 'oranges', 'bananas');
print "[@a]\n";

$first = shift @a;
print "\$first = [$first], \@a = [@a]\n";
[16:15] ant% ./try
[apples][oranges][bananas]
$first = [apples], @a = [oranges][bananas]

: The statement return unless ref($m) && @$m && $col; tests to see that
: the reference $m and value $col exist but what's @$m mean? An array
: whose pointer reference starts at $m?

Yes, but your terminology could stand polishing. (If I seem picky, I'm
only trying to help you learn.) In Perl parlance, we'd say that we're
making sure -- albeit indirectly -- that $m is an array reference, that
$m's thingy (Perl's pedestrian way of saying 'referent', i.e., the array
to which $m refers) has at least one element, and that we have a column
label to look for. See the perlref manpage.

We might have written the following

return unless ref($m) && @$m && $col;

to be more chatty as

unless ($m && ref($m) eq 'ARRAY') {
warn "'$m' is not an array reference";
return;
}

unless (@$m > 0) {
warn "no rows!";
return;
}

if (!defined($col) || $col eq '') {
warn "no column label!";
return;
}

I wrote the check the way I did because sort_by_column operates
in-place, so, at worst, I'd just leave the data alone. One line was
also a little more appealing than twelve.

There are also lots of hairy philosophical arguments surrounding this
issue such as "defensive programming is bad style because it hides
bugs", but let's not get into all that.

: Also I'm not sure what the expression @{$m}[1..$#$m] means.
: obviously a pointer $m to an array but [1..$#$m]? .

Remember that Perl doesn't have pointers but references.

Perl's .. operator can produce ranges, e.g.,

% perl -le 'print 0..9'
0123456789

Recall from the perldata manpage that $#ARRAY gives the index of the
last element of @ARRAY. For example

% perl -le '@a = (1..10); print $#a'
9

(I might be setting a bad example. mjd, rightly IMHO, says using
$#ARRAY is a red flag[*]. The usage is correct in this case, but
do what I say, not what I do.
[*] http://groups.google.com/groups?selm...%40news.op.net

The perlref manpage shows how to dereference arrays, and $#$m yields the
index of the last element in $m's thingy. @{$m}[...] takes a slice of
$m's thingy, i.e., a sublist -- see the perldata manpage.

Don't get bogged down in the low-level details. Think about what we're
trying to do: we want to leave the first row alone (the header) and
sort everything else, i.e., all the rows from index 1 up to the last
index in $m's thingy. We're operating in-place, so we put the rows back
where we got them:

@{$m}[1..$#$m] = sort { $a->[$colidx] <=> $b->[$colidx] }
@{$m}[1..$#$m];

: Next I don't understand some of the code in the subroutine
: find_column_index:
:
: sub find_column_index {
: my $a = shift;
: my $col = shift;
:
: my $header = $a->[0];
: my $colidx = 0;
: for (@$header) {
: last if $_ eq $col;
: ++$colidx;
: }
:
: $colidx >= @$header ? () : $colidx;
: }
:
: I take it that "my $header = $a->[0];" means store the pointer
: reference of the 'th element into $header?

Yes, we're storing a copy of a reference to the array of column headers.
I used a separate variable to show the code's intent.

: "for (@$header)" means for
: each element of the input array do the below? I didn't know "last"
: would end the loop after the last statement if the "if" statement was
: true. Neat.

Yes. Perl's last operator is like break in C but cooler.

: I take it that when you say "for(@$header)" each element
: of the array is stored into $_ one by one in the for loop?

Yes. See the perlsyn manpage.

: Last what does $colidx >= @$header ? () : $colidx; mean? If the array
: element number of 'L3' is greater then or equal to ...then I get lost.

That's the ternary operator as in C, sometimes called an "inline if".
See the perlsyn manpage.

That code is checking whether we found a match. If the condition is
true (no match), then $colidx will be at least as large as the number of
elements in @$header, and we return () or nothing. Otherwise (what's
after the colon), we send back the desired header's index.

Hope this helps,
Greg
--
When I was a boy of fourteen, my father was so ignorant that I could hardly
stand to have the old man around. But when I got to be twenty-one, I was
astonished at how much he'd learned in seven years.
-- Mark Twain
 
Reply With Quote
 
Mark Jason Dominus
Guest
Posts: n/a
 
      06-30-2003
In article <(E-Mail Removed)>,
Greg Bacon <(E-Mail Removed)> wrote:
>(I might be setting a bad example. mjd, rightly IMHO, says using
>$#ARRAY is a red flag[*]. The usage is correct in this case, but
>do what I say, not what I do.
>[*] http://groups.google.com/groups?selm...%40news.op.net


In that article, I said I thought I was going to add $#array as a red
flag. I did add it to the class, but I did not accord it 'red flag'
status. A 'red flag' is something that is almost always wrong. After
doing a study, I concluded that although $#array is often wrong, it is
not 'almost always wrong'.

The details of the study are at
http://perl.plover.com/yak/flags/dollar-pound/. Here is the short
version. $#array is commonly used for five things:

1. Generating a list of indices for an array. (Your example above is
one of these; it is @{$m}[1..$#$m].)

2. The upper bound of a C-style 'for' loop, as

for ($i=0; $i <= $#array; $i++) {
do something with $array[$i];
}

3. As a boundary check to see if a value is in the proper index range
for an array. (2) could be considered a special case of this.
Here's an example:

if ($last_item >= $#list) {
$Init_Disp_Limits->();
}

4. To pre-extend an array, as with

$#array = $EXPECTED_NUMBER_OF_ITEMS;

5. To access the last element of an array, as with $last = $array[$#array].

In my judgement, all of the class (2) and (5) uses, and many of the
class (3) uses, would have been better written some other way. For
example, I think the example in (5) is obviously better as $last = $array[-1].

Overall, about 20% of the uses of $#array would have been better off
some other way. Class (1) did not seem to be in this 20%. I don't
know any better way to write

%hash = map { $array[$_] => $_ } 0 .. $#array;

without the $#array, for example.



 
Reply With Quote
 
Dennis@NoSpam.com
Guest
Posts: n/a
 
      06-30-2003
(E-Mail Removed) (Greg Bacon) wrote:

>In article <(E-Mail Removed)>,
> <(E-Mail Removed)> wrote:
>

<snip really great code and explanations for all my beginner questions>
>
>Hope this helps,
>Greg


Thanks Greg for the really great Perl code and explanations on how it works. I
really appreciate the time and effort you put in to teach me and all the others
who are reading these posts. I have really learned a lot...much more than the
Perl books I've been reading.

Thanks again.

Dennis

 
Reply With Quote
 
Greg Bacon
Guest
Posts: n/a
 
      06-30-2003
In article <20030630132321.238$(E-Mail Removed)>,
<(E-Mail Removed)> wrote:

: (E-Mail Removed) (Mark Jason Dominus) wrote:
:
: > The details of the study are at
: > http://perl.plover.com/yak/flags/dollar-pound/. Here is the short
: > version. $#array is commonly used for five things:
: >
: > 1. Generating a list of indices for an array. (Your example above is
: > one of these; it is @{$m}[1..$#$m].)
:
: I wish the ".." operator, when occuring in a slice, were sufficiently
: magical to allow @{$m}[1..-1] to replace the above.

Amen! Almost anything other than big ugly $#{...} dereferences would
be nice.

: > 2. The upper bound of a C-style 'for' loop, as
: >
: > for ($i=0; $i <= $#array; $i++) {
: > do something with $array[$i];
: > }
:
: I use this very frequently when I have parallel arrays. Of course,
: that might not exactly fit in your criteria for inclusion in this
: category. I also use this when I want to change the length of @array
: during the loop.

When I find myself constructing parallel arrays, I almost always merge
them into arrays of either hashes or arrays.

Greg
--
Laws that forbid the carrying of arms ... make things worse for the assaulted
and better for the assailants; they serve rather to encourage than to prevent
homicides, for an unarmed man may be attacked with greater confidence than an
armed man. -- Thomas Jefferson
 
Reply With Quote
 
ctcgag@hotmail.com
Guest
Posts: n/a
 
      07-01-2003
Greg Bacon <(E-Mail Removed)> wrote:
>
> : > 2. The upper bound of a C-style 'for' loop, as
> : >
> : > for ($i=0; $i <= $#array; $i++) {
> : > do something with $array[$i];
> : > }
> :
> : I use this very frequently when I have parallel arrays. Of course,
> : that might not exactly fit in your criteria for inclusion in this
> : category. I also use this when I want to change the length of @array
> : during the loop.
>
> When I find myself constructing parallel arrays, I almost always merge
> them into arrays of either hashes or arrays.


I do that sometimes, but I tend to not do it as much as I could for three
reasons. $age[$id] and $sex[$id] take up much, much less room than
$person[$id]{age} and $person[$id]{sex} if there are a lot of entries. The
first way gives me compile time errors if I fat-finger "age" or "sex".
With the first I can easily pass one compartment to general functions
without using map: median(\@age) rather than
median([map $_->{age}, @person]). Of course, the converse is that the
parallel structure makes it harder to pass the whole structure around, but
if I have to do much of that I tend to encompass it into a class anyway.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service New Rate! $9.95/Month 50GB
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
Javascript Toolbox - Table Sort utility - Sort on column's @intattribute Rick Javascript 0 06-08-2010 11:44 PM
Numerically sort a file on a given column where column is a $var joemacbusiness@yahoo.com Perl Misc 4 07-18-2008 01:48 PM
Merge Sort in C - array output is same as input after sort routine completes rkk C Programming 9 09-24-2006 08:30 PM
Ado sort error-Ado Sort -Relate, Compute By, or Sort operations cannot be done on column(s) whose key length is unknown or exceeds 10 KB. Navin ASP General 1 09-09-2003 07:16 AM



Advertisments