Velocity Reviews > Perl > multi-field array sort using Sort::Fields method

# multi-field array sort using Sort::Fields method

Domenico Discepola
Guest
Posts: n/a

 04-27-2004
Hello all. My goal is to be able to perform a "multi-field sort on a
multidimensional array". Having read many posts in the newsgroups, I was
unable to find a "straight" answer to this problem. I therefore came up
with this method. My question is, is there a more efficient solution to
this problem or is my method acceptable? Is there another CPAN module that
can be used? I welcome all opinions.

#!perl
use strict;
use warnings;
use Sort::Fields;

my ( @arr01, @arr02, \$r, \$c, \$string, \$aref, \$delim, @arr_temp,
@arr_final );

@arr01 = ( [1, 'a', 'dom'],
[3, 'd', 'jamie'],
[7, 'd', 'abigail']);

\$delim = "\t";
\$string = "";

#Step 1 - combine fields into 1 string so that Sort::Fields will work
for \$r ( 0 .. \$#arr01 ) {
\$aref = \$arr01[\$r];
for \$c ( 0 .. \$#{\$aref} ) {
if ( \$c == \$#\$aref ) {
\$string = \$string . \$arr01[\$r][\$c];
} else {
\$string = \$string . \$arr01[\$r][\$c]. \$delim ;
}
}
push @arr02, \$string;
\$string = "";
}

#Step 2 - sort by field 2, then 3
my @sorted = fieldsort '\t', [2,3], @arr02;

#Step 3 - split sorted strings into mutidim array
foreach my \$el ( @sorted ) {
@arr_temp = split \$delim, \$el;
push @arr_final, [@arr_temp];
}

#Step 4 - final output
print join('|', @\$_), "\n" for @arr_final;

Paul Lalli
Guest
Posts: n/a

 04-27-2004
On Tue, 27 Apr 2004, Domenico Discepola wrote:

> Hello all. My goal is to be able to perform a "multi-field sort on a
> multidimensional array". Having read many posts in the newsgroups, I was
> unable to find a "straight" answer to this problem. I therefore came up
> with this method. My question is, is there a more efficient solution to
> this problem or is my method acceptable? Is there another CPAN module that
> can be used? I welcome all opinions.

You're going backwards. Sort::Fields is what you use when you don't have
multi-dimensional arrays, but rather arrays of delimited strings. If you
have multi-dimensional arrays, you just use 'sort':

#!perl.exe
use strict;
use warnings;

my @arr01 = (
[1, 'a', 'dom'],
[3, 'd', 'jamie'],
[7, 'd', 'abigail']
);

my @sorted = sort {
\$a->[1] cmp \$b->[1] #sort ASCIIBetically by 2nd field
or
\$a->[2] cmp \$b->[2] #if same, sort ASICBettically by 3rd field
} @arr01;

foreach (@sorted){
print "@\$_\n";
}

__END__

perldoc -f sort

Hope this helps,
Paul Lalli

>
> #!perl
> use strict;
> use warnings;
> use Sort::Fields;
>
> my ( @arr01, @arr02, \$r, \$c, \$string, \$aref, \$delim, @arr_temp,
> @arr_final );
>
> @arr01 = ( [1, 'a', 'dom'],
> [3, 'd', 'jamie'],
> [7, 'd', 'abigail']);
>
> \$delim = "\t";
> \$string = "";
>
> #Step 1 - combine fields into 1 string so that Sort::Fields will work
> for \$r ( 0 .. \$#arr01 ) {
> \$aref = \$arr01[\$r];
> for \$c ( 0 .. \$#{\$aref} ) {
> if ( \$c == \$#\$aref ) {
> \$string = \$string . \$arr01[\$r][\$c];
> } else {
> \$string = \$string . \$arr01[\$r][\$c]. \$delim ;
> }
> }
> push @arr02, \$string;
> \$string = "";
> }
>
> #Step 2 - sort by field 2, then 3
> my @sorted = fieldsort '\t', [2,3], @arr02;
>
> #Step 3 - split sorted strings into mutidim array
> foreach my \$el ( @sorted ) {
> @arr_temp = split \$delim, \$el;
> push @arr_final, [@arr_temp];
> }
>
> #Step 4 - final output
> print join('|', @\$_), "\n" for @arr_final;
>
>
>
>

Domenico Discepola
Guest
Posts: n/a

 04-27-2004

"Paul Lalli" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> On Tue, 27 Apr 2004, Domenico Discepola wrote:
>
> > Hello all. My goal is to be able to perform a "multi-field sort on a
> > multidimensional array".

Is there another CPAN module that
> > can be used? I welcome all opinions.

>

> perldoc -f sort
>
> Hope this helps,
> Paul Lalli
>
> >

Thanks for your concrete example - it helps while trying to understand the
existing documentation. Although the example you provided works, I'm
wondering if there is a CPAN module which provides a more "elegant"
interface to a multi-field, multidimensional array sort. What I have in
mind is an interface similar to Sort::Fields:

@arr_order = ( [2, 'a'], [-1,'n'] );

@sorted = sort_module( \$arr_input, \@arr_order );

This would translate as sort what's in array \$arr_order alphabetically by
position 2, then numerically reversed by position 1.

Input parameter 1 is the array to be sorted, parameter 2 is an array
(@arr_order) containing the index position of the array columns we want to
sort on, along with an alphabetic (a) or numeric (n) sort type. A negative
sign indicates a reverse sort. The benefit of this is to be able to call 1
function that can be passed parameters dynamically (as opposed to
dynamically modifying code with \$a and \$b, cmp, <=>, etc.).

Paul Lalli
Guest
Posts: n/a

 04-27-2004
On Tue, 27 Apr 2004, Domenico Discepola wrote:

> Thanks for your concrete example - it helps while trying to understand the
> existing documentation. Although the example you provided works, I'm
> wondering if there is a CPAN module which provides a more "elegant"
> interface to a multi-field, multidimensional array sort. What I have in
> mind is an interface similar to Sort::Fields:
>
> @arr_order = ( [2, 'a'], [-1,'n'] );
>
> @sorted = sort_module( \$arr_input, \@arr_order );
>
> This would translate as sort what's in array \$arr_order alphabetically by
> position 2, then numerically reversed by position 1.
>
> Input parameter 1 is the array to be sorted, parameter 2 is an array
> (@arr_order) containing the index position of the array columns we want to
> sort on, along with an alphabetic (a) or numeric (n) sort type. A negative
> sign indicates a reverse sort. The benefit of this is to be able to call 1
> function that can be passed parameters dynamically (as opposed to
> dynamically modifying code with \$a and \$b, cmp, <=>, etc.).

I'm not especially sure I agree that your method would be more 'elegant'.
Really what you've suggested is to write a wrapper around sort() that
would restrict its functionality. I'm not convinced that's a good idea.
(I'm also not sure I understand what you mean by your last sentence - \$a
and \$b are two predefined variables common to every single sort
subroutine, and cmp vs <=> simply means asciibettical vs numerical)

Disregarding all that, however, it probably wouldn't be too hard to write
such a wrapper:

use strict;
use warnings;

my (\$array, \$config); #'global', because will be used in two functions

sub sort_module(\@\@){
(\$array, \$config) = @_;
#error checking here to make sure arrays are what you want
sort with_elegance (@\$array);
}

sub with_elegance{
my \$return;
foreach my \$dimension (@\$config){
my \$pos = (abs \$\$dimension[0]) - 1;
my \$compare;
if (\$\$dimension[1] eq 'a'){
\$compare = 'cmp';
} elsif(\$\$dimension[1] eq 'n') {
\$compare = '<=>';
} else {
die "Invalid comparison marker: \$\$dimension[1] ".
"(only 'a' and 'n' allowed\n";
}
if (\$\$dimension[0] >= 0){
eval '\$return = \$\$a[\$pos] '.\$compare.' \$\$b[\$pos]';
} else {
eval '\$return = \$\$b[\$pos] '.\$compare.' \$\$a[\$pos]';
}
last unless \$return == 0;
}
return \$return;
}

my @arr_input = ( [1, 'a', 'dom'],
[3, 'd', 'jamie'],
[7, 'd', 'abigail']
);

my @arr_order = ( [2, 'a'], [-1,'n'] );
my @sorted = sort_module( @arr_input, @arr_order );

foreach (@sorted) {
print "@\$_\n";
}

__END__

Give that a shot and see if it does what you want. Note that I whipped
this up in just a few moments, and it shows. Among the things that should
probably be fixed are the use of global variables, die() should become
carp or croak if this were put into an actual module, and the use of the
eval function. Not to mention there's probably more than a few ways to
optimize it....

Paul Lalli

Paul Lalli
Guest
Posts: n/a

 04-27-2004
On Tue, 27 Apr 2004, Paul Lalli wrote:

> On Tue, 27 Apr 2004, Domenico Discepola wrote:
>
> > Thanks for your concrete example - it helps while trying to understand the
> > existing documentation. Although the example you provided works, I'm
> > wondering if there is a CPAN module which provides a more "elegant"
> > interface to a multi-field, multidimensional array sort. What I have in
> > mind is an interface similar to Sort::Fields:
> >
> > @arr_order = ( [2, 'a'], [-1,'n'] );
> >
> > @sorted = sort_module( \$arr_input, \@arr_order );
> >
> > This would translate as sort what's in array \$arr_order alphabetically by
> > position 2, then numerically reversed by position 1.
> >
> > Input parameter 1 is the array to be sorted, parameter 2 is an array
> > (@arr_order) containing the index position of the array columns we want to
> > sort on, along with an alphabetic (a) or numeric (n) sort type. A negative
> > sign indicates a reverse sort. The benefit of this is to be able to call 1
> > function that can be passed parameters dynamically (as opposed to
> > dynamically modifying code with \$a and \$b, cmp, <=>, etc.).

>
> I'm not especially sure I agree that your method would be more 'elegant'.
> Really what you've suggested is to write a wrapper around sort() that
> would restrict its functionality. I'm not convinced that's a good idea.
> (I'm also not sure I understand what you mean by your last sentence - \$a
> and \$b are two predefined variables common to every single sort
> subroutine, and cmp vs <=> simply means asciibettical vs numerical)
>
> Disregarding all that, however, it probably wouldn't be too hard to write
> such a wrapper:

And of course, I did things bass-ackwards and coded before checking CPAN.
I wonder if this would suffice for what you're looking for:
http://search.cpan.org/~evo/Data-Sorting-0.9/Sorting.pm

It doesn't have quite the same interface you wanted, but it's similar.

Paul Lalli

Anno Siegel
Guest
Posts: n/a

 04-27-2004
Domenico Discepola <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> Hello all. My goal is to be able to perform a "multi-field sort on a
> multidimensional array". Having read many posts in the newsgroups, I was
> unable to find a "straight" answer to this problem. I therefore came up
> with this method. My question is, is there a more efficient solution to
> this problem or is my method acceptable? Is there another CPAN module that
> can be used? I welcome all opinions.

Okay, the code is a bit pointless, because sorting an array of arrays via
Sort::Fields is backwards. Still, a few remarks:

> #!perl
> use strict;
> use warnings;

Good.

> use Sort::Fields;
>
> my ( @arr01, @arr02, \$r, \$c, \$string, \$aref, \$delim, @arr_temp,
> @arr_final );

It is usually preferred to declare variables where they first appear.
Some languages don't support this and force you to lump all declarations
together. That doesn't mean it's good style.

> @arr01 = ( [1, 'a', 'dom'],
> [3, 'd', 'jamie'],
> [7, 'd', 'abigail']);
>
> \$delim = "\t";
> \$string = "";
>
> #Step 1 - combine fields into 1 string so that Sort::Fields will work
> for \$r ( 0 .. \$#arr01 ) {
> \$aref = \$arr01[\$r];

You don't need the index (\$r) in the loop, so you could have iterated
over @arr01 directly:

for my \$aref ( @arr01 ) {

> for \$c ( 0 .. \$#{\$aref} ) {
> if ( \$c == \$#\$aref ) {
> \$string = \$string . \$arr01[\$r][\$c];
> } else {
> \$string = \$string . \$arr01[\$r][\$c]. \$delim ;
> }
> }
> push @arr02, \$string;

Uh, oh! Perl has a function for what that loop does, it's called join:

push @arr2, join \$delim, @\$aref;

....does exactly the same thing.

> \$string = "";

If you had declared \$string inside the loop body you wouldn't have to
worry about clearing it. "my" does that at run time.

Your outer loop does nothing but collect the results of a calculation
into a list. Again, Perl has a function for that, called "map":

my @arr02 = map join( \$delim, @\$_) => @arr01;

is a more idiomatic replacement for your code to this point.

> }
>
> #Step 2 - sort by field 2, then 3
> my @sorted = fieldsort '\t', [2,3], @arr02;
>
> #Step 3 - split sorted strings into mutidim array
> foreach my \$el ( @sorted ) {

Ah, good. An index-free loop where no index is needed.

> @arr_temp = split \$delim, \$el;
> push @arr_final, [@arr_temp];
> }

Why the intermediate @arr_temp? "[ split \$delim, \$el]" works as well.
The loop could again be replaced by map:

@arr_final = map [ split \$delim, \$_], @sorted;

> #Step 4 - final output
> print join('|', @\$_), "\n" for @arr_final;

Hey, you know about "join". Why did you re-invent it up there? In
general, the second half of your code looks smoother than the part
before "fieldsort ...".

Having reduced all of the loops to map, the whole thing can be written
as one statement (untested, as is the above):

print join( '|', @\$_), "\n" for
map [ split \$delim, \$_] =>
fieldsort '\t', [2,3] =>
map join( \$delim, @\$_) => @arr01;

This is not everyone's idea of good style, but I think in this case it's

Anno

Uri Guttman
Guest
Posts: n/a

 04-28-2004
>>>>> "PL" == Paul Lalli <(E-Mail Removed)> writes:

PL> On Tue, 27 Apr 2004, Domenico Discepola wrote:
>> Thanks for your concrete example - it helps while trying to understand the
>> existing documentation. Although the example you provided works, I'm
>> wondering if there is a CPAN module which provides a more "elegant"
>> interface to a multi-field, multidimensional array sort. What I have in
>> mind is an interface similar to Sort::Fields:

PL> I'm not especially sure I agree that your method would be more 'elegant'.
PL> Really what you've suggested is to write a wrapper around sort() that
PL> would restrict its functionality. I'm not convinced that's a good idea.
PL> (I'm also not sure I understand what you mean by your last sentence - \$a
PL> and \$b are two predefined variables common to every single sort
PL> subroutine, and cmp vs <=> simply means asciibettical vs numerical)

PL> Disregarding all that, however, it probably wouldn't be too hard to write
PL> such a wrapper:

PL> sub with_elegance{
PL> my \$return;
PL> foreach my \$dimension (@\$config){
PL> my \$pos = (abs \$\$dimension[0]) - 1;
PL> my \$compare;
PL> if (\$\$dimension[1] eq 'a'){
PL> \$compare = 'cmp';
PL> } elsif(\$\$dimension[1] eq 'n') {
PL> \$compare = '<=>';
PL> } else {
PL> die "Invalid comparison marker: \$\$dimension[1] ".
PL> "(only 'a' and 'n' allowed\n";
PL> }
PL> if (\$\$dimension[0] >= 0){
PL> eval '\$return = \$\$a[\$pos] '.\$compare.' \$\$b[\$pos]';
PL> } else {
PL> eval '\$return = \$\$b[\$pos] '.\$compare.' \$\$a[\$pos]';
PL> }
PL> last unless \$return == 0;
PL> }
PL> return \$return;
PL> }

PL> Give that a shot and see if it does what you want. Note that I whipped
PL> this up in just a few moments, and it shows. Among the things that should
PL> probably be fixed are the use of global variables, die() should become
PL> carp or croak if this were put into an actual module, and the use of the
PL> eval function. Not to mention there's probably more than a few ways to
PL> optimize it....

wait for Sort::Maker which is mostly developed and will do all that and
more and much faster and with a better api. i expect to cpan the .01
version in mid june before yapc. i could let people have beta copies
earlier than that if anyone wants it.

more on this soon as i do some more pod editing. i will post that soon
as it is presentable

uri

--
Uri Guttman ------ http://www.velocityreviews.com/forums/(E-Mail Removed) -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org