Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Sorted Hash

Reply
Thread Tools

Sorted Hash

 
 
palexvs@gmail.com
Guest
Posts: n/a
 
      11-29-2007
I filled hash and then printed it (sorted by key):
my %hs = (
key10 => 5,
key5 => b,
aey9 => 7,
)
foreach my $k (sort keys %hs) { print "$k $hs{$k}\n"; }

key - string ([0-9A-F]{72}), 50K records.
How do it more effective?

 
Reply With Quote
 
 
 
 
Ben Morrow
Guest
Posts: n/a
 
      11-30-2007

Quoth "(E-Mail Removed)" <(E-Mail Removed)>:
> I filled hash and then printed it (sorted by key):
> my %hs = (
> key10 => 5,
> key5 => b,
> aey9 => 7,
> )
> foreach my $k (sort keys %hs) { print "$k $hs{$k}\n"; }
>
> key - string ([0-9A-F]{72}), 50K records.
> How do it more effective?


Tie::IxHash, or maintain an array of keys yourself. See perldoc -q
sorted.

Ben

 
Reply With Quote
 
 
 
 
A. Sinan Unur
Guest
Posts: n/a
 
      11-30-2007
"(E-Mail Removed)" <(E-Mail Removed)> wrote in news:a1ddaad2-89c6-4b01-
http://www.velocityreviews.com/forums/(E-Mail Removed):

> I filled hash and then printed it (sorted by key):
> my %hs = (
> key10 => 5,
> key5 => b,
> aey9 => 7,
> )
> foreach my $k (sort keys %hs) { print "$k $hs{$k}\n"; }
>
> key - string ([0-9A-F]{72}), 50K records.
> How do it more effective?


Well, depends on what you mean by 'more effective'.

If you are talking about speed, note that the only place where you can
really get any meaningful improvement is IO.

I am using Windows XP SP2 with ActiveState Perl 5.8.8.822 on an Intel Core
Duo 1.66 Mhz with 1 GB physical memory with a whole bunch of other apps
open.

Let's start with:

#!/usr/bin/perl

use strict;
use warnings;

my %hash;

for my $k ( 1 .. 50_000 ) {
$hash{ make_key() } = $k;
}

### printing code here

sub make_key {
join('', map { sprintf( '%.16f', $_ ) } (rand, rand, rand, rand) );
}

__END__
C:\DOCUME~1\asu1\LOCALS~1\Temp\t> timethis t

TimeThis : Command Line : t
TimeThis : Start Time : Thu Nov 29 19:32:35 2007
TimeThis : End Time : Thu Nov 29 19:32:36 2007
TimeThis : Elapsed Time : 00:00:01.109

So, let's add printing to this script:

my @keys = sort keys %hash;
print "$_ $hash{$_}\n" for @keys;

TimeThis : Command Line : t
TimeThis : Start Time : Thu Nov 29 19:37:11 2007
TimeThis : End Time : Thu Nov 29 19:37:14 2007
TimeThis : Elapsed Time : 00:00:03.156

So, we are talking about 2.047 seconds added by printing the hash.

Assuming 80 bytes per print statement and 50,000 print statements, that's
3906.25 K of output in about two seconds.

That's not bad and does not leave a lot of room for improvement.

Following the printing late strategy and trying to trade-off memory for
speed, let's accumulate all output before printing:

my $out;
for ( @keys ) {
$out .= "$_ $hash{$_}\n";
}

{
local $| = 0;
print $out;
}

TimeThis : Command Line : t
TimeThis : Start Time : Thu Nov 29 19:56:33 2007
TimeThis : End Time : Thu Nov 29 19:56:39 2007
TimeThis : Elapsed Time : 00:00:05.859

Well, that wasn't good (and I really did not expect it to be). But we don't
have to accumulate all output. We can just print after every 9K or so. (I
came up with that magic number on my system after some trial and error ...
Time program, go do something in Word, load cnn.com on Firefox, go back and
re-time this script ... poor man's cache clearing).

my $out;
for ( @keys ) {
$out .= "$_ $hash{$_}\n";
if ( length $out > 9_000 ) {
print $out;
$out = q{};
}
}

TimeThis : Command Line : t
TimeThis : Start Time : Thu Nov 29 20:04:57 2007
TimeThis : End Time : Thu Nov 29 20:05:00 2007
TimeThis : Elapsed Time : 00:00:03.062

So, after all that tinkering, I got an improvement of 0.094 seconds, that
is about 3%.

Now, my against my gut feeling, I did try:

print "$_ $hash{$_}\n" for sort keys %hash;

TimeThis : Command Line : t
TimeThis : Start Time : Thu Nov 29 20:11:01 2007
TimeThis : End Time : Thu Nov 29 20:11:04 2007
TimeThis : Elapsed Time : 00:00:02.953

Hmmm ... It does not look like I can beat the simplest alternative by
trying to be clever.

That took 1.844 seconds to output about 3906.25K.

of course, I probably wasted my time because I misunderstood your vaguely
phrased question.

Sinan

--
A. Sinan Unur <(E-Mail Removed)>
(remove .invalid and reverse each component for email address)
clpmisc guidelines: <URL:http://www.augustmail.com/~tadmc/clpmisc.shtml>

 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      11-30-2007
(E-Mail Removed) wrote:
> I filled hash and then printed it (sorted by key):
> my %hs = (
> key10 => 5,
> key5 => b,
> aey9 => 7,
> )
> foreach my $k (sort keys %hs) { print "$k $hs{$k}\n"; }
>
> key - string ([0-9A-F]{72}), 50K records.
> How do it more effective?



 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      11-30-2007
(E-Mail Removed) wrote:
> I filled hash and then printed it (sorted by key):

[...]
> foreach my $k (sort keys %hs) { print "$k $hs{$k}\n"; }
>
> key - string ([0-9A-F]{72}), 50K records.
> How do it more effective?


Your algorithm as posted should be 100% effective or are you observing any
unsorted keys?

jue


 
Reply With Quote
 
xhoster@gmail.com
Guest
Posts: n/a
 
      11-30-2007
"(E-Mail Removed)" <(E-Mail Removed)> wrote:
> I filled hash and then printed it (sorted by key):
> my %hs = (
> key10 => 5,
> key5 => b,
> aey9 => 7,
> )
> foreach my $k (sort keys %hs) { print "$k $hs{$k}\n"; }
>
> key - string ([0-9A-F]{72}), 50K records.
> How do it more effective?


What is ineffective about the current way?

You could keep the structure sorted throughout its lifetime by using
a tied hash or a non-hash structure, but the overhead of doing so
is almost certainly going to be greater than a one-time sort.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
Reply With Quote
 
Salvador Fandino
Guest
Posts: n/a
 
      11-30-2007
(E-Mail Removed) wrote:
> I filled hash and then printed it (sorted by key):
> my %hs = (
> key10 => 5,
> key5 => b,
> aey9 => 7,
> )
> foreach my $k (sort keys %hs) { print "$k $hs{$k}\n"; }
>
> key - string ([0-9A-F]{72}), 50K records.
> How do it more effective?
>


yu can use the radix sort implementation available from Sort::Key::Radix
that is usually faster for this kind of data that the merge sort used
internally by perl:


#!/usr/bin/perl

use strict;
use warnings;

use Benchmark qw(cmpthese);
use Sort::Key::Radix qw(ssort);

my @l = ('a'..'z','A'..'Z','1'..'9');

sub genkey { join '', map $l[rand @l], 0..71 }

my @keys = map genkey, 0..50_000;


sub psort { my @sorted = sort @keys }
sub rsort { my @sorted = ssort @keys }

cmpthese (-1, { psort => \&psort,
rsort => \&rsort } );


-----------

psort 4.39/s -- -41%
rsort 7.41/s 69% --
 
Reply With Quote
 
A. Sinan Unur
Guest
Posts: n/a
 
      11-30-2007
Salvador Fandino <(E-Mail Removed)> wrote in news:fiorj9$gto$1
@hefestos.uned.es:

> (E-Mail Removed) wrote:
>> I filled hash and then printed it (sorted by key):
>> my %hs = (
>> key10 => 5,
>> key5 => b,
>> aey9 => 7,
>> )
>> foreach my $k (sort keys %hs) { print "$k $hs{$k}\n"; }
>>
>> key - string ([0-9A-F]{72}), 50K records.
>> How do it more effective?
>>

>
> yu can use the radix sort implementation available from
> Sort::Key::Radix
> that is usually faster for this kind of data that the merge sort used
> internally by perl:


Does sorting speed matter when most of the time is spent printing?

Sinan


--
A. Sinan Unur <(E-Mail Removed)>
(remove .invalid and reverse each component for email address)
clpmisc guidelines: <URL:http://www.augustmail.com/~tadmc/clpmisc.shtml>

 
Reply With Quote
 
Damian Lukowski
Guest
Posts: n/a
 
      11-30-2007
A. Sinan Unur schrieb:

> Does sorting speed matter when most of the time is spent printing?


You never know whether he's not redirecting stdout to a file on a
ramdisk.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
hash of hash of hash of hash in c++ rp C++ 1 11-10-2011 04:45 PM
Hash#select returns an array but Hash#reject returns a hash... Srijayanth Sridhar Ruby 19 07-02-2008 12:49 PM
Sorted Hash and or Array banker123 Perl Misc 4 11-05-2006 07:44 PM
ordered/sorted hash robertj Ruby 19 12-11-2005 04:10 PM
Accessing Hash elements in sorted order? Chris Ruby 12 09-18-2004 04:23 PM



Advertisments