Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > accessing perl's internal hashing algorithm

Reply
Thread Tools

accessing perl's internal hashing algorithm

 
 
rob
Guest
Posts: n/a
 
      09-27-2004
Hi, I have a performance intensive script that needs to hash a lot of
strings. I am guessing that perl's internal hashing algorithm is
implemented at a low level, is optimized, etc. and I would like to
access it though my perl scripts. Does anyone know how to do this?

Thanks,
Rob
 
Reply With Quote
 
 
 
 
Mark Clements
Guest
Posts: n/a
 
      09-27-2004
rob wrote:
> Hi, I have a performance intensive script that needs to hash a lot of
> strings. I am guessing that perl's internal hashing algorithm is
> implemented at a low level, is optimized, etc. and I would like to
> access it though my perl scripts. Does anyone know how to do this?

I wouldn't have thought that the algorithm would be suitable for general
purpose use, but I stand open to correction. The Digest::SHA1 and
Digest::MD5 modules, for example, are implemented via xs and not native
perl (though clearly this is not a guarantee of speed: the algorithm can
suck whatever the implementation), so if raw performance is your
requirement then you could do worse than starting with these. You could
always check out the perl source directly if you're stuck on using
perl's internal algorithm: hv.c?

Mark
 
Reply With Quote
 
 
 
 
Ala Qumsieh
Guest
Posts: n/a
 
      09-27-2004
rob wrote:

> Hi, I have a performance intensive script that needs to hash a lot of
> strings. I am guessing that perl's internal hashing algorithm is
> implemented at a low level, is optimized, etc. and I would like to
> access it though my perl scripts. Does anyone know how to do this?


A lot of the performance hit is probably due to Perl dynamically
enlarging hashes. If I remember correctly, when you declare a hash, Perl
allocates a certain size for it. As you starting populating this hash,
and once the number of keys hashed to the same integer exceeds a certain
threshold, Perl doubles the size of the hash (implemented as an array in
C), and re-indexes the keys. This can cause a lot of performace delay in
the case of a large number of keys to be hashed, especially if this
needs to be done multiple times.

The solution is to preallocate a large enough hash, if you think you can
guess how many keys you will need:

keys %hash = 5000;

Perl will round that number to the next power of two.

All of this is explained in Srinivasan's "Advanced Perl Programming"
(the Panther).

--Ala
 
Reply With Quote
 
ctcgag@hotmail.com
Guest
Posts: n/a
 
      09-27-2004
(rob) wrote:
> Hi, I have a performance intensive script that needs to hash a lot of
> strings. I am guessing that perl's internal hashing algorithm is
> implemented at a low level, is optimized, etc. and I would like to
> access it though my perl scripts. Does anyone know how to do this?


I do this by using Perl hash variables

It is not clear to me exactly what you want to do. You want to use
Perl's hashing algorithm from within a Perl script, but without the
rest of the Hash-table machinery? I suppose you could copy the hashing
algorithm out of the Perl source and use it to create your own XS or
Inline::C code.
perlguts:

The hash algorithm is defined in the "PERL_HASH(hash, key, klen)"
macro:

hash = 0;
while (klen--)
hash = (hash * 33) + *key++;
hash = hash + (hash >> 5); /* after 5.6 */

But I don't see what this will get you. Calling this many times from
within Perl will probably spend more time on function-call overhead than on
hashing.



Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
 
Reply With Quote
 
Michele Dondi
Guest
Posts: n/a
 
      09-29-2004
On Mon, 27 Sep 2004 18:00:04 GMT, Ala Qumsieh <>
wrote:

>The solution is to preallocate a large enough hash, if you think you can
> guess how many keys you will need:
>
> keys %hash = 5000;


Cool! A good example of how precious bits of information can leak in
clpmisc... actually it's all there in the docs, as I have promptly
checked, but, well: would I have done that had I not read this?!?


Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Market Basket Algorithm - Direct Hashing and Pruning Arthur Chan Ruby 0 04-27-2009 10:56 AM
Fastest hashing algorithm? howa Java 8 06-15-2007 09:51 AM
Internal Client Accessing Internal Server Via Public IP Address GeekMarine1972 Cisco 1 01-15-2005 02:49 AM
Searching: Extendible Hashing Algorithm (OOP) Michael Pernkopf C++ 0 05-27-2004 05:18 PM
Key generation algorithm and Cipher algorithm Ahmed Moustafa Java 0 11-15-2003 06:35 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57