Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > how to define hashcode for this class?

Reply
Thread Tools

how to define hashcode for this class?

 
 
cbongior@stny.rr.com
Guest
Posts: n/a
 
      08-05-2005
That's what the equals() method is for. The hascode built into object
is not well suited to uniform random probabilty. "Effective Java" by
Bloch gives a simple test to prove this out. You are right though about
symetry -- which is why the EJ book even gives you and improved hashing
algorithm (I believe it involves some simple primes). And now that you
mention it, I suppose commutivity is a problem with hashing (increase
chance of collision). I will look at the bloch method again tonight

Clearly though, I stirred up a hornets nest with the above post.
Avoiding overflow is merely precautionary wisdom. If someone can show
me that overflow in a hasing algorithm is NEVER a problem (or give a
very convincing argument) I will let my overflow worries ebb (get it:
Overflow, EBB ?).

 
Reply With Quote
 
 
 
 
Thomas Weidenfeller
Guest
Posts: n/a
 
      08-05-2005
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> Clearly though, I stirred up a hornets nest with the above post.
> Avoiding overflow is merely precautionary wisdom. If someone can show
> me that overflow in a hasing algorithm is NEVER a problem (or give a
> very convincing argument) I will let my overflow worries ebb (get it:
> Overflow, EBB ?).


Let's start with the behavior of the data type: Overflow behavior of
integers is well defined in the JLS. You don't get any exception, and
the result resembles the lower bits as if the operation had been
executed with a sufficiently large data type. So that shouldn't create a
new problem.

But, overflow in a hashing algorithm can of course be a problem, if the
algorithm is ill chosen, depending on the value range. E.g.

// brain-dead hash function
hash = 1;
for(int i = 0; i < value; i++) {
hash *= 2;
}

The above will give a hash code of 0 for almost all values. This is a
property of the particular algorithm (and the input). Other algorithms
have other properties, and might or might not degenerate when overflow
happens.

/Thomas
--
The comp.lang.java.gui FAQ:
ftp://ftp.cs.uu.nl/pub/NEWS.ANSWERS/...g/java/gui/faq
http://www.uni-giessen.de/faq/archiv....java.gui.faq/
 
Reply With Quote
 
 
 
 
Patricia Shanahan
Guest
Posts: n/a
 
      08-05-2005
(E-Mail Removed) wrote:
> That's what the equals() method is for. The hascode built into object
> is not well suited to uniform random probabilty...


The sole objective of hashing is to reduce the number of key equality
tests for a table look-up, so I don't understand the first quoted sentence.

The relevant parts of the hashCode contract are "However, the programmer
should be aware that producing distinct integer results for unequal
objects may improve the performance of hashtables." and "As much as is
reasonably practical, the hashCode method defined by class Object does
return distinct integers for distinct objects."

A hash table implementation can do its own manipulations to distribute
entries with different codes among its buckets, but there is nothing it
can do to avoid excessive equals() calls given a hash function that
assigns the same code to clusters of keys that are likely to exist in
the same table.

> Clearly though, I stirred up a hornets nest with the above post.
> Avoiding overflow is merely precautionary wisdom. If someone can show
> me that overflow in a hasing algorithm is NEVER a problem (or give a
> very convincing argument) I will let my overflow worries ebb (get it:
> Overflow, EBB ?).


Integer overflow is not a problem, because it never causes abrupt
completion of an operation and it has no attractors, no values that
results tend to concentrate on. Any int can the the result of an
overflowed int calculation.

On the other hand, I would not like to see double overflow in a
hash code function, because the double infinities function as
attractors. Any overflow results in one of them, and an infinite
intermediate result in a chained calculation tends to lead to an
infinite final result.

[Note that I nobly resisted any puns involving "floating", "overflow",
and "ebb".]

Patricia
 
Reply With Quote
 
Dale King
Guest
Posts: n/a
 
      08-17-2005
Kevin wrote:
> Thanks a lot.
> I also find this link very useful:
> http://www.javaworld.com/javaqa/2002...hashtable.html
> I think I will just use something like a*ValueA+b*ValueB+c*ValueC,
> where a/b/c are primes.


Just make sure you use some large primes and not just primes like 3,5,7.

--
Dale King
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
About typedef -- define the function pointer or define function model? robin liu C Programming 3 04-21-2006 03:26 PM
#define _ and #define __ Brian Takita Ruby 0 01-23-2006 04:34 AM
How to define a define that defines some defines ? theotyflos C Programming 3 02-19-2004 05:07 PM
Improving hashCode() to match equals() Marco Java 10 01-17-2004 09:55 PM
hashCode for byte[] Roedy Green Java 1 08-22-2003 02:08 AM



Advertisments