Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > hashCode

Reply
Thread Tools

hashCode

 
 
Jan Burse
Guest
Posts: n/a
 
      08-11-2012
To: Roedy Green
From: Jan Burse <(E-Mail Removed)>

Roedy Green schrieb:
> If you had a
> large number of fields, the multiply effect could fall off the left
> hand end.


Actually this does not happen, since you multiply with 31, which is 1+2+4+8+16.
So that:

a*31+b = a*16+a*8+a*4+a*2+a+b

So for a HashMap that uses an index = hash & (2^n - 1) (which is the same as
hash mod 2^n), the impact of a will be still seen, even when it occurs at the
very left hand side.

There is some Microsoft C# HashMap implementation which does not use mod 2^n,
but instead some primes. In case the implementation choses 31 as the designated
prime, all information but for the first field will be lost. But since mod 2^32
is also applied, this might not be completely true.

For 2^n I don't know exactly how the impact could be described. I guess in a
HashMap with index = hash mod 2^1 the hash amounts to a parity bit, since the
sum in a+b acts like an xor on the first right hand bit. But 2^n with n>1 the
31 multiplication is a little more crude.

--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/3
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
 
Reply With Quote
 
 
 
 
Mike Winter
Guest
Posts: n/a
 
      08-11-2012
To: Joerg Meier
From: Mike Winter <(E-Mail Removed)>

On 11/08/2012 17:25, Joerg Meier wrote:
> On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote:
>[...]
>> In my essay I recommend XOR which is an inherentely faster operation
>> than multiply.

>
> Hasn't that been wrong since about the invention of the 80386 processor
> family ?


Not that far back: the Pentium required 9-11 cycles to complete a MUL
instruction compared to 1-3 for XOR (and the like), depending on operand
locations and widths.

> Pretty sure by now MUL and XOR both take one cycle and that's it.


More-or-less, but the former is still slower for wider operands. However, your
point is well-taken: it needn't be as much a concern in most cases.

--
Mike Winter
Replace ".invalid" with ".uk" to reply by e-mail.

--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/3
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
 
Reply With Quote
 
 
 
 
rossum
Guest
Posts: n/a
 
      08-11-2012
To: Lew
From: rossum <(E-Mail Removed)>

On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <(E-Mail Removed)> wrote:

>public static int calculateHash(Foo arg) {
> int h = 0;
>
> for ( each attribute of Foo that contributes to 'equals()' )
> {
> h = 31 * h + attribute.hashCode();
> }
> return h;
>}

Bloch starts with:

int h = 17;

He says that works beter in cases where the first one or more
attribute.hashCode() values are zero, and hence will not register.

He suggessts any constant non-zero value.

rossum

--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/3
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      08-12-2012
To: Arne Vajhj
From: Lew <(E-Mail Removed)>

On 08/10/2012 04:30 PM, Arne Vajh-,j wrote:
> On 8/10/2012 6:32 PM, Lew wrote:
>> bob smith wrote:
>>> Now, there are cases where you HAVE to override it, or your code is very
>>> broken.

>>
>> No.

>
>> As long as 'hashCode()' fulfills the contract, your code will work -
>> functionally. But a bad
>> 'hashCode()' could and likely will noticeably affect performance. There is
>> more to correctness
>> than mere functional conformance.

>
> If the code per specs is guaranteed to work then it is correct.
>
> Good (or just decent) performance is not necessary for code to
> be correct.
>
> At least not in the traditional programming terminology.
>
> In plain English maybe.


I see your point, but that is not to say that the specs exclude performance
considerations.

In the case of 'hashCode()', the Javadocs do say, "This method is supported for
the benefit of hash tables such as those provided by HashMap."
<http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode()>

The key question here is how you define "benefit". I argue that a hash code
that is constant does not benefit, say, a 'HashMap' because one of our desired
uses is constant-order retrieval.

"This implementation provides constant-time performance for the basic
operations (get and put), assuming the hash function disperses the elements
properly among the buckets."
<http://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html>

Each specification refers to the other. Ergo they are meant to be considered
together. Taken together, the documentation clearly specifies that "correct" or
"proper" includes performance considerations. Therefore, by what you say, the
simple "return 1;" is not correct.

It certainly would not be correct for the 'Object' implementation. "As much as
is reasonably practical, the hashCode method defined by class Object does
return distinct integers for distinct objects." [op. cit.]

As you say, Arne, "correct" means it follows the spec. The OP's suggested
implementation violates the spec on two fronts.

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedi.../c/cf/Friz.jpg

--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/3
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      08-12-2012
To: Eric Sosman
From: Lew <(E-Mail Removed)>

Eric Sosman wrote:
> Okay: Then returning a constant 1 (or 42 or 0 or whatever)
> would in fact satisfy the letter of the law regarding hashCode():


Not if you consider all aspects of what the Javadocs promise.

See my post upthread.

> Whenever x.equals(y) is true, x.hashCode() == y.hashCode(). In
> your example this would be trivially true because x,y,z,... all
> have the same hashCode() value, whether they're equal or not --
> You have lived up to the letter of the law.


No, because the law requires that the method support 'HashMap', which in turn
calls for "properly" hashed objects.

> Of course, such a hashCode() would make all those hash-based
> containers pretty much useless: They would work in the sense that
> they would get the Right Answer, but they'd be abominably slow,


Indeed.

> with expected performance of O(N) instead of O(1). See
> <http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/>
> for a survey of some denial-of-service attacks that work by driving
> hash tables from O(1) to O(N), resulting in catastrophic failure
> of the attacked system.
>
> In other words, the letter of the law on hashCode() is a bare
> minimum that guarantees correct functioning, but it is not enough
> to guarantee usability. Why isn't the law more specific? Because


Actually, if you consider all that the Javadocs tell you, this "letter of the
law" to which you refer is like saying the sequence "ABC" constitutes all of
"the ABCs".

> nobody knows how to write "hashCode() must be correct *and* usable"
> in terms that would cover all the classes all the Java programmers
> have dreamed up and will dream up. Your hashCode() meets the bare
> minimum requirement, but is not "usable." The actual hashCode()
> provided by Object also meets the bare minimum requirement, and *is*
> usable as it stands, until (and unless; you don't HAVE to) you
> choose to implement other equals() semantics, and a hashCode() to
> match them.


As Arne states, "correct" means "fulfills the specification". The specification
for Java API methods is the standard Javadocs, which do impose performance
considerations on 'hashCode()'.

One understands that the spec isn't always fully enforceable by the compiler.
[1] It is correct that the compiler will allow 'return 1;'. It is not correct
that that fulfills the specification.

[1] Doesn't one?

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedi.../c/cf/Friz.jpg

--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/3
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      08-12-2012
To: Jan Burse
From: Lew <(E-Mail Removed)>

Jan Burse wrote:
> Maybe it would make sense to spell out what the contract
> for hashCode() is. Well the contract is simply, the
> following invariant should hold:
>
> /* invariant that should hold */
> if a.equals(b) then a.hashCode()==b.hashCode()


True, but if you read the specification for 'hashCode()' fully, that is not the
entire contract, only the compiler-enforceable part of it.

The entire specification requires that as much as feasible, the 'Object'
implementation distinguish distinct instances, and that the method generally
support 'HashMap', which promises O(1) 'get()' and 'put()' with a "proper"
(i.e., compliant) 'hashCode()'.

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedi.../c/cf/Friz.jpg

--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/3
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      08-12-2012
To: Lew
From: Lew <(E-Mail Removed)>

Lew wrote:
> Jan Burse wrote:
>> Maybe it would make sense to spell out what the contract
>> for hashCode() is. Well the contract is simply, the
>> following invariant should hold:
>>
>> /* invariant that should hold */
>> if a.equals(b) then a.hashCode()==b.hashCode()

>
> True, but if you read the specification for 'hashCode()' fully, that is not
> the entire contract, only the compiler-enforceable part of it.


Oooops!

I made a mistake.

Not even that is compiler-enforceable.

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedi.../c/cf/Friz.jpg

--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/3
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
 
Reply With Quote
 
Arne Vajhøj
Guest
Posts: n/a
 
      08-12-2012
To: Lew
From: =?UTF-8?B?QXJuZSBWYWpow7hq?= <(E-Mail Removed)>

On 8/11/2012 7:24 PM, Lew wrote:
> On 08/10/2012 04:30 PM, Arne Vajh-,j wrote:
>> On 8/10/2012 6:32 PM, Lew wrote:
>>> bob smith wrote:
>>>> Now, there are cases where you HAVE to override it, or your code is
>>>> very
>>>> broken.
>>>
>>> No.

>>
>>> As long as 'hashCode()' fulfills the contract, your code will work -
>>> functionally. But a bad
>>> 'hashCode()' could and likely will noticeably affect performance.
>>> There is
>>> more to correctness
>>> than mere functional conformance.

>>
>> If the code per specs is guaranteed to work then it is correct.
>>
>> Good (or just decent) performance is not necessary for code to
>> be correct.
>>
>> At least not in the traditional programming terminology.
>>
>> In plain English maybe.

>
> I see your point, but that is not to say that the specs exclude
> performance considerations.
>
> In the case of 'hashCode()', the Javadocs do say, "This method is
> supported for the benefit of hash tables such as those provided by
> HashMap."
> <http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode()>
>
> The key question here is how you define "benefit". I argue that a hash
> code that is constant does not benefit, say, a 'HashMap' because one of
> our desired uses is constant-order retrieval.


Object having the method defined to support effective hashing does not imply
that it has to it just means that the potential is there.

> "This implementation provides constant-time performance for the basic
> operations (get and put), assuming the hash function disperses the
> elements properly among the buckets."


Yes. And here it makes an assumption. Not that hashCode is implemented correct,
but that it is implemented in a certain way.

> Each specification refers to the other. Ergo they are meant to be
> considered together. Taken together, the documentation clearly specifies
> that "correct" or "proper" includes performance considerations.
> Therefore, by what you say, the simple "return 1;" is not correct.


> As you say, Arne, "correct" means it follows the spec. The OP's
> suggested implementation violates the spec on two fronts.


No it does not.

It follows exactly the explicit stated contract in the Java doc:

<quote>
The general contract of hashCode is:

Whenever it is invoked on the same object more than once during an
execution of a Java application, the hashCode method must consistently return
the same integer, provided no information used in equals comparisons on the
object is modified. This integer need not remain consistent from one execution
of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method,
then calling the hashCode method on each of the two objects must produce the
same integer result.
It is not required that if two objects are unequal according to the
equals(java.lang.Object) method, then calling the hashCode method on each of
the two objects must produce distinct integer results. However, the programmer
should be aware that producing distinct integer results for unequal objects may
improve the performance of hashtables.
</quote>

The ability to support something does not make it part of the contract.

This is a classic test question in basic Java SE. And that returning a constant
is correct but not smart should be in most Java SE text books.

Arne

--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/3
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
 
Reply With Quote
 
Arne Vajhøj
Guest
Posts: n/a
 
      08-12-2012
To: Lew
From: =?UTF-8?B?QXJuZSBWYWpow7hq?= <(E-Mail Removed)>

On 8/11/2012 7:29 PM, Lew wrote:
> Eric Sosman wrote:
>> Okay: Then returning a constant 1 (or 42 or 0 or whatever)
>> would in fact satisfy the letter of the law regarding hashCode():

>
> Not if you consider all aspects of what the Javadocs promise.
>
> See my post upthread.
>
>> Whenever x.equals(y) is true, x.hashCode() == y.hashCode(). In
>> your example this would be trivially true because x,y,z,... all
>> have the same hashCode() value, whether they're equal or not --
>> You have lived up to the letter of the law.

>
> No, because the law requires that the method support 'HashMap', which in
> turn calls for "properly" hashed objects.
>
>> Of course, such a hashCode() would make all those hash-based
>> containers pretty much useless: They would work in the sense that
>> they would get the Right Answer, but they'd be abominably slow,

>
> Indeed.
>
>> with expected performance of O(N) instead of O(1). See
>> <http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/>
>> for a survey of some denial-of-service attacks that work by driving
>> hash tables from O(1) to O(N), resulting in catastrophic failure
>> of the attacked system.
>>
>> In other words, the letter of the law on hashCode() is a bare
>> minimum that guarantees correct functioning, but it is not enough
>> to guarantee usability. Why isn't the law more specific? Because

>
> Actually, if you consider all that the Javadocs tell you, this "letter
> of the law" to which you refer is like saying the sequence "ABC"
> constitutes all of "the ABCs".
>
>> nobody knows how to write "hashCode() must be correct *and* usable"
>> in terms that would cover all the classes all the Java programmers
>> have dreamed up and will dream up. Your hashCode() meets the bare
>> minimum requirement, but is not "usable." The actual hashCode()
>> provided by Object also meets the bare minimum requirement, and *is*
>> usable as it stands, until (and unless; you don't HAVE to) you
>> choose to implement other equals() semantics, and a hashCode() to
>> match them.

>
> As Arne states, "correct" means "fulfills the specification". The
> specification for Java API methods is the standard Javadocs, which do
> impose performance considerations on 'hashCode()'.
>
> One understands that the spec isn't always fully enforceable by the
> compiler. [1] It is correct that the compiler will allow 'return 1;'. It
> is not correct that that fulfills the specification.


It fulfills the spec.

It does not fulfill you bizarre interpretation of "support".

Arne

--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/3
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
 
Reply With Quote
 
Arne Vajhøj
Guest
Posts: n/a
 
      08-12-2012
To: Lew
From: =?UTF-8?B?QXJuZSBWYWpow7hq?= <(E-Mail Removed)>

On 8/11/2012 7:34 PM, Lew wrote:
> Jan Burse wrote:
>> Maybe it would make sense to spell out what the contract
>> for hashCode() is. Well the contract is simply, the
>> following invariant should hold:
>>
>> /* invariant that should hold */
>> if a.equals(b) then a.hashCode()==b.hashCode()

>
> True, but if you read the specification for 'hashCode()' fully, that is
> not the entire contract, only the compiler-enforceable part of it.
>
> The entire specification requires that as much as feasible, the 'Object'
> implementation distinguish distinct instances, and that the method
> generally support 'HashMap', which promises O(1) 'get()' and 'put()'
> with a "proper" (i.e., compliant) 'hashCode()'.


Two wrong statements.

It says that the method is defined to support HashMap

And HashMap does not guarantee O(1) with a correct hashCode - it guarantee that
for one that return good distributed values.

Arne

--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/3
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Hashcode of primitive types Dimitri Pissarenko Java 5 01-29-2004 11:05 PM
Improving hashCode() to match equals() Marco Java 10 01-17-2004 09:55 PM
Designing hashCode() methods kelvSYC Java 1 12-24-2003 02:56 AM
equals and hashCode Gregory A. Swarthout Java 2 12-20-2003 12:34 AM
hashCode for byte[] Roedy Green Java 1 08-22-2003 02:08 AM



Advertisments