Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > hashCode

Reply
Thread Tools

hashCode

 
 
Mike Winter
Guest
Posts: n/a
 
      08-11-2012
On 11/08/2012 17:25, Joerg Meier wrote:
> On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote:
>[...]
>> In my essay I recommend XOR which is an inherentely faster operation
>> than multiply.

>
> Hasn't that been wrong since about the invention of the 80386 processor
> family ?


Not that far back: the Pentium required 9-11 cycles to complete a MUL
instruction compared to 1-3 for XOR (and the like), depending on operand
locations and widths.

> Pretty sure by now MUL and XOR both take one cycle and that's it.


More-or-less, but the former is still slower for wider operands.
However, your point is well-taken: it needn't be as much a concern in
most cases.

--
Mike Winter
Replace ".invalid" with ".uk" to reply by e-mail.
 
Reply With Quote
 
 
 
 
Lew
Guest
Posts: n/a
 
      08-11-2012
On 08/10/2012 04:30 PM, Arne Vajhøj wrote:
> On 8/10/2012 6:32 PM, Lew wrote:
>> bob smith wrote:
>>> Now, there are cases where you HAVE to override it, or your code is very
>>> broken.

>>
>> No.

>
>> As long as 'hashCode()' fulfills the contract, your code will work -
>> functionally. But a bad
>> 'hashCode()' could and likely will noticeably affect performance. There is
>> more to correctness
>> than mere functional conformance.

>
> If the code per specs is guaranteed to work then it is correct.
>
> Good (or just decent) performance is not necessary for code to
> be correct.
>
> At least not in the traditional programming terminology.
>
> In plain English maybe.


I see your point, but that is not to say that the specs exclude performance
considerations.

In the case of 'hashCode()', the Javadocs do say, "This method is supported
for the benefit of hash tables such as those provided by HashMap."
<http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode()>

The key question here is how you define "benefit". I argue that a hash code
that is constant does not benefit, say, a 'HashMap' because one of our desired
uses is constant-order retrieval.

"This implementation provides constant-time performance for the basic
operations (get and put), assuming the hash function disperses the elements
properly among the buckets."
<http://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html>

Each specification refers to the other. Ergo they are meant to be considered
together. Taken together, the documentation clearly specifies that "correct"
or "proper" includes performance considerations. Therefore, by what you say,
the simple "return 1;" is not correct.

It certainly would not be correct for the 'Object' implementation.
"As much as is reasonably practical, the hashCode method defined by class
Object does return distinct integers for distinct objects." [op. cit.]

As you say, Arne, "correct" means it follows the spec. The OP's suggested
implementation violates the spec on two fronts.

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedi.../c/cf/Friz.jpg

 
Reply With Quote
 
 
 
 
Lew
Guest
Posts: n/a
 
      08-11-2012
Eric Sosman wrote:
> Okay: Then returning a constant 1 (or 42 or 0 or whatever)
> would in fact satisfy the letter of the law regarding hashCode():


Not if you consider all aspects of what the Javadocs promise.

See my post upthread.

> Whenever x.equals(y) is true, x.hashCode() == y.hashCode(). In
> your example this would be trivially true because x,y,z,... all
> have the same hashCode() value, whether they're equal or not --
> You have lived up to the letter of the law.


No, because the law requires that the method support 'HashMap', which in turn
calls for "properly" hashed objects.

> Of course, such a hashCode() would make all those hash-based
> containers pretty much useless: They would work in the sense that
> they would get the Right Answer, but they'd be abominably slow,


Indeed.

> with expected performance of O(N) instead of O(1). See
> <http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/>
> for a survey of some denial-of-service attacks that work by driving
> hash tables from O(1) to O(N), resulting in catastrophic failure
> of the attacked system.
>
> In other words, the letter of the law on hashCode() is a bare
> minimum that guarantees correct functioning, but it is not enough
> to guarantee usability. Why isn't the law more specific? Because


Actually, if you consider all that the Javadocs tell you, this "letter of the
law" to which you refer is like saying the sequence "ABC" constitutes all of
"the ABCs".

> nobody knows how to write "hashCode() must be correct *and* usable"
> in terms that would cover all the classes all the Java programmers
> have dreamed up and will dream up. Your hashCode() meets the bare
> minimum requirement, but is not "usable." The actual hashCode()
> provided by Object also meets the bare minimum requirement, and *is*
> usable as it stands, until (and unless; you don't HAVE to) you
> choose to implement other equals() semantics, and a hashCode() to
> match them.


As Arne states, "correct" means "fulfills the specification". The
specification for Java API methods is the standard Javadocs, which do impose
performance considerations on 'hashCode()'.

One understands that the spec isn't always fully enforceable by the compiler.
[1] It is correct that the compiler will allow 'return 1;'. It is not correct
that that fulfills the specification.

[1] Doesn't one?

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedi.../c/cf/Friz.jpg

 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      08-11-2012
Jan Burse wrote:
> Maybe it would make sense to spell out what the contract
> for hashCode() is. Well the contract is simply, the
> following invariant should hold:
>
> /* invariant that should hold */
> if a.equals(b) then a.hashCode()==b.hashCode()


True, but if you read the specification for 'hashCode()' fully, that is not
the entire contract, only the compiler-enforceable part of it.

The entire specification requires that as much as feasible, the 'Object'
implementation distinguish distinct instances, and that the method generally
support 'HashMap', which promises O(1) 'get()' and 'put()' with a "proper"
(i.e., compliant) 'hashCode()'.

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedi.../c/cf/Friz.jpg
 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      08-11-2012
Lew wrote:
> Jan Burse wrote:
>> Maybe it would make sense to spell out what the contract
>> for hashCode() is. Well the contract is simply, the
>> following invariant should hold:
>>
>> /* invariant that should hold */
>> if a.equals(b) then a.hashCode()==b.hashCode()

>
> True, but if you read the specification for 'hashCode()' fully, that is not
> the entire contract, only the compiler-enforceable part of it.


Oooops!

I made a mistake.

Not even that is compiler-enforceable.

--
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedi.../c/cf/Friz.jpg
 
Reply With Quote
 
Arne Vajhøj
Guest
Posts: n/a
 
      08-12-2012
On 8/11/2012 7:24 PM, Lew wrote:
> On 08/10/2012 04:30 PM, Arne Vajhøj wrote:
>> On 8/10/2012 6:32 PM, Lew wrote:
>>> bob smith wrote:
>>>> Now, there are cases where you HAVE to override it, or your code is
>>>> very
>>>> broken.
>>>
>>> No.

>>
>>> As long as 'hashCode()' fulfills the contract, your code will work -
>>> functionally. But a bad
>>> 'hashCode()' could and likely will noticeably affect performance.
>>> There is
>>> more to correctness
>>> than mere functional conformance.

>>
>> If the code per specs is guaranteed to work then it is correct.
>>
>> Good (or just decent) performance is not necessary for code to
>> be correct.
>>
>> At least not in the traditional programming terminology.
>>
>> In plain English maybe.

>
> I see your point, but that is not to say that the specs exclude
> performance considerations.
>
> In the case of 'hashCode()', the Javadocs do say, "This method is
> supported for the benefit of hash tables such as those provided by
> HashMap."
> <http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode()>
>
> The key question here is how you define "benefit". I argue that a hash
> code that is constant does not benefit, say, a 'HashMap' because one of
> our desired uses is constant-order retrieval.


Object having the method defined to support effective hashing
does not imply that it has to it just means that the potential
is there.

> "This implementation provides constant-time performance for the basic
> operations (get and put), assuming the hash function disperses the
> elements properly among the buckets."


Yes. And here it makes an assumption. Not that hashCode is implemented
correct, but that it is implemented in a certain way.

> Each specification refers to the other. Ergo they are meant to be
> considered together. Taken together, the documentation clearly specifies
> that "correct" or "proper" includes performance considerations.
> Therefore, by what you say, the simple "return 1;" is not correct.


> As you say, Arne, "correct" means it follows the spec. The OP's
> suggested implementation violates the spec on two fronts.


No it does not.

It follows exactly the explicit stated contract in the Java doc:

<quote>
The general contract of hashCode is:

Whenever it is invoked on the same object more than once during an
execution of a Java application, the hashCode method must consistently
return the same integer, provided no information used in equals
comparisons on the object is modified. This integer need not remain
consistent from one execution of an application to another execution of
the same application.
If two objects are equal according to the equals(Object) method,
then calling the hashCode method on each of the two objects must produce
the same integer result.
It is not required that if two objects are unequal according to the
equals(java.lang.Object) method, then calling the hashCode method on
each of the two objects must produce distinct integer results. However,
the programmer should be aware that producing distinct integer results
for unequal objects may improve the performance of hashtables.
</quote>

The ability to support something does not make it part of the contract.

This is a classic test question in basic Java SE. And that returning
a constant is correct but not smart should be in most Java SE
text books.

Arne





 
Reply With Quote
 
Arne Vajhøj
Guest
Posts: n/a
 
      08-12-2012
On 8/11/2012 7:29 PM, Lew wrote:
> Eric Sosman wrote:
>> Okay: Then returning a constant 1 (or 42 or 0 or whatever)
>> would in fact satisfy the letter of the law regarding hashCode():

>
> Not if you consider all aspects of what the Javadocs promise.
>
> See my post upthread.
>
>> Whenever x.equals(y) is true, x.hashCode() == y.hashCode(). In
>> your example this would be trivially true because x,y,z,... all
>> have the same hashCode() value, whether they're equal or not --
>> You have lived up to the letter of the law.

>
> No, because the law requires that the method support 'HashMap', which in
> turn calls for "properly" hashed objects.
>
>> Of course, such a hashCode() would make all those hash-based
>> containers pretty much useless: They would work in the sense that
>> they would get the Right Answer, but they'd be abominably slow,

>
> Indeed.
>
>> with expected performance of O(N) instead of O(1). See
>> <http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/>
>> for a survey of some denial-of-service attacks that work by driving
>> hash tables from O(1) to O(N), resulting in catastrophic failure
>> of the attacked system.
>>
>> In other words, the letter of the law on hashCode() is a bare
>> minimum that guarantees correct functioning, but it is not enough
>> to guarantee usability. Why isn't the law more specific? Because

>
> Actually, if you consider all that the Javadocs tell you, this "letter
> of the law" to which you refer is like saying the sequence "ABC"
> constitutes all of "the ABCs".
>
>> nobody knows how to write "hashCode() must be correct *and* usable"
>> in terms that would cover all the classes all the Java programmers
>> have dreamed up and will dream up. Your hashCode() meets the bare
>> minimum requirement, but is not "usable." The actual hashCode()
>> provided by Object also meets the bare minimum requirement, and *is*
>> usable as it stands, until (and unless; you don't HAVE to) you
>> choose to implement other equals() semantics, and a hashCode() to
>> match them.

>
> As Arne states, "correct" means "fulfills the specification". The
> specification for Java API methods is the standard Javadocs, which do
> impose performance considerations on 'hashCode()'.
>
> One understands that the spec isn't always fully enforceable by the
> compiler. [1] It is correct that the compiler will allow 'return 1;'. It
> is not correct that that fulfills the specification.


It fulfills the spec.

It does not fulfill you bizarre interpretation of "support".

Arne


 
Reply With Quote
 
Arne Vajhøj
Guest
Posts: n/a
 
      08-12-2012
On 8/11/2012 7:34 PM, Lew wrote:
> Jan Burse wrote:
>> Maybe it would make sense to spell out what the contract
>> for hashCode() is. Well the contract is simply, the
>> following invariant should hold:
>>
>> /* invariant that should hold */
>> if a.equals(b) then a.hashCode()==b.hashCode()

>
> True, but if you read the specification for 'hashCode()' fully, that is
> not the entire contract, only the compiler-enforceable part of it.
>
> The entire specification requires that as much as feasible, the 'Object'
> implementation distinguish distinct instances, and that the method
> generally support 'HashMap', which promises O(1) 'get()' and 'put()'
> with a "proper" (i.e., compliant) 'hashCode()'.


Two wrong statements.

It says that the method is defined to support HashMap

And HashMap does not guarantee O(1) with a correct
hashCode - it guarantee that for one that return
good distributed values.

Arne


 
Reply With Quote
 
Arne Vajhøj
Guest
Posts: n/a
 
      08-12-2012
On 8/11/2012 10:15 PM, Arne Vajhøj wrote:
> This is a classic test question in basic Java SE. And that returning
> a constant is correct but not smart should be in most Java SE
> text books.


Effective Java / Joshua Bloch:

<quote>
// The worst possible legal hash function - never use!
public int hashCode() { return 42; }

It is legal because it ensures that equal objects have the
same hash code. It's atrocious because ...
</quote>

Java 2 SUN Certified Programmer & Developer / Kathy Sierra & Bert Bates:

<quote>
A hashCode() that returns the same value for all instances whether
they're equal or not is still a legal - even appropriate - hashCode()
method! For example,
public int hashCode() {
return 1492;
}
would not violate the contract
....
This hashCode() method is horrible inefficient, ...
....
Nontheless, this one-hash-fits-all method would be
considered appropriate and even correct because it
doesn't violate the contract. Once more, correct does
not necessarily mean good.
</quote>

Arne




 
Reply With Quote
 
Arne Vajhøj
Guest
Posts: n/a
 
      08-12-2012
On 8/11/2012 7:54 AM, Roedy Green wrote:
> On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <>
> wrote, quoted or indirectly quoted someone who said :
>
>> h =3D 31 * h + attribute.hashCode();
>> }

> In my essay I recommend XOR which is an inherentely faster operation
> than multiply. I wonder which actually works out better.


Multiply.

XOR has several problems:
- many small values give small result
- same values in different fields give same result
- two identical values give result zero
+ all those I did not think of.

> If you had a
> large number of fields, the multiply effect could fall off the left
> hand end. It is the algorithm used for String which could have very
> long strings, so Sun must have thought of that.


The multiply effect does not fall off the left with a value like 31
(it would with 32).

Arne




 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Hashcode of primitive types Dimitri Pissarenko Java 5 01-29-2004 11:05 PM
Improving hashCode() to match equals() Marco Java 10 01-17-2004 09:55 PM
Designing hashCode() methods kelvSYC Java 1 12-24-2003 02:56 AM
equals and hashCode Gregory A. Swarthout Java 2 12-20-2003 12:34 AM
hashCode for byte[] Roedy Green Java 1 08-22-2003 02:08 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57