Velocity Reviews > Re: Any built-in ishashable method ?

# Re: Any built-in ishashable method ?

Steven D'Aprano
Guest
Posts: n/a

 01-18-2013
On Fri, 18 Jan 2013 12:56:09 +0100, Jean-Michel Pichavant wrote:

>> So I'm guessing you had a key where
>>
>> key1 == key2 did not imply hash(key1) == hash(key2)
>>
>> I don't see a way to avoid that problem in a look-before-you-leap test.
>>
>> --
>> http://mail.python.org/mailman/listinfo/python-list
>>
>>

> You guessed right. But it took me a lot of time before jumping to that
> conclusion, mostly because the code worked in the first place (it can
> with a little bit of luck). Now I'm extra careful about what I use as
> dict key, I was just wondering if there was a reliable quick way to
> verify an object can be used as key.

No. You can verify that an object is hashable, by trying to hash it, but
there is no quick way to verify that the object's hash method isn't buggy.

> That brings me to another question, is there any valid test case where
> key1 != key2 and hash(key1) == hash(key2) ? Or is it some kind of design
> flaw ?

Neither. It is a consequence of the nature of the universe: the
Pigeonhole Principle. You can't put N+1 pigeons into N pigeonholes
without at least one pigeonhole containing two pigeons.

Take, for example, ints. There are an infinite number of possible ints in
Python, or at least limited only by memory. But there are only a finite
number of values they can be hashed to: hash values are limited to the
values -2147483648 through 2147483647 in Python 2.7. You can't have an
infinite set of values hash to a finite number of hashes without there
being some duplicates. And so we have:

py> hash(1)
1
py> hash(2**32)
1
py> hash(2**64)
1
py> hash(2**12
1
py> hash(2**204
1

etc. Likewise, there are an infinite number of potential strings, which
must hash into a finite number of hash values. Although it is hard for me
to find hash collisions, I know that there must be some, because there
are more strings than hash values. This is not a design flaw in the hash
function, and can't be fixed by finding a more clever hash function. Nor
is it a deliberate feature. It's just the way hashing works.

--
Steven