Velocity Reviews > Retrieving an object from a set

# Retrieving an object from a set

Arnaud Delobelle
Guest
Posts: n/a

 01-25-2013
Dear Pythoneers,

I've got a seemingly simple problem, but for which I cannot find a
simple solution.

I have a set of objects (say S) containing an object which is equal to
a given object (say x). So

x in S

is true. So there is an object y in S which is equal to x. My
problem is how to retrieve y, without going through the whole set.
Here is a simple illustration with tuples (my actual scenario is not
with tuples but with a custom class):

>>> y = (1, 2, 3) # This is the 'hidden object'
>>> S = set([y] + range(10000))
>>> x = (1, 2, 3)
>>> x in S

True
>>> x is y

False

I haven't found y. It's a very simple problem, and this is the
simplest solution I can think of:

class FindEqual(object):
def __init__(self, obj):
self.obj = obj
def __hash__(self):
return hash(self.obj)
def __eq__(self, other):
equal = self.obj == other
if equal:
self.lastequal = other
return equal

>>> yfinder = FindEqual(x)
>>> yfinder in S

True
>>> yfinder.lastequal is y

True

I've found y! I'm not happy with this as it really is a trick. Is
there a cleaner solution?

--
Arnaud

Steven D'Aprano
Guest
Posts: n/a

 01-26-2013
Arnaud Delobelle wrote:

> Dear Pythoneers,
>
> I've got a seemingly simple problem, but for which I cannot find a
> simple solution.
>
> I have a set of objects (say S) containing an object which is equal to
> a given object (say x). So
>
> x in S
>
> is true. So there is an object y in S which is equal to x. My
> problem is how to retrieve y, without going through the whole set.

Why do you care? Since x == y, what benefit do you get from extracting the
actual object y?

I'm not necessarily saying that you *should not* care, only that it is a
very rare occurrence. The only thing I can think of is interning objects,
which is best done with a dict, not a set:

CACHE = {}

def intern(obj):
return CACHE.setdefault(obj, obj)

which you could then use like this:

py> s = "hello world"
py> intern(s)
'hello world'
py> t = 'hello world'
py> t is s
False
py> intern(t) is s
True

However, there aren't very many cases where doing this is actually helpful.
Under normal circumstances, object equality is much more important than
identity, and if you find that identity is important to you, then you

So... now that I've told you why you shouldn't do it, here's how you can do
it anyway:

def retrieve(S, x):
"""Returns the object in set S which is equal to x."""
s = set(S) # make a copy of S
t = S.difference(s)
if t:
return t.pop()

S = set(range(10))
y = (1, 2, "hello world")
x = (1, 2, "hello world")
assert x is not y and x == y
z = retrieve(S, x)
assert z is y

By the way, since this makes a copy of the set, it is O(n). The
straight-forward approach:

for element in S:
if x == element:
x = element
break

is also O(n), but with less overhead. On the other hand, the retrieve
function above does most of its work in C, while the straight-forward loop
is pure Python, so it's difficult to say which will be faster. I suggest
you time them and see.

--
Steven