Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: Ordering python sets

Reply
Thread Tools

Re: Ordering python sets

 
 
Tim Chase
Guest
Posts: n/a
 
      10-22-2008
> I think that using Python sets would be the best choice, but I also
> need integers to be ordered inside the set and I've just found out
> that, instead, Python sets are unordered collections.


Sets (both in Python, and their mathematical definition) are
unordered. However, some simple testing in my current python
seems to indicate that both their repr() and their __iter__()
seem to traverse in sorted order (though that may be a niceness
that Python performs in 2.3 and above). You can just order them
when you go to operate on them:

import random as r

# to test in <2.4
from sets import Set as set
s = set([r.randint(1,1000) for _ in range(1000)])
# for 2.4+
s = set(r.randint(1,1000) for _ in range(1000))

print s
for i in sorted(list(s)):
do_something(i)

Though for each test, in 2.3, 2.4, and 2.5 that I've got
installed on my local machine, they each printed "s" in-order,
and the iteration occurred in-order as well, even without the
added "sorted(list(s))" code.

-tkc



 
Reply With Quote
 
 
 
 
Peter Otten
Guest
Posts: n/a
 
      10-22-2008
Tim Chase wrote:

> Though for each test, in 2.3, 2.4, and 2.5 that I've got
> installed on my local machine, they each printed "s" in-order,
> and the iteration occurred in-order as well, even without the
> added "sorted(list(s))" code.


You need more tests then

>>> list(set([1,1000]))

[1000, 1]

By the way, sorted(s) is sufficient; sorted() accepts arbitrary iterables.

Peter
 
Reply With Quote
 
 
 
 
Mr.SpOOn
Guest
Posts: n/a
 
      10-22-2008
On Wed, Oct 22, 2008 at 3:37 PM, Peter Otten <(E-Mail Removed)> wrote:
> Tim Chase wrote:
>
>> Though for each test, in 2.3, 2.4, and 2.5 that I've got
>> installed on my local machine, they each printed "s" in-order,
>> and the iteration occurred in-order as well, even without the
>> added "sorted(list(s))" code.

>
> You need more tests then
>
>>>> list(set([1,1000]))

> [1000, 1]


It seems to me that it orders elements when you add using the add()
method, but if you create a set starting from a list, it may result
unordered.

Anyway, the add() fails to order if you try to add a float in a set of integers.

> By the way, sorted(s) is sufficient; sorted() accepts arbitrary iterables.


I'll try with this.
 
Reply With Quote
 
Roy Smith
Guest
Posts: n/a
 
      10-22-2008
In article <(E-Mail Removed)>,
Mr.SpOOn <(E-Mail Removed)> wrote:

> It seems to me that it orders elements when you add using the add()
> method, but if you create a set starting from a list, it may result
> unordered.


Arrrggghhh! None of these behaviors are guaranteed. The docs say, "A set
object is an unordered collection". If you write code which depends on a
set preserving order, are going to get burned.

If you want something that preserves order, use a list. If the O(n) lookup
time of a list is too slow, you can get O(log n) with heapq.

If you truly need O(1) lookup time AND need to preserve order, you might
consider writing a class which does something like stores the values in a
list, but use a dict to map value -> list_index, and maintains both data
structures in parallel. Or, roll your own tree structure. Or wait for
true STL-style trees to be added to Python
 
Reply With Quote
 
Mr.SpOOn
Guest
Posts: n/a
 
      10-22-2008
On Wed, Oct 22, 2008 at 4:30 PM, Roy Smith <(E-Mail Removed)> wrote:
> In article <(E-Mail Removed)>,
> Mr.SpOOn <(E-Mail Removed)> wrote:
>
>> It seems to me that it orders elements when you add using the add()
>> method, but if you create a set starting from a list, it may result
>> unordered.

>
> Arrrggghhh! None of these behaviors are guaranteed. The docs say, "A set
> object is an unordered collection". If you write code which depends on a
> set preserving order, are going to get burned.


Yes, of course
I wasn't going to count on that.

> If you want something that preserves order, use a list. If the O(n) lookup
> time of a list is too slow, you can get O(log n) with heapq.


Well, maybe I can just do it using sets and the sorted() method. If
this doesn't satisfy me, I think I'll just use lists.
 
Reply With Quote
 
Peter Pearson
Guest
Posts: n/a
 
      10-22-2008
On Wed, 22 Oct 2008 15:37:03 +0200, Peter Otten <(E-Mail Removed)> wrote:
> Tim Chase wrote:
>
>> Though for each test, in 2.3, 2.4, and 2.5 that I've got
>> installed on my local machine, they each printed "s" in-order,
>> and the iteration occurred in-order as well, even without the
>> added "sorted(list(s))" code.

>
> You need more tests then
>
>>>> list(set([1,1000]))

> [1000, 1]


So one wonders, of course, why Mr. Otten chose 1000; and one explores:

$ python
Python 2.4.3 (#2, Jul 31 2008, 21:56:52)
[snip]
>>> for x in 2, 100, 199, 200, 300, 400, 500, 600:

.... list( set( [ 1, x ] ) )
....
[1, 2]
[1, 100]
[1, 199]
[200, 1]
[1, 300]
[400, 1]
[1, 500]
[600, 1]
>>>



--
To email me, substitute nowhere->spamcop, invalid->net.
 
Reply With Quote
 
Glenn Linderman
Guest
Posts: n/a
 
      10-22-2008
On approximately 10/22/2008 7:48 AM, came the following characters from
the keyboard of Mr.SpOOn:
> On Wed, Oct 22, 2008 at 4:30 PM, Roy Smith <(E-Mail Removed)> wrote:
>
>> In article <(E-Mail Removed)>,
>> Mr.SpOOn <(E-Mail Removed)> wrote:
>>
>>
>>> It seems to me that it orders elements when you add using the add()
>>> method, but if you create a set starting from a list, it may result
>>> unordered.
>>>

>> Arrrggghhh! None of these behaviors are guaranteed. The docs say, "A set
>> object is an unordered collection". If you write code which depends on a
>> set preserving order, are going to get burned.
>>

>
> Yes, of course
> I wasn't going to count on that.
>
>
>> If you want something that preserves order, use a list. If the O(n) lookup
>> time of a list is too slow, you can get O(log n) with heapq.
>>

>
> Well, maybe I can just do it using sets and the sorted() method. If
> this doesn't satisfy me, I think I'll just use lists.
>

Be aware of the bisect module if you decide to go to (sorted) lists.

--
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

 
Reply With Quote
 
Peter Otten
Guest
Posts: n/a
 
      10-22-2008
Peter Pearson wrote:

> On Wed, 22 Oct 2008 15:37:03 +0200, Peter Otten <(E-Mail Removed)> wrote:
>> Tim Chase wrote:
>>
>>> Though for each test, in 2.3, 2.4, and 2.5 that I've got
>>> installed on my local machine, they each printed "s" in-order,
>>> and the iteration occurred in-order as well, even without the
>>> added "sorted(list(s))" code.

>>
>> You need more tests then
>>
>>>>> list(set([1,1000]))

>> [1000, 1]

>
> So one wonders, of course, why Mr. Otten chose 1000; and one explores:
>
> $ python
> Python 2.4.3 (#2, Jul 31 2008, 21:56:52)
> [snip]
>>>> for x in 2, 100, 199, 200, 300, 400, 500, 600:

> ... list( set( [ 1, x ] ) )
> ...
> [1, 2]
> [1, 100]
> [1, 199]
> [200, 1]
> [1, 300]
> [400, 1]
> [1, 500]
> [600, 1]
>>>>


Here's another one:

>>> set([1,9])

set([1, 9])
>>> set([9,1])

set([9, 1])

This time I did indeed search systematically...

Peter

 
Reply With Quote
 
Peter Otten
Guest
Posts: n/a
 
      10-22-2008
Duncan Booth wrote:

> Peter Otten <(E-Mail Removed)> wrote:
>
>> Here's another one:
>>
>>>>> set([1,9])

>> set([1, 9])
>>>>> set([9,1])

>> set([9, 1])
>>
>> This time I did indeed search systematically...
>>

> You missed one with smaller values:
>>>> set([8,0])

> set([8, 0])
>>>> set([0,8])

> set([0, 8])


I searched the minimal combination with one...

> You can work some of it out quite easily instead of searching. The hash
> value for an integer is itself, the initial space allocated for the set is
> 8 slots so each value is reduced modulo 8. If the values you insert don't
> clash then the order of insertion doesn't matter. If there are values
> which coincide on the same slot then the second one inserted will go into
> a different slot so the order may vary.


I guess I have to move the goal posts to beat you:

>>> set([-1,-2]), set([-2,-1])

(set([-2, -1]), set([-1, -2]))

For that one the number of slots doesn't matter because

>>> hash(-1), hash(-2)

(-2, -2)

Peter
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Wanted: Python solution for ordering dependencies Eduardo Schettino Python 3 04-25-2010 05:00 PM
Z-Ordering (Morton ordering) question nbigaouette C Programming 2 11-06-2009 05:26 AM
Ordering python sets Mr.SpOOn Python 31 11-06-2008 01:28 PM
Sets in Python sapsi Python 33 09-22-2007 02:23 PM
Python sets. Grzegorz Dostatni Python 4 05-05-2004 02:07 AM



Advertisments