Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: sorting on keys in a list of dicts

Reply
Thread Tools

Re: sorting on keys in a list of dicts

 
 
Jp Calderone
Guest
Posts: n/a
 
      01-06-2005
On Thu, 06 Jan 2005 15:31:22 +0100, J Berends <j.p.t.j.berends@[n0sp4m].nl> wrote:
>Suppose I have a list of dictionaries and each dict has a common keyname
> with a (sortable) value in it.
>
> How can I shuffle their position in the list in such way that they
> become sorted.
>


In Python 2.4,

import operator
L.sort(key=operator.itemgetter(key))

In Python 2.3,

L2 = [(d[key], i, d) for (i, d) in enumerate(L)]
L2.sort()
L = [d for (v, i, d) in L2]

Jp
 
Reply With Quote
 
 
 
 
J Berends
Guest
Posts: n/a
 
      01-06-2005
Jp Calderone wrote:
> On Thu, 06 Jan 2005 15:31:22 +0100, J Berends <j.p.t.j.berends@[n0sp4m].nl> wrote:
>
>>Suppose I have a list of dictionaries and each dict has a common keyname
>>with a (sortable) value in it.
>>
>>How can I shuffle their position in the list in such way that they
>>become sorted.
>>

>
>
> In Python 2.4,
>
> import operator
> L.sort(key=operator.itemgetter(key))
>
> In Python 2.3,
>
> L2 = [(d[key], i, d) for (i, d) in enumerate(L)]
> L2.sort()
> L = [d for (v, i, d) in L2]
>
> Jp

the 2.3 version seems to work quite nicely! thanks. Using the 2.3 for
version compatibility. Thanks!
 
Reply With Quote
 
 
 
 
Jeff Shannon
Guest
Posts: n/a
 
      01-06-2005
Jp Calderone wrote:

> L2 = [(d[key], i, d) for (i, d) in enumerate(L)]
> L2.sort()
> L = [d for (v, i, d) in L2]


Out of curiosity, any reason that you're including the index? I'd
have expected to just do

L2 = [(d[key], d) for d in L]
L2.sort()
L = [d for (v, d) in L2]

I suppose that your version has the virtue that, if the sortkey value
is equal, items retain the order that they were in the original list,
whereas my version will sort them into an essentially arbitrary order.
Is there anything else that I'm missing here?

Jeff Shannon
Technician/Programmer
Credit International

 
Reply With Quote
 
Nick Coghlan
Guest
Posts: n/a
 
      01-07-2005
Jeff Shannon wrote:
> I suppose that your version has the virtue that, if the sortkey value is
> equal, items retain the order that they were in the original list,
> whereas my version will sort them into an essentially arbitrary order.
> Is there anything else that I'm missing here?


Stability in sorting is a property not to be sneezed at - it means switching to
sorting by a second key gives the effect of "sort by key 1, then by key 2",
whereas that doesn't hold with an unstable sort algorithm. If you've ever used
an application with an unstable sorting process and that only allows sorting a
table on one column at a time, you'll appreciate the frustration that can cause

Also, it's required to match the behaviour of the Python 2.4 version (which gets
to take advantage of the stability of the builtin sort).

Cheers,
Nick.

--
Nick Coghlan | http://www.velocityreviews.com/forums/(E-Mail Removed) | Brisbane, Australia
---------------------------------------------------------------
http://boredomandlaziness.skystorm.net
 
Reply With Quote
 
It's me
Guest
Posts: n/a
 
      01-07-2005
What does it mean by "stability in sorting"?

Can somebody please give a sample for using the code posted? I am a little
lost here and I like to know more about the use of keys....

Thanks,

"Nick Coghlan" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Jeff Shannon wrote:
> > I suppose that your version has the virtue that, if the sortkey value is
> > equal, items retain the order that they were in the original list,
> > whereas my version will sort them into an essentially arbitrary order.
> > Is there anything else that I'm missing here?

>
> Stability in sorting is a property not to be sneezed at - it means

switching to
> sorting by a second key gives the effect of "sort by key 1, then by key

2",
> whereas that doesn't hold with an unstable sort algorithm. If you've ever

used
> an application with an unstable sorting process and that only allows

sorting a
> table on one column at a time, you'll appreciate the frustration that can

cause
>
> Also, it's required to match the behaviour of the Python 2.4 version

(which gets
> to take advantage of the stability of the builtin sort).
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan | (E-Mail Removed) | Brisbane, Australia
> ---------------------------------------------------------------
> http://boredomandlaziness.skystorm.net



 
Reply With Quote
 
Craig Ringer
Guest
Posts: n/a
 
      01-07-2005
On Sat, 2005-01-08 at 00:26, It's me wrote:
> What does it mean by "stability in sorting"?


If I understand correctly, it means that when two sorts are performed in
sequence, the keys that are equal to the second sort end up ordered the
way they were left by the first sort.

I'm far from certain of this, but at least I'm presenting an opportunity
for someone to yell "no, you're wrong!" and in the process definitively
answer the question.

For example, given the list:

..>>> l = [(1,2), (8,2), (2,2), (3,2), (4,3), (5,3), (8,9)]

if we sort by the first element of each tuple then the second (the
default), we get:

..>>> l.sort()
..>>> l
[(1, 2), (2, 2), (3, 2), (4, 3), (5, 3), (8, 2), (8, 9)]

Now, if we sort based on the second element we get:

..>>> def seconditem(x):
..... return x[1]
.....
..>>> l.sort(key=seconditem)
..>>> l
[(1, 2), (2, 2), (3, 2), (8, 2), (4, 3), (5, 3), (8, 9)]

You'll note that there are several correct answers to the request "sort
the list 'l' by the second element of each item", including:

[(1, 2), (2, 2), (3, 2), (8, 2), (4, 3), (5, 3), (8, 9)]
[(2, 2), (1, 2), (8, 2), (3, 2), (4, 3), (5, 3), (8, 9)]
[(1, 2), (2, 2), (3, 2), (8, 2), (5, 3), (4, 3), (8, 9)]

and many others. Because we didn't specify that the first item in the
value tuples should be used in the sort key, so long as the second key
is equal for a group of items it doesn't matter what order items in that
group appear in.

Python (at least 2.4), however, returns those groups where the order
isn't defined in the same order they were before the sort. Look at this,
for example:

..>>> l.sort()
..>>> l.reverse()
..>>> l
[(8, 9), (8, 2), (5, 3), (4, 3), (3, 2), (2, 2), (1, 2)]
..>>> l.sort(key=seconditem)
..>>> l
[(8, 2), (3, 2), (2, 2), (1, 2), (5, 3), (4, 3), (8, 9)]

See how the exact same sort command was used this time around, but
because the list was reverse-sorted first, the elements are in reverse
order by first item when the second item is equal?

In the first case we used the same result as the stable sort could be
obtained with:

..>>> def revitem(x):
..... return (x[1], x[0])
>>> l.sort(key=revitem)
>>> l

[(1, 2), (2, 2), (3, 2), (8, 2), (4, 3), (5, 3), (8, 9)]

(in other words, saying "use the value tuple as the sort key, but sort
by the second element before the first")

That doesn't extend to more complex cases very well though. Imagine you
had 3-tuples not 2-tuples, and wanted to maintain the previous sort
order of equal groupings when re-sorting by a different key... but you
didn't know what key was last used for sorting. A stable sort algorithm
means you don't need to care, because the order will be maintained for
you not randomized.

Well, that's several hundred more words than were probably required, but
I hope I made sense.

--
Craig Ringer

 
Reply With Quote
 
Nick Coghlan
Guest
Posts: n/a
 
      01-07-2005
It's me wrote:
> What does it mean by "stability in sorting"?
>
> Can somebody please give a sample for using the code posted? I am a little
> lost here and I like to know more about the use of keys....


It's the jargon for what Jeff said - if you are sorting by some value calculated
from each list entry, and different list entries may give the same value, then a
stable sort guarantees that the order of entries giving the same value will be
preserved from the original list.

Consider:

Py> from operator import itemgetter as item
Py> seq = [(1, 1), (2, 1), (2, 3), (1, 5)]
Py> seq.sort(key=item(1))
Py> seq #1
[(1, 1), (2, 1), (2, 3), (1, 5)]
Py> seq.sort(reverse=True)
Py> seq #2
[(2, 3), (2, 1), (1, 5), (1, 1)]
Py> seq.sort(key=item(1))
Py> seq #3
[(2, 1), (1, 1), (2, 3), (1, 5)]

This snippet sorts the tuples according to the second item, then sorts them in
reverse order of the whole tuple, then resorts them according to the second item.

Notice that the order of the first two items is different between point #1 and
point #3. This is because sorting by the second item makes no distinction
between these two tuples, and they retain whatever order they had before the
sort began.

This is what it means to have a stable sort, and it makes expressing complex
sorting quite easy by chaining different sort operations together.

Python 2.3 has a stable sort, and Python 2.4 brought the guarantee that it shall
remain that way. I'm not sure about Python 2.2 and earlier.

Cheers,
Nick.

--
Nick Coghlan | (E-Mail Removed) | Brisbane, Australia
---------------------------------------------------------------
http://boredomandlaziness.skystorm.net
 
Reply With Quote
 
Tim Peters
Guest
Posts: n/a
 
      01-07-2005
[Nick Coghlan]
....
> Python 2.3 has a stable sort, and Python 2.4 brought the guarantee that it shall
> remain that way. I'm not sure about Python 2.2 and earlier.


No list.sort() implementation before 2.3 was stable. It was
confusing, though, because the samplesort/binary_insertion_sort hybrid
Python used for the 4 years preceding 2.3 *was* stable for all "small
enough" lists. More than one person got fooled by guessing that
stability observed in small test cases meant it was always stable.
 
Reply With Quote
 
Nick Coghlan
Guest
Posts: n/a
 
      01-07-2005
Craig Ringer wrote:
> Well, that's several hundred more words than were probably required, but
> I hope I made sense.


Remarkably similar to what I just posted. . . I guess numeric 2-tuples are just
too good to pass up when discussing sorting

Cheers,
Nick.

--
Nick Coghlan | (E-Mail Removed) | Brisbane, Australia
---------------------------------------------------------------
http://boredomandlaziness.skystorm.net
 
Reply With Quote
 
Scott David Daniels
Guest
Posts: n/a
 
      01-07-2005
Jeff Shannon wrote:
> Jp Calderone wrote:
>> L2 = [(d[key], i, d) for (i, d) in enumerate(L)]
>> L2.sort()
>> L = [d for (v, i, d) in L2]

> Out of curiosity, any reason that you're including the index?


Others have already remarked that this preserves sort stability
(which is, in fact a lovely property). There is another property
which hasn't been mentioned:
As written, only the key and the index are compared.

Try sorting:
tricky = [dict(a=5j, b=1), dict(a=4j, b=1)]
or:
similar = [(5j, 1), (4j, 1)]

without the index.

--Scott David Daniels
(E-Mail Removed)



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
generating unique set of dicts from a list of dicts bruce Python 0 01-10-2012 08:24 PM
Sorting list vs sorting vector boltar2003@boltar.world C++ 2 07-06-2010 09:40 AM
sorting on keys in a list of dicts J Berends Python 1 01-06-2005 02:40 PM
unicode keys in dicts Jiba Python 2 01-08-2004 03:20 PM
Dos Keys, Window Keys, is there a list of what I must memorize Sean Cleary A+ Certification 0 08-04-2003 02:03 AM



Advertisments