Go Back   Velocity Reviews > Newsgroups > Python
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

Python - Proposed implementation for an Ordered Dictionary

 
Thread Tools Search this Thread
Old 02-27-2009, 08:49 AM   #11
Default Re: Proposed implementation for an Ordered Dictionary


Raymond Hettinger:
>Paul Rubin:
>>another (messy) approach would be to write a C
>>extension that uses a doubly linked list some day.

>
> That seems like an ideal implementation to me.


This was my Python implementation, where the delete too is O(1), but
it's slow:
http://code.activestate.com/recipes/498195/

I think the C extension with the doubly linked list is the best
approach.
Note that in modern CPUs caches are able to change the performance a
lot. So reducing the memory used is very important. So using the XOR
(or subtraction) trick to use only 1 pointer for each key-value may be
useful to keep performance not too much far from the normal python
dicts:
http://en.wikipedia.org/wiki/XOR_linked_list

Bye,
bearophile


bearophileHUGS@lycos.com
  Reply With Quote
Old 02-27-2009, 09:00 AM   #12
Paul Rubin
 
Posts: n/a
Default Re: Proposed implementation for an Ordered Dictionary
writes:
> So using the XOR (or subtraction) trick to use only 1 pointer for
> each key-value may be useful to keep performance not too much far
> from the normal python dicts: http://en.wikipedia.org/wiki/XOR_linked_list


I don't see how to delete a randomly chosen node if you use that
trick, since the hash lookup doesn't give you two consecutive nodes
in the linked list to xor together. Maybe you could do something
intricate with skip lists and end up with O(log n) deletion.


Paul Rubin
  Reply With Quote
Old 02-27-2009, 10:15 AM   #13
bearophileHUGS@lycos.com
 
Posts: n/a
Default Re: Proposed implementation for an Ordered Dictionary
Paul Rubin:
>I don't see how to delete a randomly chosen node if you use that trick, since the hash lookup doesn't give you two consecutive nodes in the linked list to xor together.<


Thank you, I think you are right, I am sorry.
So on 32-bit CPUs you need to add 8 bytes to each value.

On 64-bit CPUs you may use a double indexed list, instead of a double
linked list. And each index can be stored in 6 bytes instead of 8
(281_474_976_710_656 buckets may be enough for most people), so you
need only 12 bytes for two indexes instead of 16 bytes, this helps the
L1 cache (and the small L2 cache too on Core i7) a bit. But this may
slow down too much the ordered iteration.

Bye,
bearophile


bearophileHUGS@lycos.com
  Reply With Quote
Old 02-27-2009, 01:08 PM   #14
Steve Holden
 
Posts: n/a
Default Re: Proposed implementation for an Ordered Dictionary
wrote:
> Paul Rubin:
>> I don't see how to delete a randomly chosen node if you use that trick, since the hash lookup doesn't give you two consecutive nodes in the linked list to xor together.<

>
> Thank you, I think you are right, I am sorry.
> So on 32-bit CPUs you need to add 8 bytes to each value.
>
> On 64-bit CPUs you may use a double indexed list, instead of a double
> linked list. And each index can be stored in 6 bytes instead of 8
> (281_474_976_710_656 buckets may be enough for most people), so you
> need only 12 bytes for two indexes instead of 16 bytes, this helps the
> L1 cache (and the small L2 cache too on Core i7) a bit. But this may
> slow down too much the ordered iteration.
>

A sort of premature pessimization, then.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/



Steve Holden
  Reply With Quote
Old 02-27-2009, 03:16 PM   #15
bearophileHUGS@lycos.com
 
Posts: n/a
Default Re: Proposed implementation for an Ordered Dictionary
Steve Holden:
> A sort of premature pessimization, then.


Maybe not, the save in memory may lead to higher speed anyway. So you
need to test it to know the overall balance. And in data structures
with general purpose you want all the speed you can get.

Bye,
bearophile


bearophileHUGS@lycos.com
  Reply With Quote
Old 02-28-2009, 05:26 AM   #16
Raymond Hettinger
 
Posts: n/a
Default Re: Proposed implementation for an Ordered Dictionary
[Steve Holden]
> A sort of premature pessimization, then.


QOTW!

_ ~
@ @
\_/


Raymond



Raymond Hettinger
  Reply With Quote
Old 02-28-2009, 01:12 PM   #17
python@bdurham.com
 
Posts: n/a
Default Re: Proposed implementation for an Ordered Dictionary

>> A sort of premature pessimization, then.


QOTW!

+1

Malcolm


python@bdurham.com
  Reply With Quote
Old 02-28-2009, 03:58 PM   #18
Colin J. Williams
 
Posts: n/a
Default Re: Proposed implementation for an Ordered Dictionary
Raymond Hettinger wrote:
> [Paul Rubin]
>> Ehh, I guess I'm not surprised at the slowdown and extra complexity
>> from the second dict. Oh well. If the module really turns out to be
>> really used a lot, another (messy) approach would be to write a C
>> extension that uses a doubly linked list some day.

>
> That seems like an ideal implementation to me.
>
> O(1): appending, popping, searching, and deletion
> O(n): traversal
>
> Raymond


Sometimes, it's useful to be able to
obtain the data in the sorted sequence.

You might consider adding functionality
like:

def seqItems(self):
'''To return the items, sorted
by key. '''
return [self[k] for k in
self.seqKeys()]

def seqKeys(self):
''' To return the keys sorted. '''
k= self.keys()
k.sort()
return k

def seqValues(self):
''' To return the values, with
their keys, sorted by value. '''
v= [(it[1], it[0]) for it in
self.items()]
v.sort()
return v

Colin W.


Colin J. Williams
  Reply With Quote
Old 02-28-2009, 04:27 PM   #19
Steven D'Aprano
 
Posts: n/a
Default Re: Proposed implementation for an Ordered Dictionary
Colin J. Williams wrote:

> Sometimes, it's useful to be able to
> obtain the data in the sorted sequence.
>
> You might consider adding functionality
> like:
>
> def seqItems(self):
> '''To return the items, sorted
> by key. '''
> return [self[k] for k in
> self.seqKeys()]


Amazingly, the Python time-machine provides such functionality! Using just a
handful of pre-existing pieces, you can do this:

sorted(mydict.items())
sorted(mydict.keys())
[mydict[x] for x in sorted(mydict.keys)]

and any other sequence you might need. That's the beauty of a general
purpose programming language. You don't need everything to be a built-in
*wink*



--
Steven



Steven D'Aprano
  Reply With Quote
Old 03-01-2009, 12:21 AM   #20
Colin J. Williams
 
Posts: n/a
Default Re: Proposed implementation for an Ordered Dictionary
Steven D'Aprano wrote:
> Colin J. Williams wrote:
>
>> Sometimes, it's useful to be able to
>> obtain the data in the sorted sequence.
>>
>> You might consider adding functionality
>> like:
>>
>> def seqItems(self):
>> '''To return the items, sorted
>> by key. '''
>> return [self[k] for k in
>> self.seqKeys()]

>
> Amazingly, the Python time-machine provides such functionality! Using just a
> handful of pre-existing pieces, you can do this:
>
> sorted(mydict.items())
> sorted(mydict.keys())
> [mydict[x] for x in sorted(mydict.keys)]
>
> and any other sequence you might need. That's the beauty of a general
> purpose programming language. You don't need everything to be a built-in
> *wink*
>
>

Thanks, you are right, you have a neat
way of handling the first two, they
work with an instance of Raymond
Hettinger's. OrderedDict.

The third suggestion has a couple of
problems, which are fixed below.

if __name__ == '__main__':
d=
OrderedDict.fromkeys('abracadabra',
value= 'zzz')
print(d)
print(d.seqKeys())
print(d.seqItems())
print(d.seqValues())
# Steven D'Aprano's alternative
mydict= d.copy()
print sorted(mydict.items())
print sorted(mydict.keys())
# print [mydict[x] for x in
sorted(mydict.keys)] Instance object is
not iterable
print(sorted(iter([(x[1], x[0]) for
x in mydict.iteritems()]))) # This works

Colin W.


Colin J. Williams
  Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off




SEO by vBSEO 3.3.2 ©2009, Crawlability, Inc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46