Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Builtin classes list, set, dict reimplemented via B-trees

Reply
Thread Tools

Builtin classes list, set, dict reimplemented via B-trees

 
 
barnesc@engr.orst.edu
Guest
Posts: n/a
 
      09-14-2005

Hi again,

Since my linear algebra library appears not to serve any practical
need (I found cgkit, and that works better for me), I've gotten bored
and went back to one of my other projects: reimplementing the Python
builtin classes list(), set(), dict(), and frozenset() with balanced
trees (specifically, counted B-trees stored in memory).

In short, this allows list lookup, insertion, deletion in O(log(N))
time. It allows the set and dictionary types to be maintained in
order, and insert/lookup/remove operations here take O(log(N)) time
as well. Getting the k least or k greatest elements takes
O(log(N)+k) time.

I'm about 50% done with the implementation of this module.

I thought I'd use this for Dijkstra's algorithm and some schedulers,
but then I noticed that a binary heap can be retrofitted to have the
DECREASE-KEY operation (see e.g. David Eppstein's priorityQueue).
Thus my B-tree based classes aren't really necessary, since there are
simpler heap implementations that do the same thing.

So my question is: are there any other *practical* applications of a
B-tree based list/set/dict ? In other words, is this module totally
worth coding, or is it just academic wankery and theoretical flim
flam ?

Thanks for your comments/opinions/etc...

- Connelly Barnes
'Y29ubmVsbHliYXJuZXNAeWFob28uY29t'.decode('base64' )


---------------------------------------------------------------------
Detailed overview of the bcollections module (from the docstring)
---------------------------------------------------------------------

"""
....

The bset() and bdict() classes improve upon the builtin set and
dictionary types by maintaining the elements in sorted order.

Generally, the b* classes are a factor of O(log(N)) slower[2] than the
corresponding builtin classes, except for certain operations:

- For a bset(), it takes O(log(N)) time to get the minimum, maximum,
or ith element. Also, it takes O(log(N)+k) time to get the
k first, k last, or any slice of k elements from the sorted set.
- For a bdict(), dictionary keys can be retrieved in sorted order
with the above time bounds.
- For a blist(), items can be inserted and removed in O(log(N)) time.
It takes O(log(N)+k) time to get, set, or delete a slice of k
elements.

....

blist - List implemented via B-tree.
bset - Set implemented via B-tree.
Additional methods:
- s.min() - Minimum element
- s.max() - Maximum element
- s.index(x) - Index of x in sorted list of
elements.
- s.next(x) - Least element which is > x
(x need not be in the set).
- s.previous(x) - Greatest element which is < x
(x need not be in the set).
- s[i] - Get element i from sorted list.
- s[i:j] - Get slice from sorted list.
bfrozenset - Immutable set implemented via B-tree.
bdict - Dict implemented via B-tree.
Additional methods:
- a.min() - Minimum key
- a.max() - Maximum key
- a.index(k) - Index of k in sorted list of
keys.
- s.next(x) - Least key which is > x
(x need not be an existing key).
- s.previous(x) - Greatest key which is < x
(x need not be an existing key).
- a.get_key_at(i) - Get key at index i from sorted
list of keys.
- a.get_key_at(i, j) - Get slice from sorted key list.
"""
 
Reply With Quote
 
 
 
 
Bryan Olson
Guest
Posts: n/a
 
      09-14-2005
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> Hi again,
>
> Since my linear algebra library appears not to serve any practical
> need (I found cgkit, and that works better for me), I've gotten bored
> and went back to one of my other projects: reimplementing the Python
> builtin classes list(), set(), dict(), and frozenset() with balanced
> trees (specifically, counted B-trees stored in memory).

[...]
> So my question is: are there any other *practical* applications of a
> B-tree based list/set/dict ? In other words, is this module totally
> worth coding, or is it just academic wankery and theoretical flim
> flam ?


B-trees are specifically designed for disk storage. Seeking to a
page takes much longer than reading a page of several kilobytes.
B-trees read the keys for a many-way comparison in one seek.

For in-memory comparison-based dictionaries, Red-Black trees,
AVL trees, 2-3 trees, or skip-lists, are a better choice.


--
--Bryan
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
subtle error slows code by 10x (builtin sum()) - replace builtin sumwithout using import? bdb112 Python 2 07-02-2011 03:13 AM
dict!ident as equivalent of dict["ident"] Alexander Kozlovsky Python 5 05-22-2006 08:06 AM
Builtin classes list, set, dict reimplemented via B-trees barnesc@engr.orst.edu Python 0 09-15-2005 06:16 AM
Builtin classes list, set, dict reimplemented via B-trees barnesc@engr.orst.edu Python 2 09-14-2005 07:43 PM
Re: dict->XML->dict? Or, passing small hashes through text? Skip Montanaro Python 0 08-15-2003 03:46 PM



Advertisments