Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Binary search tree

Reply
Thread Tools

Binary search tree

 
 
maxim.novak@gmail.com
Guest
Posts: n/a
 
      11-09-2007
Hi,

I have to get list of URLs one by one and to find the URLs that I have
more than one time(can't be more than twice).

I thought to put them into binary search tree, this way they'll be
sorted and I'll be able to check if the URL already exist.

Couldn't find any python library that implements trees.
Is there some library of this kind in python? Or can I find it
somewhere else?

 
Reply With Quote
 
 
 
 
Larry Bates
Guest
Posts: n/a
 
      11-09-2007
wrote:
> Hi,
>
> I have to get list of URLs one by one and to find the URLs that I have
> more than one time(can't be more than twice).
>
> I thought to put them into binary search tree, this way they'll be
> sorted and I'll be able to check if the URL already exist.
>
> Couldn't find any python library that implements trees.
> Is there some library of this kind in python? Or can I find it
> somewhere else?
>

Put them into a set. You can check for existence (very fast) and at the end it
is easy to sort.

-Larry
 
Reply With Quote
 
 
 
 
Neil Cerutti
Guest
Posts: n/a
 
      11-09-2007
On 2007-11-09, Larry Bates <> wrote:
> wrote:
>> I have to get list of URLs one by one and to find the URLs
>> that I have more than one time(can't be more than twice).
>>
>> I thought to put them into binary search tree, this way
>> they'll be sorted and I'll be able to check if the URL already
>> exist.
>>
>> Couldn't find any python library that implements trees. Is
>> there some library of this kind in python? Or can I find it
>> somewhere else?

>
> Put them into a set. You can check for existence (very fast)
> and at the end it is easy to sort.


The set is likely the way to go.

Python's library support for binary search trees consists of the
bisect module.

--
Neil Cerutti
Ask about our plans for owning your home --sign at mortgage company
 
Reply With Quote
 
D.Hering
Guest
Posts: n/a
 
      11-09-2007
On Nov 9, 4:06 pm, maxim.no...@gmail.com wrote:
> Hi,
>
> I have to get list of URLs one by one and to find the URLs that I have
> more than one time(can't be more than twice).
>
> I thought to put them into binary search tree, this way they'll be
> sorted and I'll be able to check if the URL already exist.
>
> Couldn't find any python library that implements trees.
> Is there some library of this kind in python? Or can I find it
> somewhere else?


Can you use set() or set.difference()?

 
Reply With Quote
 
Bruno Desthuilliers
Guest
Posts: n/a
 
      11-09-2007
a écrit :
> Hi,
>
> I have to get list of URLs one by one and to find the URLs that I have
> more than one time(can't be more than twice).
>
> I thought to put them into binary search tree, this way they'll be
> sorted and I'll be able to check if the URL already exist.


What about a set ?

s = set()
for url in urls:
if url in s:
print "already have ", url
else:
set.add(url)

 
Reply With Quote
 
Michel Albert
Guest
Posts: n/a
 
      11-12-2007
On Nov 9, 11:45 pm, Bruno Desthuilliers
<bdesth.quelquech...@free.quelquepart.fr> wrote:
> maxim.no...@gmail.com a écrit :
>
> > Hi,

>
> > I have to get list of URLs one by one and to find the URLs that I have
> > more than one time(can't be more than twice).

>
> > I thought to put them into binary search tree, this way they'll be
> > sorted and I'll be able to check if the URL already exist.

>
> What about a set ?
>
> s = set()
> for url in urls:
> if url in s:
> print "already have ", url
> else:
> set.add(url)


Interesting. For this case I usually used dicts. As in:

d = {}
for url in urls:
if url in d:
print "already have ", url
else:
d[url] = 1

Now, I can see that this method has some superfluous data (the `1`
that is assigned to the dict). So I suppose this is less memory
efficient. But is this slower then? As both implementations use hashes
of the URL to access the data. Just asking out of curiosity

 
Reply With Quote
 
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
Guest
Posts: n/a
 
      11-12-2007
> Now, I can see that this method has some superfluous data (the `1`
> that is assigned to the dict). So I suppose this is less memory
> efficient. But is this slower then? As both implementations use hashes
> of the URL to access the data. Just asking out of curiosity


Performance-wise, there is no difference in the implementations.
What matters when comparing programs one-by-one is how many method
calls you need. In this example, the dictionary is slightly faster
in my measurements, since for the set, you need to perform a lookup
of .add, whereas the access to __setitem__ for the dict need no
additional dictionary lookup.

Regards,
Martin
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Binary tree search vs Binary search Bogdan C Programming 22 10-21-2010 09:46 PM
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 2 10-31-2007 02:58 AM
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 1 10-30-2007 11:01 PM
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 4 10-30-2007 08:21 PM
B tree, B+ tree and B* tree Stub C Programming 3 11-12-2003 01:51 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57