Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Binary tree search vs Binary search

Reply
Thread Tools

Binary tree search vs Binary search

 
 
James Dow Allen
Guest
Posts: n/a
 
      10-19-2010
James Dow Allen <(E-Mail Removed)> might have writ, in
news:Xns9E16D98DFA353jamesdowallen@78.46.73.112:
>> You really don't understand big-Oh notation do you? ...

>
> . We don't mind when ignorant newbies post here.
> Even when they act like insolent jerks, they're amusing.


Since I tend to be *more* patient than others
with newbies in Usenet, readers may wonder why I went out of
my way to criticise this pompous imbecile. It's in part because
of another pompous and imbecilic post he made recently:

On Sep 7, 3:31 am, Tom St Denis <(E-Mail Removed)> wrote:
> > In article
> > <(E-Mail Removed)>,
> > James Dow Allen <(E-Mail Removed)> wrote:
> > >What I AM asking is: Why aren't the rest of you doing this?
> > >

>
> usually when a newb comes into a group [any group] and says things
> like "why not we do everything while standing on our heads?" it's
> because they don't know what they're talking about.
>
> This doesn't mean we don't listen to the newbs, it just means we don't
> follow them. It's not our fault you can't tell the difference.
>
> As to the specifics here, the solution to the OPs problem is called a
> "processor cache" and "stack opcode optimizations" [something modern
> high performance processors have] making the need to inline your
> compare function rather moot.


What was especially amusing here is that his final paragraph, perhaps
intended to be helpful. was a complete non sequitur regardless of
how OP was misconstrued. With a little more Googling, Denis may
soon be up-to-speed on the simpler facts about big-Oh notation,
but I think he needs to take up Remedial Reading for Comprehension.

Hope this helps.
James Dow Allen
 
Reply With Quote
 
 
 
 
Tom St Denis
Guest
Posts: n/a
 
      10-19-2010
On Oct 19, 10:23*am, James Dow Allen <(E-Mail Removed)>
wrote:

<snip>

This is my last reply to you since you're intent on being openly
hostile.

> First note that OP presents a speed-doubling and Tom implies
> that it will go away for large n. * *
> Tom: *Do you still believe that, or did Googling "big-Oh" help?


Do you know what asymptotic convergence is? Maybe a 2x speedup when
n=10 is present but disappears when n=2^40?

> Tom, did your newsreader remove the attribution on the above
> line? *It's by Tom St Denis. *If you now understand it was a foolish
> remark, good for you. *But don't remove the attribution, please.


I removed the excess quotation to trim the reply down. People can see
the attribution in the headers.

> No. *It's true that you don't say it's "optimal". *But it's
> false that you said "it's no big deal." *What you wrote is
> that it's "acceptable" and my question remains:
> How do you know that???


It does depend on the requirements whether it's "acceptable" or not.
But premature optimization is the root of all stupid. If the existing
code uses a tree, porting it all over to a linear array might not be
worth the 2x improvement if the current performance isn't a problem.

You don't know that. I don't know that.

> It's not unusual for such a search to dominate speed performance.
> Speedup by a factor of two is, as a matter of fact, not unlikely to be
> important! *True, it's "often (or even usually) unimportant" but that's
> not what you wrote, is it? BTW, since you're wrong on *every single
> point* in your message here, I doubt you'll respond again, but if you
> do, kindly have the courtesy to quote and correctly attribute your own
> incorrect remarks which are under review.


You're being pedantic to basically have something to argue over. I
maintain I don't care one way or the other.

> Yes it is. *There is no "noise" in big-Oh notation for
> it to "fall below." *If this isn't clear, please present
> an example which falls *above* your alleged "noise" in
> big-Oh notation. *


Um ... you clearly don't understand big-Oh notation. I'm sorry but if
you write something like this you just don't get what big-Oh
accomplishes.

> > *O(c*log(n)) is equiv to O(log(n))

>
> So you *did* know this much after all!
> (Or did you Google for it just now?)


I didn't google it. I went to college.

> > You really don't understand big-Oh notation do you? ...

>
> *. * We don't mind when ignorant newbies post here.
> Even when they act like insolent jerks, they're amusing.


How am I acting like an insolent jerk? Just because you disagree with
what I wrote doesn't mean I'm a jerk.

Anyways, please don't reply to this post, I won't reply to you in
kind. You're clearly trying to pick a fight, and truth be told I
don't really care about the outcome of this thread.

Tom
 
Reply With Quote
 
 
 
 
James Dow Allen
Guest
Posts: n/a
 
      10-19-2010
On Oct 19, 9:58*pm, Tom St Denis <(E-Mail Removed)> wrote:
> Do you know what asymptotic convergence is? *Maybe a 2x speedup when
> n=10 is present but disappears when n=2^40?


It's comments like this that make us wonder whether you
indeed understand big-Oh or not. You claim to have Googled
to learn O(c log N) = O(log N), but the above comment
implies you think c = 1 ??? I'll admit to having been
somewhat ironic in my replies -- I suspect you've understood
the rudiments of big-Oh all along -- but why this strange insistence
that c should be 1 ?

> > Tom, did your newsreader remove the attribution on the above
> > line? *It's by Tom St Denis. *If you now understand it was a foolish
> > remark, good for you. *But don't remove the attribution, please.

>
> I removed the excess quotation to trim the reply down. *People can see
> the attribution in the headers.


False. The Reference list does not show clearly who wrote
what. You unnecessarily snipped a single line; the most logical
reason for that was to obfuscate the attribution your own
incorrect comment.

> > There is no "noise" in big-Oh notation for
> > it to "fall below." *If this isn't clear, please present
> > an example which falls *above* your alleged "noise" in
> > big-Oh notation. *

>
> Um ... you clearly don't understand big-Oh notation. *I'm sorry but if
> you write something like this you just don't get what big-Oh
> accomplishes.


Despite my sarcasm, I actually suspect you are more-or-less
familiar with big-Oh. I'm mildly curious whether you're just being
nasty, or honestly think I'm not familiar with it.

And, if you insist that your nonsense about "noise" was
meaningful after all, why didn't you do what you were told and
"present an example which falls *above* your alleged noise."


> Anyways, please don't reply to this post,


*You* pick a fight, post more insulting and useless drivel
and then ask *me* not to respond!!

I don't give a **** if you respond to this one or not.

James
 
Reply With Quote
 
Sebastian
Guest
Posts: n/a
 
      10-20-2010
On 18 Okt., 19:43, Tom St Denis wrote:
>
> Binary search (e.g. ordered array) is O(log(n)) search but O(n)
> insert. *Binary tree is O(log(n)) search *and* O(log(n)) insert.
>
> The overhead you're finding by a constant is acceptable (and falls
> below the noise in big-Oh notation). *They should in theory take
> roughly the same time as n => oo, but for small values they'll skew.


Tom, I know you are a smart guy. But this time you are wrong.

Here's a simple counter example:

Algorithm A requires log2(n)+8 "costly operations".
Algorithm B requires 2*log2(n) "costly operations".
Both are O(log n).
For small n B is faster than A.
But if you let n approach infinity, A will be faster than B.
The ratio between the two will aproach 2, not 1.
So, they won't require "roughly the same time".

Cheers!
SG
 
Reply With Quote
 
Tom St Denis
Guest
Posts: n/a
 
      10-20-2010
On Oct 20, 6:02*am, Sebastian <(E-Mail Removed)> wrote:
> Tom, I know you are a smart guy. *But this time you are wrong.
>
> Here's a simple counter example:
>
> * Algorithm A requires log2(n)+8 "costly operations".
> * Algorithm B requires 2*log2(n) "costly operations".
> * Both are O(log n).
> * For small n B is faster than A.
> * But if you let n approach infinity, A will be faster than B.
> * The ratio between the two will aproach 2, not 1.
> * So, they won't require "roughly the same time".


Except for all we [likely] know the ratio 2:1 appears at small sample
set sizes. But that a comment of cache performance and not big-Oh.
For example, when the sample set size gets so big it doesn't fit in
memory, disk access will essentially destroy any advantage and they'll
both be for all intents and purposes the same speed.

In the big-Oh there is no such thing as O(2log n) as a "final
answer." So in both cases I stand by my guess [which is all we're
doing here] that the 2x difference is probably nothing to write home
about.

I suspect a binary search to be faster overall [with the advantage
going down as size goes up], but the more important thing I said in my
post was that a) it's in the noise and b) Insert is slower than with a
tree. Granted I should have said "it's *probably* in the noise" but
the general essence of the thought remains mostly valid.

All we know from the OP is he searched a dictionary and one was
"almost" twice as fast as the other. Does the table have 50
elements? 50,000 elements? 50M elements? What sort of architecture
is it? etc...

Now why this turned into a flamewar between another poster and I I
don't know. I never said anything derogatory to them. Alas, this is
USENET ...

Tom
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      10-20-2010
On 10/20/2010 6:48 AM, Tom St Denis wrote:
> On Oct 20, 6:02 am, Sebastian<(E-Mail Removed)> wrote:
>> Tom, I know you are a smart guy. But this time you are wrong.
>>
>> Here's a simple counter example:
>>
>> Algorithm A requires log2(n)+8 "costly operations".
>> Algorithm B requires 2*log2(n) "costly operations".
>> Both are O(log n).
>> For small n B is faster than A.
>> But if you let n approach infinity, A will be faster than B.
>> The ratio between the two will aproach 2, not 1.
>> So, they won't require "roughly the same time".

>
> Except for all we [likely] know the ratio 2:1 appears at small sample
> set sizes. But that a comment of cache performance and not big-Oh.
> For example, when the sample set size gets so big it doesn't fit in
> memory, disk access will essentially destroy any advantage and they'll
> both be for all intents and purposes the same speed.


All you're saying is that real computers are finite, meaning
that we'll always have N <= Nmax. That, in turn, implies that all
real implementations of all algorithms, even bogosort, have O(1)
performance because f(N) <= f(Nmax) == constant == O(1).

So, yes: You can, sort of, defend your misunderstanding of O()
notation, at the cost of making O() notation useless.

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid
 
Reply With Quote
 
Tom St Denis
Guest
Posts: n/a
 
      10-20-2010
On Oct 20, 7:32*am, Eric Sosman <(E-Mail Removed)> wrote:
> On 10/20/2010 6:48 AM, Tom St Denis wrote:
>
>
>
>
>
>
>
>
>
> > On Oct 20, 6:02 am, Sebastian<(E-Mail Removed)> *wrote:
> >> Tom, I know you are a smart guy. *But this time you are wrong.

>
> >> Here's a simple counter example:

>
> >> * *Algorithm A requires log2(n)+8 "costly operations".
> >> * *Algorithm B requires 2*log2(n) "costly operations".
> >> * *Both are O(log n).
> >> * *For small n B is faster than A.
> >> * *But if you let n approach infinity, A will be faster than B.
> >> * *The ratio between the two will aproach 2, not 1.
> >> * *So, they won't require "roughly the same time".

>
> > Except for all we [likely] know the ratio 2:1 appears at small sample
> > set sizes. *But that a comment of cache performance and not big-Oh.
> > For example, when the sample set size gets so big it doesn't fit in
> > memory, disk access will essentially destroy any advantage and they'll
> > both be for all intents and purposes the same speed.

>
> * * *All you're saying is that real computers are finite, meaning
> that we'll always have N <= Nmax. *That, in turn, implies that all
> real implementations of all algorithms, even bogosort, have O(1)
> performance because f(N) <= f(Nmax) == constant == O(1).
>
> * * *So, yes: You can, sort of, defend your misunderstanding of O()
> notation, at the cost of making O() notation useless.


It's not useless though...

Compare Comba vs. Karatsuba multiplication...

Comba is basically n^2 * (work of MULADD) + n * (work of carry
prop) ... or O(n^2)

Where Karatsuba is

3x * comba of n/2 size + various 2n-length additions [with carry
prop], etc.. or O(n^log_2(3))

So even though Comba is FASTER at small values of n we still say
"Karatsuba is more efficient." [even though at small sizes we'd use
Comba instead]. Now if there was another technique that was
2*Karatsuba work you'd still write it as O(n^log_2(3)) even though
you'd know that at most reasonable sizes it'd be slower. I didn't
invent big-Oh notation, so stop harping on me. Nor am I saying "all
algorithms of equal big-Oh work are equivalent."

But we don't know for a fact that the 2:1 ratio caries through larger
[possibly practical] sizes. All I was saying is a small constant
factor isn't an eye opener. If the OP had wrote that it was 10x
faster then yes, clearly something is afoot, but 2x faster on say a
500 word dictionary doesn't exactly scream FUNDAMENTAL FLAW to me. It
could just be that there are more cache hits with a dataset that
small. Which would mean if your application solely deals with
searches [and no inserts] of 500 word datasets that a binary search is
ideal. Nobody is arguing that [certainly I'm not]. But as to the
concept of broad generalizations I'd have a hard time extrapolating
whether a 2x speed difference FROM ===ONE=== SAMPLE SET is significant
or not since it IS below the asymptotic work level threshold.

IOW, stop putting words into my mouth and stop assuming that you know
more about the OPs problem than anyone else does (since the OP hasn't
explained their problem in enough detail).

Tom
 
Reply With Quote
 
Sebastian
Guest
Posts: n/a
 
      10-20-2010
On 20 Okt., 13:46, Tom St Denis wrote:
> On Oct 20, 7:32*am, Eric Sosman wrote:
> > On 10/20/2010 6:48 AM, Tom St Denis wrote:
> > [...]
> > * * *All you're saying is that real computers are finite, meaning
> > that we'll always have N <= Nmax. *That, in turn, implies that all
> > real implementations of all algorithms, even bogosort, have O(1)
> > performance because f(N) <= f(Nmax) == constant == O(1).

>
> > * * *So, yes: You can, sort of, defend your misunderstanding of O()
> > notation, at the cost of making O() notation useless.

>
> It's not useless though...
>
> Compare Comba vs. Karatsuba multiplication...
>
> Comba is basically n^2 * (work of MULADD) + n * (work of carry
> prop) ... or O(n^2)
>
> Where Karatsuba is
>
> 3x * comba of n/2 size + various 2n-length additions [with carry
> prop], etc.. or O(n^log_2(3))
>
> So even though Comba is FASTER at small values of n we still say
> "Karatsuba is more efficient."


....in terms of the asymptotic runtime behaviour, yes.

> Now if there was another technique that was
> 2*Karatsuba work you'd still write it as O(n^log_2(3)) even though
> you'd know that at most reasonable sizes it'd be slower. *I didn't
> invent big-Oh notation, so stop harping on me. *Nor am I saying "all
> algorithms of equal big-Oh work are equivalent."


No, you just suggested that algorithms of the same time complexity
class "should in theory take roughly the same time" as n approaches
infinity. For some definition of "take roughly the same time" this
might be the case. But IMHO the most sensible interpretation of this
claim was -- as Ben pointed out -- that the runtime ratio between both
algorithms will approach 1 as n approaches infinity. As pointed out,
this is not necessarily the case. If you meant to say that the
algorithms have the same asymptotic runtime behaviour, you should have
said just that.

> But we don't know for a fact that the 2:1 ratio caries through larger
> [possibly practical] sizes.


Right. But we also don't know that it doesn't.

Cheers!
SG
 
Reply With Quote
 
Tom St Denis
Guest
Posts: n/a
 
      10-20-2010
On Oct 20, 10:27*am, Sebastian <(E-Mail Removed)> wrote:
> On 20 Okt., 13:46, Tom St Denis wrote:
>
>
>
>
>
>
>
>
>
> > On Oct 20, 7:32*am, Eric Sosman wrote:
> > > On 10/20/2010 6:48 AM, Tom St Denis wrote:
> > > [...]
> > > * * *All you're saying is that real computers are finite, meaning
> > > that we'll always have N <= Nmax. *That, in turn, implies that all
> > > real implementations of all algorithms, even bogosort, have O(1)
> > > performance because f(N) <= f(Nmax) == constant == O(1).

>
> > > * * *So, yes: You can, sort of, defend your misunderstanding of O()
> > > notation, at the cost of making O() notation useless.

>
> > It's not useless though...

>
> > Compare Comba vs. Karatsuba multiplication...

>
> > Comba is basically n^2 * (work of MULADD) + n * (work of carry
> > prop) ... or O(n^2)

>
> > Where Karatsuba is

>
> > 3x * comba of n/2 size + various 2n-length additions [with carry
> > prop], etc.. or O(n^log_2(3))

>
> > So even though Comba is FASTER at small values of n we still say
> > "Karatsuba is more efficient."

>
> ...in terms of the asymptotic runtime behaviour, yes.
>
> > Now if there was another technique that was
> > 2*Karatsuba work you'd still write it as O(n^log_2(3)) even though
> > you'd know that at most reasonable sizes it'd be slower. *I didn't
> > invent big-Oh notation, so stop harping on me. *Nor am I saying "all
> > algorithms of equal big-Oh work are equivalent."

>
> No, you just suggested that algorithms of the same time complexity
> class "should in theory take roughly the same time" as n approaches
> infinity. *For some definition of "take roughly the same time" this
> might be the case. *But IMHO the most sensible interpretation of this
> claim was -- as Ben pointed out -- that the runtime ratio between both
> algorithms will approach 1 as n approaches infinity. *As pointed out,
> this is not necessarily the case. *If you meant to say that the
> algorithms have the same asymptotic runtime behaviour, you should have
> said just that.


I'm saying it's lost in the noise because for a single sample set with
a difference BELOW the asymptotic rate it's lost.

It's like saying

strlen("hello") and strlen_mmx_super_optimized("hello") each take 15
and 20 cycles. Therefore, the latter is always slower than the
former. In reality, the setup time for short strings might eat into
the performance gains. So we can say "it's lost in the noise."

You don't know that array search is ALWAYS 2X faster than a tree
search. You know that for a SINGLE test case it was 2x faster. In
single test case I can show that qsort is no better than bubble sort.
What does that prove?

> > But we don't know for a fact that the 2:1 ratio caries through larger
> > [possibly practical] sizes.

>
> Right. But we also don't know that it doesn't.


Hence my comment thrice now that I should have said "probably lost in
the noise." But since you diehard pedantics won't let it be just stop
replying. I acknowledged the mistake that you're pointing out. There
is no more level of correction needed.

Tom
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      10-21-2010
On 10/20/2010 7:46 AM, Tom St Denis wrote:
> [...]
> IOW, stop putting words into my mouth and stop assuming that you know
> more about the OPs problem than anyone else does (since the OP hasn't
> explained their problem in enough detail).


Nobody knows enough about the O.P.'s problem to say anything
useful about it; I don't think anyone disputes that. We won't know
enough until (and unless) he reveals a whole lot more than he has
thus far.

As for the words I have put into your mouth, they are

> The overhead you're finding by a constant is acceptable (and falls
> below the noise in big-Oh notation). They should in theory take
> roughly the same time as n => oo, but for small values they'll skew.


.... and if they are not your words, you are being impersonated by
someone who does not understand big-Oh.

--
Eric Sosman
(E-Mail Removed)lid
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 2 10-31-2007 02:58 AM
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 1 10-30-2007 11:01 PM
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 4 10-30-2007 08:21 PM
B+ Tree versus Ternary Search Tree Ramkumar Menon Java 2 08-16-2005 08:13 PM
B tree, B+ tree and B* tree Stub C Programming 3 11-12-2003 01:51 PM



Advertisments