Velocity Reviews > Algorithms in C - Robert Sedgewick

# Algorithms in C - Robert Sedgewick

arnuld
Guest
Posts: n/a

 09-08-2009
This code is from chapter 2 of Algorithms in C:

Sequential Search:

int search( int a[], int v, int l, int r)
{
int i;
for(i = 1; i <= r; i++)
if(v == a[i]) return i;

return -1;
}

Binary Search:

int search( int a[], int v, int l, int r )
{
while( r >= 1 )
{
int m = (l + r)/2;
if(v == a[m]) return m;
if(v < a[m]) r = m - 1;
else l = m + 1;
}

return -1;
}

Now author explains the average cost of S-Search as:

(1 + 2 + 3 + .... + N)/N = (N + 1)/2

and the worst-case cost of B-Search as:

T(Base-N) <= T(Base-N/2) + 1 for N >=2 and T(Base-1) = 1

I am going to ask a very simple question. BY just looking at the code of
both searches, any man with brains can simply conclude that B-Search, on
a sorted input, is always cheaper as compared to S-Search. If it is so
clear form the code, why do I h ave to go through the incomprehensible
junk of N, T, C N(BaseN) etc etc. The book is about Algorithms in C but
when I started to read, the author is teaching me whole recurrence series
(which of course is an obtuse thing to understand):

C(Base-N) = C(Base-(N-1)) + N for N >= 2 with C(Base-1) = 1
C(Base-N) = C(Base-(N-2)) + (N - 1) + N
= C(Base-(N-3)) + (N - 2) + (N - 1) + N
= ..............
= ..............
= C(Base-1) + 2 + ... (N - 2) + (N - 1) + N
= 1 + 2 + 3 + .. + (N - 2) + (N - 1) + N
= N(N+1)/2

Is it necessary to understand whole exponential series and recurrences
and permutations and combinations and logarithmic series before you even
start to comprehend a search or sorting or any algorithm at all ? If that
is so why will I use Sedgwick, I can read Hall & Knight's "Higher
Algebra".

My experience says its enough to know that N is less than N^2 which is
less than 2^N and logN is least of them all and if B-Search takes logN on
a sorted input and S-Search takes N, then we can go with B-Search based
algorithm, provided you have estimated the cost of sorting first.

So should I go with reading all the maths or just go to some random
chapter and start working on an algorithm in C ?

--

www.lispmachine.wordpress.com
my email is @ the above blog.

Eric Sosman
Guest
Posts: n/a

 09-08-2009
arnuld wrote:
> This code is from chapter 2 of Algorithms in C:
>
> Sequential Search:
>
> int search( int a[], int v, int l, int r)
> {
> int i;
> for(i = 1; i <= r; i++)
> if(v == a[i]) return i;
>
> return -1;
> }

I don't have the book, but this looks very un-C-like.
Ordinarily, one would expect element [0] of an array to
hold something useful, and might not expect element [r] to
exist at all. Is this offered as real C code, or just as
the first stage in a transliteration from a language with
[1]-based arrays, possibly Fortran?

> Binary Search:
>
> int search( int a[], int v, int l, int r )
> {
> while( r >= 1 )
> {
> int m = (l + r)/2;
> if(v == a[m]) return m;
> if(v < a[m]) r = m - 1;
> else l = m + 1;
> }
>
> return -1;
> }

The termination criterion looks completely wrong. Are
you sure you've transcribed the code correctly?

> Now author explains the average cost of S-Search as:
>
> (1 + 2 + 3 + .... + N)/N = (N + 1)/2
>
> and the worst-case cost of B-Search as:
>
> T(Base-N) <= T(Base-N/2) + 1 for N >=2 and T(Base-1) = 1

> I am going to ask a very simple question. BY just looking at the code of
> both searches, any man with brains can simply conclude that B-Search, on
> a sorted input, is always cheaper as compared to S-Search.

Yes, any man with brains could conclude that, if his brains
didn't function very well. The sequential search will take more
steps than the binary search, it's true -- but the "step" of one
is simpler (cheaper) than the "step" of the other. Look at the
sequential "step:" two comparisons and an addition. Look at
the binary "step:" three comparisons, two additions, and a
division.

> Is it necessary to understand whole exponential series and recurrences
> and permutations and combinations and logarithmic series before you even
> start to comprehend a search or sorting or any algorithm at all ?

No: You can choose to learn things by rote and not bother
with understanding them. Carry your grimoire everywhere, and
hope you never encounter a situation that's just a little bit
different, something that isn't written down for you in exactly
the form you encounter it.

Or you can arm yourself with the ability to do a little
analysis unaided, so you won't be thrown for a loop if things
are just a little different from those you saw in the book. For
example, consider applying the binary search to a sorted array
containing some equal elements, the task being to find the low
and high indices of the span of elements equal to `v', or to
determine that `v' is not present. There are at least three
obvious ways to approach the task; how will you choose one?

> My experience says its enough to know that N is less than N^2 which is
> less than 2^N and logN is least of them all and if B-Search takes logN on
> a sorted input and S-Search takes N, then we can go with B-Search based
> algorithm, provided you have estimated the cost of sorting first.

Have you ever noticed that practical Quicksort implementations
often switch to insertion sort or something of the kind when the
sub-files get short enough? Why would they abandon the O(N*logN)
partition-exchange method in favor of an O(N*N) method? Are they
just being perverse -- or is there something you've failed to
notice?

> So should I go with reading all the maths or just go to some random
> chapter and start working on an algorithm in C ?

It's your choice: memorize, or comprehend. Each approach has

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid

Charles Richmond
Guest
Posts: n/a

 09-08-2009
arnuld wrote:
>
> [snip...] [snip...] [snip...]
>
> I am going to ask a very simple question. BY just looking at the code of
> both searches, any man with brains can simply conclude that B-Search, on
> a sorted input, is always cheaper as compared to S-Search.

But a binary search is *not* always cheaper than a sequential
search. For just a few items, sequential search can be cheaper.
There is a "breakpoint" where the binary search becomes cheaper.
This may be somewhere between five and nine items.

--
+----------------------------------------+
| Charles and Francis Richmond |
| |
| plano dot net at aquaporin4 dot com |
+----------------------------------------+

Default User
Guest
Posts: n/a

 09-08-2009
Eric Sosman wrote:

> arnuld wrote:
> > This code is from chapter 2 of Algorithms in C:
> >
> > Sequential Search:
> >
> > int search( int a[], int v, int l, int r)
> > {
> > int i;
> > for(i = 1; i <= r; i++)
> > if(v == a[i]) return i;
> >
> > return -1;
> > }

>
> I don't have the book, but this looks very un-C-like.
> Ordinarily, one would expect element [0] of an array to
> hold something useful, and might not expect element [r] to
> exist at all. Is this offered as real C code, or just as
> the first stage in a transliteration from a language with
> [1]-based arrays, possibly Fortran?

As I recall, the original book was written in Pascal, and Sedgewick
used 1-based arrays for the C version as well.

The code is on his web site:

<http://www.cs.princeton.edu/~rs/Algs3.c1-4/code.txt>

Brian

Julienne Walker
Guest
Posts: n/a

 09-08-2009
On Sep 8, 8:43*am, Eric Sosman <(E-Mail Removed)> wrote:
> arnuld wrote:
>
> > Binary Search:

>
> > int search( int a[], int v, int l, int r )
> > {
> > * *while( r >= 1 )
> > * * *{
> > * * * * int m = (l + r)/2;
> > * * * * if(v == a[m]) return m;
> > * * * * if(v < a[m]) r = m - 1;
> > * * * * else l = m + 1;
> > * * * }

>
> > * *return -1;
> > }

>
> * * *The termination criterion looks completely wrong. *Are
> you sure you've transcribed the code correctly?

While I don't have that book on hand at the moment to verify the
transcription, I can safely say that the code in it is generally quite
poor, in terms of readability, style, and overall correctness. I spent
more time fixing the code than learning from it, though otherwise the
book is excellent and I still recommend it with a warning not to rely
on the code.

James Kuyper
Guest
Posts: n/a

 09-08-2009
arnuld wrote:
> This code is from chapter 2 of Algorithms in C:
>
> Sequential Search:
>
> int search( int a[], int v, int l, int r)
> {
> int i;
> for(i = 1; i <= r; i++)
> if(v == a[i]) return i;
>
> return -1;
> }

That looks like Fortran-inspired C code. I'm not personally familiar
with this book; it might have some very useful information about
algorithms. However, I recommend not paying too much attention to the C
code; you're better off learning a more idiomatic C programming style.

....
> Now author explains the average cost of S-Search as:
>
> (1 + 2 + 3 + .... + N)/N = (N + 1)/2
>
> and the worst-case cost of B-Search as:
>
> T(Base-N) <= T(Base-N/2) + 1 for N >=2 and T(Base-1) = 1
>
>
>
>
> I am going to ask a very simple question. BY just looking at the code of
> both searches, any man with brains can simply conclude that B-Search, on
> a sorted input, is always cheaper as compared to S-Search.

It is not always cheaper, and in some cases it can be more expensive.

The first issue is the frequency with which items on the list are
searched for. If some items in a list are searched for a lot more
frequently than others, and the list is sorted in decreasing order by
frequency of use, it's quite possible for the linear search to be as
fast or even faster than a binary search. In order for this to be true,
the chance that a given item will be searched for has to be very high
for the first few items in the list, and that chance has to drop very
rapidly with each of the first several items on the list. Lists with
those characteristics are not commonplace, but neither are they as rare
as you might think.

The second issue is the length of the list - binary search is a little
more complicated to set up than a linear search, and each binary search
step is a bit more complicated than the corresponding linear search
step. It isn't much of a difference, but For sufficiently short lists,
it's a difference that can make a linear search as fast as, if not
faster than, a binary search. How short is "sufficiently short"? That's
highly implementation-dependent. However, I'd never even dream of using
a binary search on a list with as few as 5 elements, the break-even
point is usually quite a bit higher than that.

These first case is really just an extension of the second. A list
where, most of the time, only the first few items have to be searched
through, is effectively very similar to a list that only has a few items.

Default User
Guest
Posts: n/a

 09-08-2009
Eric Sosman wrote:

> Well, possibly -- I wouldn't know. But in the case at hand, it
> seems more than likely that the O.P. has mis-transcribed "l" as "1",
> don't you think? And I sort of wonder what other transcription
> errors may have been introduced along with that one ...

That's not the case.

Brian

arnuld
Guest
Posts: n/a

 09-09-2009
> On Tue, 08 Sep 2009 08:43:12 -0400, Eric Sosman wrote:
>> arnuld wrote:

> I don't have the book, but this looks very un-C-like.
> Ordinarily, one would expect element [0] of an array to hold something
> useful, and might not expect element [r] to exist at all. Is this
> offered as real C code, or just as the first stage in a transliteration
> from a language with [1]-based arrays, possibly Fortran?

I guess Pascal. I got that from accu.org .

>> Binary Search:
>>
>> int search( int a[], int v, int l, int r ) {
>> while( r >= 1 )
>> {
>> int m = (l + r)/2;
>> if(v == a[m]) return m;
>> if(v < a[m]) r = m - 1;
>> else l = m + 1;
>> }
>>
>> return -1;
>> }

> The termination criterion looks completely wrong. Are
> you sure you've transcribed the code correctly?

Well, the condition inside while loop was typed incorrectly its (r >= l),
its l (Ell) not 1 (one). They are so confusing to me all the time. If you
look at the code I have posted in CLC and CLC++ in last 2 years, you will
never see me using l as a variable name.

> No: You can choose to learn things by rote and not bother
> with understanding them. Carry your grimoire everywhere, and hope you
> never encounter a situation that's just a little bit different,
> something that isn't written down for you in exactly the form you
> encounter it.

> ...SNIP....

> Have you ever noticed that practical Quicksort implementations
> often switch to insertion sort or something of the kind when the
> sub-files get short enough? Why would they abandon the O(N*logN)
> partition-exchange method in favor of an O(N*N) method? Are they just
> being perverse -- or is there something you've failed to notice?

Yes, I failed to notice that. Its a new insight you told me, I can never
deduce this.

>> So should I go with reading all the maths or just go to some random
>> chapter and start working on an algorithm in C ?

> It's your choice: memorize, or comprehend. Each approach has

Well, I am willing to buy a very basic book ("Higher Algebra" by Hall &
Knight) and learn Maths. Trouble is I still don't get how math is
connected with finding what kind of loop in C will take more CPU time or
if algebra fits in finding which piece of C code will make efficient use
of CPU, then from where I need to start to understand ? C^2, C(Base 2^n +
1) or log N.

Math always seemed incomprehensible to me. It looks like Newton's Law of
Gravitation to me. Before discovered the law of gravitation, apple still
fell on the ground, gravitation existed before Newton saw the apple
falling. Same I feel for math, I mean the fundamentals that math explains
exist without maths. I am not arguing, I am just trying to understand.

--
www.lispmachine.wordpress.com
my email is @ the above blog.

arnuld
Guest
Posts: n/a

 09-09-2009
> On Wed, 09 Sep 2009 04:49:25 +0000, arnuld wrote:
> Math always seemed incomprehensible to me. It looks like Newton's Law of
> Gravitation to me. Before discovered the law of gravitation, apple still
> fell on the ground, gravitation existed before Newton saw the apple
> falling. Same I feel for math, I mean the fundamentals that math
> explains exist without maths. I am not arguing, I am just trying to
> understand.

What I exactly meant why I shouldn't understand those fundamentals which
exist before Math. We are writing code, C or Lisp or whatever, we are
doing programming which is a different skill from Maths.

--
www.lispmachine.wordpress.com
my email is @ the above blog.

Keith Thompson
Guest
Posts: n/a

 09-09-2009
Richard Heathfield <(E-Mail Removed)> writes:
[...]
> 1, 2, 3, 4, 5, 6, 7, 8, 9, ..... 1048573, 1048574, 1048575.
>
> To find the key 1 via binary search will take 20 comparisons, whereas
> it will only take 1 comparison via sequential search.
>
> Nevertheless, *on average* binary search is much, much quicker. The
> average sequential search will take half a million comparisons for
> the above list, whereas a binary search on the above list (assuming
> it is a straightforward sorted linear list, that is) will never take
> more than 20 comparisons.

The average *successful* search will take half a million comparisons.
An unsuccessful search will take a million. The number of comparisons
required for an average search depends on how many of them are
successful (and on the distribution of the values successfully
searched for).

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"