Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Strange bug

Reply
Thread Tools

Strange bug

 
 
BartC
Guest
Posts: n/a
 
      11-23-2010
"Francois Grieu" <(E-Mail Removed)> wrote in message
news:4ceb5dec$0$677$(E-Mail Removed)...
> On 22/11/2010 13:55, BartC wrote:
>> (I had a go at implementing a fast version of the OP's Trim function; it
>> was
>> about a magnitude faster, but the recursive content was so obviously
>> contrived, that it wasn't worth bothering with, causing a 3x slowdown
>> over a
>> version using iteration. Not a good example.)

>
> The OP's algorithm has a worst case run time O(n^2) where n is the
> length of the string, when the straight algorithm runs O(n).
> The slowdown due to recursion in this case is by more than a
> constant factor.


My version of it called strlen() and memmove() just once. Recursion was used
just for counting spaces, and that was linear time I think, the same as
iteration.

--
Bartc

 
Reply With Quote
 
 
 
 
lawrence.jones@siemens.com
Guest
Posts: n/a
 
      11-24-2010
Ian Collins <(E-Mail Removed)> wrote:
>
> Yes, it took a while for the penny to drop for me as well. Maybe
>
> "Provided that a library function can be declared without reference to
> any type defined in the standard header where the function is declared"
>
> Would be better wording.


I think it should be more like "...defined only in the standard
header...". If the type is also defined in a different header, then
including that header should suffice. In short, I'm pretty sure the
intent was that declaring a function by hand should work as long as you
can do it "correctly". But it's hard to say "correctly" in standardese.
--
Larry Jones

Hello, I'm wondering if you sell kegs of dynamite. -- Calvin
 
Reply With Quote
 
 
 
 
Tim Rentsch
Guest
Posts: n/a
 
      11-24-2010
Francois Grieu <(E-Mail Removed)> writes:

> On 23/11/2010 01:09, Tim Rentsch wrote:
>> Francois Grieu <(E-Mail Removed)> writes:
>>
>>> [snip]
>>>
>>> I note that the archetypal learning use case for recursion are
>>> factorials and the "Tower of Hanoi", although both happen to have
>>> a non-recursive solution which is much simpler:
>>> http://en.wikipedia.org/wiki/Tower_o...rsive_solution
>>> This should be a giveaway that recursion is seldom useful, since
>>> instructors fail to find a good use case for it.

>>
>> I think the reasoning here is bad. These examples are chosen
>> because they usually are already familiar to students outside the
>> realm of computing. More specifically, they are chosen to *explain*
>> recursion, not to *motivate* recursion. I think they do a good job
>> in this respect -- in both cases the recursive solutions are simple
>> enough and easy to understand. Calling the referenced non-recursive
>> tower-of-hanoi algorithm simpler, let alone much simpler, is highly
>> misleading; the *steps* may be simpler, but seeing how and why the
>> algorithm works certainly is not. Conversely, for a recursive
>> version it's easy to see at a glance both how it works and why, and
>> that's the point of the exercise - to present a simple example where
>> recursion can be understood. Deeper understanding, about when it
>> might be good to use recursion, and when not, comes only later.

>
> I do acknowledge that
> - the most natural solution to Tower of Hanoi is recursive,
> simple and correct.
> - Tower of Hanoi is thus a good introductory problem for
> recursion in a computer class.


Okay, that's all good.

> I stand by my points, as clarified below:
> - Tower of Hanoi is *not* a good use case for recursion,
> unless when learning/teaching is the purpose.


I think that depends on what one is trying to optimize. It's
probably faster to write a recursive version that is clearly
correct, and if time-to-market is a key concern then a recursive
solution is probably the right choice here (assuming of course
that ToH is a necessary part of whatever software is being
delivered).

> - Recursion is seldom useful in practice, where non-recursive
> solutions often exist and lead to simpler code.


I'll just flat out disagree here. Recursion is often useful in
practice, and even when non-recursive solutions exist there
frequently are recursive solutions that are easier to understand,
or result in faster code, or both.

> - There are so few counterexamples to the above that recursion
> is taught with problems like Tower of Hanoi that are both
> artificial and with a shorter and more efficient
> non-recursive algorithm.


That's a bogus conclusion. Tower of Hanoi and other simple
example problems are used to illustrate recursion because the
problems are already familiar to students, not because of any
shortage of examples where a recursive solution is viable.

> Similarly, Mergesort, which someone in this thread sees as a
> good use case of recursion, is indeed a good problem for
> teaching recursion.


Okay.

> It can also be written simply and elegantly
> with a single size_t variable rather than reentrant call or
> stack array;


Certainly mergesort can be written iteratively using O(1) storage.
The idea that this can be done "simply and elegantly" is (a)
nebulously vague, (b) completely subjective, (c) an unsupported
opinion, (d) at best debatable, or (e) several of the above.

> once this is grasped, the code gets shorter and
> more efficient,


That's not consistent with the timing results pete reported.
Do you have any kind of supporting evidence to offer?

> and arguably as easy to derive.


If you offer up both a program and an informal correctness proof
I might agree, but as an unsupported statement I don't buy it.
I have programmed myself both a recursive mergesort and an
iterative mergesort, and understood both well, and the iterative
version was definitely harder to get right than the recursive
version. If you want to convince people otherwise you should
be prepared to provide a counterexample.

> In my view Mergesort is thus not a good use case for recursion
> in the sense that implies a reentrant function.


In my view this conclusion rests on some faulty premises,
as explained above.

> Quicksort is an excellent problem for teaching on recursion,
> because
> - Applying to a simple Quicksort just what was learned from
> recursion in Tower of Hanoi leads to code that superficially
> works, but on many architectures will overflow the execution
> stack for specially constructed input. Such failures are part
> of the process of learning.
> - There is a portable way out of that (using recursion for
> the smallest problem only, solving that one first, and
> replacing tail recursion for the rest by iteration).


What you're saying is that quicksort is good mostly for teaching
some of the *problems* associated with recursion, and one way of
getting around such problems. Forgive my cynicism, but this sounds
like you're subtly promoting an anti-recursion bias in the students.
Of course, I agree that it's important to understand the problem of
overly deep recursion, but this can be illustrated more easily with
a recursive function for measuring the length of a list, which in
turn can be used to show how to turn a non-tail-recursive function
into a function that is tail recursive (and thus easily iterizable
if one chooses to do that).

> Several qsort() C library source that I looked at do not use
> recursion in the sense that implies a reentrant function.
> In fact, having no reentrant function is a design objective
> of some C libraries, for portability, testability, and
> efficiency reasons.


As a counterpoint, the paper "Engineering a Sort Function", one of
the most widely cited papers about how to do a good job implementing
qsort, embraces recursion down *both* branches of sorting subarrays.
Quoting from the paper:

We have not adopted many customary improvements. By
managing a private stack we could cut stack space from
nearly 20 variables to 2 per level. By sorting the larger
side of the partition last and eliminating tail recursion,
we could reduce worst-case stack growth from linear in /n/
to logarithmic. Neither trick is worth the effort.

Personally I'm more likely to be convinced by Jon Bentley and Doug
McIlroy than by the (presumed) statements of anonymous developers
working on some unknown C libraries.
 
Reply With Quote
 
Tim Rentsch
Guest
Posts: n/a
 
      11-24-2010
http://www.velocityreviews.com/forums/(E-Mail Removed) writes:

> Ian Collins <(E-Mail Removed)> wrote:
>>
>> Yes, it took a while for the penny to drop for me as well. Maybe
>>
>> "Provided that a library function can be declared without reference to
>> any type defined in the standard header where the function is declared"
>>
>> Would be better wording.

>
> I think it should be more like "...defined only in the standard
> header...". If the type is also defined in a different header, then
> including that header should suffice. In short, I'm pretty sure the
> intent was that declaring a function by hand should work as long as you
> can do it "correctly". But it's hard to say "correctly" in standardese.


I've read this opinion twice now, and it surprised me both times.
The plain reading is clear enough, and seems to make no effort in
the "correctly" direction. It seems just as likely, or perhaps
even more likely, that the intention was to allow declarations of
functions that existed before the Standard, but any function that
needed a "new" type (ie, a type defined in some system header)
must be declared by #include'ing the appropriate header. This
would allow old code to work K&R style, but new code must do the
right thing. Doesn't that seem like a more plausible explanation?
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      11-24-2010
Tim Rentsch <(E-Mail Removed)> writes:
> (E-Mail Removed) writes:
>> Ian Collins <(E-Mail Removed)> wrote:
>>> Yes, it took a while for the penny to drop for me as well. Maybe
>>>
>>> "Provided that a library function can be declared without reference to
>>> any type defined in the standard header where the function is declared"
>>>
>>> Would be better wording.

>>
>> I think it should be more like "...defined only in the standard
>> header...". If the type is also defined in a different header, then
>> including that header should suffice. In short, I'm pretty sure the
>> intent was that declaring a function by hand should work as long as you
>> can do it "correctly". But it's hard to say "correctly" in standardese.

>
> I've read this opinion twice now, and it surprised me both times.
> The plain reading is clear enough, and seems to make no effort in
> the "correctly" direction. It seems just as likely, or perhaps
> even more likely, that the intention was to allow declarations of
> functions that existed before the Standard, but any function that
> needed a "new" type (ie, a type defined in some system header)
> must be declared by #include'ing the appropriate header. This
> would allow old code to work K&R style, but new code must do the
> right thing. Doesn't that seem like a more plausible explanation?


Not to me.

I think the point is that the function declarations provided by headers
are just C function declarations. A declaration needs to be visible in
order to call a library function, and it doesn't matter to the compiler
whether that declaration comes from a #included header or is part of the
source file being compiled.

In what plausible way could this program:

#include <stddef.h>
int main(void)
{
size_t strlen(const char *s);
return strlen("");
}

fail in a real implementation? The standard, as currently worded,
fails to define its behavior, but IMHO a conceptually *simpler*
rule would cover this program.

Again, that's not to say that writing such code is a good idea,
but the rule in the standard does reinforce the idea that standard
functions like strlen() can be treated like ordinary functions.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Nick
Guest
Posts: n/a
 
      11-24-2010
Francois Grieu <(E-Mail Removed)> writes:

> Similarly, Mergesort, which someone in this thread sees as a
> good use case of recursion, is indeed a good problem for
> teaching recursion. It can also be written simply and elegantly
> with a single size_t variable rather than reentrant call or
> stack array; once this is grasped, the code gets shorter and
> more efficient, and arguably as easy to derive.
> In my view Mergesort is thus not a good use case for recursion
> in the sense that implies a reentrant function.


To help me follow this point, could you re-write this without the
recursion -
http://groups.google.com/group/comp....c56651e3?hl=en

It's not obvious to me how to do it while keeping it simple and elegant.
--
Online waterways route planner | http://canalplan.eu
Plan trips, see photos, check facilities | http://canalplan.org.uk
 
Reply With Quote
 
lawrence.jones@siemens.com
Guest
Posts: n/a
 
      11-24-2010
Keith Thompson <(E-Mail Removed)> wrote:
>
> I think the point is that the function declarations provided by headers
> are just C function declarations. A declaration needs to be visible in
> order to call a library function, and it doesn't matter to the compiler
> whether that declaration comes from a #included header or is part of the
> source file being compiled.


Exactly.

> In what plausible way could this program:
>
> #include <stddef.h>
> int main(void)
> {
> size_t strlen(const char *s);
> return strlen("");
> }
>
> fail in a real implementation? The standard, as currently worded,
> fails to define its behavior, but IMHO a conceptually *simpler*
> rule would cover this program.


The difficulty is that declaring strlen as:

unsigned strlen(const char *s);

is perfectly correct on a platform where size_t is a synonym for
unsigned int, but is not correct in general. I'm struggling to come
up with words that unambiguously allow your declaration but not mine.
--
Larry Jones

I hate being good. -- Calvin
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      11-25-2010
(E-Mail Removed) writes:
> Keith Thompson <(E-Mail Removed)> wrote:
>> I think the point is that the function declarations provided by headers
>> are just C function declarations. A declaration needs to be visible in
>> order to call a library function, and it doesn't matter to the compiler
>> whether that declaration comes from a #included header or is part of the
>> source file being compiled.

>
> Exactly.
>
>> In what plausible way could this program:
>>
>> #include <stddef.h>
>> int main(void)
>> {
>> size_t strlen(const char *s);
>> return strlen("");
>> }
>>
>> fail in a real implementation? The standard, as currently worded,
>> fails to define its behavior, but IMHO a conceptually *simpler*
>> rule would cover this program.

>
> The difficulty is that declaring strlen as:
>
> unsigned strlen(const char *s);
>
> is perfectly correct on a platform where size_t is a synonym for
> unsigned int, but is not correct in general. I'm struggling to come
> up with words that unambiguously allow your declaration but not mine.


I think that "unsigned strlen(const char *s)" *should* be valid
if size_t and unsigned are the same type. The declaration is ugly
and non-portable, but the compiler shouldn't be able to reject it
if it's compatible with the standard form. If we have "typedef
unsigned size_t;", the compiler shouldn't be able to distinguish
between unsigned and size_t.

The current wording is a little weak on that point, even without
covering functions that are declared in one header and depend on
types "defined" in another. It says:

Provided that a library function can be declared without
reference to any type defined in a header, it is also permissible
to declare the function and use it without including its
associated header.

It doesn't say you have to declare it *correctly*. One possible
reading is that:

double strlen(int s);
double x = strlen(42);

is permissible.

Loosely, if you declare a standard library function yourself, the
declaration has to specify the same type as the actual function.
The phrase "same type" probably needs tweaking ("compatible type",
maybe?). If it doesn't, the behavior is undefined. (Does the
declaration itself introduce UB, or just a call?)

Basically, if a user declaration of strlen would be valid in
the presence of #include <string.h>, it's valid in its absence;
if it would be invalid, then it's an incompatible declaration
and the behavior of any call that refers to the user declaration
is undefined.

Yet another reason that it's a bad idea to depend on this is that
standard function declarations can change from one version of
the standard to another. A program that gets the declaration for
strcpy() by #include'ing <string.h> can easily be portable across
C90, C99, and future standards; a program that tries to declare
it itself can get hung up on the "restrict" qualifiers that were
added in C99.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      11-25-2010
Keith Thompson <(E-Mail Removed)> writes:
[...]
> Provided that a library function can be declared without
> reference to any type defined in a header, it is also permissible
> to declare the function and use it without including its
> associated header.
>
> It doesn't say you have to declare it *correctly*. One possible
> reading is that:
>
> double strlen(int s);
> double x = strlen(42);
>
> is permissible.

[...]

On second thought, I think that's covered by C99 6.5.2.2p9:

If the function is defined with a type that is not compatible
with the type (of the expression) pointed to by the expression
that denotes the called function, the behavior is undefined.

I'm assuming here that strlen() has a "definition" somewhere.
On the other hand, it might be written in a language other than
C, or implemented via pure compiler magic. (Note that I'm not
referring to the permission to define strlen() as a macro; in that
case, any declaration is irrelevant unless the caller bypasses the
macro definition.)

So perhaps having 7.1.4p2 refer to 6.5.2.2, and state that all
this stuff works as if the standard library functions have actual
C definitions (is that already stated somewhere?), would cover all
the bases.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Mark Wooding
Guest
Posts: n/a
 
      11-25-2010
Nick <(E-Mail Removed)> writes:

> To help me follow this point, could you re-write this without the
> recursion -
> http://groups.google.com/group/comp....c56651e3?hl=en


This is Chris Torek's recursive linked-list mergesort.

> It's not obvious to me how to do it while keeping it simple and
> elegant.


Chris's algorithm is a `top-down' mergesort. It's not especially easy
to make such a beast nonrecursive without an explicit stack. But
there's a very close cousin, a `bottom-up' mergesort, which is easy to
make nonrecursive.

The difference is as follows.

* A top-down mergesort splits its input into two sublists of similar
length, recursively sorts the two halves, and merges them together.
So if you start with N items, then you split them into two lists of
N/2 items at the top level, four lists of N/4 items at the next, and
so on until you have only trivial sublists.

* A bottom-up mergesort builds up islands of sortedness. It starts by
cutting the input into pairs of sublists of (say) length 1 which are
obviously sorted, merging the pairs, and concatenating the merged
results. Then every pair of items is sorted; so we cut the list
into pairs of lists of length 2, merging and concatenating.

This latter is quite easy to implement nonrecursively: all you need to
remember is how long the sublists you're meant to work with are. I
think the resulting code is fairly elegant, but I'll let you be the
final judge. Note the triple indirection in `mergelists': I think this
is one of the more obvious natural applications for a triply-indirected
pointer.

/* Merge together two sorted lists A and B into a single sorted list.
* The sort order is determined by the CMP function; if CMP decides that two
* elements are equivalent then prefer the element from list A. On exit,
* **TAILP is set to point to the start of the merged list, and *TAILP is the
* address of the `next' link of the last node of the merged list, so that
* further stuff can be appended later: several merged lists can be
* accumulated simply by passing the same TAILP argument repeatedly to
* `mergelists'; to terminate the list finally, set **TAILP = 0.
*/
static void mergelists(struct list ***tailp,
struct list *a, struct list *b,
int (*cmp)(const struct list *,
const struct list *))
{
struct list **tail = *tailp;

/* Choose the lesser item from the two list heads and advance. Continue
* until we run out of one of the lists; when done; set A to the (possibly)
* nonempty list.
*/
for (; {
if (!b) break;
else if (!a) { a = b; break; }
else if (cmp(a, b) <= 0) { *tail = a; tail = &a->next; a = a->next; }
else { *tail = b; tail = &b->next; b = b->next; }
}

/* Append the (possibly) nonempty list onto the accumulated output, and
* find the end of it.
*/
*tail = a;
if (a) {
while (a->next) a = a->next;
tail = &a->next;
}

/* Tell the caller where the end of the list is. */
*tailp = tail;
}

/* Sort a list L using in-place list mergesort.
* The CMP function is a preorder, outputting <0, 0, or >0 according to
* whether the payload of its left argument is less than, equivalent to, or
* greater than, its right argument; `equivalence' must indeed be an
* equivalence relation, and CMP must be transitive. The sort is stable: two
* nodes with equivalent payloads remain in the same relative order.
*/
static struct list *mergesort(struct list *l,
int (*cmp)(const struct list *,
const struct list *))
{
size_t i, n = 1;
struct list *head, **tail;
struct list *a, *b, **p;
int done = 0;

/* An empty input is a nasty edge case. Dispose of it in advance. */
if (!l) return (0);

/* Main loop. We cut the list into pairs of sublists of length N, and
* merge each pair together, concatenating the results to form a new list
* in which each consecutive run of 2 N elements is sorted. Then we double
* N. When N is longer than the list, we're done.
*/
do {

/* Collect the merged sublist pairs. */
tail = &head;

/* While there is more input list to consume, cut out two sublists of
* length N.
*/
while (l) {

/* Find the start and end of the first sublist. Note that the extra
* level of indiration in P allows it to lag one node behind, so that
* we can terminate the sublist when we've counted enough.
*/
p = &l; a = *p;
for (i = n; i && *p; i--) p = &(*p)->next;

/* Find the start and end of the second sublist. */
b = *p; *p = 0; p = &b;
for (i = n; i && *p; i--) p = &(*p)->next;
l = *p; *p = 0;

/* If we've hit the end of the input, and haven't produced any output,
* then our two sublists must contain all of the elements. Therefore,
* when we've merged them, we'll have sorted the entire input, and
* we'll be finished.
*/
if (!l && tail == &head) done = 1;

/* Merge the sublists, and append to our output. The A list consists
* of elements earlier than the B list, and `mergelists' will preserve
* the relative order of equivalent nodes, so the sort is stable.
*/
mergelists(&tail, a, b, cmp);
}

/* Terminate the output list, and make it be the new input. */
*tail = 0;
l = head;
n <<= 1;
} while (!done);

/* Finished. */
return (l);
}

-- [mdw]
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
*bug* *bug* *bug* David Raleigh Arnold Firefox 12 04-02-2007 03:13 AM
AVG Email Scanner activating at strange times with strange IP addresses dennispublic@hotmail.com Computer Support 1 08-26-2006 04:27 AM
Strange, Illogical ASP.NET Bug: "File or assembly name, or one of its dependencies not found" Philipp Schumann ASP .Net 2 10-25-2003 05:05 PM
Question About Strange 'C' Code Syntax ( Well strange to me anyway ) Harvey Twyman C Programming 8 10-25-2003 05:54 AM
Re: VERY STRANGE BUG? Adding a textbox control causes other textbox control to fail??? S. Justin Gengo ASP .Net 0 07-16-2003 06:51 PM



Advertisments