Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Strange bug

Reply
Thread Tools

Strange bug

 
 
Super Ted
Guest
Posts: n/a
 
      11-19-2010
Hi I am trying to write a function to trim white-space from the beginning
and end of a string. I have decided to capitalize on the recursive nature
of this task and also to subdivide it into two simpler functions (trim
left, trim right).

Here is my code with a test-harness:

#include <stdio.h>

static int TrimWSpaceL(char *s)
{
if(isspace(*s)) {
strcpy(s,s+1);
return 1+TrimWSpaceL(s);
}
else
return 0;
}

static int TrimWSpaceR(char *s)
{
if(isspace(*(s+strlen(s)-1))) {
*(s+strlen(s)-1)='\0';
return 1+TrimWSpaceR(s);
}
else
return 0;
}

/* trims white-space, returning #spaces trimmed */
int TrimWSpace(char *s)
{
return TrimWSpaceL(s) + TrimWSpaceR(s);
}

int main()
{
char s[] = " \t *Hello world* \n ";
int i = TrimWSpace(s);
printf("%s\n(deleted %d spaces)\n", s,i);
return 0;
}

However I get an unexpected output:

*Hello world
(deleted 11 spaces)

Both lines are wrong! In the first line, a * has disappeared. In the 2nd
line, 11 spaces are reported, not 10!

Can anyone help?

Cheers.
 
Reply With Quote
 
 
 
 
Ian Collins
Guest
Posts: n/a
 
      11-19-2010
On 11/20/10 09:10 AM, Super Ted wrote:
> #include<stdio.h>


You haven't included <ctype.h> or <string.h>, so you don't have
prototypes for strcpy, strlen or isspace.

--
Ian Collins
 
Reply With Quote
 
 
 
 
Super Ted
Guest
Posts: n/a
 
      11-19-2010
Ian Collins writes:

> On 11/20/10 09:10 AM, Super Ted wrote:
>> #include<stdio.h>

>
> You haven't included <ctype.h> or <string.h>, so you don't have
> prototypes for strcpy, strlen or isspace.


Sorry, but adding those includes does not fix the problem!
 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      11-19-2010
On 11/20/10 10:31 AM, Super Ted wrote:
> Ian Collins writes:
>
>> On 11/20/10 09:10 AM, Super Ted wrote:
>>> #include<stdio.h>

>>
>> You haven't included<ctype.h> or<string.h>, so you don't have
>> prototypes for strcpy, strlen or isspace.

>
> Sorry, but adding those includes does not fix the problem!


It should, the code appears fine (and works for me).

../a.out
*Hello world*
(deleted 10 spaces)

--
Ian Collins
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      11-19-2010
Ian Collins <ian-> writes:
> On 11/20/10 10:31 AM, Super Ted wrote:
>> Ian Collins writes:
>>
>>> On 11/20/10 09:10 AM, Super Ted wrote:
>>>> #include<stdio.h>
>>>
>>> You haven't included<ctype.h> or<string.h>, so you don't have
>>> prototypes for strcpy, strlen or isspace.

>>
>> Sorry, but adding those includes does not fix the problem!

>
> It should, the code appears fine (and works for me).
>
> ./a.out
> *Hello world*
> (deleted 10 spaces)


strcpy(s,s+1);

C99 7.21.2.3:

The strcpy function copies the string pointed to by s2 (including
the terminating null character) into the array pointed to by
s1. If copying takes place between objects that overlap, the
behavior is undefined.

If you really wanted to do it this way, you could use memmove(),
which works for overlapping arguments. But the approach
is quite inefficient; each character is copied multiple times.

--
Keith Thompson (The_Other_Keith) kst- <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      11-19-2010
On 11/20/10 10:47 AM, Keith Thompson wrote:
> Ian Collins<ian-> writes:
>> On 11/20/10 10:31 AM, Super Ted wrote:
>>> Ian Collins writes:
>>>
>>>> On 11/20/10 09:10 AM, Super Ted wrote:
>>>>> #include<stdio.h>
>>>>
>>>> You haven't included<ctype.h> or<string.h>, so you don't have
>>>> prototypes for strcpy, strlen or isspace.
>>>
>>> Sorry, but adding those includes does not fix the problem!

>>
>> It should, the code appears fine (and works for me).
>>
>> ./a.out
>> *Hello world*
>> (deleted 10 spaces)

>
> strcpy(s,s+1);
>
> C99 7.21.2.3:
>
> The strcpy function copies the string pointed to by s2 (including
> the terminating null character) into the array pointed to by
> s1. If copying takes place between objects that overlap, the
> behavior is undefined.
>
> If you really wanted to do it this way, you could use memmove(),
> which works for overlapping arguments. But the approach
> is quite inefficient; each character is copied multiple times.


Silly me - the prototype should have given that away:

char *strcpy(char *restrict s1, const char *restrict s2);

--
Ian Collins
 
Reply With Quote
 
Mark Wooding
Guest
Posts: n/a
 
      11-20-2010
Super Ted <> writes:

> #include <stdio.h>


The `strcpy' and `strlen' functions you use are declared in <string.h>.
The `isspace' function is declared in <ctype.h>. You should include
these headers if you use the functions. I believe that the implicit
declaration of `isspace' if you omit <ctype.h> is sufficient to be able
to call it in a well-defined way; but this is not the case for `strcpy'
or `strlen', neither of which return `int' -- the former returns a
`char *' and the latter a `size_t'.

My compiler gives me warnings about the above. You should probably turn
on your compiler's warnings.

#include <ctype.h>
#include <string.h>

> static int TrimWSpaceL(char *s)
> {
> if(isspace(*s)) {


The ismumble functions from <ctype.h> expect to receive their argument
in a rather peculiar form: an `unsigned char' converted to an `int'. If
the `char' type is signed on your system (as it is in many), and the
first character of the string is negative (and not EOF, a special
exemption) then the behaviour is undefined.

> strcpy(s,s+1);


The source and destination arguments for `strcpy' mustn't overlap. This
has undefined behaviour. Also, you do this copy for every leading
whitespace character you find: if the string consists entirely of
n whitespace then you move a total of n (n + 1)/2 characters, taking
quadratic time to solve a linear-time problem.

> return 1+TrimWSpaceL(s);


This is not a tail position. You have potentially unbounded non-tail
recursion, and therefore run the risk of exhausting the stack. There is
no reason to use more than a constant amount of memory to solve this
problem.

Also, the type `int' may be too small to store the number of whitespace
characters removed from the string. I'd use `size_t' instead.

> }
> else
> return 0;
> }


static size_t trimspace_l(char *p)
{
const char *q;

for (q = p; *q && isspace((unsigned char)*q); q++);
if (q > p) memmove(p, q, strlen(q) + 1);
return (q - p);
}

> static int TrimWSpaceR(char *s)
> {
> if(isspace(*(s+strlen(s)-1))) {


Same remarks about isspace again. Many people would find
`s[strlen(s) - 1]' clearer, though the two are entirely equivalent.

More seriously, this has a bug if the string is empty: strlen(s) is
zero, so this is s[-1], probably outside the bounds of the string, and
undefined behaviour.

> *(s+strlen(s)-1)='\0';


The `strlen' function has to scan the entire string trying to find a
zero byte. For long strings, this takes a fair amount of time. You
call it twice for each trailing whitespace character. Again, this leads
to quadratic running time for ought to be a linear-time operation.

> return 1+TrimWSpaceR(s);


And again the unbounded recursion, using linear space for what ought to
be a constant-space operation. And again, `int' may not be large
enough.

> }
> else
> return 0;
> }


static size_t trimspace_r(char *p)
{
char *q = p + strlen(p);
const char *r = q;

while (q > p && isspace((unsigned char)q[-1])) *--q = 0;
return (r - q);
}

> /* trims white-space, returning #spaces trimmed */
> int TrimWSpace(char *s)
> {
> return TrimWSpaceL(s) + TrimWSpaceR(s);


This performs the two trimmings in an arbitrary order. Either order is
correct, but it's slightly (negligibly!) more efficient to trim the
right hand side first (why?).

> }


static size_t trimspace(char *p)
{
size_t n = 0;

n += trimspace_r(p);
n += trimspace_l(p);
return (n);
}

> int main()


It's more usual nowadays to write `int main(void)' in C. The above
doesn't provide a `prototype', which probably doesn't matter for `main'
because calling `main' is usually only done in IOCCC submissions.

> {
> char s[] = " \t *Hello world* \n ";
> int i = TrimWSpace(s);
> printf("%s\n(deleted %d spaces)\n", s,i);
> return 0;
> }


This seems OK.

> However I get an unexpected output:
>
> *Hello world
> (deleted 11 spaces)
>
> Both lines are wrong! In the first line, a * has disappeared. In the 2nd
> line, 11 spaces are reported, not 10!


I don't know which of the mistakes I've pointed above causes this. (On
my system, I get the expected output for your program. That doesn't
mean that the mistakes aren't serious -- only that they don't cause
observable failures on one particular machine with one particular
compiler version using one particular collection of compiler options.

-- [mdw]
 
Reply With Quote
 
Super Ted
Guest
Posts: n/a
 
      11-20-2010
Keith Thompson writes:

> Ian Collins <ian-> writes:
>> On 11/20/10 10:31 AM, Super Ted wrote:
>>> Ian Collins writes:
>>>
>>>> On 11/20/10 09:10 AM, Super Ted wrote:
>>>>> #include<stdio.h>
>>>>
>>>> You haven't included<ctype.h> or<string.h>, so you don't have
>>>> prototypes for strcpy, strlen or isspace.
>>>
>>> Sorry, but adding those includes does not fix the problem!

>>
>> It should, the code appears fine (and works for me).
>>
>> ./a.out
>> *Hello world*
>> (deleted 10 spaces)

>
> strcpy(s,s+1);
>
> C99 7.21.2.3:
>
> The strcpy function copies the string pointed to by s2 (including
> the terminating null character) into the array pointed to by s1. If
> copying takes place between objects that overlap, the behavior is
> undefined.
>
> If you really wanted to do it this way, you could use memmove(), which
> works for overlapping arguments. But the approach is quite inefficient;
> each character is copied multiple times.


This is a truely stupid restriction. I have now written a strcpy_safe
function by hand, where all string copies are valid, and from now on I
will use only that in all my future code.
 
Reply With Quote
 
Super Ted
Guest
Posts: n/a
 
      11-20-2010

> More seriously, this has a bug if the string is empty: strlen(s) is
> zero, so this is s[-1], probably outside the bounds of the string, and
> undefined behaviour.


I do not check argument validity for efficiency reasons. Like with
standard functions eg strlen, the caller of my function is responsible
for ensuring it doesn't pass a null string.
 
Reply With Quote
 
Seebs
Guest
Posts: n/a
 
      11-20-2010
On 2010-11-20, Super Ted <> wrote:
> Keith Thompson writes:
>> The strcpy function copies the string pointed to by s2 (including
>> the terminating null character) into the array pointed to by s1. If
>> copying takes place between objects that overlap, the behavior is
>> undefined.


> This is a truely stupid restriction.


Is it?

Because it seems to me that if it were all that stupid, surely someone
would have pointed this out and it wouldn't be there.

Is it possible that there is a reason for it which you don't understand?

> I have now written a strcpy_safe
> function by hand, where all string copies are valid, and from now on I
> will use only that in all my future code.


All string copies? That's an accomplishment.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet-
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
*bug* *bug* *bug* David Raleigh Arnold Firefox 12 04-02-2007 03:13 AM
AVG Email Scanner activating at strange times with strange IP addresses dennispublic@hotmail.com Computer Support 1 08-26-2006 04:27 AM
Strange, Illogical ASP.NET Bug: "File or assembly name, or one of its dependencies not found" Philipp Schumann ASP .Net 2 10-25-2003 05:05 PM
Question About Strange 'C' Code Syntax ( Well strange to me anyway ) Harvey Twyman C Programming 8 10-25-2003 05:54 AM
Re: VERY STRANGE BUG? Adding a textbox control causes other textbox control to fail??? S. Justin Gengo ASP .Net 0 07-16-2003 06:51 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57