Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Comment on trim string function please

Reply
Thread Tools

Comment on trim string function please

 
 
swengineer001@gmail.com
Guest
Posts: n/a
 
      07-10-2008
Just looking for a few eyes on this code other than my own.

void TrimCString(char *str)
{
// Trim whitespace from beginning:
size_t i = 0;
size_t j;

while(isspace(str[i]))
{
i++;
}
if(i > 0)
{
for(j = 0; i < strlen(str); j++, i++)
{
str[j] = str[i];
}
str[j] = '\0';
}

// Trim whitespace from end:
i = strlen(str) - 1;

while(isspace(str[i]))
{
i--;
}
if(i < (strlen(str) - 1))
{
str[i + 1] = '\0';
}
}
 
Reply With Quote
 
 
 
 
Jens Thoms Toerring
Guest
Posts: n/a
 
      07-10-2008
http://www.velocityreviews.com/forums/(E-Mail Removed) <(E-Mail Removed)> wrote:
> Just looking for a few eyes on this code other than my own.


> void TrimCString(char *str)
> {
> // Trim whitespace from beginning:


I guess I would trim from the end first since then you have less
copying to do afterwards.

> size_t i = 0;
> size_t j;


> while(isspace(str[i]))


isspace() expects an int and not a char as it's argument.

> {
> i++;
> }
> if(i > 0)
> {
> for(j = 0; i < strlen(str); j++, i++)
> {
> str[j] = str[i];
> }
> str[j] = '\0';
> }


An alternative would be to use memmove() here, so you don't
have to do it byte by byte. Also callling strlen() each time
through the loop is a bit of a waste of time - it doesn't
change and can be replaced by a check if str[i] is '\0'.

> // Trim whitespace from end:
> i = strlen(str) - 1;


Careful: This could set 'i' to -1 (if the string consistet of white
space only) and then the rest won't work anymore.

> while(isspace(str[i]))
> {
> i--;
> }


> if(i < (strlen(str) - 1))
> {
> str[i + 1] = '\0';
> }
> }


Here's an alternative version using pointers (and trying to
minimize the number of calls of strlen()):

void
TrimCString( char *str )
{
char *p,
*q;

/* Check that we've got something that looks like a string */

if ( ! str || ! * str )
return;

/* Trim from end */

for ( p = str + strlen( str ) - 1; p != str && isspace( ( int ) *p ); p-- )
/* empty */ ;

if ( p == str ) /* only white space in string */
{
*str = '\0';
return;
}

*++p = '\0';

/* Trim from start */

for ( q = str; isspace( ( int ) *q ); q++ )
/* empty */ ;

if ( q != str )
memmove( str, q, p - q + 1 );
}
Regards, Jens
--
\ Jens Thoms Toerring ___ (E-Mail Removed)
\__________________________ http://toerring.de
 
Reply With Quote
 
 
 
 
badc0de4@gmail.com
Guest
Posts: n/a
 
      07-10-2008
(E-Mail Removed) wrote:
> Just looking for a few eyes on this code other than my own.
>
> void TrimCString(char *str)


Why not return a char *, like most other string functions?

char *TrimCString(char *str)

using char * enables you to piggyback your function in the
middle of other functions, eg
printf("%s\n", TrimCString(someString));

> {
> // Trim whitespace from beginning:
> size_t i = 0;
> size_t j;
>
> while(isspace(str[i]))
> {
> i++;
> }
> if(i > 0)
> {
> for(j = 0; i < strlen(str); j++, i++)


move the strlen() outside the loop.
Maybe use memmove() instead of the loop.

> {
> str[j] = str[i];
> }
> str[j] = '\0';
> }
>
> // Trim whitespace from end:
> i = strlen(str) - 1;


Use the strlen you computed before, when
you moved it out of the for loop above

>
> while(isspace(str[i]))
> {
> i--;
> }
> if(i < (strlen(str) - 1))
> {
> str[i + 1] = '\0';


No need for the test.
when i >= (strlen(str) - 1) -- it can only be equal, anyway -- the
assignment overwrites a '\0' with a brand new '\0'.
Anyway, if you want to keep the test, use the computed strlen.

> }
> }



A couple what-if's

* what if a pass NULL to the function?
TrimCString(NULL);

* what if I pass a constant string literal to the function?
TrimCString(" 4 spaces at both ends ");
 
Reply With Quote
 
swengineer001@gmail.com
Guest
Posts: n/a
 
      07-10-2008
On Jul 10, 1:32*pm, (E-Mail Removed) (Jens Thoms Toerring) wrote:
> (E-Mail Removed) <(E-Mail Removed)> wrote:
> > Just looking for a few eyes on this code other than my own.
> > void TrimCString(char *str)
> > {
> > * * * * // Trim whitespace from beginning:

>
> I guess I would trim from the end first since then you have less
> copying to do afterwards.
>
> > * * * * size_t i = 0;
> > * * * * size_t j;
> > * * * * while(isspace(str[i]))

>
> isspace() expects an int and not a char as it's argument.

Doesn't this promotion happen implicitly and without loss of
information since I am actually dealing with characters and not using
the char as a holder for small integers?
>
> > * * * * {
> > * * * * * * * * i++;
> > * * * * }
> > * * * * if(i > 0)
> > * * * * {
> > * * * * * * * * for(j = 0; i < strlen(str); j++, i++)
> > * * * * * * {
> > * * * * * * * * * * * * str[j] = str[i];
> > * * * * * * * * }
> > * * * * * * * * str[j] = '\0';
> > * * * * }

>
> An alternative would be to use memmove() here, so you don't
> have to do it byte by byte. Also callling strlen() each time
> through the loop is a bit of a waste of time - it doesn't
> change and can be replaced by a check if str[i] is '\0'.
>
> > * * * * // Trim whitespace from end:
> > * * * * i = strlen(str) - 1;

>
> Careful: This could set 'i' to -1 (if the string consistet of white
> space only) and then the rest won't work anymore.

i is of type size_t which I believe can't be negative?
>
> > * * * * while(isspace(str[i]))
> > * * * * {
> > * * * * * * * * i--;
> > * * * * }
> > * * * * if(i < (strlen(str) - 1))
> > * * * * {
> > * * * * * * * * str[i + 1] = '\0';
> > * * * * }
> > }

>
> Here's an alternative version using pointers (and trying to
> minimize the number of calls of strlen()):
>
> void
> TrimCString( char *str )
> {
> * * char *p,
> * * * * * * **q;
>
> * * /* Check that we've got something that looks like a string */
>
> * * * * if ( ! str || ! * str )
> * * * * * * * * return;
>
> * * * * /* Trim from end */
>
> * * for ( p = str + strlen( str ) - 1; p != str && isspace( ( int ) *p ); p-- )
> * * * * * * /* empty */ ;
>
> * * * * if ( p == str ) * * */* only white space in string */
> * * * * {
> * * * * * * *str = '\0';
> * * * * * * * * return;
> * * }
>
> * * * * *++p = '\0';
>
> * * /* Trim from start */
>
> * * * * for ( q = str; isspace( ( int ) *q ); q++ )
> * * * * /* empty */ ;
>
> * * if ( q != str )
> * * * * memmove( str, q, p - q + 1 );}

Thanks for the code and the comments. This is useful.
 
Reply With Quote
 
swengineer001@gmail.com
Guest
Posts: n/a
 
      07-10-2008
On Jul 10, 1:39*pm, (E-Mail Removed) wrote:
> (E-Mail Removed) wrote:
> > Just looking for a few eyes on this code other than my own.

>
> > void TrimCString(char *str)

>
> Why not return a char *, like most other string functions?
>
> char *TrimCString(char *str)
>
> * * * * using char * enables you to piggyback your function in the
> * * * * middle of other functions, eg
> * * * * printf("%s\n", TrimCString(someString));

Good point, I had not thought of it.
>
> > {
> > * *// Trim whitespace from beginning:
> > * *size_t i = 0;
> > * *size_t j;

>
> > * *while(isspace(str[i]))
> > * *{
> > * * * * * *i++;
> > * *}
> > * *if(i > 0)
> > * *{
> > * * * * * *for(j = 0; i < strlen(str); j++, i++)

>
> move the strlen() outside the loop.
> Maybe use memmove() instead of the loop.
>
> > * * * *{
> > * * * * * * * * * *str[j] = str[i];
> > * * * * * *}
> > * * * * * *str[j] = '\0';
> > * *}

>
> > * *// Trim whitespace from end:
> > * *i = strlen(str) - 1;

>
> Use the strlen you computed before, when
> you moved it out of the for loop above

the length of the string has changed since then, possibly.
>
>
>
> > * *while(isspace(str[i]))
> > * *{
> > * * * * * *i--;
> > * *}
> > * *if(i < (strlen(str) - 1))
> > * *{
> > * * * * * *str[i + 1] = '\0';

>
> No need for the test.
> when i >= (strlen(str) - 1) -- it can only be equal, anyway -- the
> assignment overwrites a '\0' with a brand new '\0'.
> Anyway, if you want to keep the test, use the computed strlen.
>
> > * *}
> > }

>
> A couple what-if's
>
> * what if a pass NULL to the function?
> * TrimCString(NULL);

I added a check for this.
>
> * what if I pass a constant string literal to the function?
> * TrimCString(" * *4 spaces at both ends * *");

How can I account for this?

 
Reply With Quote
 
swengineer001@gmail.com
Guest
Posts: n/a
 
      07-10-2008
On Jul 10, 1:41*pm, Eric Sosman <(E-Mail Removed)> wrote:
> (E-Mail Removed) wrote:
> > Just looking for a few eyes on this code other than my own.

>
> * * *This pair of eyes sees three bugs, two occurring twice each
> and the other perhaps due to too much snippage. *There are also
> some opportunities to improve speed and style. *So, here we go:
>
> * * *Bug: Since you're using isspace() and strlen(), you need to
> #include <ctype.h> and <string.h> to get their declarations.
> Without the declarations, a compiler operating under C99 rules
> must generate a diagnostic. *Under C89 rules the code will work,
> but might not be as fast as if the vendor's "magic" declarations
> were present.
>
> > void TrimCString(char *str)
> > {
> > * *// Trim whitespace from beginning:
> > * *size_t i = 0;

>
> * * *Style: Instead of initializing `i' at its declaration and then
> relying on the initial value later on, consider initializing it
> closer to its use. *The `for' statement is convenient for such
> purposes.
>
> > * *size_t j;

>
> > * *while(isspace(str[i]))

>
> * * *Bug: If `str' contains negative-valued characters, this use
> may violate the "contract" of isspace() by passing an out-of-range
> argument. *Use `while (isspace( (unsigned char)str[i] ))'. *(This
> is one of those occasions where a cast is almost always required
> and almost always omitted, as opposed to the more usual situation
> where a cast is almost always inserted and almost always wrong.)
>
> > * *{
> > * * * * * *i++;
> > * *}
> > * *if(i > 0)
> > * *{
> > * * * * * *for(j = 0; i < strlen(str); j++, i++)

>
> * * *Speed: This loop calculates strlen(str) on every iteration.
> Since it will return the same value each time, all calls after the
> first are wasted effort. *Call strlen() once before the loop and
> store the result in a variable, and use that variable to control
> the loop.
>
> > * * * *{
> > * * * * * * * * * *str[j] = str[i];
> > * * * * * *}
> > * * * * * *str[j] = '\0';
> > * *}

>
> > * *// Trim whitespace from end:
> > * *i = strlen(str) - 1;

>
> * * *Bug: If `str' is the empty string (either because it was
> empty to begin with or because it contained only white space),
> this calculation will produce a very large `i' that is almost
> assuredly wrong, R-O-N-G.
>
> > * *while(isspace(str[i]))

>
> * * *Bug: Same missing cast as above.
>
> > * *{
> > * * * * * *i--;
> > * *}
> > * *if(i < (strlen(str) - 1))

>
> * * *Bug: Same mishandling of the empty string as above.
>
> * * *Speed: strlen(str) is still the same as it was a few lines
> ago, so there's no point in computing it again.
>
> > * *{
> > * * * * * *str[i + 1] = '\0';

>
> * * *Speed: It's probably quicker -- and certainly less verbose --
> to do this assignment unconditionally than to test whether it's
> needed. *If it isn't, you'll just store a zero on top of an
> existing zero, which is harmless.
>
> > * *}
> > }

>
> * * *Summary: Not quite ready for prime time, but not as bad as
> some attempts I've seen.
>
> * * *Challenge: See if you can think of a way to do the job in
> just one pass over the string (calling strlen() counts as a
> "pass"). *Hint: During the copy-to-front operation, can you
> somehow figure out where the final '\0' should land without
> going back and inspecting the moved characters a second time?
>
> --
> (E-Mail Removed)


Thanks for the input. Someone else had pointed out one of the issues
you mentioned but I didn't understsnd what he was saying until I read
your description.
 
Reply With Quote
 
swengineer001@gmail.com
Guest
Posts: n/a
 
      07-10-2008
Here is my second go at it after reading the comments provided.

//This was in my original file but just not immediately before this
function
#include <ctype.h>
#include <string.h>

//Added char* return as suggested
char* TrimCString(char *str)
{
// Trim whitespace from beginning:
size_t i = 0;
size_t j;

//Added validation of the argument for NULL
if(str == NULL)
{
return str;
}

//Added cast as suggested. I have always avoided casts for various
reasons
//discussed in the FAQ and elsewhere but it is good to know this
case
while(isspace((unsigned char)str[i]))
{
i++;
}
if(i > 0)
{
//Calculate length once
size_t length = strlen(str);
//Decided to leave the bytewise copy I just think I can
understand it a little better
for(j = 0; i < length; j++, i++)
{
str[j] = str[i];
}
str[j] = '\0';
}

// Trim whitespace from end:
//Added check to catch the empty string
i = (strlen(str) ? (strlen(str) - 1) : 0);

while(isspace((unsigned char)str[i]))
{
i--;
}
//Removed check that was not needed
str[i + 1] = '\0';
}
 
Reply With Quote
 
badc0de4@gmail.com
Guest
Posts: n/a
 
      07-10-2008
(E-Mail Removed) wrote:
> On Jul 10, 1:39 pm, (E-Mail Removed) wrote:
> > (E-Mail Removed) wrote:
> > > Just looking for a few eyes on this code other than my own.



> > * what if I pass a constant string literal to the function?
> > TrimCString(" 4 spaces at both ends ");


> How can I account for this?


Well ... you can't!
But you can do (at least) one of two things:

a) add a comment forbidding the caller to pass a string literal.
The comment goes near the function in the file with the code,
it also goes in the header file and in the documentation
provided for TrimCString()

b) Rewrite TrimCString to write the trimmed string to a
different memory area (variable, array, whatever), a bit
like strcpy() works

char *TrimCString(char *dest, const char *source)
/* caller must ensure `dest` has enough space for
the trimmed string */
/* if source is NULL, trims `dest` in-place. `dest`
*MUST* be writable */
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      07-10-2008
Eric Sosman <(E-Mail Removed)> writes:

> (E-Mail Removed) wrote:
>> Just looking for a few eyes on this code other than my own.

<snip>
>> size_t i = 0;

>
> Style: Instead of initializing `i' at its declaration and then
> relying on the initial value later on, consider initializing it
> closer to its use. The `for' statement is convenient for such
> purposes.


If you are suggesting replacing the while (isspace(str[i]) with a
for (int i = 0; isspace((unsigned char)str[i]; i++); then this will not
allow i to tested outside the for:

>> while(isspace(str[i]))
>> {
>> i++;
>> }
>> if(i > 0)


here.

--
Ben.
 
Reply With Quote
 
badc0de4@gmail.com
Guest
Posts: n/a
 
      07-10-2008
(E-Mail Removed) wrote:
> Here is my second go at it after reading the comments provided.


> //Added char* return as suggested
> char* TrimCString(char *str)
> {


[...]

> //Calculate length once
> size_t length = strlen(str);
> //Decided to leave the bytewise copy [...]
> for(j = 0; i < length; j++, i++)
> {
> str[j] = str[i];
> }
> str[j] = '\0';
> }
>
> // Trim whitespace from end:
> //Added check to catch the empty string
> i = (strlen(str) ? (strlen(str) - 1) : 0);


Here, strlen(str) is the same as j.
You can replace the call to strlen() with j

i = j ? j - 1 : 0;

> while(isspace((unsigned char)str[i]))
> {
> i--;
> }
> //Removed check that was not needed
> str[i + 1] = '\0';


And, since you changed the function to return a char *,
do return a char * to the caller.

return str;

> }

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 2 04-24-2007 01:59 AM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 26 02-26-2007 05:06 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 6 12-25-2006 08:47 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 0 10-25-2006 11:00 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 0 08-28-2006 11:00 PM



Advertisments