Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Right trim string function in C

Reply
Thread Tools

Right trim string function in C

 
 
Shao Miller
Guest
Posts: n/a
 
      02-11-2012
On 2/11/2012 11:47, Shao Miller wrote:
> On 2/11/2012 08:50, Ben Bacarisse wrote:
>> but your use of a little-used statement in a position where it is both
>> suggestive and pointless is a stoke of genius!
>>

> Are you referring to the keyword (not statement) 'register'? If so, it's
> a storage-class specifier. While it suggests that access to so-specified
> objects "be as fast as possible," it also ensures that the address of a
> so-specified object cannot be taken. In this fashion, it is used in my
> code to prevent accidentally taking the address of the corresponding
> objects. I believe that this is similar to how 'const' can be removed
> from a well-behaved program without changing the behaviour. Am I mistaken?
>
>
> Why didn't you point out the redundancy of my 'continue's? Doesn't it
> seem like another "extra" like 'register' and 'const'?
>


It just occurred to me that you were talking about the jump statement
'continue', rather than the storage-class specifier 'register'. It's in
there as a habit because of the ability to hook it with a macro for
debugging purposes.

#if DEBUG_ITERATIONS
#define continue \
if (1) { \
puts("Continuing iteration"); \
continue; \
} else do ; while (0)
#endif
 
Reply With Quote
 
 
 
 
August Karlstrom
Guest
Posts: n/a
 
      02-11-2012
On 2012-02-10 19:17, peter wrote:
> Hello C programmers,
> I was wondering does anybody knows how or is there
> a right trim string function available in C?


No, not in the standard library, but it is not hard to implement.
Remember that since strings are constant objects you first need to copy
the string to a character array.

#include <ctype.h>
#include <string.h>

void trim_right(char *s)
{
int i;

i = strlen(s) - 1;
while ((i >= 0) && isspace(s[i])) {
i--;
}
s[i + 1] = '\0';
}


August
 
Reply With Quote
 
 
 
 
Willem
Guest
Posts: n/a
 
      02-11-2012
Willem wrote:
) peter wrote:
) ) Hello C programmers,
) ) I was wondering does anybody knows how or is there
) ) a right trim string function available in C?
) )
) ) Eg. Suppose I have the following:
) )
) ) char *str = "Hello Dolly \0";
) )
) ) Is there a right trim function that will remove the
) ) trailing spaces and make *str look like "Hello Dolly\0"?
) )
) ) In PYTHON there is a useful function for this: str.rstrip();
)
) There's been a lot of code posted, all of which seemed to be using
) pointers, so just for shits I'll post one that uses indexes:
)
) i = strlen(s);
) while (i > 0 && s[i-1] == ' ') i--;
) s2 = malloc(i+1);
) memcpy(s2, s, i);
) s2[i] = 0;

And if you really want to avoid the forwards/backwards seeking,
(but note that strlen is probably a lot faster at seeking forward than
a manual loop, so for long strings this could very well be slower)
use this:

i = 0;
for (j = 0; s[j]; j++) if (s[j] != ' ') i = j+1;
s2 = malloc(i+1);
memcpy(s2, s, i);
s2[i] = 0;

But like I said, that will probably be slower in most cases.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
Reply With Quote
 
Stefan Ram
Guest
Posts: n/a
 
      02-11-2012
Ben Bacarisse <(E-Mail Removed)> writes:
>In addition to being careful about the pointers, you need to finesse the
>mess that is isspace (and friends) when char might be signed.


Ok, I admit that I do not understand this. After

#include <ctype.h>

, and in the scope of something like

char c;

, the expression

isspace( c )

will be nonzero, when c is a standard white-space character
(or is one of a locale-specific set of characters).

Where is the problem Ben refers to?

 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      02-11-2012
August Karlstrom <(E-Mail Removed)> writes:

> On 2012-02-10 19:17, peter wrote:
>> Hello C programmers,
>> I was wondering does anybody knows how or is there
>> a right trim string function available in C?

>
> No, not in the standard library, but it is not hard to
> implement. Remember that since strings are constant objects you first
> need to copy the string to a character array.
>
> #include <ctype.h>
> #include <string.h>
>
> void trim_right(char *s)
> {
> int i;
>
> i = strlen(s) - 1;
> while ((i >= 0) && isspace(s[i])) {
> i--;
> }
> s[i + 1] = '\0';
> }


Odd things can happen when strlen(s) == 0 because strlen(s) - 1 is a
large positive number which may or may not fit into an int (it's worse
if it does, though that is very unlikely). Of course, since the
conversion to int is implementation defined, you may get -1 exactly as
you expect, but that just means the problem might not show up in
testing. You can avoid the problem by sticking to unsigned arithmetic:

size_t i = strlen(s);
while (i-- > 0 && isspace(s[i]))
/* do nothing */;
s[i + 1] = '\0';

Using unsigned wrap-around like this is a little unusual but it does
work (unless I've messed-up of course).

--
Ben.
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      02-11-2012
http://www.velocityreviews.com/forums/(E-Mail Removed)-berlin.de (Stefan Ram) writes:
> Ben Bacarisse <(E-Mail Removed)> writes:
>>In addition to being careful about the pointers, you need to finesse the
>>mess that is isspace (and friends) when char might be signed.

>
> Ok, I admit that I do not understand this. After
>
> #include <ctype.h>
>
> , and in the scope of something like
>
> char c;
>
> , the expression
>
> isspace( c )
>
> will be nonzero, when c is a standard white-space character
> (or is one of a locale-specific set of characters).
>
> Where is the problem Ben refers to?


N1570 7.4p1:

... the argument is an int, the value of which shall be
representable as an unsigned char or shall equal the value of
the macro EOF. If the argument has any other value, the behavior
is undefined.

If plain char is signed and the value of c is negative (and not
equal to EOF), then the behavior of isspace(c) is undefined.

This can be avoided by writing:

isspace((unsigned char)c)

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      02-11-2012
(E-Mail Removed)-berlin.de (Stefan Ram) writes:

> Ben Bacarisse <(E-Mail Removed)> writes:
>>In addition to being careful about the pointers, you need to finesse the
>>mess that is isspace (and friends) when char might be signed.

>
> Ok, I admit that I do not understand this. After
>
> #include <ctype.h>
>
> , and in the scope of something like
>
> char c;
>
> , the expression
>
> isspace( c )
>
> will be nonzero, when c is a standard white-space character
> (or is one of a locale-specific set of characters).
>
> Where is the problem Ben refers to?


7.4 p1:

The header <ctype.h> declares several functions useful for classifying
and mapping characters. In all cases the argument is an int, the value
of which shall be representable as an unsigned char or shall equal the
value of the macro EOF. If the argument has any other value, the
behavior is undefined.

If char is signed, not all values of (int)c (the effect of the argument
promotion) are representable as an unsigned char. Also, there may be
a character for which (int)c == EOF.

If you know that char is unsigned or that all values of c are positive,
then there's no problem, but in portable code you need to take some
evasive action. The "usual" solution is to write

isspace((unsigned char)c);

but there was a recent thread where it was suggested that is not always
100% portable and correct depending on where the characters come from
and what representation the machine uses.

--
Ben.
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      02-11-2012
Shao Miller <(E-Mail Removed)> writes:

> On 2/11/2012 08:50, Ben Bacarisse wrote:
>> Shao Miller<(E-Mail Removed)> writes:
>>
>>> On 2/10/2012 13:17, peter wrote:
>>>> Hello C programmers,
>>>> I was wondering does anybody knows how or is there
>>>> a right trim string function available in C?
>>>>
>>>> Eg. Suppose I have the following:
>>>>
>>>> char *str = "Hello Dolly \0";
>>>>
>>>> Is there a right trim function that will remove the
>>>> trailing spaces and make *str look like "Hello Dolly\0"?
>>>>
>>>> In PYTHON there is a useful function for this: str.rstrip();
>>>
>>> Maybe you would like this?:
>>>

>>
>> This is a very good effort, but, I am sorry to say, I was able to
>> understand it eventually.

>
> No need to apologize. I'm glad that you eventually understood it
> without any comments to follow along with. If it had been a more
> instructive example, it would have included comments.


My post was an attempt at humour that seems to have failed. Sorry about
that. Because style discussions are generally unproductive (people who
use non-standard styles always know that the reasons they do it are
worth it) I wanted to say "Personally, I don't like your choices and BTW
you forgot to include ctype.h" in a more interesting way than that.

I'm sorry if it was not funny.

>> You might want to study the code posted by
>> Stefan Ram.

<snip>
>> He is undoubtedly ahead of you in matters of pure layout,

>
> What do you mean? Stylistically? In regards to program flow? In
> regards to object usage? His program and mine do different things,
> so...


Yes, stylistically. I find his highly non-standard layout very hard to
read.

>> but your use of a little-used statement in a position where it is both
>> suggestive and pointless is a stoke of genius!

>
> Are you referring to the keyword (not statement) 'register'?


No, as you later posted it was the continue at the bottom of every loop.

> If so,
> it's a storage-class specifier. While it suggests that access to
> so-specified objects "be as fast as possible," it also ensures that
> the address of a so-specified object cannot be taken. In this
> fashion, it is used in my code to prevent accidentally taking the
> address of the corresponding objects.


You see? I was sure you'd have a reason for that, Just chuckle at my
Luddite failure to perceive the value of it.

<snip>
>> As for the semantics, I thought the idea of keeping "one_past_end" one
>> greater than it needs to be (thereby toying, teasingly, with UB all the
>> time) was a very nice touch -- it's always nice to see a '+ 2' in a
>> string-walking loop.

>
> Why is it one greater than it needs to be? If 'c' is not the null
> terminator, then one past the end of the string is at least two
> characters away. That distance seems useful for the 'malloc'.


It's swings and roundabouts. You need to --diff to set the null and do
the memcpy and if you keep the pointer "nearer" you need to +1 when you
malloc. The latter is, at least, a common idiom. I.e.:

char c;
const char * cur_pos = string;
const char * one_past_end = string;
ptrdiff_t diff;
char * copy;

if (!chars) {
while ((c = *cur_pos++))
if (!isspace(c))
one_past_end = cur_pos;
}
else {
while ((c = *cur_pos++))
if (!in_set(c, chars))
one_past_end = cur_pos;
}
diff = one_past_end - string;
copy = malloc(diff + 1);
if (!copy)
return NULL;

copy[diff] = '\0';
return memcpy(copy, string, diff);

Personally, I'd then change "diff" to "length". I've also moved the ++
of cur_pos to the loop, but if you don't like that you can go back to

while ((c = *cur_pos)) {
if (!isspace(c))
one_past_end = cur_pos + 1;
++cur_pos;
}

but it's then clear that the ++cur_pos could go above the "if" and thus
into the loop condition.

<snip>

--
Ben.
 
Reply With Quote
 
Malcolm McLean
Guest
Posts: n/a
 
      02-11-2012
On Feb 11, 7:58*pm, Ben Bacarisse <(E-Mail Removed)> wrote:
> August Karlstrom <(E-Mail Removed)> writes:
> > On 2012-02-10 19:17, peter wrote:
> >> Hello C programmers,
> >> * * * * *I was wondering does anybody knows how or is there
> >> a right trim string function available in C?

>
> > No, not in the standard library, but it is not hard to
> > implement. Remember that since strings are constant objects you first
> > need to copy the string to a character array.

>
> > #include <ctype.h>
> > #include <string.h>

>
> > void trim_right(char *s)
> > {
> > * *int i;

>
> > * *i = strlen(s) - 1;
> > * *while ((i >= 0) && isspace(s[i])) {
> > * * * * * *i--;
> > * *}
> > * *s[i + 1] = '\0';
> > }

>
> Odd things can happen when strlen(s) == 0 because strlen(s) - 1 is a
> large positive number which may or may not fit into an int (it's worse
> if it does, though that is very unlikely). *Of course, since the
> conversion to int is implementation defined, you may get -1 exactly as
> you expect, but that just means the problem might not show up in
> testing. *You can avoid the problem by sticking to unsigned arithmetic:
>
> * size_t i = strlen(s);
> * while (i-- > 0 && isspace(s[i]))
> * * * /* do nothing */;
> * s[i + 1] = '\0';
>
> Using unsigned wrap-around like this is a little unusual but it does
> work (unless I've messed-up of course).
>
> --
> Ben.- Hide quoted text -
>
> - Show quoted text -


 
Reply With Quote
 
August Karlstrom
Guest
Posts: n/a
 
      02-11-2012
On 2012-02-11 20:58, Ben Bacarisse wrote:
> August Karlstrom<(E-Mail Removed)> writes:
>> #include<ctype.h>
>> #include<string.h>
>>
>> void trim_right(char *s)
>> {
>> int i;
>>
>> i = strlen(s) - 1;
>> while ((i>= 0)&& isspace(s[i])) {
>> i--;
>> }
>> s[i + 1] = '\0';
>> }

>
> Odd things can happen when strlen(s) == 0 because strlen(s) - 1 is a
> large positive number which may or may not fit into an int (it's worse
> if it does, though that is very unlikely). Of course, since the
> conversion to int is implementation defined, you may get -1 exactly as
> you expect, but that just means the problem might not show up in
> testing.


Thanks for pointing that out. I somewhat carelessly assumed that strlen
returns an int.

> You can avoid the problem by sticking to unsigned arithmetic:
>
> size_t i = strlen(s);
> while (i--> 0&& isspace(s[i]))
> /* do nothing */;
> s[i + 1] = '\0';
>
> Using unsigned wrap-around like this is a little unusual but it does
> work (unless I've messed-up of course).


For empty strings and strings containing only whitespace you rely on two
unsigned integer wrap-arounds. Nothing wrong with that, though I find
the side-effectful loop guard hard to understand - could you maybe read
its meaning aloud.

Sometimes unsigned types create more problems than they solve so I would
probably just make sure that the string is not insanely long:

#include <assert.h>
#include <ctype.h>
#include <limits.h>
#include <string.h>

void trim_right(char *s)
{
size_t len;
int i;

len = strlen(s);
assert(len <= INT_MAX);
i = len - 1;
while ((i >= 0) && isspace(s[i])) {
i--;
}
s[i + 1] = '\0';
}


August
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 2 04-24-2007 01:59 AM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 26 02-26-2007 05:06 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 6 12-25-2006 08:47 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 0 10-25-2006 11:00 PM
FAQ Topic - How do I trim whitespace - LTRIM/RTRIM/TRIM? FAQ server Javascript 0 08-28-2006 11:00 PM



Advertisments