Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Replacing NULLS with space (C strings)

Reply
Thread Tools

Replacing NULLS with space (C strings)

 
 
peter
Guest
Posts: n/a
 
      02-09-2012
In fact, I want to remove all NULLS and EOFs (0x1a)
from a string then replace them all with spaces. The way I do it
now is by using a for() loop:

for(temp=0;temp<=strlen(buffer);temp++)
{
if(buffer[temp]== '\0' || buffer[temp]==0x1A)
{buffer[temp]=' ';}
}

Is there a faster / more efficient way of doing this?
 
Reply With Quote
 
 
 
 
James Kuyper
Guest
Posts: n/a
 
      02-09-2012
On 02/09/2012 03:19 PM, peter wrote:
> In fact, I want to remove all NULLS and EOFs (0x1a)


EOF is a macro defined in <stdio.h>. It's required to have a negative
value, which 0x1A does not, so they can't be the same. EOF very
commonly, though not universally, has a value of -1.

There have been systems where 0x1A was used to indicate the end of a
file. However, such systems are far from universal. I'd recommend making
sure that this value is indeed being used that way in all of the
contexts in which you want to use this code.

> from a string then replace them all with spaces. The way I do it
> now is by using a for() loop:
>
> for(temp=0;temp<=strlen(buffer);temp++)
> {
> if(buffer[temp]== '\0' || buffer[temp]==0x1A)
> {buffer[temp]=' ';}
> }
>
> Is there a faster / more efficient way of doing this?


By definition, strlen(buffer) gives you the offset of the very first
null character in buffer (or, if there is none, it keeps searching past
the end of buffer until if finds one; this often results in memory
access violations - make VERY sure that your buffer is in fact null
terminated before calling strlen). Therefore, there's no point in
checking for null characters before you reach then end of the loop; and
you're guaranteed to find one once you reach that end.

I suspect that you have some kind of misunderstanding, that led you to
think that your code could find a null character in some other locations
as well. However, for the rest of this message I'll assume you intended
it to handle null characters exactly the way it actually does.

You have strlen() scanning sequentially through buffer looking for the
first null character, and then you have your for loop scanning
sequentially through buffer looking for null characters and 0x1A. Why
not do it in a single pass?

Your code sets the final terminating null character to blank. This
guarantees that strlen(buffer) can no longer be used to tell you where
that character used to be. If you're planning to do anything further
with that portion of buffer, you'd better do something to keep track of
where it ends.

You don't say what the element type of buffer is; I'll assume it's char;
make appropriate adjustments below if it's something else.

for(char *p = buffer; *p; p++)
if(*p == 0x1A)
*p = ' ';

*p++ = ' ';
ptrdiff_t length = p - buffer;
 
Reply With Quote
 
 
 
 
Anders Wegge Keller
Guest
Posts: n/a
 
      02-09-2012
peter <(E-Mail Removed)> writes:

> In fact, I want to remove all NULLS and EOFs (0x1a)
> from a string then replace them all with spaces. The way I do it
> now is by using a for() loop:
>
> for(temp=0;temp<=strlen(buffer);temp++)
> {
> if(buffer[temp]== '\0' || buffer[temp]==0x1A)
> {buffer[temp]=' ';}
> }


> Is there a faster / more efficient way of doing this?


strlen(buffer) will return the offset of the first '\0' encountered,
so the code above doesn't make that much sense. Also, it is not very
effecient to call strlen for each iteration of the loop. Especially
with patological code like this, the comåpiler will be unable to
optimize the repeated calls away, as you are modifying the object you
are giving as argument.

Either call strlen once and use that result in the entire loop:

len = strlen (buffer);
for (temp = 0 ; temp < len ; temp++) {
if (buffer[temp] == 0x1a) { buffer[temp] = ' '; }
}

Or skip the strlen call entirely, and check for end of string at the
same time as check for modification:

temp = 0;

while (buffer[temp]) {
if (buffer[temp] == 0x1a) { buffer[temp] = ' '; }
temp++;
}

--
/Wegge

Leder efter redundant peering af dk.*,linux.debian.*
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      02-09-2012
On 02/09/2012 04:04 PM, Anders Wegge Keller wrote:
> peter <(E-Mail Removed)> writes:
>
>> In fact, I want to remove all NULLS and EOFs (0x1a)
>> from a string then replace them all with spaces. The way I do it
>> now is by using a for() loop:
>>
>> for(temp=0;temp<=strlen(buffer);temp++)
>> {
>> if(buffer[temp]== '\0' || buffer[temp]==0x1A)
>> {buffer[temp]=' ';}
>> }

>
>> Is there a faster / more efficient way of doing this?

>
> strlen(buffer) will return the offset of the first '\0' encountered,
> so the code above doesn't make that much sense. Also, it is not very
> effecient to call strlen for each iteration of the loop.


I didn't notice that - that's embarrassing (not as embarrassing as
having written such code, but close). It's worse than merely being
horrendously inefficient; with the terminating null character being
replaced with ' ' inside the loop, followed by immediate recalculation
of the length of the supposedly null-terminated string, the loop will
never terminate until something goes very badly wrong (and possibly not
even then).

 
Reply With Quote
 
John Gordon
Guest
Posts: n/a
 
      02-09-2012
In <jh19o6$qnl$(E-Mail Removed)> peter <(E-Mail Removed)> writes:

> In fact, I want to remove all NULLS and EOFs (0x1a)
> from a string then replace them all with spaces. The way I do it
> now is by using a for() loop:


> for(temp=0;temp<=strlen(buffer);temp++)
> {
> if(buffer[temp]== '\0' || buffer[temp]==0x1A)
> {buffer[temp]=' ';}
> }


C strings are terminated by a NULL character. Therefore, by definition,
you won't find any NULLs in the string itself.

--
John Gordon A is for Amy, who fell down the stairs
http://www.velocityreviews.com/forums/(E-Mail Removed) B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashlycrumb Tinies"

 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      02-09-2012
peter <(E-Mail Removed)> writes:
> In fact, I want to remove all NULLS and EOFs (0x1a)
> from a string then replace them all with spaces. The way I do it
> now is by using a for() loop:
>
> for(temp=0;temp<=strlen(buffer);temp++)
> {
> if(buffer[temp]== '\0' || buffer[temp]==0x1A)
> {buffer[temp]=' ';}
> }
>
> Is there a faster / more efficient way of doing this?


There's probably no faster way than a for loop, but yours can be
improved considerably by not calling strlen() on each iteration.
strlen() has to scan the entire string, and you're doing that once for
each character.

Also, the correct condition is "<", not "<=". For example if the
string's value is "hello", then strlen() returns 5, but you want to
check positions 0 through 4.

const size_t len = strlen(buffer);
for (i = 0; i < len; i ++) {
...
}

And some terminology issues. NULL is (a macro that expands to)
a null *pointer* constant; the null character is better referred
to as NUL, or just '\0'. (Yes, some character set standards do
call it NULL, but using that name can be confusing.)

And EOF is a macro that expands to a negative integer constant
expression, typically (-1). 0x1A is the control-Z character,
which is used on some systems, to indicate an end-of-file condition.

Finally, strlen() searches for the '\0' character that marks the end
of a string. If your buffer might have multiple '\0' characters in
it, then it isn't a string, and you should use some other technique
to determine how long it is (or how long the relevant portion of
it is).

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      02-09-2012
John Gordon <(E-Mail Removed)> writes:
> In <jh19o6$qnl$(E-Mail Removed)> peter <(E-Mail Removed)> writes:
>> In fact, I want to remove all NULLS and EOFs (0x1a)
>> from a string then replace them all with spaces. The way I do it
>> now is by using a for() loop:

>
>> for(temp=0;temp<=strlen(buffer);temp++)
>> {
>> if(buffer[temp]== '\0' || buffer[temp]==0x1A)
>> {buffer[temp]=' ';}
>> }

>
> C strings are terminated by a NULL character. Therefore, by definition,
> you won't find any NULLs in the string itself.


Null is (a macro that expands to) a null *pointer* constant.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      02-09-2012
Keith Thompson <(E-Mail Removed)> writes:
> John Gordon <(E-Mail Removed)> writes:
>> In <jh19o6$qnl$(E-Mail Removed)> peter <(E-Mail Removed)> writes:
>>> In fact, I want to remove all NULLS and EOFs (0x1a)
>>> from a string then replace them all with spaces. The way I do it
>>> now is by using a for() loop:

>>
>>> for(temp=0;temp<=strlen(buffer);temp++)
>>> {
>>> if(buffer[temp]== '\0' || buffer[temp]==0x1A)
>>> {buffer[temp]=' ';}
>>> }

>>
>> C strings are terminated by a NULL character. Therefore, by definition,
>> you won't find any NULLs in the string itself.

>
> Null is (a macro that expands to) a null *pointer* constant.


I meant to type NULL, of course.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Malcolm McLean
Guest
Posts: n/a
 
      02-09-2012
On Feb 9, 8:19*pm, peter <(E-Mail Removed)> wrote:
> In fact, I want to remove all NULLS and EOFs (0x1a)
> from a string then replace them all with spaces. The way I do it
> now is by using a for() loop:
>
> *for(temp=0;temp<=strlen(buffer);temp++)
> * *{
> * * if(buffer[temp]== '\0' || buffer[temp]==0x1A)
> * * * {buffer[temp]=' ';}
> * *}
>
> Is there a faster / more efficient way of doing this?
>

Yes.

get the length of the data in the buffer. Only you can do that.
Probably you want to exclude the last terminating nul from the
replacement, but maybe not, depending on how you're going to use the
data. You might even ned to add a nul.


Then just do this.

len = data_length_got_somehow;
for(i=0;i<len;i++)
if(buffer[i] == 0 || buffer[i] == 0x1a)
buffer[i] = ' ';
/* possibly you need to do this, but make sure that buffer is one
bigger than len */
buffer[i] = 0;

If you call strlen() in the for control statement, the length of the
string will be reclaculated on each iteration, which is slow. Also
since you want to replace nuls, it's a bug.
--
Visit my website
http://www.malcolmmclean.site11.com/www
 
Reply With Quote
 
Malcolm McLean
Guest
Posts: n/a
 
      02-10-2012
On Feb 10, 12:22*am, pete <(E-Mail Removed)> wrote:
>
> By definition, a string includes a null character.
>
> ISO/IEC 9899:201x Committee Draft — April 12, 2011 N1570
> 7. Library
> 7.1 Introduction
> 7.1.1 Definitions of terms
> 1 * * A string is a contiguous sequence of characters
> * * * terminated by and including the first null character.
>

In ANSI C terminology. That's so that they can use the term "string"
in describing library functions without constantly having to specify
that it must be nul-terminated.
However the strings in your C program may not be nul-terminated.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Newbie question: replacing nulls in CSV with preceding value Matt Waite Python 4 02-01-2007 10:31 PM
Why Python style guide (PEP-8) says 4 space indents instead of 8 space??? 8 space indents ever ok?? Christian Seberino Python 21 10-27-2003 04:20 PM
Re: Why Python style guide (PEP-8) says 4 space indents instead of8 space??? 8 space indents ever ok?? Ian Bicking Python 2 10-24-2003 11:15 AM
Re: Why Python style guide (PEP-8) says 4 space indents instead of8 space??? 8 space indents ever ok?? Ian Bicking Python 2 10-23-2003 07:07 AM
Stack space, global space, heap space Shuo Xiang C Programming 10 07-11-2003 07:30 PM



Advertisments