Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Concatenating strings

Reply
Thread Tools

Concatenating strings

 
 
Big Brother
Guest
Posts: n/a
 
      09-02-2007
I've been thinking about the seemingly simple task of writing a
va_arg-type function to concatenate arbitarily many strings.

My first thoughts violated the following principles:
1) never calculate the length of a string more than once;
2) never invoke realloc() multiple times if you can make do with a
single malloc().

As a result, I came up with the code below. This avoids the two pitfalls
above, but at the expense of needing three passes through the argument
list, which seems a bit clumsy.

Does anyone have any thoughts on the rights and wrongs of this?


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>

/* Returns the concatenation of the strings supplied as arguments. The
* final argument must be NULL.
* Caller should free() the returned pointer when done.
* In case of allocation failure, NULL is returned.
*/
char *concat(char *s, ...)
{
va_list ap;
unsigned n=1;
size_t *len, total=0;
char *t, *r;

/* pass 1: count arguments */
for(va_start(ap, s); va_arg(ap,char *); n++)
;

/* pass 2: get lengths */
if(!(len=malloc(n * sizeof *len)))
return NULL;
va_start(ap, s);
len[n=0]=strlen(s);
while(t=va_arg(ap, char *))
total+=(len[++n]=strlen(t));
va_end(ap);

/* pass 3: copy strings */
if(r=malloc(total+1)) {
memcpy(r, s, total = *len);
n=1;
for(va_start(ap, s) ; t=va_arg(ap, char *) ; total+=len[n++])
memcpy(r+total, t, len[n]);
va_end(ap);
}
free(len);
return r;
}

int main()
{
char *s;
s=concat("hello ", "world", " and ", "everyone", " in it!", NULL);
puts(s);
free(s);
}

 
Reply With Quote
 
 
 
 
Big Brother
Guest
Posts: n/a
 
      09-02-2007
On 2 Sep 2007 at 15:22, Big Brother wrote:
....

> /* pass 1: count arguments */
> for(va_start(ap, s); va_arg(ap,char *); n++)
> ;


Oops,should be a va_end(ap); here, of course.

>
> /* pass 2: get lengths */


....

 
Reply With Quote
 
 
 
 
Eric Sosman
Guest
Posts: n/a
 
      09-02-2007
Big Brother wrote:
> I've been thinking about the seemingly simple task of writing a
> va_arg-type function to concatenate arbitarily many strings.
>
> My first thoughts violated the following principles:
> 1) never calculate the length of a string more than once;
> 2) never invoke realloc() multiple times if you can make do with a
> single malloc().
>
> As a result, I came up with the code below. This avoids the two pitfalls
> above, but at the expense of needing three passes through the argument
> list, which seems a bit clumsy.
>
> Does anyone have any thoughts on the rights and wrongs of this?


My preference would be to give up on #1, saving one pass
over the arguments and eliminating the extra malloc()/free().
(You could do it in two passes without the extra space by
building the output string with repeated strcat(), but that
obeys #1 in name only.)

Other comments in-line.

> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <stdarg.h>
>
> /* Returns the concatenation of the strings supplied as arguments. The
> * final argument must be NULL.


Since you retrieve that argument as a char*, it must be
supplied as (char*)NULL -- an unadorned NULL is incorrect.

> * Caller should free() the returned pointer when done.
> * In case of allocation failure, NULL is returned.
> */
> char *concat(char *s, ...)


It would be nice to const-qualify the named argument.

> {
> va_list ap;
> unsigned n=1;
> size_t *len, total=0;


Stylistic observation: The code will be easier to read
(and harder to botch) if you keep all the components of a
loop close together. Don't initialize these variables at
the very start of the function; initialize them just before
you use them.

> char *t, *r;
>
> /* pass 1: count arguments */
> for(va_start(ap, s); va_arg(ap,char *); n++)
> ;


You need a va_end(ap) here. Also, you've not taken
care of the boundary case where s == NULL.

> /* pass 2: get lengths */
> if(!(len=malloc(n * sizeof *len)))
> return NULL;
> va_start(ap, s);
> len[n=0]=strlen(s);
> while(t=va_arg(ap, char *))
> total+=(len[++n]=strlen(t));
> va_end(ap);


Stylistic observation: Brevity is the soul of wit, but
parsimony is the enemy of clarity. There are no medals given
for packing the greatest amount of activity into the fewest
number of characters. Relax, spread yourself out a little,
let your code do one thing at a time and do it clearly --
you might even <gasp!> go so far as to introduce a couple
of additional variables.

> /* pass 3: copy strings */
> if(r=malloc(total+1)) {
> memcpy(r, s, total = *len);
> n=1;
> for(va_start(ap, s) ; t=va_arg(ap, char *) ; total+=len[n++])
> memcpy(r+total, t, len[n]);
> va_end(ap);
> }
> free(len);
> return r;
> }
>
> int main()
> {
> char *s;
> s=concat("hello ", "world", " and ", "everyone", " in it!", NULL);


... (char*)NULL); as mentioned above.

> puts(s);


Undefined behavior if concat() returns NULL, which its
documentation says it might do.

> free(s);
> }


Falling off the end of main() without returning a value
is an abomination. The fact that the abomination has received
the blessing of C99 doesn't make it less abominable.

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid
 
Reply With Quote
 
Francine.Neary@googlemail.com
Guest
Posts: n/a
 
      09-02-2007
On Sep 2, 4:57 pm, Eric Sosman <(E-Mail Removed)> wrote:
> Other comments in-line.


Couple more bugs...

>
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <string.h>
> > #include <stdarg.h>

>
> > /* Returns the concatenation of the strings supplied as arguments. The
> > * final argument must be NULL.

>
> Since you retrieve that argument as a char*, it must be
> supplied as (char*)NULL -- an unadorned NULL is incorrect.
>
> > * Caller should free() the returned pointer when done.
> > * In case of allocation failure, NULL is returned.
> > */
> > char *concat(char *s, ...)

>
> It would be nice to const-qualify the named argument.
>
> > {
> > va_list ap;
> > unsigned n=1;
> > size_t *len, total=0;

>
> Stylistic observation: The code will be easier to read
> (and harder to botch) if you keep all the components of a
> loop close together. Don't initialize these variables at
> the very start of the function; initialize them just before
> you use them.


And in fact, there's a bug exactly at this point - the initialization
of total is wrong: ...

>
> > char *t, *r;

>
> > /* pass 1: count arguments */
> > for(va_start(ap, s); va_arg(ap,char *); n++)
> > ;

>
> You need a va_end(ap) here. Also, you've not taken
> care of the boundary case where s == NULL.
>
> > /* pass 2: get lengths */
> > if(!(len=malloc(n * sizeof *len)))
> > return NULL;
> > va_start(ap, s);
> > len[n=0]=strlen(s);


....replace this with total=len[n=0]=strlen(s);

> > while(t=va_arg(ap, char *))
> > total+=(len[++n]=strlen(t));
> > va_end(ap);

>
> Stylistic observation: Brevity is the soul of wit, but
> parsimony is the enemy of clarity. There are no medals given
> for packing the greatest amount of activity into the fewest
> number of characters. Relax, spread yourself out a little,
> let your code do one thing at a time and do it clearly --
> you might even <gasp!> go so far as to introduce a couple
> of additional variables.
>
>
>
> > /* pass 3: copy strings */
> > if(r=malloc(total+1)) {
> > memcpy(r, s, total = *len);
> > n=1;
> > for(va_start(ap, s) ; t=va_arg(ap, char *) ; total+=len[n++])
> > memcpy(r+total, t, len[n]);
> > va_end(ap);
> > }
> > free(len);
> > return r;


Oops! You forgot to null-terminate r.

> > }

>
> > int main()
> > {
> > char *s;
> > s=concat("hello ", "world", " and ", "everyone", " in it!", NULL);

>
> ... (char*)NULL); as mentioned above.
>
> > puts(s);

>
> Undefined behavior if concat() returns NULL, which its
> documentation says it might do.
>
> > free(s);
> > }

>
> Falling off the end of main() without returning a value
> is an abomination. The fact that the abomination has received
> the blessing of C99 doesn't make it less abominable.
>
> --
> Eric Sosman
> (E-Mail Removed)


 
Reply With Quote
 
Malcolm McLean
Guest
Posts: n/a
 
      09-02-2007

"Big Brother" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> I've been thinking about the seemingly simple task of writing a
> va_arg-type function to concatenate arbitarily many strings.
>
> My first thoughts violated the following principles:
> 1) never calculate the length of a string more than once;
> 2) never invoke realloc() multiple times if you can make do with a
> single malloc().
>
> As a result, I came up with the code below. This avoids the two pitfalls
> above, but at the expense of needing three passes through the argument
> list, which seems a bit clumsy.
>
> Does anyone have any thoughts on the rights and wrongs of this?
>
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <stdarg.h>
>
> /* Returns the concatenation of the strings supplied as arguments. The
> * final argument must be NULL.
> * Caller should free() the returned pointer when done.
> * In case of allocation failure, NULL is returned.
> */
> char *concat(char *s, ...)
> {
> va_list ap;
> unsigned n=1;
> size_t *len, total=0;
> char *t, *r;
>
> /* pass 1: count arguments */
> for(va_start(ap, s); va_arg(ap,char *); n++)
> ;
>
> /* pass 2: get lengths */
> if(!(len=malloc(n * sizeof *len)))
> return NULL;
> va_start(ap, s);
> len[n=0]=strlen(s);
> while(t=va_arg(ap, char *))
> total+=(len[++n]=strlen(t));
> va_end(ap);
>

Here you seem to have forgotten the first string in calculating total.
This was hard to spot, because of all the fancy assignments in expressions.

> /* pass 3: copy strings */
> if(r=malloc(total+1)) {
> memcpy(r, s, total = *len);
> n=1;
> for(va_start(ap, s) ; t=va_arg(ap, char *) ; total+=len[n++])
> memcpy(r+total, t, len[n]);
> va_end(ap);
> }
>

No terminating nul. Also, aren't you copying string t on the first pass?
That's set to NULL at this point.
>
> free(len);
> return r;
> }
>
> int main()
> {
> char *s;
> s=concat("hello ", "world", " and ", "everyone", " in it!", NULL);
> puts(s);
> free(s);
> }
>


 
Reply With Quote
 
SM Ryan
Guest
Posts: n/a
 
      09-03-2007
Big Brother <(E-Mail Removed)> wrote:
# I've been thinking about the seemingly simple task of writing a
# va_arg-type function to concatenate arbitarily many strings.
#
# My first thoughts violated the following principles:
# 1) never calculate the length of a string more than once;
# 2) never invoke realloc() multiple times if you can make do with a
# single malloc().

You're free to do whatever you want, but as far as practical code,
if you realloc by a constant factor (say double the string length
every realloc) instead of by a constant amount, the running time
is linear in the the length of the strings for any malloc you're
likely to encounter.

--
SM Ryan http://www.rawbw.com/~wyrmwif/
Death is the worry of the living. The dead, like myself,
only worry about decay and necrophiliacs.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
concatenating strings EHC Python 3 12-16-2006 12:07 AM
Concatenating strings John Henry Python 1 07-01-2006 03:38 AM
Concatenating strings John Henry Python 1 07-01-2006 03:36 AM
adding strings, not concatenating them Stan Horwitz Perl 2 02-15-2006 08:38 PM
Problems concatenating strings stroker_ace@hotmail.com C++ 11 04-08-2005 03:32 PM



Advertisments