Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C Programming (http://www.velocityreviews.com/forums/f42-c-programming.html)
-   -   strlcpy and strlcat (http://www.velocityreviews.com/forums/t557930-strlcpy-and-strlcat.html)

dj3vande@csclub.uwaterloo.ca.invalid 12-06-2007 11:24 PM

strlcpy and strlcat
 
I wrote these for a hobby project (I wanted to use them, but needed to
be able to build it on systems that don't have them), and it's probably
worth letting CLC rip them to shreds before I call them done.

They implement, modulo bugs, the behavior of the BSD strlcat and
strlcpy functions. Interestingly, the system I'm actually posting from
(SunOS 5.8) fails one of the tests at the bottom when it uses the
versions in the system library.

They're intended to be in the common subset of C90 and C99. Comments
on correctness and clarity are welcome; comments on style will be
tolerated.


dave


strl.h:
--------
#ifndef H_STRL
#define H_STRL

/*Implementation of BSD strlcat and strlcpy, for systems that don't have them.
Written by Dave Vandervies, December 2007.
Placed in the public domain; attribution is appreciated.
*/

#ifdef __cplusplus
extern "C" { /*make C++ compilers play nicely with the linker*/
#endif

#ifndef HAS_STRLFUNCS

/*strlcpy copies a string from src to dest, creating a string at most
maxlen bytes long (including the '\0' terminator).
Returns the length of the string that would be created without
truncation, excluding the '\0' terminator. (So if the return value
is >= maxlen, the result was truncated.)
*/
size_t my_strlcpy(char *dest,const char *src,size_t maxlen);

/*strlcat appends the contents of src to dest, creating a string at
most maxlen bytes long (including the '\0' terminator).
If src is already longer than maxlen bytes long, its contents
are not changed.
Returns the length of the string that would be created without
truncation, excluding the '\0' terminator, or maxlen+strlen(src)
if no '\0' is found within maxlen bytes of *dest. (So if the
return value is >= maxlen, the result was truncated.)
*/
size_t my_strlcat(char *dest,const char *src,size_t maxlen);

#ifndef CLC_PEDANTIC
#undef strlcpy
#define strlcpy my_strlcpy
#undef strlcat
#define strlcat my_strlcat
#endif /*CLC_PEDANTIC*/

#else /*HAS_STRLFUNCS*/
#include <string.h>
#endif /*HAS_STRLFUNCS*/

#ifdef __cplusplus
} /*close extern "C"*/
#endif

#endif /*H_STRL #include guard*/
--------


strl.c:
--------
#include <assert.h>
#include <string.h>

#include "strl.c"

/*Implementation of BSD strlcat and strlcpy, for systems that don't have them.
Written by Dave Vandervies, December 2007.
Placed in the public domain; attribution is appreciated.
*/

#ifndef HAS_STRLFUNCS

size_t my_strlcpy(char *dest,const char *src,size_t maxlen)
{
size_t len,needed;

#ifdef PARANOID
assert(dest!=NULL);
assert(src!=NULL);
#endif

len=needed=strlen(src)+1;
if(len >= maxlen)
len=maxlen-1;

memcpy(dest,src,len);
dest[len]='\0';

return needed-1;
}

size_t my_strlcat(char *dest,const char *src,size_t maxlen)
{
size_t src_len,dst_len;
size_t len,needed;

#ifdef PARANOID
assert(dest!=NULL);
assert(src!=NULL);
#endif

src_len=strlen(src);
/*Be paranoid about dest being a properly terminated string*/
{
char *end=memchr(dest,'\0',maxlen);
if(!end)
return maxlen+src_len;
dst_len=end-dest;
}

len=needed=dst_len+src_len+1;
if(len >= maxlen)
len=maxlen-1;

memcpy(dest+dst_len,src,len-dst_len);
dest[len]='\0';

return needed-1;
}

#endif /*!HAS_STRLFUNCS*/

#ifdef UNIT_TEST

#include <stdio.h>

/*
dj3vande@goofy:~/clc (0) $ gcc -W -Wall -ansi -pedantic -O -DUNIT_TEST -ostrl strl.c
dj3vande@goofy:~/clc (0) $ ./strl
strlcpy with truncation: Expect `hel'/5: `hel'/5
strlcat with truncation: Expect `help!'/9: `help!'/9
strlcpy without truncation: Expect `help!'/5: `help!'/5
strlcat without truncation: Expect `help!help!'/10: `help!help!'/10
strlcat with maxlen<strlen(dest): Expect `help!help!'/9: `help!help!'/9
dj3vande@goofy:~/clc (0) $
*/

int main(void)
{
char buf1[256],buf2[256];
unsigned long ret;

#ifdef HAS_STRLFUNCS
#define my_strlcpy strlcpy
#define my_strlcat strlcat
printf("Using system library versions\n");
#endif

ret=my_strlcpy(buf1,"hello",4);
printf("strlcpy with truncation: Expect `hel'/5: `%s'/%lu\n",buf1,ret);

ret=my_strlcat(buf1,"p!!!!!",6);
printf("strlcat with truncation: Expect `help!'/9: `%s'/%lu\n",buf1,ret);

ret=my_strlcpy(buf2,buf1,sizeof buf2);
printf("strlcpy without truncation: Expect `help!'/5: `%s'/%lu\n",buf2,ret);

ret=my_strlcat(buf2,buf1,sizeof buf2);
printf("strlcat without truncation: Expect `help!help!'/10: `%s'/%lu\n",buf2,ret);

ret=my_strlcat(buf2,buf1,4);
printf("strlcat with maxlen<strlen(dest): Expect `help!help!'/9: `%s'/%lu\n",buf2,ret);

return 0;
}

#endif /*UNIT_TEST*/
--------

CBFalconer 12-07-2007 01:40 AM

Re: strlcpy and strlcat
 
dj3vande@csclub.uwaterloo.ca.invalid wrote:
>
> I wrote these for a hobby project (I wanted to use them, but
> needed to be able to build it on systems that don't have them),
> and it's probably worth letting CLC rip them to shreds before I
> call them done.
>
> They implement, modulo bugs, the behavior of the BSD strlcat and
> strlcpy functions. Interestingly, the system I'm actually
> posting from (SunOS 5.8) fails one of the tests at the bottom
> when it uses the versions in the system library.
>
> They're intended to be in the common subset of C90 and C99.
> Comments on correctness and clarity are welcome; comments on
> style will be tolerated.


Take a look at:

<http://cbfalconer.home.att.net/download/strlcpy.zip>

They are written to be compact and avoid any further use of the
standard library. This improves their usefullness where memory is
tight. I notice yours uses calls to strlen.

--
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>
Try the download section.



--
Posted via a free Usenet account from http://www.teranews.com


dj3vande@csclub.uwaterloo.ca.invalid 12-08-2007 12:53 AM

Re: strlcpy and strlcat
 
In article <4758A495.A1D4DF9A@yahoo.com>,
CBFalconer <cbfalconer@maineline.net> wrote:

>Take a look at:
>
> <http://cbfalconer.home.att.net/download/strlcpy.zip>
>
>They are written to be compact and avoid any further use of the
>standard library. This improves their usefullness where memory is
>tight.


That's an entirely different environment than I was writing for; I was
targeting an environment where optimizers are aggressive, resources are
relatively cheap, and programmers' cognitive energy is the most
important thing to optimize.
(I would expect a good optimizer to generate code for mine that will be
slower but not by enough to show up outside of performance-critical
areas; a dumber or nonexistent optimizer will almost certainly generate
function calls that are at least as expensive as the code that they
replace on reasonable-length strings even if the library calls run
faster than the inline code.)

Interestingly, the OpenBSD implementation[1] also avoids calling the
rest of the standard library (both are completely self-contained), and
your implementation has the same difference from the BSD implementation
as the SunOS library (strlcat handles strlen(dest) > maxlen differently
- the BSD implementation is (and documents being) more paranoid about
dest not being properly terminated).


> I notice yours uses calls to strlen.


Also memchr and memcpy, which don't get inlined by GCC on x86 (strlen
does).



(Over 24 hours and only one response? I need to get my sigmonster set
up on this account, then at least Richard H will read my posts.)


dave

[1] <http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/string/strlcat.c?rev=1.13&content-type=text/x-cvsweb-markup>
<http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/string/strlcpy.c?rev=1.11&content-type=text/x-cvsweb-markup>


Tor Rustad 12-09-2007 03:30 AM

Re: strlcpy and strlcat
 
dj3vande@csclub.uwaterloo.ca.invalid wrote:

[...]

> (Over 24 hours and only one response? I need to get my sigmonster set
> up on this account, then at least Richard H will read my posts.)


I have seen your request, the main reason for not checking this deeper,
has been primary that those strl* interfaces has IMO a design weakness,
which I eliminated in my own implementation.

I think you should put more effort into your test function, perhaps even
provide some self test function with external linkage, at least use
EXIT_FAILURE in case one test case fail. Also, watching the output from
successful tests, can be tiresome in a big project.

I would remove PARANOID, using assert() isn't paranoid. :) The
CLC_PEDANTIC is not needed, we do know these functions invade the
reserved C name space, but the C committee wouldn't use these names for
something different.

The usage of #ifdef's should be minimized in source, and primary used in
header files instead. Because of all these macros, the code became more
hard-to-read, than it should have been.

I will post another followup, if I get time to write a test function
tomorrow.

--
Tor <bwzcab@wvtqvm.vw | tr i-za-h a-z>

CBFalconer 12-09-2007 04:50 AM

Re: strlcpy and strlcat
 
dj3vande@csclub.uwaterloo.ca.invalid wrote:
> CBFalconer <cbfalconer@maineline.net> wrote:
>
>> Take a look at:
>>
>> <http://cbfalconer.home.att.net/download/strlcpy.zip>
>>
>> They are written to be compact and avoid any further use of the
>> standard library. This improves their usefullness where memory
>> is tight.

>
> That's an entirely different environment than I was writing for;
> I was targeting an environment where optimizers are aggressive,
> resources are relatively cheap, and programmers' cognitive energy
> is the most important thing to optimize.


The environmental capability is simply an added feature. It also
avoids non-productive time spent executing calls and returns. Note
that the code is pure standard C.

.... snip ...
>
> Interestingly, the OpenBSD implementation[1] also avoids calling
> the rest of the standard library (both are completely
> self-contained), and your implementation has the same difference
> from the BSD implementation as the SunOS library (strlcat handles
> strlen(dest) > maxlen differently - the BSD implementation is
> (and documents being) more paranoid about dest not being properly
> terminated).


Please explain more fully. I don't believe my coding can ever
leave an improperly terminated string. Please tell me what you
find objectionable (or missing) in the test results (copy
following).

Testing lgh = stringop(dest, source, sz)

dest source opn sz lgh result
==== ====== === == === ======
"" "string1" cpy 10 7 "string1"
"" "string1" cpy 5 7 "stri"
"" "string1" cpy 1 7 ""
"string1" "string1" cat 10 14 "string1st"
"string1st" "x " cpy 10 2 "x "
"x " "string1" cat 10 9 "x string1"
"x string1" "x " cpy 10 2 "x "
"x " "string1" cat 0 9 "x "
"x " "string1" cpy 0 7 "x "
"x " "longer string" cat 0 15 "x "
"x " "(NULL)" cpy 10 0 ""
"" "x " cpy 10 2 "x "
"x " "(NULL)" cat 10 2 "x "

--
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>
Try the download section.



--
Posted via a free Usenet account from http://www.teranews.com


dj3vande@csclub.uwaterloo.ca.invalid 12-09-2007 06:32 AM

Re: strlcpy and strlcat
 
In article <475B7415.4D8A13A4@yahoo.com>,
CBFalconer <cbfalconer@maineline.net> wrote:
>dj3vande@csclub.uwaterloo.ca.invalid wrote:


>> Interestingly, the OpenBSD implementation[1] also avoids calling
>> the rest of the standard library (both are completely
>> self-contained), and your implementation has the same difference
>> from the BSD implementation as the SunOS library (strlcat handles
>> strlen(dest) > maxlen differently - the BSD implementation is
>> (and documents being) more paranoid about dest not being properly
>> terminated).

>
>Please explain more fully.


If the dest argument to strlcat does not in fact point to a correctly
terminated string, the BSD implementation will stop looking for a '\0'
after maxlen bytes. This avoids walking through large amounts of
memory (only to read - it wouldn't be written in any case) when it's
given bad input.

> I don't believe my coding can ever
>leave an improperly terminated string.


If the inputs are well-formed neither implementation will ever create
an improperly terminated string.


dave


dj3vande@csclub.uwaterloo.ca.invalid 12-09-2007 06:36 AM

Re: strlcpy and strlcat
 
In article <fjg25n$r0u$1@rumours.uwaterloo.ca>,
<dj3vande@csclub.uwaterloo.ca.invalid> wrote:

>If the inputs are well-formed neither implementation will ever create
>an improperly terminated string.


On second thought, the conditional is irrelevant there; neither
implementation will ever change the contents of memory UNLESS the
inputs are well-formed, in which case the new contents of the
destination buffer will be a properly terminated string.


dave
(needs coffee, or sleep, or both)


Richard Heathfield 12-09-2007 07:39 AM

Re: strlcpy and strlcat
 
dj3vande@csclub.uwaterloo.ca.invalid said:

<snip>

> (Over 24 hours and only one response? I need to get my sigmonster set
> up on this account, then at least Richard H will read my posts.)


You know me too well, Dave.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

CBFalconer 12-09-2007 03:17 PM

Re: strlcpy and strlcat
 
dj3vande@csclub.uwaterloo.ca.invalid wrote:
> CBFalconer <cbfalconer@maineline.net> wrote:
>> dj3vande@csclub.uwaterloo.ca.invalid wrote:

>
>>> Interestingly, the OpenBSD implementation[1] also avoids calling
>>> the rest of the standard library (both are completely
>>> self-contained), and your implementation has the same difference
>>> from the BSD implementation as the SunOS library (strlcat handles
>>> strlen(dest) > maxlen differently - the BSD implementation is
>>> (and documents being) more paranoid about dest not being properly
>>> terminated).

>>
>> Please explain more fully.

>
> If the dest argument to strlcat does not in fact point to a correctly
> terminated string, the BSD implementation will stop looking for a '\0'
> after maxlen bytes. This avoids walking through large amounts of
> memory (only to read - it wouldn't be written in any case) when it's
> given bad input.


I would argue that my technique is better. It will normally cause
an immediate fault during the call, which should leave traces as to
the cause, and be repairable. IIRC I did this deliberately. Note
that a NULL description of src is considered an empty string.

--
Chuck F (cbfalconer at maineline dot net)
<http://cbfalconer.home.att.net>
Try the download section.



--
Posted via a free Usenet account from http://www.teranews.com


dj3vande@csclub.uwaterloo.ca.invalid 12-10-2007 08:58 PM

Re: strlcpy and strlcat
 
In article <uIednRoDJdfc_MbaRVnzvQA@telenor.com>,
Tor Rustad <tor_rustad@hotmail.com> wrote:
>dj3vande@csclub.uwaterloo.ca.invalid wrote:
>
>[...]
>
>> (Over 24 hours and only one response? I need to get my sigmonster set
>> up on this account, then at least Richard H will read my posts.)

>
>I have seen your request, the main reason for not checking this deeper,
>has been primary that those strl* interfaces has IMO a design weakness,
>which I eliminated in my own implementation.


Out of curiousity, what is that design weakness, and how did you fix
it?


>I think you should put more effort into your test function,


Probably. The one I have was intended as a quick sanity check to make
sure nothing was obviously wrong, not an exhaustive test of all the
boundary cases.
(I tend to rely, sometimes too much, on careful design and desk-checks,
and use code tests mostly to make sure I haven't missed something
obvious rather than to try everything that could go wrong.)


>I would remove PARANOID, using assert() isn't paranoid. :)


I'm used to using PARANOID to control consistency checks that can get
expensive. (I try to remember to build with them turned on when I
write them to make sure they're correct, but otherwise they don't get
activated unless I'm trying to debug something.)
In this case using it to turn off the asserts is probably overdoing it;
an assert isn't nearly as expensive as, say, walking through a binary
tree to make sure it's ordered the way I expect it to be.

> The
>CLC_PEDANTIC is not needed, we do know these functions invade the
>reserved C name space, but the C committee wouldn't use these names for
>something different.


I actually put that in when I first wrote it, and was kind of surprised
to see it when I looked over the code after I decided to post it. :)


dave



All times are GMT. The time now is 11:03 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.