Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Partial string loss with sprintf/strcat

Reply
Thread Tools

Partial string loss with sprintf/strcat

 
 
Neal Barney
Guest
Posts: n/a
 
      07-22-2006
I have a C program which runs on a device using a Zilog Z180
microprocessor. While it can address 1MB of RAM, it can only address
64KB at any given time. And of that only 16KB can be used for "stack
and heap space". So I'm running in a very memory constricted
environment.

The program "speaks" a proprietary protocol which sends ASCII strings
back and forth from the device to the server. Within the past couple of
months we've been noticing errors in our server logs and people have
been complaing of failures. Upon viewing server logs we've noticed the
string the device is sending is missing it's header.

The string gets built like so:

/* Global variable */
char Message[512];

/* Local variable */
char tmpMsg[512];

/* var1 - var4 (local variables) get populated with values... */

/* Build the header */
sprintf( tmpMsg, "|XX|%s|0|00.00|XXXX:XXXX|XXXX:%s|XXXX:%s|XXXX:%s" ,
var1, var2, var3, var4 );

/* X's are alphas and zeros are numbers */

if ( conditionTrue )
strcat( tmpMsg, ",S" );
else
strcat( tmpMsg, ",H" );

/* Continue building the rest of the string... */

/* Put msg length at beginning of string*/
len = strlen( tmpMsg );
sprintf( Message, "%04d%s", len, tmpMsg );

return;


Now when the device transmits the string to the server, the server logs
something similar to:

0089,S|XXXX:XXXX|XXXX:XXXX|... (and so on)
or sometimes something like
0093 ,S|XXXX:XXXX|XXXX:XXXX|... (and so on)

The entire header appears to be getting lost. The calculated message
length matches the size of the string without the header, which may
suggest:

a) The sprintf that places the header in tmpMsg is failing (but why?).
b) A bug in strcat for the C library I'm using is wiping out the
string/starting at the beginning of the string. (It doesn't do it later
in the same function or elsewhere in the program).
c) "Something else" is walking on that part of memory.
d) ...?

This is not 100% reproduceable (in fact, in the past 9 months I have
only personally seen it happen once on my test device, and I can't
reproduce it there). But we are seeing this happen in production
multiple times a day from different devices. Unplugging the device and
plugging it back in will often "fix" the problem (though it may occur
again in the future).

What are the most likely causes for this problem? Solutions?
 
Reply With Quote
 
 
 
 
Andrew Poelstra
Guest
Posts: n/a
 
      07-22-2006
On 2006-07-22, Neal Barney <(E-Mail Removed)> wrote:
> The program "speaks" a proprietary protocol which sends ASCII strings
> back and forth from the device to the server. Within the past couple of
> months we've been noticing errors in our server logs and people have
> been complaing of failures. Upon viewing server logs we've noticed the
> string the device is sending is missing it's header.
>
> The string gets built like so:
>
> /* Global variable */
> char Message[512];
>
> /* Local variable */
> char tmpMsg[512];
>
> /* var1 - var4 (local variables) get populated with values... */
>
> /* Build the header */
> sprintf( tmpMsg, "|XX|%s|0|00.00|XXXX:XXXX|XXXX:%s|XXXX:%s|XXXX:%s" ,
> var1, var2, var3, var4 );
>


Are you sure that var1 - var4 take up no more than 465 bytes?

> /* X's are alphas and zeros are numbers */
>
> if ( conditionTrue )
> strcat( tmpMsg, ",S" );
> else
> strcat( tmpMsg, ",H" );
>


Looks like I meant 463 bytes...

> /* Continue building the rest of the string... */


....or less. Are you sure that a 512-byte buffer is enough? If not, you
need to have some way of dealing with running out of space. Perhaps you
could eliminate 4-digit dates (they aren't too important in human-read
logs), or detailed timestamp information. It really depends on what you
need from the log files, and what your design is.

If you overflowed your buffer in the above code, you've left the realm
of C, and have entered Honah Lee.

> Now when the device transmits the string to the server, the server logs
> something similar to:
>
> 0089,S|XXXX:XXXX|XXXX:XXXX|... (and so on)
> or sometimes something like
> 0093 ,S|XXXX:XXXX|XXXX:XXXX|... (and so on)
>
> The entire header appears to be getting lost.


Which part is the header? What exactly should be displayed that isn't?

> This is not 100% reproduceable (in fact, in the past 9 months I have
> only personally seen it happen once on my test device, and I can't
> reproduce it there). But we are seeing this happen in production
> multiple times a day from different devices. Unplugging the device and
> plugging it back in will often "fix" the problem (though it may occur
> again in the future).
>
> What are the most likely causes for this problem? Solutions?


Sure sounds like a buffer overflow, or some other undefined memory issue.

--
Andrew Poelstra <website down>
My server is down; you can't mail
me, nor can I post convieniently.
 
Reply With Quote
 
 
 
 
Eric Sosman
Guest
Posts: n/a
 
      07-22-2006
Neal Barney wrote:
> I have a C program which runs on a device using a Zilog Z180
> microprocessor. While it can address 1MB of RAM, it can only address
> 64KB at any given time. And of that only 16KB can be used for "stack
> and heap space". So I'm running in a very memory constricted
> environment.
>
> The program "speaks" a proprietary protocol which sends ASCII strings
> back and forth from the device to the server. Within the past couple of
> months we've been noticing errors in our server logs and people have
> been complaing of failures. Upon viewing server logs we've noticed the
> string the device is sending is missing it's header.
>
> The string gets built like so:
>
> /* Global variable */
> char Message[512];
>
> /* Local variable */
> char tmpMsg[512];
>
> /* var1 - var4 (local variables) get populated with values... */
>
> /* Build the header */
> sprintf( tmpMsg, "|XX|%s|0|00.00|XXXX:XXXX|XXXX:%s|XXXX:%s|XXXX:%s" ,
> var1, var2, var3, var4 );
>
> /* X's are alphas and zeros are numbers */
>
> if ( conditionTrue )
> strcat( tmpMsg, ",S" );
> else
> strcat( tmpMsg, ",H" );
>
> /* Continue building the rest of the string... */
>
> /* Put msg length at beginning of string*/
> len = strlen( tmpMsg );
> sprintf( Message, "%04d%s", len, tmpMsg );
>
> return;
>
>
> Now when the device transmits the string to the server, the server logs
> something similar to:
>
> 0089,S|XXXX:XXXX|XXXX:XXXX|... (and so on)
> or sometimes something like
> 0093 ,S|XXXX:XXXX|XXXX:XXXX|... (and so on)


Like Andrew Poelstra, I was at first suspicious of a buffer
overflow. But the lengths shown in these fragments don't support
that hypothesis, not directly at any rate.

I think you need to show us actual code, not paraphrases.
The bug might lie in "Continue building the rest of the string"
or even in the way Message and tmpMsg are declared (what you've
shown is obviously not the full story; it doesn't, for example,
show how one of them gets to be "global" and the other "local").

If you don't want to show your (proprietary) code and would
prefer to debug on your own, I'd suggest adding a few printf()
calls to output interesting values. For example, the value
returned by sprintf() might be interesting, especially if it
is markedly different from the value returned by strlen(). I'm
thinking of something along the lines of

tmpLen = sprintf(tmpMsg, ...);
if (tmplen != strlen(tmpMsg))
print_everything_you_can_and_then_die();

if (conditionTrue)
strcat(tmpMsg, ",S");
else
strcat(tmpMsg, ",H");
tmpLen += 2;
if (tmplen != strlen(tmpMsg))
print_everything_you_can_and_then_die();

/* Continue building the rest of the string... */

len = strlen( tmpMsg );
sprintf( Message, "%04d%s", len, tmpMsg );
if (len + 4 != strlen(Message))
print_everything_you_can_and_die();

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid
 
Reply With Quote
 
jacob navia
Guest
Posts: n/a
 
      07-22-2006
Eric Sosman a crit :
> If you don't want to show your (proprietary) code and would
> prefer to debug on your own, I'd suggest adding a few printf()
> calls to output interesting values. For example, the value
> returned by sprintf() might be interesting, especially if it
> is markedly different from the value returned by strlen(). I'm
> thinking of something along the lines of
>
> tmpLen = sprintf(tmpMsg, ...);
> if (tmplen != strlen(tmpMsg))
> print_everything_you_can_and_then_die();
>


This is a very good proposition. It means to try to verify
what each part of the software is doing. The result of the
sprintf call is a good starting point to start adding redundant
verifications, (also called sanity checks!!!) into the software.

When something like this happens, MANY things can be the reason
for the observed symptoms: run time data corruption, buffer
overflows, etc.

Starting with the result of the sprintf call
you can verify also that it is less than the length of the
buffer!!!
 
Reply With Quote
 
Neal Barney
Guest
Posts: n/a
 
      07-23-2006
Eric Sosman wrote:
> Like Andrew Poelstra, I was at first suspicious of a buffer
> overflow. But the lengths shown in these fragments don't support
> that hypothesis, not directly at any rate.
>
> I think you need to show us actual code, not paraphrases.
> The bug might lie in "Continue building the rest of the string"
> or even in the way Message and tmpMsg are declared (what you've
> shown is obviously not the full story; it doesn't, for example,
> show how one of them gets to be "global" and the other "local").
>
> If you don't want to show your (proprietary) code and would
> prefer to debug on your own, I'd suggest adding a few printf()
> calls to output interesting values. For example, the value
> returned by sprintf() might be interesting, especially if it
> is markedly different from the value returned by strlen(). I'm
> thinking of something along the lines of
>
> tmpLen = sprintf(tmpMsg, ...);
> if (tmplen != strlen(tmpMsg))
> print_everything_you_can_and_then_die();
>
> if (conditionTrue)
> strcat(tmpMsg, ",S");
> else
> strcat(tmpMsg, ",H");
> tmpLen += 2;
> if (tmplen != strlen(tmpMsg))
> print_everything_you_can_and_then_die();
>
> /* Continue building the rest of the string... */
>
> len = strlen( tmpMsg );
> sprintf( Message, "%04d%s", len, tmpMsg );
> if (len + 4 != strlen(Message))
> print_everything_you_can_and_die();
>


My first suspicion was also a buffer overflow. However, the header
should always be between 78 and 85 characters. The entire message
rarely exceeds 200 bytes so a 512 byte buffer -should- be safe, but an
overflow is still possible. After trolling through the code for nearly
a week, I have yet to find where it is overflowing if that is what is
happening.

I understand that it's hard to know what advice to give with the
mocked-up code I presented. I would post the actual code, but it's not
-my- code, it's that of my employer and I'm sure they wouldn't
appreciate it.

I do like your suggestion of testing the string length at every step of
the code and dying if it doesn't meet the lengths that I expect. The
only issue that still leaves for me is that I am unable to reproduce it
myself on my development box. This code works about 95-98% of the time
for those using it production (though some people seem to get it more
than their fair share). I guess if it did die on them in production,
I'd have a better idea of where the problem was.

Thank you for everyone who has responded so far. If anyone else has any
ideas or suggestions, I'm still open to them.
 
Reply With Quote
 
Flash Gordon
Guest
Posts: n/a
 
      07-23-2006
Neal Barney wrote:

<snip suspicion of buffer overflow>

> I do like your suggestion of testing the string length at every step of
> the code and dying if it doesn't meet the lengths that I expect. The
> only issue that still leaves for me is that I am unable to reproduce it
> myself on my development box. This code works about 95-98% of the time
> for those using it production (though some people seem to get it more
> than their fair share). I guess if it did die on them in production,
> I'd have a better idea of where the problem was.
>
> Thank you for everyone who has responded so far. If anyone else has any
> ideas or suggestions, I'm still open to them.


Could you get it to log data that would overflow the buffer somewhere
you (or a customer) could retrieve the log then take some appropriate
recovery action?
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidelines and intro:
http://clc-wiki.net/wiki/Intro_to_clc
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      07-23-2006
Neal Barney wrote:
> [...]
> I do like your suggestion of testing the string length at every step of
> the code and dying if it doesn't meet the lengths that I expect. The
> only issue that still leaves for me is that I am unable to reproduce it
> myself on my development box. This code works about 95-98% of the time
> for those using it production (though some people seem to get it more
> than their fair share). I guess if it did die on them in production,
> I'd have a better idea of where the problem was.


The important thing isn't really that the program die,
but that it die noisily, having logged everything that might
conceivably be of value in diagnosing the problem. When a
bug is hard to reproduce, it is particularly important to
capture as much information as possible.

One way to do this, of course, is to output everything
you can every time you go through the suspect code. That has
an unfortunate tendency to drown you with too much data, so a
technique that's fairly frequently employed is to log everything
to a circular buffer, going merrily 'round and 'round as long
as everything looks all right. Then when something screwy is
detected, you dump the buffer containing (by now) the information
collected from the last dozen or hundred or whatever log points.

Having captured the bad trace, it isn't even essential that
the program actually die -- but from your description, it seems
that the bogus transmission may have disrupted communication
with the other end of the wire, and the program might have some
difficulty continuing "normal operations" anyhow.

--
Eric Sosman
(E-Mail Removed)lid
 
Reply With Quote
 
Ivanna Pee
Guest
Posts: n/a
 
      07-24-2006

Neal Barney wrote:
> I have a C program which runs on a device using a Zilog Z180
> microprocessor. While it can address 1MB of RAM, it can only address
> 64KB at any given time. And of that only 16KB can be used for "stack
> and heap space". So I'm running in a very memory constricted
> environment.
>
> The program "speaks" a proprietary protocol which sends ASCII strings
> back and forth from the device to the server. Within the past couple of
> months we've been noticing errors in our server logs and people have
> been complaing of failures. Upon viewing server logs we've noticed the
> string the device is sending is missing it's header.
>
> The string gets built like so:
>
> /* Global variable */
> char Message[512];
>
> /* Local variable */
> char tmpMsg[512];
>
> /* var1 - var4 (local variables) get populated with values... */
>
> /* Build the header */
> sprintf( tmpMsg, "|XX|%s|0|00.00|XXXX:XXXX|XXXX:%s|XXXX:%s|XXXX:%s" ,
> var1, var2, var3, var4 );
>
> /* X's are alphas and zeros are numbers */
>
> if ( conditionTrue )
> strcat( tmpMsg, ",S" );
> else
> strcat( tmpMsg, ",H" );
>
> /* Continue building the rest of the string... */
>
> /* Put msg length at beginning of string*/
> len = strlen( tmpMsg );
> sprintf( Message, "%04d%s", len, tmpMsg );
>
> return;
>
>
> Now when the device transmits the string to the server, the server logs
> something similar to:
>
> 0089,S|XXXX:XXXX|XXXX:XXXX|... (and so on)
> or sometimes something like
> 0093 ,S|XXXX:XXXX|XXXX:XXXX|... (and so on)
>
> The entire header appears to be getting lost. The calculated message
> length matches the size of the string without the header, which may
> suggest:
>
> a) The sprintf that places the header in tmpMsg is failing (but why?).
> b) A bug in strcat for the C library I'm using is wiping out the
> string/starting at the beginning of the string. (It doesn't do it later
> in the same function or elsewhere in the program).
> c) "Something else" is walking on that part of memory.
> d) ...?
>
> This is not 100% reproduceable (in fact, in the past 9 months I have
> only personally seen it happen once on my test device, and I can't
> reproduce it there). But we are seeing this happen in production
> multiple times a day from different devices. Unplugging the device and
> plugging it back in will often "fix" the problem (though it may occur
> again in the future).
>
> What are the most likely causes for this problem? Solutions?


Does the transmitting program crash when this occurs? It doesn't sound
like it if the problem data packet is recieved at the other end.

I would guess that during "Continue building the rest of the string"
you are stomping on tmpMsg[0], perhaps overrunning the local data on
the stack just below it.

How often does this thing transmit? Speed up the xmission interval on
the test setup to hopefully increase the frequency at which it fails,
and take a look at the stack when it does.

 
Reply With Quote
 
Dave Thompson
Guest
Posts: n/a
 
      07-31-2006
On Sat, 22 Jul 2006 20:12:51 +0000 (UTC), Neal Barney
<(E-Mail Removed)> wrote:

> I have a C program which runs on a device using a Zilog Z180
> microprocessor. While it can address 1MB of RAM, it can only address
> 64KB at any given time. And of that only 16KB can be used for "stack
> and heap space". So I'm running in a very memory constricted
> environment.
>
> The program "speaks" a proprietary protocol which sends ASCII strings
> back and forth from the device to the server. Within the past couple of
> months we've been noticing errors in our server logs and people have
> been complaing of failures. Upon viewing server logs we've noticed the
> string the device is sending is missing it's header.
>
> The string gets built like so:
>
> /* Global variable */
> char Message[512];
>
> /* Local variable */
> char tmpMsg[512];


> /* Build the header */
> sprintf( tmpMsg, <snip>
> /* Continue building the rest of the string... */
>
> /* Put msg length at beginning of string*/
> len = strlen( tmpMsg );
> sprintf( Message, "%04d%s", len, tmpMsg );
>
> return;
>
>
> Now when the device transmits the string to the server, <snip>
> The entire header appears to be getting lost. The calculated message
> length matches the size of the string without the header, which may
> suggest:
>
> a) The sprintf that places the header in tmpMsg is failing (but why?).
> b) A bug in strcat for the C library I'm using is wiping out the
> string/starting at the beginning of the string. (It doesn't do it later
> in the same function or elsewhere in the program).
> c) "Something else" is walking on that part of memory.
> d) ...?
>
> This is not [very] reproduceable <snip>


Like other responders I don't see any likely culprits in the partial
code you posted. As already suggested collecting information about the
state and sequence(s) leading up to a fault is likely to be helpful in
narrowing down the suspects, and/or helping you construct cases that
will reproduce it in test. I realize general advice like this is not
much more helpful than telling you that candy tastes good.

It may be obvious, but if you haven't done so already I would inspect
the generated (and I assume shipped) assembler/machine code for your
routine and if reasonably possible (and not prohibited) for the
library routines your code calls (sprintf, strcat, etc.) just to make
sure the compiler/implementor isn't doing something dumb somewhere.
(But that part would be offtopic for c.l.c, except maybe for questions
about exactly what standard-library routines are required to do,
especially in corner cases, which I don't see you approaching.)

I do however see something I would do differently in a memory
constrained environment (and maybe even a normal one) which might or
might not be of help to you. Instead of building the message body in
one buffer (tmpMsg) and then copying it plus a small fixed-length
prefix into another (Message), do something like:
size_t used = sprintf (&Message[4], "format", headerargs);
strcpy (&Message[4+used], nextitem); used += nextitemlen;
etc., bumping used in the process as long as not too difficult
char tmpLen [4+1];
sprintf (tmpLen, "%04d", used /* or strlen (Message+4) */ );
memcpy (&Message[0], tmpLen, 4);
/* you can simplify &Message[n] to Message+n and particularly
&Message[0] to Message if your compiler is so badly lame it doesn't */

If the code generated, by your compiler for your target*, to access
(within) the global/static buffer is less efficient, use a pointer
into it instead:
/* register? */ char * putMsg = &Message[4];
putMsg += sprintf (putMsg, "format", headerargs); etc.
(* For 8080 and IIRC Z80, linker-resolved globals or statics are
actually better, but I don't know if Z180 is different here.)

Also, if you have snprintf (or _snprintf or somesuch) available in
your library, you might use that instead. (It is standard in C99 for
hosted environments, but I'd guess the implementation you're using
isn't even trying to conform at that level.) For the code and data you
posted it doesn't look likely that you're overflowing, but it could be
arbitrarily bad if you do, so if you can be certain at reasonable cost
you don't it's probably worth it just to rule that out.

- David.Thompson1 at worldnet.att.net
 
Reply With Quote
 
Neal Barney
Guest
Posts: n/a
 
      08-03-2006
On Mon, 31 Jul 2006 05:11:18 +0000, Dave Thompson wrote:
> Like other responders I don't see any likely culprits in the partial
> code you posted. As already suggested collecting information about the
> state and sequence(s) leading up to a fault is likely to be helpful in
> narrowing down the suspects, and/or helping you construct cases that
> will reproduce it in test. I realize general advice like this is not
> much more helpful than telling you that candy tastes good.


Well, I suppose it is a small comfort to me that I'm not crazy and didn't
overlook something patently obvious. After further research I do believe
that I have a better idea of what is going on. The stack and heap size
must be pre-declared and it appears as if the declared stack size is too
small. The stack space is being exceeded (or so it appears). I have
given the stack an additional 1000 bytes and will do further testing to
see if this resolves the issue. If it doesn't I'm in trouble. With
that change to the stack size that puts my memory usage at 16302 bytes out
of 16384. Talk about pushing the limits... :-O

> It may be obvious, but if you haven't done so already I would inspect
> the generated (and I assume shipped) assembler/machine code for your
> routine and if reasonably possible (and not prohibited) for the library
> routines your code calls (sprintf, strcat, etc.) just to make sure the
> compiler/implementor isn't doing something dumb somewhere. (But that
> part would be offtopic for c.l.c, except maybe for questions about
> exactly what standard-library routines are required to do, especially in
> corner cases, which I don't see you approaching.)


A good suggestion but it's been years since I inspected assembler level
code. And this machine uses a modified variant of the "Amsterdam
compiler" that compiles to EM byte-code. The machine runs an interpreter
that translates the EM code to native machine calls. I suppose there may
be enough information out there to learn more about the EM byte code and
figure things out, but that will probably be a last resort.

> I do however see something I would do differently in a memory
> constrained environment (and maybe even a normal one) which might or
> might not be of help to you. Instead of building the message body in one
> buffer (tmpMsg) and then copying it plus a small fixed-length prefix
> into another (Message), do something like:
> size_t used = sprintf (&Message[4], "format", headerargs); strcpy
> (&Message[4+used], nextitem); used += nextitemlen; etc., bumping used
> in the process as long as not too difficult char tmpLen [4+1]; sprintf
> (tmpLen, "%04d", used /* or strlen (Message+4) */ ); memcpy
> (&Message[0], tmpLen, 4);
> /* you can simplify &Message[n] to Message+n and particularly
> &Message[0] to Message if your compiler is so badly lame it doesn't */


I already rewrote the function to do exactly what you are suggesting
(though implemented a little differently). It should now be more
efficient as well as likely being easier to see which call is causing me
the issue. We'll see if this makes a difference.

> If the code generated, by your compiler for your target*, to access
> (within) the global/static buffer is less efficient, use a pointer into
> it instead:
> /* register? */ char * putMsg = &Message[4]; putMsg += sprintf
> (putMsg, "format", headerargs); etc.
> (* For 8080 and IIRC Z80, linker-resolved globals or statics are
> actually better, but I don't know if Z180 is different here.)
>
> Also, if you have snprintf (or _snprintf or somesuch) available in your
> library, you might use that instead. (It is standard in C99 for hosted
> environments, but I'd guess the implementation you're using isn't even
> trying to conform at that level.) For the code and data you posted it
> doesn't look likely that you're overflowing, but it could be arbitrarily
> bad if you do, so if you can be certain at reasonable cost you don't
> it's probably worth it just to rule that out.


I do not have a snprintf of any sort available to me. The compiler I'm
using is ANSI C89 conforming, but that's as far as it goes (The compiler
was written in 1993). I thought about using a third party library (such
as trio), but in the end I decided that it wasn't worth linking against
any additional code. Size is too important when the program gets
downloaded at 2400/1200baud.

I appreciate all of the replies I have received. I apologize for asking
the question and then more-or-less disappearing. I was out of town on
training and didn't have Usenet access. But all of the suggestions have
been much appreciated.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
map<string, vector<string> > Question about partial initialization Mr. K.V.B.L. C++ 6 09-16-2008 11:06 AM
Re: Partial 1.0 - Partial classes for Python Thomas Heller Python 13 02-08-2007 08:12 PM
Re: Partial 1.0 - Partial classes for Python J. Clifford Dyer Python 0 02-08-2007 05:29 PM
Why? Partial Class within a Partial class Billy ASP .Net 2 02-01-2006 09:10 AM
partial / wildcard string match in 'in' and 'list.index()' Jon Perez Python 1 05-28-2004 06:47 AM



Advertisments