Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Problem with printf-like logging framework and inline hexdumps.

Reply
Thread Tools

Problem with printf-like logging framework and inline hexdumps.

 
 
Jef Driesen
Guest
Posts: n/a
 
      09-14-2012
On 13-09-12 18:11, Jef Driesen wrote:
> Hi,
>
> For one of my projects, I created a simple logging framework. I have an
> (opaque) context object which contains a buffer:
>
> typedef struct dc_context_t {
> char msg[4096];
> /* More stuff here. */
> } dc_context_t;
>
> and a printf-like function to log a message. The logging function uses
> the vsnprintf() function to write a string representation of the
> arguments to the context buffer. The resulting string is then passed to
> a user-defined callback function.
>
> int
> dc_context_log (dc_context_t *context, const char *format, ...)
> {
> va_list ap;
>
> va_start (ap, format);
> vsnprintf (context->msg, sizeof (context->msg), format, ap);
> va_end (ap);
>
> callback (context->msg);
> }
>
> This works perfect for all printf-style logging. For example:
>
> dc_context_log (context, "%s (%d)", "An error message", 3);
>
> However, I would like to log some binary data too. Let's say I want to
> log a message with an inline hexdump, something like this:
>
> "DEBUG: data=<DATA>, size=<SIZE>"
>
> Since there is no printf format string available for hexdumps, I could
> write a little dc_context_hexdump() helper function to convert the
> binary data into a hex string, and then produce the log message in 3
> steps:
>
> dc_context_log (context, "%s: data=", "DEBUG");
> dc_context_hexdump (context, hexdata, size);
> dc_context_log (context, "", size);
>
> This would work fine, except that I need the output in only one piece.
> There are several reasons for that. I simplified the code a bit for this
> post, but in reality the dc_context_log() function also receives extra
> info, like the __FILE__ and __LINE__ macros. When printing the output, I
> only want to see the file and line info appear only once, and not three
> times. Also in a multithreaded environment, another thread may call the
> logging function between each of the three calls, and thus interleaving
> the output.
>
> How would you workaround this?
>
> A possible approach might be to convert the binary data to string
> first, and then pass that to the dc_context_log call. Something like
> this:
>
> void
> somefunction (dc_context_t *context, unsigned char *data, unsigned int
> size)
> {
> /* Do something useful here. */
>
> /* Log the binary data. */
> hexstr = hexdump (data, size);
> dc_context_log (context, "%s: data=%s, size=%u",
> "DEBUG", hexstr, size);
> }
>
> But this hexdump function isn't thread-safe (because it would have to
> return a pointer to a static buffer). And even if I would make a
> re-entrant hexdump_r() version, it's not ideal because then I would have
> to allocate a buffer inside the somefunction(). I don't want to do that
> because the dc_context_log is actually called through a macro, which can
> be configured at compile time to omit the actual call (e.g. in
> production builds). But then the memory allocation would still be there,
> and I don't want to pollute application code with #ifdef's. An temporary
> buffer on the stack won't work either, because the size isn't fixed.
>
> Suggestions are welcome!


An alternative solution that I have been thinking about, is to
introduce a number of helper functions:

dc_context_printf
dc_context_hexdump
dc_context_flush

The first two functions would just write to the msg buffer as before,
but now with an extra variable to track the current writing position,
such that successive calls will append their data. The third function is
there to generating the output and reset the writing position.

Thread-safety could be achieved by providing two more locking functions
(dc_context_lock and dc_context_unlock) called before and after the
printing functions.

The only downside, is that I would need several function calls to
generate a single log message. For my example in the original post,
there would be at least 6 calls:

dc_context_lock (context);
dc_context_printf (context, "%s: data=", "DEBUG");
dc_context_hexdump (context, data, size);
dc_context_printf (context, "size=%d", size);
dc_context_flush (context);
dc_context_unlock (context);

Although it works, it's also getting pretty verbose, and all these function
calls will have some runtime overhead too. (The original dc_context_log function
does only call the vsnprintf function if the loglevel is high enough.)

Jef
 
Reply With Quote
 
 
 
 
Ben Bacarisse
Guest
Posts: n/a
 
      09-14-2012
Jef Driesen <(E-Mail Removed)> writes:

> On 14-09-12 00:54, Ben Bacarisse wrote:
>> Jef Driesen <(E-Mail Removed)> writes:
>>
>>> On 13-09-12 22:24, Ben Bacarisse wrote:

>> <snip>
>>>> Rolling your own format processing is not that hard. I'd process the
>>>> format in chunks that bracket your own format codes. Here's an
>>>> example:
>>>>
>>>> void logit(const char *fmt, ...)
>>>> {
>>>> va_list al;
>>>> va_start(al, fmt);
>>>> const char *fp;
>>>> while ((fp = strstr(fmt, "%H")) != NULL) {
>>>> char tmp[fp - fmt + 1];
>>>> memcpy(tmp, fmt, sizeof tmp - 1);
>>>> tmp[sizeof tmp - 1] = 0;
>>>> vprintf(tmp, al);
>>>>
>>>> printf("<%d>", va_arg(al, int));
>>>>
>>>> fmt = fp + 2;
>>>> }
>>>> vprintf(fmt, al);
>>>> va_end(al);
>>>> }
>>>>
>>>> Here, a format of %H is treated as if it meant "<%d>" but you can do
>>>> whatever you like at that point (in fact, my use-case called a function
>>>> keyed from the the format letter).
>>>>
>>>> Obviously you'll need a cleverer function to match a custom format. %H
>>>> should not match after a %, for example, and your formats are likely to
>>>> have counts in them.
>>>>
>>>> I don't think it even breaks any of the rules about handling va_lists.
>>>
>>> I also thought of something like this, but the manpages for the vprintf
>>> functions says this on my linux system:
>>>
>>> These functions do not call the va_end macro. Because they invoke the va_arg
>>> macro, the value of ap is undefined after the call. See stdarg(3).

>>
>> Ah, yes. The above is from some old code when things were a little
>> wilder...
>>
>> I don't think all is lost. I think there is a fiddly dance you can do
>> with va_end and va_copy to get the job done:
>>
>> void logit(const char *fmt, ...)
>> {
>> va_list al;
>> va_start(al, fmt);
>> const char *fp;
>> while ((fp = strstr(fmt, "%H")) != NULL) {
>> char tmp[fp - fmt + 1];
>> memcpy(tmp, fmt, sizeof tmp - 1);
>> tmp[sizeof tmp - 1] = 0;
>>
>> va_list al_copy;
>> va_copy(al_copy, al);
>>
>> vprintf(tmp, al_copy);
>>
>> va_end(al_copy);
>> va_end(al);
>> va_copy(al, al_copy);
>>
>> printf("<%d>", va_arg(al, int));
>> fmt = fp + 2;
>> }
>> vprintf(fmt, al);
>> va_end(al);
>> }
>>
>> This passes a cursory reading of the va_* macro restrictions in the
>> standard but I'd welcome some more input...

>
> I think this version has the same problem as the original. After
> calling the va_end macro, the va_list variable becomes undefined
> according to the manpages on my linux system:
>
> "After the call va_end(ap) the variable ap is undefined."


The linux man pages are probably not a good source for details like
this. Both quotes you've given conflict with the C standard and I
suspect that is just a result of summarising the situation rather
bluntly -- not a reflection of the fact the glibc's implementation is
deliberately not conforming.

The standard says:

"The va_end macro may modify ap so that it is no longer usable
(without being reinitialized by the va_start or va_copy macro). If
there is no corresponding invocation of the va_start or va_copy macro,
or if the va_end macro is not invoked before the return, the behavior
is undefined."

That's not exactly the same thing. I had attempted to use va_copy get
the va_list correctly reinitialised, but I don't think I've got it
right. So, yes, there is still a problem I think.

There may be no correct va_start/va_copy/va_end pattern that works. If
so, that's a great shame, but it was presumably unavoidable. I can't
believe this old trick was ruled un-portable without it having to be so
ruled. I'll give it some more thought...

--
Ben.
 
Reply With Quote
 
 
 
 
Ben Bacarisse
Guest
Posts: n/a
 
      09-14-2012
Ben Bacarisse <(E-Mail Removed)> writes:
<snip>
> There may be no correct va_start/va_copy/va_end pattern that works. If
> so, that's a great shame, but it was presumably unavoidable. I can't
> believe this old trick was ruled un-portable without it having to be so
> ruled. I'll give it some more thought...


I have thunk and I think there is no suitable pattern of
va_start/va_copy/va_end calls that will work.

This means that the only way to add your own formats (portably) is to
keep a synchronised copy of the va_list: you duplicate it at the start,
and you duplicate the current state when you call vprintf to process one
or more of its formats. When vprintf returns, you must scan the format
and execute the same va_args calls the vprintf will have used.

That makes it much more fiddly because of all the types that might be
involved. And to do this requires actually parsing the format
specifiers which is not trivial.

Seems a pity...

--
Ben.
 
Reply With Quote
 
Jef Driesen
Guest
Posts: n/a
 
      09-17-2012
On 14-09-12 23:49, Ben Bacarisse wrote:
> Ben Bacarisse <(E-Mail Removed)> writes:
> <snip>
>> There may be no correct va_start/va_copy/va_end pattern that works. If
>> so, that's a great shame, but it was presumably unavoidable. I can't
>> believe this old trick was ruled un-portable without it having to be so
>> ruled. I'll give it some more thought...

>
> I have thunk and I think there is no suitable pattern of
> va_start/va_copy/va_end calls that will work.
>
> This means that the only way to add your own formats (portably) is to
> keep a synchronised copy of the va_list: you duplicate it at the start,
> and you duplicate the current state when you call vprintf to process one
> or more of its formats. When vprintf returns, you must scan the format
> and execute the same va_args calls the vprintf will have used.
>
> That makes it much more fiddly because of all the types that might be
> involved. And to do this requires actually parsing the format
> specifiers which is not trivial.
>
> Seems a pity...


A pity indeed. Implementing my own printf function isn't trivial and a bit
overkill for my needs.

Jef

 
Reply With Quote
 
Kaz Kylheku
Guest
Posts: n/a
 
      09-17-2012
On 2012-09-14, Ben Bacarisse <(E-Mail Removed)> wrote:
> Ben Bacarisse <(E-Mail Removed)> writes:
><snip>
>> There may be no correct va_start/va_copy/va_end pattern that works. If
>> so, that's a great shame, but it was presumably unavoidable. I can't
>> believe this old trick was ruled un-portable without it having to be so
>> ruled. I'll give it some more thought...

>
> I have thunk and I think there is no suitable pattern of
> va_start/va_copy/va_end calls that will work.
>
> This means that the only way to add your own formats (portably) is to
> keep a synchronised copy of the va_list: you duplicate it at the start,
> and you duplicate the current state when you call vprintf to process one
> or more of its formats. When vprintf returns, you must scan the format
> and execute the same va_args calls the vprintf will have used.
>
> That makes it much more fiddly because of all the types that might be
> involved. And to do this requires actually parsing the format
> specifiers which is not trivial.


It's a piece of cake. And anyway, you don't have to support everything
in printf. If you require yourself to do a complete, perfect job,
that's a good way to get out of getting anything done.

I've used this in projects past, just for quick and dirty debugging,
not part of any permanent framework.

It's not a complete solution but it could be massaged into one.

For instance, note how it assumes that only one argument is consumed
by printf, and it just eats a void *. This worked on the target platforms
I was working on, where I was using mostly %d and %s arguments.

It also doesn't handle variable widths and precisions (* character).

These things would not be very hard to add. You need to add some
sophistication to the parser, which has to produce some info in a bunch of
local variables: number of stars which occured (to consume that many ints)
and the type of the argument (requiring a switch statement to various
va_arg calls).

int hexprintf(const char *fmt, ...)
{
int ch, count = 0;
va_list vl;

va_start (vl, fmt);

for (ch = *fmt++; ch != 0; ch = *fmt++) {
switch (ch) {
case '%': /* printf conversion */
{
/* regular printf conversion */
char conv[64] = "%";
const char *end = fmt;
va_list vl_copy;

va_copy (vl_copy, vl);

end += strspn(end, "#0-+ "); /* skip flags */
end += strspn(end, "0123456789*"); /* skip precision */
if (*end == '.') /* skip precision */
end += 1 + strspn(end + 1, "0123456789*");

/* skip length */
if (strncmp(end, "hh", 2) == 0)
end += 2;
else if (strncmp(end, "ll", 2) == 0)
end += 2;
else if (strspn(end, "hlLqjzt") > 0)
end += 1;

if (*end == '@') {
/* our hex conversion */
const unsigned char *data = va_arg (vl, unsigned char *);
int i, size = va_arg (vl, int);

for (i = 0; i < size; i++) {
count += printf("%02x", (unsigned int) data[i]);
if (i < size - 1) {
putchar(' ');
count++;
}
}
} else {
/* copy and call printf */
memcpy(conv + 1, fmt, end - fmt + 1);
count += vprintf(conv, vl_copy);
va_end (vl_copy);
va_arg (vl, void *);
}
fmt = end + 1;
}
break;
default:
putchar(ch);
count++;
break;
}
}

va_end (vl);
return count;
}
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      09-17-2012
Kaz Kylheku <(E-Mail Removed)> writes:

> On 2012-09-14, Ben Bacarisse <(E-Mail Removed)> wrote:
>> Ben Bacarisse <(E-Mail Removed)> writes:
>><snip>
>>> There may be no correct va_start/va_copy/va_end pattern that works. If
>>> so, that's a great shame, but it was presumably unavoidable. I can't
>>> believe this old trick was ruled un-portable without it having to be so
>>> ruled. I'll give it some more thought...

>>
>> I have thunk and I think there is no suitable pattern of
>> va_start/va_copy/va_end calls that will work.
>>
>> This means that the only way to add your own formats (portably) is to
>> keep a synchronised copy of the va_list: you duplicate it at the start,
>> and you duplicate the current state when you call vprintf to process one
>> or more of its formats. When vprintf returns, you must scan the format
>> and execute the same va_args calls the vprintf will have used.
>>
>> That makes it much more fiddly because of all the types that might be
>> involved. And to do this requires actually parsing the format
>> specifiers which is not trivial.

>
> It's a piece of cake.


Something is a piece of cake, yes, but is it what the OP was talking
about? After all, I've had a bit code that does this for decades, but
that was (quite reasonably) rejected because the OP wanted stronger
guarantees that it will work.

Your code makes different assumptions which make well suit the OP
better, but it's still a bit of a punt.

> And anyway, you don't have to support everything
> in printf. If you require yourself to do a complete, perfect job,
> that's a good way to get out of getting anything done.


I agree, though I think printf-like logging things need a bit more care
that some other APIs because users of the logging can easily forget what
restrictions you've placed on allowable formats.

> I've used this in projects past, just for quick and dirty debugging,
> not part of any permanent framework.
>
> It's not a complete solution but it could be massaged into one.


Ditto with my "not guaranteed" solution.

> For instance, note how it assumes that only one argument is consumed
> by printf, and it just eats a void *. This worked on the target platforms
> I was working on, where I was using mostly %d and %s arguments.
>
> It also doesn't handle variable widths and precisions (* character).
>
> These things would not be very hard to add. You need to add some
> sophistication to the parser, which has to produce some info in a bunch of
> local variables: number of stars which occured (to consume that many ints)
> and the type of the argument (requiring a switch statement to various
> va_arg calls).


Yes, not hard. You could also add code to make %n work (it's about two
lines) and I'd want fix it to treat %% correctly.

But I don't want a pointless argument over my "not trivial" and your
"not hard". My point was that it seems a shame that there is no way to
avoid scanning the format. A standard solution could have been
provided. The v*printf functions could take, for example, extra pointer
argument which would collect a value to be passed to a new va_ macro
(va_skip?) so that the caller can synchronise its own argument list.
That might be a bit OTT, but maybe the restriction on using an
indeterminate argument list could be relaxed so that it can copied? On
what systems, I wonder, would that would cause a problem?

<snip code>
--
Ben.
 
Reply With Quote
 
Jorgen Grahn
Guest
Posts: n/a
 
      09-20-2012
On Thu, 2012-09-13, Heinrich Wolf wrote:
>
> "Jef Driesen" <(E-Mail Removed)> schrieb im Newsbeitrag
> news:k2t0k8$cjf$(E-Mail Removed)...
> ...
>> But this hexdump function isn't thread-safe (because it would have to
>> return a pointer to a static buffer). And even if I would make a
>> re-entrant hexdump_r() version, it's not ideal because then I would have
>> to allocate a buffer inside the somefunction(). I don't want to do that
>> because the dc_context_log is actually called through a macro, which can
>> be configured at compile time to omit the actual call (e.g. in
>> production builds). But then the memory allocation would still be there,
>> and I don't want to pollute application code with #ifdef's. An temporary
>> buffer on the stack won't work either, because the size isn't fixed.
>>
>> Suggestions are welcome!
>>
>> Jef

>
> Maybe you supply the buffer for the hex dump locally from a the caller of
> the logging.


Yes, like inet_ntop() does. And such functions should also
return a pointer to the buffer, so you can:

char buf[SOMETHING];
printf("Hello, I'm %s\n",
inet_pton(AF_INET, &some_ip_address, buf, sizeof buf));

Not that this solves all his problems, though. Printf works to well
because it appends to an infinite stream. You can hack the text up
into pieces (call printf() and friends many times) and it doesn't
matter. But with logging, each write is an entry, a distinct event;
printing each bit separately and printing one big chunk don't give
the same result.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Multiple functions (one version being inline and other beingnon-inline) Rahul C++ 3 02-28-2008 03:28 PM
about extern inline and static inline Sean C++ 4 04-30-2006 03:18 PM
Tool which expands implicitly inline inline functions tthunder@gmx.de C++ 3 06-16-2005 12:54 AM
To inline or not to inline? Alvin C++ 7 05-06-2005 03:04 PM
inline or not to inline in C++ Abhi C++ 2 07-03-2003 12:07 AM



Advertisments