Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C Programming (http://www.velocityreviews.com/forums/f42-c-programming.html)
-   -   Merging of string literals guaranteed by C std? (http://www.velocityreviews.com/forums/t946521-merging-of-string-literals-guaranteed-by-c-std.html)

Johannes Bauer 05-25-2012 11:33 AM

Merging of string literals guaranteed by C std?
 
Hi group,

I have a question about string literals and the address that they point
to. Does the standard *guarantee* that two identical string literals
actually point to the same address. I.e. can we safely assert:

assert("foo" == "foo");

Or can it maybe only be asserted if the literal occurs in one
compilation unit (i.e. not across compilation units)?

My gut feeling tells me that I cannot rely on the addresses being
identical, but I cannot find it in N1124. It would make things much
easier/cooler if the standard would assert that in my situation, but I
don't want to rely on compiler behavior alone (gcc merges the string
literals into one address even with -O0).

Best regards,
Johannes


--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>

Noob 05-25-2012 12:03 PM

Re: Merging of string literals guaranteed by C std?
 
Johannes Bauer wrote:

> I have a question about string literals and the address that they point
> to. Does the standard *guarantee* that two identical string literals
> actually point to the same address. I.e. can we safely assert:
>
> assert("foo" == "foo");


No, this cannot be asserted. AFAIU, it is a QoI issue.
A "dumb" implementation is allowed to store every string
literal in a separate location.

C89 states: (3.1.4 String literals)
"Identical string literals of either form [wide or regular]
need not be distinct."

"need not be distinct" thus they may be distinct.

Regards.

James Kuyper 05-25-2012 12:24 PM

Re: Merging of string literals guaranteed by C std?
 
On 05/25/2012 07:33 AM, Johannes Bauer wrote:
> Hi group,
>
> I have a question about string literals and the address that they point
> to. Does the standard *guarantee* that two identical string literals
> actually point to the same address. I.e. can we safely assert:
>
> assert("foo" == "foo");


No, the standard neither mandates nor forbids that. Note: the same is
true of

"watergate" + 5 == "gate"
--
James Kuyper

Eric Sosman 05-25-2012 12:38 PM

Re: Merging of string literals guaranteed by C std?
 
On 5/25/2012 7:33 AM, Johannes Bauer wrote:
> Hi group,
>
> I have a question about string literals and the address that they point
> to. Does the standard *guarantee* that two identical string literals
> actually point to the same address. I.e. can we safely assert:
>
> assert("foo" == "foo");


No.

> Or can it maybe only be asserted if the literal occurs in one
> compilation unit (i.e. not across compilation units)?


No.

> My gut feeling tells me that I cannot rely on the addresses being
> identical, but I cannot find it in N1124. It would make things much
> easier/cooler if the standard would assert that in my situation, but I
> don't want to rely on compiler behavior alone (gcc merges the string
> literals into one address even with -O0).


Your gut is right: The two "foo" may resolve to a single
nameless array, or to two. One or both or neither of them
may also share storage with the tail end of "barfoo". It's
the compiler's choice, and I don't even think the compiler is
required to document it (except in the sense that you can
compare the pointers at run time).

Some compilers have a mode in which each appearance of a
literal is guaranteed *not* to overlap others, usually to allow
the program to change the contents of the literal's nameless
array. In old gcc versions the "-fwriteable-strings" flag did
this; I think the option has been discontinued.

--
Eric Sosman
esosman@ieee-dot-org.invalid

Johannes Bauer 05-26-2012 07:00 AM

Re: Merging of string literals guaranteed by C std?
 
On 25.05.2012 14:38, Eric Sosman wrote:

>> assert("foo" == "foo");

>
> No.
>
>> Or can it maybe only be asserted if the literal occurs in one
>> compilation unit (i.e. not across compilation units)?

>
> No.


Thank you and the other two posters for your clarification. Going to
think of something else then :-)

Best regards,
Joe

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>

Johannes Bauer 05-26-2012 11:47 AM

Re: Merging of string literals guaranteed by C std?
 
On 26.05.2012 13:37, pete wrote:

> /*
> ** What are you trying to do?
> */
> char *foo = "foo";
>
> assert(foo == foo);


I'm writing a large application with a debugging facility. I want to
enable or disable certain (debugging) outputs at *compile* time (since
some of them are in inner loops), so that if they're disabled there's no
residue in the code anywhere that there even was a output.

Moreover, I'd like to avoid defines in these loops (they really hinder
the ability to read the code IMO). So the usual approach to something
like this (and which would fulfill almost all requirements):

#define FACILITY_FOO (1 << 0)
#define FACILITY_BAR (1 << 1)
#define FACILITY_KOO (1 << 2)
....

#define ENABLED_FACILITIES (FACILITY_FOO | FACILITY_KOO)

and then in the code

#define debug(fcl, msg, ...) if (fcl & ENABLED_FACILITIES) dump(msg);

This is then resolved by the compiler and optimized out completely (i.e.
FACILITY_BAR & (FACILITY_FOO | FACILITY_KOO) == 0).

Now the problem is: I have very fine granularity of "facilities". More
than 32 to be sure (hundreds to be exact). I'd like to have a solution
with an arbitrary amount of facilities.

Therefore I was thinking of some check like

#define FACILITY_FOO "foo"
#define FACILITY_BAR "bar"
#define FACILITY_KOO "koo"

and a debug implementation like this

#define debug(fcl, msg, ...)
if ((fcl == FACILITY_FOO) || (fcl == FACILITY_BAR)) dump(msg);

Seems like this is not the way to go, though. If there was something
like "constexpr" in C, this could easily be done. Now I'm a bit puzzled
but will figure something out (and if nothing else works, I'll have
Python generate some C code which does the right switching on/off of
debugging instructions).

Best regards,
Joe


--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>

Eric Sosman 05-26-2012 12:11 PM

Re: Merging of string literals guaranteed by C std?
 
On 5/26/2012 7:47 AM, Johannes Bauer wrote:
> On 26.05.2012 13:37, pete wrote:
>
>> /*
>> ** What are you trying to do?
>> */
>> char *foo = "foo";
>>
>> assert(foo == foo);

>
> I'm writing a large application with a debugging facility. I want to
> enable or disable certain (debugging) outputs at *compile* time (since
> some of them are in inner loops), so that if they're disabled there's no
> residue in the code anywhere that there even was a output.
>
> Moreover, I'd like to avoid defines in these loops (they really hinder
> the ability to read the code IMO). So the usual approach to something
> like this (and which would fulfill almost all requirements):
>
> #define FACILITY_FOO (1<< 0)
> #define FACILITY_BAR (1<< 1)
> #define FACILITY_KOO (1<< 2)
> ...
>
> #define ENABLED_FACILITIES (FACILITY_FOO | FACILITY_KOO)
>
> and then in the code
>
> #define debug(fcl, msg, ...) if (fcl& ENABLED_FACILITIES) dump(msg);
>
> This is then resolved by the compiler and optimized out completely (i.e.
> FACILITY_BAR& (FACILITY_FOO | FACILITY_KOO) == 0).
>
> Now the problem is: I have very fine granularity of "facilities". More
> than 32 to be sure (hundreds to be exact). I'd like to have a solution
> with an arbitrary amount of facilities.
>
> Therefore I was thinking of some check like
>
> #define FACILITY_FOO "foo"
> #define FACILITY_BAR "bar"
> #define FACILITY_KOO "koo"
>
> and a debug implementation like this
>
> #define debug(fcl, msg, ...)
> if ((fcl == FACILITY_FOO) || (fcl == FACILITY_BAR)) dump(msg);
>
> Seems like this is not the way to go, though. If there was something
> like "constexpr" in C, this could easily be done. Now I'm a bit puzzled
> but will figure something out (and if nothing else works, I'll have
> Python generate some C code which does the right switching on/off of
> debugging instructions).


Why not use numeric constants instead of strings?

#define FACILITY_FOO 1
#define FACILITY_BAR 2
#define FACILITY_KOO 42
// ... or use enum constants

#define debug(fcl, msg) \
if ((fcl) == FACILITY_FOO || (fcl) == FACILITY_BAR) \
dump(msg)
// see also "the do-while hack" for a better

Alternatively,

#define FACILITY_FOO 1 // enable FOO debugging
#define FACILITY_BAR 0 // suppress BAR debugging
#define FACILITY_KOO 1 // enable KOO debugging

#define debug(fcl, msg) if (fcl) dump(msg)

.... leading to a much briefer macro that you needn't change when
changing the state of "hundreds" of facilities.

--
Eric Sosman
esosman@ieee-dot-org.invalid

Johannes Bauer 05-26-2012 12:24 PM

Re: Merging of string literals guaranteed by C std?
 
On 26.05.2012 14:11, Eric Sosman wrote:

> Why not use numeric constants instead of strings?
>
> #define FACILITY_FOO 1
> #define FACILITY_BAR 2
> #define FACILITY_KOO 42
> // ... or use enum constants
>
> #define debug(fcl, msg) \
> if ((fcl) == FACILITY_FOO || (fcl) == FACILITY_BAR) \
> dump(msg)
> // see also "the do-while hack" for a better


You mean do { } while(0)? That's in the original definition, I just
posted the shortcut from memory :-)

> Alternatively,
>
> #define FACILITY_FOO 1 // enable FOO debugging
> #define FACILITY_BAR 0 // suppress BAR debugging
> #define FACILITY_KOO 1 // enable KOO debugging
>
> #define debug(fcl, msg) if (fcl) dump(msg)
>
> ... leading to a much briefer macro that you needn't change when
> changing the state of "hundreds" of facilities.


Yes, I think I'll take that approach, which is much more sensible. The
reason I tried to use strings is beacuse the facility (unlike in the
abbreviated example) is also passed to the debugging command for proper
redirection of logging (i.e. separate things in separate files). For
display, having the name is nice.

I tried to kill two birds with one stone: essentially making the
facility's name to it's variables value.

Thanks for the hints!
Best regards,
Joe

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>

Eric Sosman 05-26-2012 12:41 PM

Re: Merging of string literals guaranteed by C std?
 
On 5/26/2012 8:24 AM, Johannes Bauer wrote:
> [...]
> Yes, I think I'll take that approach, which is much more sensible. The
> reason I tried to use strings is beacuse the facility (unlike in the
> abbreviated example) is also passed to the debugging command for proper
> redirection of logging (i.e. separate things in separate files). For
> display, having the name is nice.


Stringize the FACILITY_xxx piece in the debug() macro, and
pass it to the message-writing function, along with __FILE__ and
__LINE__ and whatever else suits your fancy.

--
Eric Sosman
esosman@ieee-dot-org.invalid

Johannes Bauer 05-26-2012 01:00 PM

Re: Merging of string literals guaranteed by C std?
 
On 26.05.2012 14:41, Eric Sosman wrote:
> On 5/26/2012 8:24 AM, Johannes Bauer wrote:
>> [...]
>> Yes, I think I'll take that approach, which is much more sensible. The
>> reason I tried to use strings is beacuse the facility (unlike in the
>> abbreviated example) is also passed to the debugging command for proper
>> redirection of logging (i.e. separate things in separate files). For
>> display, having the name is nice.

>
> Stringize the FACILITY_xxx piece in the debug() macro, and
> pass it to the message-writing function, along with __FILE__ and
> __LINE__ and whatever else suits your fancy.


Ah, stringifying that piece is a smart idea. Thanks for the pointer!

Best regards,
Joe

--
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?

> Zumindest nicht öffentlich!

Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
- Karl Kaos über Rüdiger Thomas in dsa <hidbv3$om2$1@speranza.aioe.org>


All times are GMT. The time now is 07:04 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.