Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Yet another binary search tree library

Reply
Thread Tools

Yet another binary search tree library

 
 
Francis Moreau
Guest
Posts: n/a
 
      07-28-2010
On Jul 27, 5:39*pm, Tim Rentsch <(E-Mail Removed)> wrote:
> Francis Moreau <(E-Mail Removed)> writes:
> > How about this new example, which should make you happier regarding
> > the '->' operator:

>
> > * struct A { int a; };
> > * struct B { int b; };

>
> > * struct A *p;
> > * struct B b_object;

>
> > * p = (struct A *)&b_object;
> > * p->a = 1;

>
> > As you can see, there're no more alignment issues since (in my
> > understanding) all pointers to structure types have the same
> > representation and alignment requirements as each other.

>
> Pointers to structures have the same alignment requirements as
> each other, but the structures themselves may have different
> alignments, and it's the alignments of the structures that is
> relevant here.


OK.

>
> In principle it's possible for these two structs to have
> different alignment requirements, but practically speaking it's
> pretty unlikely, so for the sake of discussion let's ignore that
> aspect.


OK.

>
> > And (still in my understanding), there's no more undefined behaviour
> > since the '->' operator requirement is satisfied now: p now points to
> > a structure object.

>
> > What do you think ?

>
> This question is very good.


Glad to hear it

> *It seems like a natural thing to
> want to do, and finding an answer isn't especially easy or
> obvious.
>
> In fact, I believe the Standard means for this to be undefined
> behavior. *Unfortunately the Standard does not (IMO) address this
> question as directly or as clearly as it should. *The general
> principle is that it's illegal to access members of one kind of
> struct by means of member access (ie, using '.' or '->') using
> a designator that is a different kind of struct. *This principle
> does have an exception, and that exception is spelled out
> specifically in 6.5.2.3p5:
>
> * * One special guarantee is made in order to simplify the use
> * * of unions: *if a union contains several structures that
> * * share a common initial sequence (see below), and if the
> * * union object currently contains one of these structures, it
> * * is permitted to inspect the common initial part of any of
> * * them anywhere that a declaration of the complete type of the
> * * union is visible. *Two structures share a common initial
> * * sequence if corresponding members have compatible types
> * * (and, for bit-fields, the same widths) for a sequence of one
> * * or more initial members.
>
> The argument is that, since the Standard goes to the trouble of
> pointing out a case where accessing one struct through a different
> type of struct is allowed, in other cases doing that is not allowed
> (ie, is undefined behavior). *This question may be worth debating
> over in comp.std.c, because at the very least it would be nice if
> the language were more clear. *However, as far as a practical
> reading of the Standard goes (ie, for comp.lang.c) I would say this
> point settles the question. *And, luckily, there is an example that
> is pretty much right on point to your question, namely 6.5.2.3p8:
>
> * * EXAMPLE 3 The following is a valid fragment:
>
> * * * * union {
> * * * * * * struct {
> * * * * * * * * int alltypes;
> * * * * * * } n;
> * * * * * * struct {
> * * * * * * * * int type;
> * * * * * * * * int intnode;
> * * * * * * } ni;
> * * * * * * struct {
> * * * * * * * * int type;
> * * * * * * * * double doublenode;
> * * * * * * } nf;
> * * * * } u;
> * * * * u.nf.type = 1;
> * * * * u.nf.doublenode = 3.14;
> * * * * /* ... */
> * * * * if (u.n.alltypes == 1)
> * * * * if (sin(u.nf.doublenode) == 0.0)
> * * * * /* ... */
>
> * * The following is not a valid fragment (because the union type is
> * * not visible within function f):
>
> * * * * struct t1 { int m; };
> * * * * struct t2 { int m; };
> * * * * int f(struct t1 *p1, struct t2 *p2)
> * * * * {
> * * * * * * if (p1->m < 0)
> * * * * * * p2->m = -p2->m;
> * * * * * * return p1->m;
> * * * * }
> * * * * int g()
> * * * * {
> * * * * * * union {
> * * * * * * * * struct t1 s1;
> * * * * * * * * struct t2 s2;
> * * * * * * } u;
> * * * * * * /* ... */
> * * * * * * return f(&u.s1, &u.s2);
> * * * * }
>
> The second code fragment is very similar to the question you ask,
> and is clearly identified as not a valid fragment. *Because of
> the similarity it seems safe to conclude that the example code
> you gave (and other similar sorts of code) is also not valid.
> Make sense?


Yes it does.

What it doesn't anymore is the description of:

- type punning described by wikipedia
- the 'strict-aliasing' option in GCC man page

The Berkeley sockets interface, which is given as example of type
punning by: http://en.wikipedia.org/wiki/Type_punning, relies on
undefined behaviours:

struct sockaddr_in sa = {0};
[...]
bind(sockfd, (struct sockaddr *)&sa, sizeof sa);

Regarding the 'strict-aliasing' rule, as you said before (and you
proved me), 'strict aliasing rules' and 'standard aliasing rules' (the
one define by the C standards) are the same.

Passing '-fno-strict-aliasing' to GCC allows to generate no conformant
programs because it seems from the man page that GCC allows more
aliasing cases than the ones defined by the standards. Which ones
exactly, I don't know. Futhermore 'strict-aliasing' option is either
ON or OFF depending on the optimisation level. So passing '-O0' to GCC
can generate a program which behaves differently when compiling with '-
O2'.

 
Reply With Quote
 
 
 
 
Tim Rentsch
Guest
Posts: n/a
 
      07-29-2010
Francis Moreau <(E-Mail Removed)> writes:

> On Jul 27, 5:39 pm, Tim Rentsch <(E-Mail Removed)> wrote:
>> Francis Moreau <(E-Mail Removed)> writes:
>> > How about this new example, which should make you happier regarding
>> > the '->' operator:

>>
>> > struct A { int a; };
>> > struct B { int b; };

>>
>> > struct A *p;
>> > struct B b_object;

>>
>> > p = (struct A *)&b_object;
>> > p->a = 1;

>>
>> > As you can see, there're no more alignment issues since (in my
>> > understanding) all pointers to structure types have the same
>> > representation and alignment requirements as each other.

>>
>> Pointers to structures have the same alignment requirements as
>> each other, but the structures themselves may have different
>> alignments, and it's the alignments of the structures that is
>> relevant here.

>
> OK.
>
>>
>> In principle it's possible for these two structs to have
>> different alignment requirements, but practically speaking it's
>> pretty unlikely, so for the sake of discussion let's ignore that
>> aspect.

>
> OK.
>
>>
>> > And (still in my understanding), there's no more undefined behaviour
>> > since the '->' operator requirement is satisfied now: p now points to
>> > a structure object.

>>
>> > What do you think ?

>>
>> This question is very good.

>
> Glad to hear it
>
>> It seems like a natural thing to
>> want to do, and finding an answer isn't especially easy or
>> obvious.
>>
>> In fact, I believe the Standard means for this to be undefined
>> behavior. Unfortunately the Standard does not (IMO) address this
>> question as directly or as clearly as it should. The general
>> principle is that it's illegal to access members of one kind of
>> struct by means of member access (ie, using '.' or '->') using
>> a designator that is a different kind of struct. This principle
>> does have an exception, and that exception is spelled out
>> specifically in 6.5.2.3p5:
>>
>> One special guarantee is made in order to simplify the use
>> of unions: if a union contains several structures that
>> share a common initial sequence (see below), and if the
>> union object currently contains one of these structures, it
>> is permitted to inspect the common initial part of any of
>> them anywhere that a declaration of the complete type of the
>> union is visible. Two structures share a common initial
>> sequence if corresponding members have compatible types
>> (and, for bit-fields, the same widths) for a sequence of one
>> or more initial members.
>>
>> The argument is that, since the Standard goes to the trouble of
>> pointing out a case where accessing one struct through a different
>> type of struct is allowed, in other cases doing that is not allowed
>> (ie, is undefined behavior). This question may be worth debating
>> over in comp.std.c, because at the very least it would be nice if
>> the language were more clear. However, as far as a practical
>> reading of the Standard goes (ie, for comp.lang.c) I would say this
>> point settles the question. And, luckily, there is an example that
>> is pretty much right on point to your question, namely 6.5.2.3p8:
>>
>> EXAMPLE 3 The following is a valid fragment:
>>
>> union {
>> struct {
>> int alltypes;
>> } n;
>> struct {
>> int type;
>> int intnode;
>> } ni;
>> struct {
>> int type;
>> double doublenode;
>> } nf;
>> } u;
>> u.nf.type = 1;
>> u.nf.doublenode = 3.14;
>> /* ... */
>> if (u.n.alltypes == 1)
>> if (sin(u.nf.doublenode) == 0.0)
>> /* ... */
>>
>> The following is not a valid fragment (because the union type is
>> not visible within function f):
>>
>> struct t1 { int m; };
>> struct t2 { int m; };
>> int f(struct t1 *p1, struct t2 *p2)
>> {
>> if (p1->m < 0)
>> p2->m = -p2->m;
>> return p1->m;
>> }
>> int g()
>> {
>> union {
>> struct t1 s1;
>> struct t2 s2;
>> } u;
>> /* ... */
>> return f(&u.s1, &u.s2);
>> }
>>
>> The second code fragment is very similar to the question you ask,
>> and is clearly identified as not a valid fragment. Because of
>> the similarity it seems safe to conclude that the example code
>> you gave (and other similar sorts of code) is also not valid.
>> Make sense?

>
> Yes it does.
>
> What it doesn't anymore is the description of:
>
> - type punning described by wikipedia


Some information in this wikipedia page is wrong. More
specifically, the reference about using union types uses a
non-normative statement from Annex J, which is misleading about
what exactly when unspecified values come into play. The actual
normative text is in 6.2.6.1p6:

When a value is stored in an object of structure or union
type, including in a member object, the bytes of the object
representation that correspond to any padding bytes take
unspecified values.

So it's only reading another union member _larger_ than the last
one written that unspecified values enter the picture. Reading a
union member that is smaller or the same size as the last one
written must re-interpret the previously written bytes under the
read member's type. Near the bottom of the Wikipedia page there is
a link for DR 257 -- if someone had followed up reading this DR and
the other DR's it mentions, they would see the text (now present in
a footnote in n1256) saying that this re-interpretation (and also
explaining it as "type punning") is what will happen. Maybe I
should put a note on the Wikipedia page about that and see if
anyone follows up on it... (No promises!)

> - the 'strict-aliasing' option in GCC man page


I'll have more to say about that in a moment.

> The Berkeley sockets interface, which is given as example of type
> punning by: http://en.wikipedia.org/wiki/Type_punning, relies on
> undefined behaviours:
>
> struct sockaddr_in sa = {0};
> [...]
> bind(sockfd, (struct sockaddr *)&sa, sizeof sa);


The discussion of this on the Wikipedia page is somewhat shallow.
In fact it's impossible to tell, based on information in the
Wikipedia page, if there is undefined behavior or not (not counting
the question of alignment of the two structs). Certainly the
'bind()' function could be written in a way so that there is
no undefined behavior. The reason is, casting one pointer type
to another (assuming no alignment problems) isn't by itself
undefined behavior; the question is what happens with that other
pointer, /and that isn't shown in the example/. It's quite
straightforward to write 'bind()' so that no undefined behavior
will occur. So the example is, IMO, misleadingly incomplete. Or
maybe both misleading and incomplete.


> Regarding the 'strict-aliasing' rule, as you said before (and you
> proved me), 'strict aliasing rules' and 'standard aliasing rules' (the
> one define by the C standards) are the same.


I have since revised my opinion on this. I don't know whether
gcc's -fstrict-aliasing is meant actually to invalidate some cases
of effective type rules, but I believe it does mean at least to
push into some gray areas of the effective type rules. In other
words there are some plausible interpretations of what effective
type rules require that using -fstrict-aliasing will fail to meet
in some cases. This makes it all the more important that exactly
what rules are followed by -fstrict-aliasing be documented precisely
so people know just what to expect.


> Passing '-fno-strict-aliasing' to GCC allows to generate no conformant
> programs because it seems from the man page that GCC allows more
> aliasing cases than the ones defined by the standards. Which ones
> exactly, I don't know. Futhermore 'strict-aliasing' option is either
> ON or OFF depending on the optimisation level. So passing '-O0' to GCC
> can generate a program which behaves differently when compiling with '-
> O2'.


Actually it's the other way around. Using -fno-strict-aliasing
means _more_ cases alias, which would bring the compiler more
into conformance, not less. Using -fstrict-aliasing means some
cases that would satisfy the more tolerant aliasing rules now do
not, so those cases would be assumed to be independent, which means
more code motion, and a greater chance for non-conformance.

The point about strict-aliasing depending on optimization level
is a good one. On hearing about it I'm not really surprised,
but it's exactly the sort of thing that might bite someone if
they weren't tuned in to the possibility. Yow!
 
Reply With Quote
 
 
 
 
Francis Moreau
Guest
Posts: n/a
 
      07-29-2010
On Jul 29, 7:01*am, Tim Rentsch <(E-Mail Removed)> wrote:
> Francis Moreau <(E-Mail Removed)> writes:


[snip]

>
> > What it doesn't anymore is the description of:

>
> > * *- type punning described by wikipedia

>
> Some information in this wikipedia page is wrong. *More
> specifically, the reference about using union types uses a
> non-normative statement from Annex J, which is misleading about
> what exactly when unspecified values come into play. *The actual
> normative text is in 6.2.6.1p6:
>
> * * When a value is stored in an object of structure or union
> * * type, including in a member object, the bytes of the object
> * * representation that correspond to any padding bytes take
> * * unspecified values.
>
> So it's only reading another union member _larger_ than the last
> one written that unspecified values enter the picture. *Reading a
> union member that is smaller or the same size as the last one
> written must re-interpret the previously written bytes under the
> read member's type. *Near the bottom of the Wikipedia page there is
> a link for DR 257 -- if someone had followed up reading this DR and
> the other DR's it mentions, they would see the text (now present in
> a footnote in n1256) saying that this re-interpretation (and also
> explaining it as "type punning") is what will happen. *Maybe I
> should put a note on the Wikipedia page about that and see if
> anyone follows up on it... *(No promises!)
>
> > * *- the 'strict-aliasing' option in GCC man page

>
> I'll have more to say about that in a moment.
>
> > The Berkeley sockets interface, which is given as example of type
> > punning by:http://en.wikipedia.org/wiki/Type_punning, relies on
> > undefined behaviours:

>
> > * *struct sockaddr_in sa = {0};
> > * *[...]
> > * *bind(sockfd, (struct sockaddr *)&sa, sizeof sa);

>
> The discussion of this on the Wikipedia page is somewhat shallow.
> In fact it's impossible to tell, based on information in the
> Wikipedia page, if there is undefined behavior or not (not counting
> the question of alignment of the two structs).


BTW, one question I'm wondering is: does a structure have the same
alignment requirement as its first member (since there's no padding at
the beginning of the structure) ?

> *Certainly the
> 'bind()' function could be written in a way so that there is
> no undefined behavior.


Well, actually looking at it more closely, there's no alias issue at
all as long as 'bind()' is not implemented as a macro.

> *The reason is, casting one pointer type
> to another (assuming no alignment problems) isn't by itself
> undefined behavior; *the question is what happens with that other
> pointer, /and that isn't shown in the example/. *It's quite
> straightforward to write 'bind()' so that no undefined behavior
> will occur.


Let's assume that 'bind()' will dereference the passed pointer...

> *So the example is, IMO, misleadingly incomplete. *Or
> maybe both misleading and incomplete.


Yes but you told me:

"""
The general principle is that it's illegal to access members of one
kind of struct by means of member access (ie, using '.' or '->') using
a designator that is a different kind of struct. This principle does
have an exception, and that exception is spelled out specifically in
6.5.2.3p5:
"""

and in the other hand, wikipedia claims:

"""
The Berkeley sockets library fundamentally relies on the fact that in
C, a pointer to struct sockaddr_in is freely convertible to a pointer
to struct sockaddr; and, in addition, that the two structure types
share the same memory layout. Therefore, a reference to the structure
field my_addr->sin_family (where my_addr is of type struct sockaddr*)
will actually refer to the field sa.sin_family (where sa is of type
struct sockaddr_in).
"""

and as I assumed before, it's most likely that 'bind()' will
dereference 'my_addr', hence whatever the implementation of 'bind()',
it will invoke undefined behavior.

>
> > Regarding the 'strict-aliasing' rule, as you said before (and you
> > proved me), 'strict aliasing rules' and 'standard aliasing rules' (the
> > one define by the C standards) are the same.

>
> I have since revised my opinion on this.


Interesting,

> *I don't know whether
> gcc's -fstrict-aliasing is meant actually to invalidate some cases
> of effective type rules, but I believe it does mean at least to
> push into some gray areas of the effective type rules. *In other
> words there are some plausible interpretations of what effective
> type rules require that using -fstrict-aliasing will fail to meet
> in some cases. *This makes it all the more important that exactly
> what rules are followed by -fstrict-aliasing be documented precisely
> so people know just what to expect.


Which newsgroup/mailing-list is best for asking clarifications, in
your opinion ?

>
> > Passing '-fno-strict-aliasing' to GCC allows to generate no conformant
> > programs because it seems from the man page that GCC allows more
> > aliasing cases than the ones defined by the standards. Which ones
> > exactly, I don't know. Futhermore 'strict-aliasing' option is either
> > ON or OFF depending on the optimisation level. So passing '-O0' to GCC
> > can generate a program which behaves differently when compiling with '-
> > O2'.

>
> Actually it's the other way around. *Using -fno-strict-aliasing
> means _more_ cases alias, which would bring the compiler more
> into conformance, not less.


So you mean that these more cases alias have a defined behavior
according to the standard, right ?

> *Using -fstrict-aliasing means some
> cases that would satisfy the more tolerant aliasing rules now do
> not, so those cases would be assumed to be independent, which means
> more code motion, and a greater chance for non-conformance.
>


Well, that's what I thought before we started our discussion.

I tried to find alias cases which would be only allowed by '-fno-
strict-aliasing' but you proved me that all of them have undefined
behavior.
 
Reply With Quote
 
Tim Rentsch
Guest
Posts: n/a
 
      07-29-2010
Francis Moreau <(E-Mail Removed)> writes:

> On Jul 29, 7:01 am, Tim Rentsch <(E-Mail Removed)> wrote:
>> Francis Moreau <(E-Mail Removed)> writes:

>
> [snip]
>
>>
>> > What it doesn't anymore is the description of:

>>
>> > - type punning described by wikipedia

>>
>> Some information in this wikipedia page is wrong. More
>> specifically, the reference about using union types uses a
>> non-normative statement from Annex J, which is misleading about
>> what exactly when unspecified values come into play. The actual
>> normative text is in 6.2.6.1p6:
>>
>> When a value is stored in an object of structure or union
>> type, including in a member object, the bytes of the object
>> representation that correspond to any padding bytes take
>> unspecified values.
>>
>> So it's only reading another union member _larger_ than the last
>> one written that unspecified values enter the picture. Reading a
>> union member that is smaller or the same size as the last one
>> written must re-interpret the previously written bytes under the
>> read member's type. Near the bottom of the Wikipedia page there is
>> a link for DR 257 -- if someone had followed up reading this DR and
>> the other DR's it mentions, they would see the text (now present in
>> a footnote in n1256) saying that this re-interpretation (and also
>> explaining it as "type punning") is what will happen. Maybe I
>> should put a note on the Wikipedia page about that and see if
>> anyone follows up on it... (No promises!)
>>
>> > - the 'strict-aliasing' option in GCC man page

>>
>> I'll have more to say about that in a moment.
>>
>> > The Berkeley sockets interface, which is given as example of type
>> > punning by:http://en.wikipedia.org/wiki/Type_punning, relies on
>> > undefined behaviours:

>>
>> > struct sockaddr_in sa = {0};
>> > [...]
>> > bind(sockfd, (struct sockaddr *)&sa, sizeof sa);

>>
>> The discussion of this on the Wikipedia page is somewhat shallow.
>> In fact it's impossible to tell, based on information in the
>> Wikipedia page, if there is undefined behavior or not (not counting
>> the question of alignment of the two structs).

>
> BTW, one question I'm wondering is: does a structure have the same
> alignment requirement as its first member (since there's no padding at
> the beginning of the structure) ?


The alignment requirement of a structure must be at least as
restrictive as the alignment requirements of all of its members,
including the first. So the alignment of a structure will be
some integer multiple (probably 1, but it can be larger than 1)
of the GCD of the alignments of its members.


>> Certainly the
>> 'bind()' function could be written in a way so that there is
>> no undefined behavior.

>
> Well, actually looking at it more closely, there's no alias issue at
> all as long as 'bind()' is not implemented as a macro.


As far as the Standard is concerned, it doesn't matter whether
'bind()' is called or expanded as a macro; in either case if it
uses an undefined construct the result is undefined behavior.


>> The reason is, casting one pointer type
>> to another (assuming no alignment problems) isn't by itself
>> undefined behavior; the question is what happens with that other
>> pointer, /and that isn't shown in the example/. It's quite
>> straightforward to write 'bind()' so that no undefined behavior
>> will occur.

>
> Let's assume that 'bind()' will dereference the passed pointer...


Okay but there is more to the question... see below.

>> So the example is, IMO, misleadingly incomplete. Or
>> maybe both misleading and incomplete.

>
> Yes but you told me:
>
> """
> The general principle is that it's illegal to access members of one
> kind of struct by means of member access (ie, using '.' or '->') using
> a designator that is a different kind of struct. This principle does
> have an exception, and that exception is spelled out specifically in
> 6.5.2.3p5:
> """
>
> and in the other hand, wikipedia claims:
>
> """
> The Berkeley sockets library fundamentally relies on the fact that in
> C, a pointer to struct sockaddr_in is freely convertible to a pointer
> to struct sockaddr; and, in addition, that the two structure types
> share the same memory layout. Therefore, a reference to the structure
> field my_addr->sin_family (where my_addr is of type struct sockaddr*)
> will actually refer to the field sa.sin_family (where sa is of type
> struct sockaddr_in).
> """
>
> and as I assumed before, it's most likely that 'bind()' will
> dereference 'my_addr', hence whatever the implementation of 'bind()',
> it will invoke undefined behavior.


Consider two possible ways this could be done (both in bind):

my_addr->sin_family

or

(unsigned short *)my_addr /* or fancier, using offsetof, etc */

The first of these is (technically) undefined behavior -- one
kind of struct is being accessed as another. The second is not
undefined behavior, because the access is done directly which is
allowed under effective type rules (this assumes that the struct
alignment requirements are satisfied, and that the respective
offsets match up, but these conditions are implementation-defined
at worst; in fact the sin_family field is the first member in
these two structs so the offsets are guaranteed to match).

This is why I say that whether there is undefined behavior
depends on the definition of bind().


>> > Regarding the 'strict-aliasing' rule, as you said before (and you
>> > proved me), 'strict aliasing rules' and 'standard aliasing rules' (the
>> > one define by the C standards) are the same.

>>
>> I have since revised my opinion on this.

>
> Interesting,
>
>> I don't know whether
>> gcc's -fstrict-aliasing is meant actually to invalidate some cases
>> of effective type rules, but I believe it does mean at least to
>> push into some gray areas of the effective type rules. In other
>> words there are some plausible interpretations of what effective
>> type rules require that using -fstrict-aliasing will fail to meet
>> in some cases. This makes it all the more important that exactly
>> what rules are followed by -fstrict-aliasing be documented precisely
>> so people know just what to expect.

>
> Which newsgroup/mailing-list is best for asking clarifications, in
> your opinion ?


I don't really know. A good place to start is probably going
to google groups and searching for 'gcc' or 'gnu gcc' under
the 'search for a group' box. There may be a developers or
development-related mailing list for gcc, maybe that could
be found by a 'gcc mail list' query on google proper? I see
there is a 'gnu.gcc.help', that's probably a good group to
ask this question again. These are all just guesses, of course.


>> > Passing '-fno-strict-aliasing' to GCC allows to generate no conformant
>> > programs because it seems from the man page that GCC allows more
>> > aliasing cases than the ones defined by the standards. Which ones
>> > exactly, I don't know. Futhermore 'strict-aliasing' option is either
>> > ON or OFF depending on the optimisation level. So passing '-O0' to GCC
>> > can generate a program which behaves differently when compiling with '-
>> > O2'.

>>
>> Actually it's the other way around. Using -fno-strict-aliasing
>> means _more_ cases alias, which would bring the compiler more
>> into conformance, not less.

>
> So you mean that these more cases alias have a defined behavior
> according to the standard, right ?


Not exactly. The Standard requires some set of cases to alias
each other in a well-defined way. If the set of cases that
an implementation interprets as allowed aliasing is a superset
of the Standard-specified set, the implementation is conforming
as far as aliasing goes. But if the implementation-allowed set is
only a subset of the Standard-specified set, then there will be
some cases where behavior is defined as far as the Standard is
concerned, but the implementation might rearrange code in a way
that invalidates the Standard's specified semantics.

Getting back to your question, it isn't that more cases have
defined behavior, it's that more cases will behave in the
same way that a simple implementation would. The same behaviors
are defined in either cases; what changes is what behaviors
that the Standard considers non-defined will become predictable
under the more tolerant set of aliasing cases.


>> Using -fstrict-aliasing means some
>> cases that would satisfy the more tolerant aliasing rules now do
>> not, so those cases would be assumed to be independent, which means
>> more code motion, and a greater chance for non-conformance.
>>

>
> Well, that's what I thought before we started our discussion.
>
> I tried to find alias cases which would be only allowed by '-fno-
> strict-aliasing' but you proved me that all of them have undefined
> behavior.


The gcc man page gives an example involving unions. I'm not
sure if the Standard means for this case to be undefined
behavior or not, but it is at least a gray area. So that may
be a starting point for you.

I have a question for you. Is your interest in strict
aliasing just academic, or do you expect it to make a
difference in some actual development you're involved
with? There may be a better way to solve the underlying
problem if there is a more specific problem to solve.
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      07-29-2010
Tim Rentsch <(E-Mail Removed)> writes:
[...]
> The alignment requirement of a structure must be at least as
> restrictive as the alignment requirements of all of its members,
> including the first. So the alignment of a structure will be
> some integer multiple (probably 1, but it can be larger than 1)
> of the GCD of the alignments of its members.

[...]

I think you meant LCM (least common multiple), not GCD (greatest
common divisor). For example, if two members of a struct have
alignments 4 and 6, the alignment of the struct is at least 12.

In practice, on most systems, all alignments are powers of two,
so the alignment of a struct is merely at least the largest of any
of its members' alignments.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Tim Rentsch
Guest
Posts: n/a
 
      07-29-2010
Keith Thompson <(E-Mail Removed)> writes:

> Tim Rentsch <(E-Mail Removed)> writes:
> [...]
>> The alignment requirement of a structure must be at least as
>> restrictive as the alignment requirements of all of its members,
>> including the first. So the alignment of a structure will be
>> some integer multiple (probably 1, but it can be larger than 1)
>> of the GCD of the alignments of its members.

> [...]
>
> I think you meant LCM (least common multiple), not GCD (greatest
> common divisor). For example, if two members of a struct have
> alignments 4 and 6, the alignment of the struct is at least 12.


Sorry, you are right of course. Thank you.
 
Reply With Quote
 
Francis Moreau
Guest
Posts: n/a
 
      08-04-2010
[ sorry for the late answer, I was off during these last days ]

On Jul 29, 6:08*pm, Tim Rentsch <(E-Mail Removed)> wrote:
> Francis Moreau <(E-Mail Removed)> writes:


[ snip]

> > BTW, one question I'm wondering is: does a structure have the same
> > alignment requirement as its first member (since there's no padding at
> > the beginning of the structure) ?

>
> The alignment requirement of a structure must be at least as
> restrictive as the alignment requirements of all of its members,
> including the first.


Could you point out the revelant section in the standard ?

> >> *Certainly the
> >> 'bind()' function could be written in a way so that there is
> >> no undefined behavior.

>
> > Well, actually looking at it more closely, there's no alias issue at
> > all as long as 'bind()' is not implemented as a macro.

>
> As far as the Standard is concerned, it doesn't matter whether
> 'bind()' is called or expanded as a macro; *in either case if it
> uses an undefined construct the result is undefined behavior.
>


I was thinking about the alias thing only, but you're right.

> >> *The reason is, casting one pointer type
> >> to another (assuming no alignment problems) isn't by itself
> >> undefined behavior; *the question is what happens with that other
> >> pointer, /and that isn't shown in the example/. *It's quite
> >> straightforward to write 'bind()' so that no undefined behavior
> >> will occur.

>
> > Let's assume that 'bind()' will dereference the passed pointer...

>
> Okay but there is more to the question... *see below.
>
>
>
>
>
> >> *So the example is, IMO, misleadingly incomplete. *Or
> >> maybe both misleading and incomplete.

>
> > Yes but you told me:

>
> > """
> > The general principle is that it's illegal to access members of one
> > kind of struct by means of member access (ie, using '.' or '->') using
> > a designator that is a different kind of struct. This principle does
> > have an exception, and that exception is spelled out specifically in
> > 6.5.2.3p5:
> > """

>
> > and in the other hand, wikipedia claims:

>
> > """
> > The Berkeley sockets library fundamentally relies on the fact that in
> > C, a pointer to struct sockaddr_in is freely convertible to a pointer
> > to struct sockaddr; and, in addition, that the two structure types
> > share the same memory layout. Therefore, a reference to the structure
> > field my_addr->sin_family (where my_addr is of type struct sockaddr*)
> > will actually refer to the field sa.sin_family (where sa is of type
> > struct sockaddr_in).
> > """

>
> > and as I assumed before, it's most likely that 'bind()' will
> > dereference 'my_addr', hence whatever the implementation of 'bind()',
> > it will invoke undefined behavior.

>
> Consider two possible ways this could be done (both in bind):
>
> * * my_addr->sin_family
>
> or
>
> * * (unsigned short *)my_addr */* or fancier, using offsetof, etc */
>
> The first of these is (technically) undefined behavior -- one
> kind of struct is being accessed as another. *The second is not
> undefined behavior, because the access is done directly which is
> allowed under effective type rules (this assumes that the struct
> alignment requirements are satisfied, and that the respective
> offsets match up, but these conditions are implementation-defined
> at worst; *in fact the sin_family field is the first member in
> these two structs so the offsets are guaranteed to match).


okay.

> >> > Passing '-fno-strict-aliasing' to GCC allows to generate no conformant
> >> > programs because it seems from the man page that GCC allows more
> >> > aliasing cases than the ones defined by the standards. Which ones
> >> > exactly, I don't know. Futhermore 'strict-aliasing' option is either
> >> > ON or OFF depending on the optimisation level. So passing '-O0' to GCC
> >> > can generate a program which behaves differently when compiling with '-
> >> > O2'.

>
> >> Actually it's the other way around. *Using -fno-strict-aliasing
> >> means _more_ cases alias, which would bring the compiler more
> >> into conformance, not less.

>
> > So you mean that these more cases alias have a defined behavior
> > according to the standard, right ?

>
> Not exactly. *The Standard requires some set of cases to alias
> each other in a well-defined way. *If the set of cases that
> an implementation interprets as allowed aliasing is a superset
> of the Standard-specified set, the implementation is conforming
> as far as aliasing goes. *But if the implementation-allowed set is
> only a subset of the Standard-specified set, then there will be
> some cases where behavior is defined as far as the Standard is
> concerned, but the implementation might rearrange code in a way
> that invalidates the Standard's specified semantics.
>
> Getting back to your question, it isn't that more cases have
> defined behavior, it's that more cases will behave in the
> same way that a simple implementation would. *The same behaviors
> are defined in either cases; *what changes is what behaviors
> that the Standard considers non-defined will become predictable
> under the more tolerant set of aliasing cases.


But does that bring the compiler more into conformance ?

>
> >> *Using -fstrict-aliasing means some
> >> cases that would satisfy the more tolerant aliasing rules now do
> >> not, so those cases would be assumed to be independent, which means
> >> more code motion, and a greater chance for non-conformance.

>
> > Well, that's what I thought before we started our discussion.

>
> > I tried to find alias cases which would be only allowed by '-fno-
> > strict-aliasing' but you proved me that all of them have undefined
> > behavior.

>
> The gcc man page gives an example involving unions. *I'm not
> sure if the Standard means for this case to be undefined
> behavior or not, but it is at least a gray area. *So that may
> be a starting point for you.


Why are you not sure ?

I used structures in my examples but I think the arguments you gave me
still hold for unions.

Regarding this example given by the man page:

int f() {
double d = 3.0;
return ((union a_union *) &d)->i;
}

is not defined by the standard since '&d' doesn't point to a union
object.

Regarding this one:

int f() {
union a_union t;
int *ip;
t.d = 3.0;
ip = &t.i;
return *ip;
}

well I don't know, but I'm pretty sure you can come up with arguments
that make it undefined behavior

>
> I have a question for you. *Is your interest in strict
> aliasing just academic, or do you expect it to make a
> difference in some actual development you're involved
> with? *There may be a better way to solve the underlying
> problem if there is a more specific problem to solve.


No my interest is just academic, I try to avoid using aliasing stuffs
as far I'm concerned.
 
Reply With Quote
 
Tim Rentsch
Guest
Posts: n/a
 
      08-04-2010
Francis Moreau <(E-Mail Removed)> writes:

> [ sorry for the late answer, I was off during these last days ]
>
> On Jul 29, 6:08 pm, Tim Rentsch <(E-Mail Removed)> wrote:
>> Francis Moreau <(E-Mail Removed)> writes:

>
> [ snip]
>>
>> The alignment requirement of a structure must be at least as
>> restrictive as the alignment requirements of all of its members,
>> including the first.

>
> Could you point out the revelant section in the standard ?


There is nothing in the Standard that says this directly. It's a
logical consequence of other requirements; more specifically,
since we can take the address of any non-bitfield member (of
either a struct or a union), the alignment requirements for
each member must translate into alignment requirements that
are at least as restrictive for the struct/union as a whole.
All the members' requirements must be met simultaneously,
hence my statement.


>> >> > Passing '-fno-strict-aliasing' to GCC allows to generate no conformant
>> >> > programs because it seems from the man page that GCC allows more
>> >> > aliasing cases than the ones defined by the standards. Which ones
>> >> > exactly, I don't know. Futhermore 'strict-aliasing' option is either
>> >> > ON or OFF depending on the optimisation level. So passing '-O0' to GCC
>> >> > can generate a program which behaves differently when compiling with '-
>> >> > O2'.

>>
>> >> Actually it's the other way around. Using -fno-strict-aliasing
>> >> means _more_ cases alias, which would bring the compiler more
>> >> into conformance, not less.

>>
>> > So you mean that these more cases alias have a defined behavior
>> > according to the standard, right ?

>>
>> Not exactly. The Standard requires some set of cases to alias
>> each other in a well-defined way. If the set of cases that
>> an implementation interprets as allowed aliasing is a superset
>> of the Standard-specified set, the implementation is conforming
>> as far as aliasing goes. But if the implementation-allowed set is
>> only a subset of the Standard-specified set, then there will be
>> some cases where behavior is defined as far as the Standard is
>> concerned, but the implementation might rearrange code in a way
>> that invalidates the Standard's specified semantics.
>>
>> Getting back to your question, it isn't that more cases have
>> defined behavior, it's that more cases will behave in the
>> same way that a simple implementation would. The same behaviors
>> are defined in either cases; what changes is what behaviors
>> that the Standard considers non-defined will become predictable
>> under the more tolerant set of aliasing cases.

>
> But does that bring the compiler more into conformance ?


Wider set of alias types allowed ==> more constraints on
code motion ==> code more in keeping with "naive" abstract
machine ==> smaller chance that code will violate expectations
of simple reading of standard. That isn't really "more into
conformance", since once it reaches the point of conforming
an implementation doesn't become "more conforming", but
I think you have the general idea. Make sense?


>> The gcc man page gives an example involving unions. I'm not
>> sure if the Standard means for this case to be undefined
>> behavior or not, but it is at least a gray area. So that may
>> be a starting point for you.

>
> Why are you not sure ?


Because text in the Standard unfortunately is not clear (or
arguably is definitely ambiguous) on this point.


> I used structures in my examples but I think the arguments you gave me
> still hold for unions.


Generally that's true.

> Regarding this example given by the man page:
>
> int f() {
> double d = 3.0;
> return ((union a_union *) &d)->i;
> }
>
> is not defined by the standard since '&d' doesn't point to a union
> object.


Right.

> Regarding this one:
>
> int f() {
> union a_union t;
> int *ip;
> t.d = 3.0;
> ip = &t.i;
> return *ip;
> }
>
> well I don't know, but I'm pretty sure you can come up with arguments
> that make it undefined behavior


I think what you mean is that someone else can come up with
arguments that it's undefined behavior (and I believe that's
right, someone else can). Speaking pragmatically I'm content
to just label it a gray area.


>> I have a question for you. Is your interest in strict
>> aliasing just academic, or do you expect it to make a
>> difference in some actual development you're involved
>> with? There may be a better way to solve the underlying
>> problem if there is a more specific problem to solve.

>
> No my interest is just academic, I try to avoid using aliasing stuffs
> as far I'm concerned.


Okay, well I hope my answers have been interesting as well as
informative.
 
Reply With Quote
 
Francis Moreau
Guest
Posts: n/a
 
      08-06-2010
On Aug 4, 8:04*pm, Tim Rentsch <(E-Mail Removed)> wrote:
> Francis Moreau <(E-Mail Removed)> writes:
> > [ sorry for the late answer, I was off during these last days ]

>
> > On Jul 29, 6:08 pm, Tim Rentsch <(E-Mail Removed)> wrote:
> >> Francis Moreau <(E-Mail Removed)> writes:

>
> > [ snip]

>
> >> The alignment requirement of a structure must be at least as
> >> restrictive as the alignment requirements of all of its members,
> >> including the first.

>
> > Could you point out the revelant section in the standard ?

>
> There is nothing in the Standard that says this directly. *It's a
> logical consequence of other requirements; *more specifically,
> since we can take the address of any non-bitfield member (of
> either a struct or a union), the alignment requirements for
> each member must translate into alignment requirements that
> are at least as restrictive for the struct/union as a whole.
> All the members' requirements must be met simultaneously,
> hence my statement.


Ok.

> >> >> > Passing '-fno-strict-aliasing' to GCC allows to generate no conformant
> >> >> > programs because it seems from the man page that GCC allows more
> >> >> > aliasing cases than the ones defined by the standards. Which ones
> >> >> > exactly, I don't know. Futhermore 'strict-aliasing' option is either
> >> >> > ON or OFF depending on the optimisation level. So passing '-O0' to GCC
> >> >> > can generate a program which behaves differently when compiling with '-
> >> >> > O2'.

>
> >> >> Actually it's the other way around. *Using -fno-strict-aliasing
> >> >> means _more_ cases alias, which would bring the compiler more
> >> >> into conformance, not less.

>
> >> > So you mean that these more cases alias have a defined behavior
> >> > according to the standard, right ?

>
> >> Not exactly. *The Standard requires some set of cases to alias
> >> each other in a well-defined way. *If the set of cases that
> >> an implementation interprets as allowed aliasing is a superset
> >> of the Standard-specified set, the implementation is conforming
> >> as far as aliasing goes. *But if the implementation-allowed set is
> >> only a subset of the Standard-specified set, then there will be
> >> some cases where behavior is defined as far as the Standard is
> >> concerned, but the implementation might rearrange code in a way
> >> that invalidates the Standard's specified semantics.

>
> >> Getting back to your question, it isn't that more cases have
> >> defined behavior, it's that more cases will behave in the
> >> same way that a simple implementation would. *The same behaviors
> >> are defined in either cases; *what changes is what behaviors
> >> that the Standard considers non-defined will become predictable
> >> under the more tolerant set of aliasing cases.

>
> > But does that bring the compiler more into conformance ?

>
> Wider set of alias types allowed ==> more constraints on
> code motion ==> code more in keeping with "naive" abstract
> machine ==> smaller chance that code will violate expectations
> of simple reading of standard. *That isn't really "more into
> conformance", since once it reaches the point of conforming
> an implementation doesn't become "more conforming", but
> I think you have the general idea. *Make sense?


yes, but I was confused by the "more into conformance" term you used
previously.

>
> >> The gcc man page gives an example involving unions. *I'm not
> >> sure if the Standard means for this case to be undefined
> >> behavior or not, but it is at least a gray area. *So that may
> >> be a starting point for you.

>
> > Why are you not sure ?

>
> Because text in the Standard unfortunately is not clear (or
> arguably is definitely ambiguous) on this point.
>
> > I used structures in my examples but I think the arguments you gave me
> > still hold for unions.

>
> Generally that's true.
>
> > Regarding this example given by the man page:

>
> > * * * int f() {
> > * * * * double d = 3.0;
> > * * * * return ((union a_union *) &d)->i;
> > * * * }

>
> > is not defined by the standard since '&d' doesn't point to a union
> > object.

>
> Right.
>
> > Regarding this one:

>
> > * * * int f() {
> > * * * * union a_union t;
> > * * * * int *ip;
> > * * * * t.d = 3.0;
> > * * * * ip = &t.i;
> > * * * * return *ip;
> > * * * }

>
> > well I don't know, but I'm pretty sure you can come up with arguments
> > that make it undefined behavior

>
> I think what you mean is that someone else can come up with
> arguments that it's undefined behavior (and I believe that's
> right, someone else can). *Speaking pragmatically I'm content
> to just label it a gray area.
>
> >> I have a question for you. *Is your interest in strict
> >> aliasing just academic, or do you expect it to make a
> >> difference in some actual development you're involved
> >> with? *There may be a better way to solve the underlying
> >> problem if there is a more specific problem to solve.

>
> > No my interest is just academic, I try to avoid using aliasing stuffs
> > as far I'm concerned.

>
> Okay, well I hope my answers have been interesting as well as
> informative.


as I said previously, definitively yes.

So I'll try to clarify which alias cases (that the standard considers
non defined) will become predictable with the '-fno-strict-aliasing'
switch.

But that's going to happen after a long rest

Thanks.
 
Reply With Quote
 
Tim Rentsch
Guest
Posts: n/a
 
      08-07-2010
Francis Moreau <(E-Mail Removed)> writes:

> On Aug 4, 8:04 pm, Tim Rentsch <(E-Mail Removed)> wrote:
>> Francis Moreau <(E-Mail Removed)> writes:
>> > [ sorry for the late answer, I was off during these last days ]

>>
>> > On Jul 29, 6:08 pm, Tim Rentsch <(E-Mail Removed)> wrote:

[snip]
>> >> >> Actually it's the other way around. Using -fno-strict-aliasing
>> >> >> means _more_ cases alias, which would bring the compiler more
>> >> >> into conformance, not less.

>>
>> >> > So you mean that these more cases alias have a defined behavior
>> >> > according to the standard, right ?

>>
>> >> Not exactly. The Standard requires some set of cases to alias
>> >> each other in a well-defined way. If the set of cases that
>> >> an implementation interprets as allowed aliasing is a superset
>> >> of the Standard-specified set, the implementation is conforming
>> >> as far as aliasing goes. But if the implementation-allowed set is
>> >> only a subset of the Standard-specified set, then there will be
>> >> some cases where behavior is defined as far as the Standard is
>> >> concerned, but the implementation might rearrange code in a way
>> >> that invalidates the Standard's specified semantics.

>>
>> >> Getting back to your question, it isn't that more cases have
>> >> defined behavior, it's that more cases will behave in the
>> >> same way that a simple implementation would. The same behaviors
>> >> are defined in either cases; what changes is what behaviors
>> >> that the Standard considers non-defined will become predictable
>> >> under the more tolerant set of aliasing cases.

>>
>> > But does that bring the compiler more into conformance ?

>>
>> Wider set of alias types allowed ==> more constraints on
>> code motion ==> code more in keeping with "naive" abstract
>> machine ==> smaller chance that code will violate expectations
>> of simple reading of standard. That isn't really "more into
>> conformance", since once it reaches the point of conforming
>> an implementation doesn't become "more conforming", but
>> I think you have the general idea. Make sense?

>
> yes, but I was confused by the "more into conformance" term you used
> previously.


Not the best choice of wording on my part. Perhaps "further
in the direction of being conforming" would be better, or
"more directly aligned with semantics on the abstract machine".
Oh well, at least we finally got there.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Binary tree search vs Binary search Bogdan C Programming 22 10-21-2010 09:46 PM
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 2 10-31-2007 02:58 AM
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 1 10-30-2007 11:01 PM
Hi, I want to implement a General Tree Data structure (Not Binary Tree ) which have more than 2 sub nodes? sharan C Programming 4 10-30-2007 08:21 PM
B tree, B+ tree and B* tree Stub C Programming 3 11-12-2003 01:51 PM



Advertisments