Velocity Reviews > Can *common* struct-members of 2 different struct-types, that are thesame for the first common members, be accessed via pointer cast to either struct-type?

# Can *common* struct-members of 2 different struct-types, that are thesame for the first common members, be accessed via pointer cast to either struct-type?

John Reye
Guest
Posts: n/a

 05-02-2012
Assume identical common top struct members:

struct a {
int i1;
char c1;
short sa1[3];
};

struct b {
int i2;
char c2;
short sa2[3];

int differ_here;
};

struct a tmp;

Are the following 2 line always equivalent (as in: yielding the same
lvalue) and allowed:

tmp.c1
((struct b*)&tmp)->c1

Thanks.
- John.

James Kuyper
Guest
Posts: n/a

 05-02-2012
On 05/02/2012 12:06 PM, John Reye wrote:
> Assume identical common top struct members:
>
> struct a {
> int i1;
> char c1;
> short sa1[3];
> };
>
> struct b {
> int i2;
> char c2;
> short sa2[3];
>
> int differ_here;
> };
>
>
> struct a tmp;
>
>
>
> Are the following 2 line always equivalent (as in: yielding the same
> lvalue) and allowed:
>
> tmp.c1
> ((struct b*)&tmp)->c1

No, the behavior is undefined. It's not because the members might be a
different locations in the two structs; that's possible but unlikely.
The real reason is the anti-aliasing rules (6.5p7). Because those rules
make the behavior of such code undefined, the implementation is not
obligated to consider the possibility that an lvalue referring to a
"struct a" object refers to the same object as one that refers to a
"struct b" object. As a result, when implementing code such as the
following:

tmp.c1 = 1;
printf("%d\n", ((struct b*)&tmp)->c1);

An implementation is not required to notice that the printf() is
referring to the same object as the assignment statement. As a result,
it could, for instance, defer the writing the new value to tmp.c1 until
after executing the printf() call. That's pretty unlikely in this simple
case, but gets more likely in more complicated code when aggressive
optimization is turned on.

There's a special exception that allows you to access the "common
initial sequence" of any two struct types, using either struct type,
when they are both members of the same union (6.5.2.3p6):

union ab
{
struct a one;
struct b two;
} pair;
pair.one.c1 = 1
printf("%d\n", pair.two.c1);

John Reye
Guest
Posts: n/a

 05-02-2012
On May 2, 6:36*pm, James Kuyper wrote:
> On 05/02/2012 12:06 PM, John Reye wrote:
>
> > Assume identical common top struct members:

>
> > struct a {
> > * int i1;
> > * char c1;
> > * short sa1[3];
> > };

>
> > struct b {
> > * int i2;
> > * char c2;
> > * short sa2[3];

>
> > * int differ_here;
> > };

>
> > struct a tmp;

>
> > Are the following 2 line always equivalent (as in: yielding the same
> > lvalue) and allowed:

>
> > tmp.c1
> > ((struct b*)&tmp)->c1

>
> No, the behavior is undefined. It's not because the members might be a
> different locations in the two structs; that's possible but unlikely.
> The real reason is the anti-aliasing rules (6.5p7). Because those rules
> make the behavior of such code undefined, the implementation is not
> obligated to consider the possibility that an lvalue referring to a
> "struct a" object refers to the same object as one that refers to a
> "struct b" object. As a result, when implementing code such as the
> following:
>
> * * * * tmp.c1 = 1;
> * * * * printf("%d\n", ((struct b*)&tmp)->c1);
>
> An implementation is not required to notice that the printf() is
> referring to the same object as the assignment statement. As a result,
> it could, for instance, defer the writing the new value to tmp.c1 until
> after executing the printf() call. That's pretty unlikely in this simple
> case, but gets more likely in more complicated code when aggressive
> optimization is turned on.
>
> There's a special exception that allows you to access the "common
> initial sequence" of any two struct types, using either struct type,
> when they are both members of the same union (6.5.2.3p6):
>
> * * * * union ab
> * * * * {
> * * * * * * * * struct a one;
> * * * * * * * * struct b two;
> * * * * } pair;
> * * * * pair.one.c1 = 1
> * * * * printf("%d\n", pair.two.c1);

Ah thanks James. Your replies are always much appreciated.

By the way does the union trick also work, when the first part is
common, but the 2 diverge afterwards.
Example:

struct a {
int i1;
char c1;
short sa1[3];

char u;
};

struct b {
int i2;
char c2;
short sa2[3];

int differ_here;
};

union ab
{
struct a one;
struct b two;
} pair;

The sizeof(union ab) will be the maximum value, of course.
So some union access-members (e.g. short structs), do not give me
access to the whole union (as described by the maximum struct). Right?

James Kuyper
Guest
Posts: n/a

 05-02-2012
On 05/02/2012 01:04 PM, John Reye wrote:
> On May 2, 6:36 pm, James Kuyper wrote:
>> On 05/02/2012 12:06 PM, John Reye wrote:
>>
>>> Assume identical common top struct members:

>>
>>> struct a {
>>> int i1;
>>> char c1;
>>> short sa1[3];
>>> };

>>
>>> struct b {
>>> int i2;
>>> char c2;
>>> short sa2[3];

>>
>>> int differ_here;
>>> };

>>
>>> struct a tmp;

>>
>>> Are the following 2 line always equivalent (as in: yielding the same
>>> lvalue) and allowed:

>>
>>> tmp.c1
>>> ((struct b*)&tmp)->c1

Correction: that should be ->c2, presumably.

>> No, the behavior is undefined. It's not because the members might be a
>> different locations in the two structs; that's possible but unlikely.
>> The real reason is the anti-aliasing rules (6.5p7). Because those rules
>> make the behavior of such code undefined, the implementation is not
>> obligated to consider the possibility that an lvalue referring to a
>> "struct a" object refers to the same object as one that refers to a
>> "struct b" object. As a result, when implementing code such as the
>> following:
>>
>> tmp.c1 = 1;
>> printf("%d\n", ((struct b*)&tmp)->c1);

Same correction here.

>> An implementation is not required to notice that the printf() is
>> referring to the same object as the assignment statement. As a result,
>> it could, for instance, defer the writing the new value to tmp.c1 until
>> after executing the printf() call. That's pretty unlikely in this simple
>> case, but gets more likely in more complicated code when aggressive
>> optimization is turned on.
>>
>> There's a special exception that allows you to access the "common
>> initial sequence" of any two struct types, using either struct type,
>> when they are both members of the same union (6.5.2.3p6):
>>
>> union ab
>> {
>> struct a one;
>> struct b two;
>> } pair;
>> pair.one.c1 = 1
>> printf("%d\n", pair.two.c1);

Correction: that should have been pair.two.c2.

> Ah thanks James. Your replies are always much appreciated.
>
> By the way does the union trick also work, when the first part is
> common, but the 2 diverge afterwards.

It works for the entire initial common sequence, no matter how many
additional members either struct type has after the common part.
Corresponding members of the common sequence must have compatible types;
for bit-fields, they must also have the same width.

> Example:
>
> struct a {
> int i1;
> char c1;
> short sa1[3];
>
> char u;
> };
>
> struct b {
> int i2;
> char c2;
> short sa2[3];
>
> int differ_here;
> };
>
>
>
> union ab
> {
> struct a one;
> struct b two;
> } pair;
>
>
> The sizeof(union ab) will be the maximum value, of course.
> So some union access-members (e.g. short structs), do not give me
> access to the whole union (as described by the maximum struct). Right?

Correct.

Barry Schwarz
Guest
Posts: n/a

 05-02-2012
On Wed, 2 May 2012 09:06:22 -0700 (PDT), John Reye
<(E-Mail Removed)> wrote:

>Assume identical common top struct members:
>
>struct a {
> int i1;
> char c1;
> short sa1[3];
>};
>
>struct b {
> int i2;
> char c2;
> short sa2[3];
>
> int differ_here;
>};
>
>
>struct a tmp;
>
>
>
>Are the following 2 line always equivalent (as in: yielding the same
>lvalue) and allowed:
>
>tmp.c1
>((struct b*)&tmp)->c1

Since struct b does not contain a member c1, the second line should
produce a diagnostic.

I appear to be in the minority. If you change the second line to
((struct b*)&tmp)->c2
and if the two structures are guaranteed to have the same alignment,
then I believe the requirement in 6.5.2.3-5 (which technically only
applies if the two structures are members of a union) would force the
compiler to generate the appropriate code to yield the same lvalue.
This would probably work everywhere except the DS9000.

--
Remove del for email

James Kuyper
Guest
Posts: n/a

 05-02-2012
On 05/02/2012 03:35 PM, Barry Schwarz wrote:
> On Wed, 2 May 2012 09:06:22 -0700 (PDT), John Reye
> <(E-Mail Removed)> wrote:
>
>> Assume identical common top struct members:
>>
>> struct a {
>> int i1;
>> char c1;
>> short sa1[3];
>> };
>>
>> struct b {
>> int i2;
>> char c2;
>> short sa2[3];
>>
>> int differ_here;
>> };
>>
>>
>> struct a tmp;
>>
>>
>>
>> Are the following 2 line always equivalent (as in: yielding the same
>> lvalue) and allowed:
>>
>> tmp.c1
>> ((struct b*)&tmp)->c1

>
> Since struct b does not contain a member c1, the second line should
> produce a diagnostic.
>
> I appear to be in the minority. If you change the second line to
> ((struct b*)&tmp)->c2
> and if the two structures are guaranteed to have the same alignment,
> then I believe the requirement in 6.5.2.3-5 (which technically only
> applies if the two structures are members of a union) would force the
> compiler to generate the appropriate code to yield the same lvalue.

It might generate a retrieval using the same offset from the base of the
struct, but the key issue is whether the value it retrieves from that
location is the one that would, otherwise be considered the "current" value.

> This would probably work everywhere except the DS9000.

So the DS9000 is the only platform that would aggressively optimize
based upon the fact that a "struct a" lvalue could never, with defined
behavior, refer to the same object as a "struct b" lvalue?
I'd thought that the best modern optimizers were more aggressive than
that, at least at their highest levels.

John Reye
Guest
Posts: n/a

 05-03-2012
On May 2, 6:36*pm, James Kuyper wrote:
> the implementation is not
> obligated to consider the possibility that an lvalue referring to a
> "struct a" object refers to the same object as one that refers to a
> "struct b" object. As a result, when implementing code such as the
> following:
>
> * * * * tmp.c1 = 1;
> * * * * printf("%d\n", ((struct b*)&tmp)->c1);
>
> An implementation is not required to notice that the printf() is
> referring to the same object as the assignment statement.

Hmmm... I think that would be one heck of a rubbish compiler (or more
precisely *optimizing* compiler)!
If the standard allows that kind of stuff, then it simply is not
bullet-proof enough.

Because tmp occurs in both lines. Every compiler should notice that!

Even if I "obfuscate" like this:
tmp.c1 = 1;
char *cp = ((struct b*)&tmp);
printf("%d\n", cp->c1);

I'd expect any compiler that gets it wrong, to be complete rubbish.
Why? Because it's an optimizer BUG.
Why?
Because any simple compiler, that does not optimize... get's it right!
And if any simple compiler get's it right, then any optimization must
guarantee to get it right as well.

I mean: if the C standard allows one to create optimizers that result
in such ... ummm... surprises (read: "rubbish"), then the standard is
faulty in my eyes.

John Reye
Guest
Posts: n/a

 05-03-2012
In fact...

then any use of pointers at all would fail.
char a;
char *p;

a = 1;
*p = 2;
printf("%d", a);

This will never print 1, unless the compiler is buggy.

In the same spirit... any compiler that gets the following wrong is
buggy:

>
> tmp.c1 = 1;
> printf("%d\n", ((struct b*)&tmp)->c2);
>
> An implementation is not required to notice that the printf() is
> referring to the same object as the assignment statement. As a result,
> it could, for instance, defer the writing the new value to tmp.c1 until
> after executing the printf() call. That's pretty unlikely in this simple
> case, but gets more likely in more complicated code when aggressive
> optimization is turned on.

It's not complicated. Rather... I suspect it would be a optimizer-
compiler bug.

John Reye
Guest
Posts: n/a

 05-03-2012
Ahh on the other hand, I might have gotten carried away here.

> tmp.c1 = 1;
> printf("%d\n", ((struct b*)&tmp)->c1);

I would always avoid something like this (not because of aliasing
rules, and compiler optimization), but because of the reasons given by
the mysterious 2nd poster (copied below) and because it's completely
unnecessary!

There is simply no need to do something like this. One can always
introduce an inner struct for the common part.
So in my above arguments, I forgot that I was arguing for something
that this is very very bad style anyway... to cast from type struct,
to a different type like that. So what I said about "rubbish
compiler's" is probably completely out of context. Sorry.

mysterious 2nd poster wrote:
> The guarantee only applies to the first member. The work around is to
> make each first field itself the same struct:
>
> struct common {
> int i; char c;
> };
>
> > struct a {

>
> struct common x1;> short sa1[3];
> > };

>
> > struct b {

>
> struct common x2;
>
> > short sa2[3];

>
> > int differ_here;
> > };

>
> > tmp.x1.c
> > ((struct b*)&tmp)->x2.c

>
> &struct a = &struct a.x1
> &struct b = &struct b.x2
> so if &struct a = &struct b
> &struct a.x1 = &struct b.x2
> and typeof struct a.x1 = typeof struct b.x2 = typeof struct common
> therefore for each f in struct common
> &struct a.x1.f = &struct b.x2.f
>
> --
> My name Indigo Montoya. | R'lyeh 38o57'6.5''S 102o51'16''E.
> You flamed my father. | I'm whoever you want me to be.
> Prepare to be spanked. | Annoying Usenet one post at a time.
> Stop posting that! | At least I can stay in character.

John Reye
Guest
Posts: n/a

 05-03-2012
(where did my message go? OK I'll repost)

Ah I think I got carried away above. Sorry.

The main problem was that I was putting down possible optimizing
compilers, while the reality is, that this statement itself is bloody
bad, and should be avoided at all costs.

struct a tmp;
((struct b*)&tmp)->c2

Reason: one casts from one type to a completely different unrelated
type.
Rather one should use an common inner struct, for the common parts
within structs. That's a simple way of not getting bitten.