Velocity Reviews > Effective types of union members

# Effective types of union members

Francis Moreau
Guest
Posts: n/a

 11-09-2010
Hello,

6.5p6 gives a definition of effective type.

However I can't find an answer to this question: What's the effective
type of union's members.

For example:

union u {
int a;
float b;
};

What's the effective type of 'u.a' and 'u.b' members ?

I would say int and float respectively.

But after this:

u.b = 1.23;

does the effective type of 'u.i' is still int ?

Thanks
--
Francis

Mark Bluemel
Guest
Posts: n/a

 11-09-2010
On 11/09/2010 08:05 AM, Francis Moreau wrote:
> Hello,
>
> 6.5p6 gives a definition of effective type.
>
> However I can't find an answer to this question: What's the effective
> type of union's members.

What do you mean by "effective type"?

> For example:
>
> union u {
> int a;
> float b;
> };
>
> What's the effective type of 'u.a' and 'u.b' members ?

What are they declared as?

> I would say int and float respectively.

>
> But after this:
>
> u.b = 1.23;
>
> does the effective type of 'u.i' is still int ?

How would an assignment change a declaration?

Mark Bluemel
Guest
Posts: n/a

 11-09-2010
On 11/09/2010 09:02 AM, Mark Bluemel wrote:
> On 11/09/2010 08:05 AM, Francis Moreau wrote:
>> Hello,
>>
>> 6.5p6 gives a definition of effective type.
>>
>> However I can't find an answer to this question: What's the effective
>> type of union's members.

>
> What do you mean by "effective type"?

OK - I missed the reference to 6.5p6. But that says the effective type
of an object is the type it was declared with.

>
>> For example:
>>
>> union u {
>> int a;
>> float b;
>> };
>>
>> What's the effective type of 'u.a' and 'u.b' members ?

>
> What are they declared as?
>
>> I would say int and float respectively.

>
>
>>
>> But after this:
>>
>> u.b = 1.23;
>>
>> does the effective type of 'u.i' is still int ?

>
> How would an assignment change a declaration?

Ben Bacarisse
Guest
Posts: n/a

 11-09-2010
Francis Moreau <(E-Mail Removed)> writes:

> 6.5p6 gives a definition of effective type.
>
> However I can't find an answer to this question: What's the effective
> type of union's members.
>
> For example:
>
> union u {
> int a;
> float b;
> };
>
> What's the effective type of 'u.a' and 'u.b' members ?
>
> I would say int and float respectively.

So would I.

> But after this:
>
> u.b = 1.23;
>
> does the effective type of 'u.i' is still int ?

(you mean, presumably, u.a)

Yes. 6.5p6 seems clear on that point. Is there some wording there that
makes you think otherwise?

--
Ben.

Francis Moreau
Guest
Posts: n/a

 11-09-2010
Hello Ben,

Ben Bacarisse <(E-Mail Removed)> writes:

> Francis Moreau <(E-Mail Removed)> writes:
>
>> 6.5p6 gives a definition of effective type.
>>
>> However I can't find an answer to this question: What's the effective
>> type of union's members.
>>
>> For example:
>>
>> union u {
>> int a;
>> float b;
>> };
>>
>> What's the effective type of 'u.a' and 'u.b' members ?
>>
>> I would say int and float respectively.

>
> So would I.
>
>> But after this:
>>
>> u.b = 1.23;
>>
>> does the effective type of 'u.i' is still int ?

>
> (you mean, presumably, u.a)
>

Yes sorry for the typo.

> Yes. 6.5p6 seems clear on that point. Is there some wording there
> that makes you think otherwise?

Well a recent discussion I have on gcc-help mailing list about GCC's
to

Message-ID: <(E-Mail Removed)>

if you're interested.

Basically GCC's man page gives an example of code which might not work
as expected if '-fstrict-aliasing' switch is passed. Here's the piece of
code taken from my recent discussion:

> Looking again at the second example:
>
> union a_union {

> int i;
> double d;
> };
>
> int f() {
> union a_union t;
> int* ip;
> t.d = 3.0;
> ip = &t.i;
> return *ip;
> }
>

We agreed that this invokes undefined behaviour, but I think the reasons
why are different.

My reason was because it's reinterpreting a double as an int which gives
undefined behaviour.

The other reason is I _think_ because of aliasing issue (again taken
from the discussion):

> could you tell me what the effective type of 't.i' object ?

int, if you can say that object exists at all: it does not have a stored
value. The stored value of t is a double with value 3.0 . You can
take its address and access it via that as "double" (or "char"), or you
can access it as the union it is. You can not access it as "int".

which actually doesn't answer to the question.
--
Francis

Ben Bacarisse
Guest
Posts: n/a

 11-09-2010
Francis Moreau <(E-Mail Removed)> writes:

> Ben Bacarisse <(E-Mail Removed)> writes:

<snip>
>> Yes. 6.5p6 seems clear on that point. Is there some wording there
>> that makes you think otherwise?

>
> Well a recent discussion I have on gcc-help mailing list about GCC's
> to
>
> Message-ID: <(E-Mail Removed)>
>
> if you're interested.
>
> Basically GCC's man page gives an example of code which might not work
> as expected if '-fstrict-aliasing' switch is passed. Here's the piece of
> code taken from my recent discussion:
>
> > Looking again at the second example:
> >
> > union a_union {

> > int i;
> > double d;
> > };
> >
> > int f() {
> > union a_union t;
> > int* ip;
> > t.d = 3.0;
> > ip = &t.i;
> > return *ip;
> > }
> >
>
> We agreed that this invokes undefined behaviour, but I think the reasons
> why are different.
>
> My reason was because it's reinterpreting a double as an int which gives
> undefined behaviour.

That may or may not be undefined. If the implementation's int type has
no trap representations then all bit patterns are valid int values. In
fact the gcc manual has an example a couple of lines before this one of
re-interpreting the double as an int (by not using pointers) and it
declares that the fragment is not undefined. The gcc doc are well
placed to say this since the authors will know if there can be any trap
representations. I know this not the key issue with this code fragment
but it's important: if the aliasing rules permit the above code, then it
won't be undefined (as far as gcc is concerned) due to the
re-interpretation.

> The other reason is I _think_ because of aliasing issue (again taken
> from the discussion):
>
> > could you tell me what the effective type of 't.i' object ?

>
> int, if you can say that object exists at all: it does not have a
> stored value. The stored value of t is a double with value 3.0 .
> You can take its address and access it via that as "double" (or
> "char"), or you can access it as the union it is. You can not
> access it as "int".
>
> which actually doesn't answer to the question.

I think the problem relates to what constitutes "the object" for the
purposes of 6.5 paragraphs 6-7.

Clearly the person you are talking to thinks that t.i might not be
considered "an object" at least not one that can be accessed via an
lvalue expression of type int at this point in the code. They consider
"the object" being accessed to be t.d since that is what is stored in
the union.

This does not conflict with the previous example in the manual which is:

| union a_union {
| int i;
| double d;
| };
|
| int f() {
| union a_union t;
| t.d = 3.0;
| return t.i;
| }
|
| The practice of reading from a different union member than the one
| most recently written to (called "type-punning") is common. Even
| with -fstrict-aliasing, type-punning is allowed, provided the
| memory is accessed through the union type. So, the code above will
| work as expected.

Here, if the "the object" being accessed is t.d, then access via t.i is
valid since it is an access to "the object" (t.d) via a union that
includes, as one of its members, the effective type (double) of that
object. In the pointer case, the lvalue expression used (*ip) is not an
aggregate or union that includes double as one of its members, nor is a
lvalue expression with a type compatible with that of the "the object"
being accessed (double).

This is my best analysis of what is being claimed from what you present
here (I don't know how to access gmane via anything by the crudest
web-based interface and I have therefore not bothered to read the other

The flip-side of the argument (which I don't hold) is that "the object"
being accessed in both examples is t.i and, since that has a declared
type (and hence effective type) of int, accessing it via t.i and *ip are
both fine.

Until I hear other arguments, I'm going with the gcc documentation
partly because they are smart people who put these examples there
after careful consideration, but mainly because the intent of the
standard seems to be that a union "holds" only one object at a time.

Another argument is this: the aliasing rules would hardly permit any
conclusions at all unless my interpretation of what is being said is
correct. A function taking

void g(int *ip, double *dp);

could not conclude that *ip and *dp don't alias storage because we could
pass

g(&t.i, &t.u);

--
Ben.

Francis Moreau
Guest
Posts: n/a

 11-10-2010
Hello Ben,

Ben Bacarisse <(E-Mail Removed)> writes:

> Francis Moreau <(E-Mail Removed)> writes:
>
>> Ben Bacarisse <(E-Mail Removed)> writes:

> <snip>
>>> Yes. 6.5p6 seems clear on that point. Is there some wording there
>>> that makes you think otherwise?

>>
>> Well a recent discussion I have on gcc-help mailing list about GCC's
>> to
>>
>> Message-ID: <(E-Mail Removed)>
>>
>> if you're interested.
>>
>> Basically GCC's man page gives an example of code which might not work
>> as expected if '-fstrict-aliasing' switch is passed. Here's the piece of
>> code taken from my recent discussion:
>>
>> > Looking again at the second example:
>> >
>> > union a_union {

>> > int i;
>> > double d;
>> > };
>> >
>> > int f() {
>> > union a_union t;
>> > int* ip;
>> > t.d = 3.0;
>> > ip = &t.i;
>> > return *ip;
>> > }
>> >
>>
>> We agreed that this invokes undefined behaviour, but I think the reasons
>> why are different.
>>
>> My reason was because it's reinterpreting a double as an int which gives
>> undefined behaviour.

>
> That may or may not be undefined. If the implementation's int type has
> no trap representations then all bit patterns are valid int values. In
> fact the gcc manual has an example a couple of lines before this one of
> re-interpreting the double as an int (by not using pointers) and it
> declares that the fragment is not undefined. The gcc doc are well
> placed to say this since the authors will know if there can be any trap
> representations. I know this not the key issue with this code fragment
> but it's important: if the aliasing rules permit the above code, then it
> won't be undefined (as far as gcc is concerned) due to the
> re-interpretation.

Ok, so such type-punning is implementation defined and in this case it's
defined so only the aliasing rules is problematic.

>> The other reason is I _think_ because of aliasing issue (again taken
>> from the discussion):
>>
>> > could you tell me what the effective type of 't.i' object ?

>>
>> int, if you can say that object exists at all: it does not have a
>> stored value. The stored value of t is a double with value 3.0 .
>> You can take its address and access it via that as "double" (or
>> "char"), or you can access it as the union it is. You can not
>> access it as "int".
>>
>> which actually doesn't answer to the question.

>
> I think the problem relates to what constitutes "the object" for the
> purposes of 6.5 paragraphs 6-7.

Yes I think so.

Now, I would think that 't.i' and 't.d' are accessing both the same
object 't' whose effective type is union a_union.

> Clearly the person you are talking to thinks that t.i might not be
> considered "an object" at least not one that can be accessed via an
> lvalue expression of type int at this point in the code. They consider
> "the object" being accessed to be t.d since that is what is stored in
> the union.

Ok I can buy this but why doesn't the standard state this clearly ?

This strong assumption is only made for being 'nice' with aliasing.

> This does not conflict with the previous example in the manual which is:
>
> | union a_union {
> | int i;
> | double d;
> | };
> |
> | int f() {
> | union a_union t;
> | t.d = 3.0;
> | return t.i;
> | }
> |
> | The practice of reading from a different union member than the one
> | most recently written to (called "type-punning") is common. Even
> | with -fstrict-aliasing, type-punning is allowed, provided the
> | memory is accessed through the union type. So, the code above will
> | work as expected.
>
> Here, if the "the object" being accessed is t.d, then access via t.i is
> valid since it is an access to "the object" (t.d) via a union that
> includes, as one of its members, the effective type (double) of that
> object.

ok this kind of type-punning (using a union and its members) is defined
by the standard so all is fine so far.

> In the pointer case, the lvalue expression used (*ip) is not an
> aggregate or union that includes double as one of its members, nor is
> a lvalue expression with a type compatible with that of the "the
> object" being accessed (double).

AFAIK, there's nothing in the standard that says that the effective type
of a union object can be changed according to its last stored value, is
there ?

I would say that the effective type of the object being accessed is
always 'union a_union'.

[me reading again, again and again 6.5p[67] ...]

But there is very something wrong here...

Suppose we have this:

union A { int m1; } a;
a.m1 = 1;

this simply doesn't fit with 6.5p7, which is obviously something that
should be defined:

Let say that the effective type of the object being accessed 'a'
through its member m1 is union A. Actually with your reasoning, I
can't say what would be the effective type since the union object has
not been initialized yet.

The type of the lvalue 'a.m1' is int.

1/ Is 'int' compatible with 'union A' => NO
2/ same with qualified version => NO
3/ same with signed/unsigned => NO
4/ same with signed/unsigned and qualified version of union A => NO
5/ Is 'int' an aggregate or union type => NO
6/ Is 'int' a character type => NO

So the effective type of 'a' object can't be union... and my reasoning
doesn't hold...

Therefore you're probably right but still 6.5p6 claims:

The effective type of an object for an access to its stored value is
the declared type of the object, if any.

I hate C standard

> This is my best analysis of what is being claimed from what you present
> here (I don't know how to access gmane via anything by the crudest
> web-based interface and I have therefore not bothered to read the other
> thread -- sorry).

Just subscribe to this group using news.gmane.org server, no ?

> The flip-side of the argument (which I don't hold) is that "the object"
> being accessed in both examples is t.i and, since that has a declared
> type (and hence effective type) of int, accessing it via t.i and *ip are
> both fine.
>
> Until I hear other arguments, I'm going with the gcc documentation
> partly because they are smart people who put these examples there
> after careful consideration, but mainly because the intent of the
> standard seems to be that a union "holds" only one object at a time.

Are there any plan to clarify these points in the future version ?

Thanks
--
Francis

Ben Bacarisse
Guest
Posts: n/a

 11-10-2010
Francis Moreau <(E-Mail Removed)> writes:

> Ben Bacarisse <(E-Mail Removed)> writes:
>
>> Francis Moreau <(E-Mail Removed)> writes:

<snip>
>>> Basically GCC's man page gives an example of code which might not work
>>> as expected if '-fstrict-aliasing' switch is passed. Here's the piece of
>>> code taken from my recent discussion:
>>>
>>> > Looking again at the second example:
>>> >
>>> > union a_union {
>>> > int i;
>>> > double d;
>>> > };
>>> >
>>> > int f() {
>>> > union a_union t;
>>> > int* ip;
>>> > t.d = 3.0;
>>> > ip = &t.i;
>>> > return *ip;
>>> > }
>>> >
>>>
>>> We agreed that this invokes undefined behaviour, but I think the reasons
>>> why are different.
>>>
>>> My reason was because it's reinterpreting a double as an int which gives
>>> undefined behaviour.

>>
>> That may or may not be undefined. If the implementation's int type has
>> no trap representations then all bit patterns are valid int values.

<snip>

> Ok, so such type-punning is implementation defined and in this case it's
> defined so only the aliasing rules is problematic.

Sorry to nit-pick but "implementation defined" is a technical term with
an exact meaning in the C standard and it does not quite apply here. The
type punning is entirely well-defined -- the bit are re-interpreted --
but whether there are any trap representations is implementation
defined.

>>> The other reason is I _think_ because of aliasing issue (again taken
>>> from the discussion):
>>>
>>> > could you tell me what the effective type of 't.i' object ?
>>>
>>> int, if you can say that object exists at all: it does not have a
>>> stored value. The stored value of t is a double with value 3.0 .
>>> You can take its address and access it via that as "double" (or
>>> "char"), or you can access it as the union it is. You can not
>>> access it as "int".
>>>
>>> which actually doesn't answer to the question.

>>
>> I think the problem relates to what constitutes "the object" for the
>> purposes of 6.5 paragraphs 6-7.

>
> Yes I think so.
>
> Now, I would think that 't.i' and 't.d' are accessing both the same
> object 't' whose effective type is union a_union.

As you go on to discover, this can't be the case.

>> Clearly the person you are talking to thinks that t.i might not be
>> considered "an object" at least not one that can be accessed via an
>> lvalue expression of type int at this point in the code. They consider
>> "the object" being accessed to be t.d since that is what is stored in
>> the union.

>
> Ok I can buy this but why doesn't the standard state this clearly ?

Maybe it is all much clearer than either of us think. In general on
Usenet silence implies agreement, so for the moment I'll assume, rather
boldly, that all the respected posters here agree with me, but I'd be
happier to see some more opinions.

> This strong assumption is only made for being 'nice' with aliasing.
>
>> This does not conflict with the previous example in the manual which is:
>>
>> | union a_union {
>> | int i;
>> | double d;
>> | };
>> |
>> | int f() {
>> | union a_union t;
>> | t.d = 3.0;
>> | return t.i;
>> | }
>> |
>> | The practice of reading from a different union member than the one
>> | most recently written to (called "type-punning") is common. Even
>> | with -fstrict-aliasing, type-punning is allowed, provided the
>> | memory is accessed through the union type. So, the code above will
>> | work as expected.
>>
>> Here, if the "the object" being accessed is t.d, then access via t.i is
>> valid since it is an access to "the object" (t.d) via a union that
>> includes, as one of its members, the effective type (double) of that
>> object.

>
> ok this kind of type-punning (using a union and its members) is defined
> by the standard so all is fine so far.
>
>> In the pointer case, the lvalue expression used (*ip) is not an
>> aggregate or union that includes double as one of its members, nor is
>> a lvalue expression with a type compatible with that of the "the
>> object" being accessed (double).

>
> AFAIK, there's nothing in the standard that says that the effective type
> of a union object can be changed according to its last stored value, is
> there ?

No, and I did not say it could be! The effective type of the object t
is the union type; that of t.d, double; and that to t.i, int. These
don't change. The question is, does t.i exist at all? The abstract
view of a C union seems to that of a container that can hold one of
several objects at once. After t.d = 3.0; there is no t.i. Accessing
t.i re-interprets the bits of t.d as an int even though there is no
object t.i.

<snip>
> Therefore you're probably right but still 6.5p6 claims:
>
> The effective type of an object for an access to its stored value is
> the declared type of the object, if any.

Yes, but the point is what object is being accessed? It can't be as
simple as the apparent object -- i.e. the one whose lvalue is being used
to do the access. If so, there'd be no strict aliasing at all!

The claim being made in that thread (and it's being made by me also) is
that the object being accessed via *ip is actually t.d, and that breaks
the rules of 6.5 p7.

> I hate C standard

I think it could be clearer and I certainly welcome another opinion.

>> This is my best analysis of what is being claimed from what you present
>> here (I don't know how to access gmane via anything by the crudest
>> web-based interface and I have therefore not bothered to read the other
>> thread -- sorry).

>
> Just subscribe to this group using news.gmane.org server, no ?

I could have guess, I suppose, that gmane would offer an NTTP interface
but I still doubt that I'll go join int. I spend too much time here as
it is!

<snip>
--
Ben.

Francis Moreau
Guest
Posts: n/a

 11-12-2010
Ben Bacarisse <(E-Mail Removed)> writes:

> Francis Moreau <(E-Mail Removed)> writes:
>
>> Ben Bacarisse <(E-Mail Removed)> writes:
>>
>>> Francis Moreau <(E-Mail Removed)> writes:

> <snip>
>>>> Basically GCC's man page gives an example of code which might not work
>>>> as expected if '-fstrict-aliasing' switch is passed. Here's the piece of
>>>> code taken from my recent discussion:
>>>>
>>>> > Looking again at the second example:
>>>> >
>>>> > union a_union {
>>>> > int i;
>>>> > double d;
>>>> > };
>>>> >
>>>> > int f() {
>>>> > union a_union t;
>>>> > int* ip;
>>>> > t.d = 3.0;
>>>> > ip = &t.i;
>>>> > return *ip;
>>>> > }
>>>> >
>>>>
>>>> We agreed that this invokes undefined behaviour, but I think the reasons
>>>> why are different.
>>>>
>>>> My reason was because it's reinterpreting a double as an int which gives
>>>> undefined behaviour.
>>>
>>> That may or may not be undefined. If the implementation's int type has
>>> no trap representations then all bit patterns are valid int values.

> <snip>
>
>> Ok, so such type-punning is implementation defined and in this case it's
>> defined so only the aliasing rules is problematic.

>
> Sorry to nit-pick but "implementation defined" is a technical term with
> an exact meaning in the C standard and it does not quite apply here. The
> type punning is entirely well-defined -- the bit are re-interpreted --
> but whether there are any trap representations is implementation
> defined.

Agreed.

[...]

> Maybe it is all much clearer than either of us think. In general on
> Usenet silence implies agreement, so for the moment I'll assume, rather
> boldly, that all the respected posters here agree with me, but I'd be
> happier to see some more opinions.

Again that's a strong assumption.

[...]

>> AFAIK, there's nothing in the standard that says that the effective type
>> of a union object can be changed according to its last stored value, is
>> there ?

> No, and I did not say it could be! The effective type of the object t
> is the union type; that of t.d, double; and that to t.i, int. These
> don't change. The question is, does t.i exist at all? The abstract
> view of a C union seems to that of a container that can hold one of
> several objects at once. After t.d = 3.0; there is no t.i. Accessing
> t.i re-interprets the bits of t.d as an int even though there is no
> object t.i.

It's a bit weird:

t.d = 3.0;
i = t.i; /* t.i exists */
ip = &t.i; /* t.i doesn't exist anymore */

isn't it ?

> <snip>
>> Therefore you're probably right but still 6.5p6 claims:
>>
>> The effective type of an object for an access to its stored value is
>> the declared type of the object, if any.

>
> Yes, but the point is what object is being accessed? It can't be as
> simple as the apparent object -- i.e. the one whose lvalue is being used
> to do the access. If so, there'd be no strict aliasing at all!
>

Well, we're talking only about union here.

Thanks
--
Francis

Johannes Schaub (litb)
Guest
Posts: n/a

 11-12-2010
Francis Moreau wrote:

> Hello,
>
> 6.5p6 gives a definition of effective type.
>
> However I can't find an answer to this question: What's the effective
> type of union's members.
>
> For example:
>
> union u {
> int a;
> float b;
> };
>
> What's the effective type of 'u.a' and 'u.b' members ?
>
> I would say int and float respectively.
>

Yes, I would say so too. They are declared subobjects.

> But after this:
>
> u.b = 1.23;
>
> does the effective type of 'u.i' is still int ?
>

Yes, u.i is always effectively int, and u.b is always effectively a float.
saying "u.b = 1.23" overwrites all or part of the memory of "u.a". However,
saying "u.a" afterwards does not contradict strict aliasing because the
lvalue is of type int, and the effective is also of type int.

So the only thing you need to be aware of is the value representation and
trapping bits in an int. If you read u.a and it turns out to be a trap,
behavior is undefined.

This is notably different in C++. In C++, "u.a" and "u.b" do have lvalues of
type int and float respectively. But the rules in C++ are not based on
effective type, but on dynamic types. The dynamic type changes with an
assignment: u.b reuses the memory of u.a, and make the object accessed by
both u.a and u.b of type float. In C++ it is an aliasing violation to say
u.a afterwards.

In both C++ and C you can copy the union as a whole because the object
keeping the stored value is a member of that union type, and as such the
union type is allowed to alias the object keeping the stored value.