Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Pre-offsetof() question

Reply
Thread Tools

Pre-offsetof() question

 
 
Arthur J. O'Dwyer
Guest
Posts: n/a
 
      10-06-2003

As far as I know, C89/C90 did not contain the
now-standard offsetof() macro.

Did C89 mandate that structs had to have a consistent
layout? For example, consider the typical layout of
the following structure:

struct weird
{
int x; /* sizeof(int)==4 here */
double y; /* sizeof(double)==8 here */
int z;
};

Now, let's suppose that the target architecture has typical
80x86 alignment requirements, where 'int' aligns on 4-byte
boundaries and 'double' on 8-byte boundaries.
A C99 compiler might produce a layout that looked like
this:

|_x__|####|___y____|_z__|####|

sizeof (struct weird) == 24 bytes


But could a C89, pre-offsetof() compiler decide to make
the layout of the struct vary, like this:

|_x__|####|___y____|_z__| on 8-byte alignment

|_x__|___y____|####|_z__| on 4-byte alignment

sizeof (struct weird) == 20 bytes


Note that the relative ordering of the members is
preserved; each 'struct weird' has the same size in
bytes; and all objects are properly aligned for their
type. But the "weird" ordering has saved us 4 bytes
per structure!

Does C89 allow this, or is it disallowed by something
in that standard? If so, what?

TIA,
-Arthur

 
Reply With Quote
 
 
 
 
Eric Sosman
Guest
Posts: n/a
 
      10-06-2003
"Arthur J. O'Dwyer" wrote:
>
> As far as I know, C89/C90 did not contain the
> now-standard offsetof() macro.


Full stop: C89 invented the <stddef.h> header, and specified
that it must provide offsetof().

> Did C89 mandate that structs had to have a consistent
> layout? For example, consider the typical layout of
> the following structure:
>
> struct weird
> {
> int x; /* sizeof(int)==4 here */
> double y; /* sizeof(double)==8 here */
> int z;
> };
>
> Now, let's suppose that the target architecture has typical
> 80x86 alignment requirements, where 'int' aligns on 4-byte
> boundaries and 'double' on 8-byte boundaries.
> A C99 compiler might produce a layout that looked like
> this:
>
> |_x__|####|___y____|_z__|####|
>
> sizeof (struct weird) == 24 bytes
>
> But could a C89, pre-offsetof() compiler decide to make
> the layout of the struct vary, like this:
>
> |_x__|####|___y____|_z__| on 8-byte alignment
>
> |_x__|___y____|####|_z__| on 4-byte alignment
>
> sizeof (struct weird) == 20 bytes
>
> Note that the relative ordering of the members is
> preserved; each 'struct weird' has the same size in
> bytes; and all objects are properly aligned for their
> type. But the "weird" ordering has saved us 4 bytes
> per structure!
>
> Does C89 allow this, or is it disallowed by something
> in that standard? If so, what?


No version of the Standard describes what alignments
are to be enforced. However, the rules for compatibility
of types guarantee that the same struct type will have the
same arrangement of padding bytes in all translation units.

Could this arrangement be different depending on flags
calling for different "strictnesses" of alignment? Yes, of
course -- but this isn't a contradiction, because using a
different set of compiler flags gives you a different
implementation of C, and the Standard makes no requirement
that translation units compiled by different implementations
must interoperate.

By the way, note that your 8-byte alignment example is
faulty. If a double must be aligned to an 8-byte boundary,
the sizeof a struct containing a double must be a multiple
of 8 bytes. Otherwise, you would not be able to malloc()
an array of two such structs:

struct weird *p = malloc(2 * sizeof *p); // assume 40

0 4 8 16 20 24 28 36 40
|_x__|####|___y____|_z__|_x__|####|___y____|_z__|
^ ^
| |
p p+1

Note that (p+1)->y is mis-aligned.

--
http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
 
 
 
Arthur J. O'Dwyer
Guest
Posts: n/a
 
      10-06-2003

On Mon, 6 Oct 2003, Eric Sosman wrote:
>
> Arthur J. O'Dwyer wrote:
> >
> > As far as I know, C89/C90 did not contain the
> > now-standard offsetof() macro.

>
> Full stop: C89 invented the <stddef.h> header, and specified
> that it must provide offsetof().


Oops. I guess the point is moot, then.

> > struct weird
> > {
> > int x; /* sizeof(int)==4 here */
> > double y; /* sizeof(double)==8 here */
> > int z;
> > };


> > But could a C89, pre-offsetof() compiler decide to make
> > the layout of the struct vary, like this:
> >
> > |_x__|####|___y____|_z__| on 8-byte alignment
> >
> > |_x__|___y____|####|_z__| on 4-byte alignment
> >
> > sizeof (struct weird) == 20 bytes
> >
> > Note that the relative ordering of the members is
> > preserved; each 'struct weird' has the same size in
> > bytes; and all objects are properly aligned for their
> > type. But the "weird" ordering has saved us 4 bytes
> > per structure!



> No version of the Standard describes what alignments
> are to be enforced. However, the rules for compatibility
> of types guarantee that the same struct type will have the
> same arrangement of padding bytes in all translation units.


How so? (Obviously, the existence of 'offsetof' assumes
that all 'struct weird's will have the same layout -- but
would that rule be explicitly stated anywhere if 'offsetof'
didn't exist?)


> By the way, note that your 8-byte alignment example is
> faulty. If a double must be aligned to an 8-byte boundary,
> the sizeof a struct containing a double must be a multiple
> of 8 bytes.


Why? (Other than the paragraph which in N869 is 7.17#3,
that is.)

> Otherwise, you would not be able to malloc()
> an array of two such structs:
>
> struct weird *p = malloc(2 * sizeof *p); // assume 40
>
> 0 4 8 16 20 24 28 36 40
> |_x__|####|___y____|_z__|_x__|####|___y____|_z__|


Ah -- your diagram is incorrect. The "correct" layout
for two optimized (but apparently non-conforming) 'struct
weird's is:

> 0 4 8 16 20 24 28 36 40

|_x__|####|___y____|_z__|_x__|___y____|####|_z__|
> ^ ^
> | |
> p p+1
>
> Note that (p+1)->y is mis-aligned.


Not anymore -- not if we remove 7.17#3. I had thought
that C89 didn't have offsetof(); apparently I was
wrong. Never mind, then.

-Arthur

 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      10-06-2003
"Arthur J. O'Dwyer" wrote:
>
> On Mon, 6 Oct 2003, Eric Sosman wrote:
> >
> > By the way, note that your 8-byte alignment example is
> > faulty. If a double must be aligned to an 8-byte boundary,
> > the sizeof a struct containing a double must be a multiple
> > of 8 bytes.

>
> Why? (Other than the paragraph which in N869 is 7.17#3,
> that is.)
>
> > Otherwise, you would not be able to malloc()
> > an array of two such structs:
> >
> > struct weird *p = malloc(2 * sizeof *p); // assume 40
> >
> > 0 4 8 16 20 24 28 36 40
> > |_x__|####|___y____|_z__|_x__|####|___y____|_z__|

>
> Ah -- your diagram is incorrect. The "correct" layout
> for two optimized (but apparently non-conforming) 'struct
> weird's is:
>
> > 0 4 8 16 20 24 28 36 40

> |_x__|####|___y____|_z__|_x__|___y____|####|_z__|
> > ^ ^
> > | |
> > p p+1
> >
> > Note that (p+1)->y is mis-aligned.

>
> Not anymore -- not if we remove 7.17#3. I had thought
> that C89 didn't have offsetof(); apparently I was
> wrong. Never mind, then.


Aha! Finally, the mystery of why offsetof intruded itself
into an apparently unrelated question becomes clear. Just to
be sure I've understood you: You're wondering whether different
instances of struct weird in the same program could arrange
their padding differently. Clearly, this cannot be the case
if offsetof(struct weird, y) is single-valued.

But even without offsetof I think you can rule out such
shenanigans. True, direct assignment of struct objects might
perhaps be clever enough to play games. But memcpy() must
also work:

struct weird *p = malloc(2 * sizeof *p);
p[0].x = ...; p[0].y = ...; p[0].z = ...;
memcpy (p+1, p, sizeof *p);
assert (p[1].x == p[0].x);
assert (p[1].y == p[0].y); // the crucial point
assert (p[2].z == p[0].z);

Since memcpy() knows only the size of the data being copied
and nothing about the nature of the object those data bytes
represent, it cannot possibly know enough to "slide" the
`y' element while copying the bag of bytes from one place
to another. Similar remarks apply to realloc() and to
fwrite()/fread(), and to other type-oblivious ways of moving
data from place to place.

--
(E-Mail Removed)
 
Reply With Quote
 
Arthur J. O'Dwyer
Guest
Posts: n/a
 
      10-06-2003

On Mon, 6 Oct 2003, Eric Sosman wrote:
>
> Aha! Finally, the mystery of why offsetof intruded itself
> into an apparently unrelated question becomes clear. Just to
> be sure I've understood you: You're wondering whether different
> instances of struct weird in the same program could arrange
> their padding differently. Clearly, this cannot be the case
> if offsetof(struct weird, y) is single-valued.


Yes! You've hit the nail on the head.

> But even without offsetof I think you can rule out such
> shenanigans. True, direct assignment of struct objects might
> perhaps be clever enough to play games. But memcpy() must
> also work:
>
> struct weird *p = malloc(2 * sizeof *p);
> p[0].x = ...; p[0].y = ...; p[0].z = ...;
> memcpy (p+1, p, sizeof *p);
> assert (p[1].x == p[0].x);
> assert (p[1].y == p[0].y); // the crucial point
> assert (p[2].z == p[0].z);


Yes, but *must* these 'assert(...)'s succeed? (Obviously
they needn't succeed if p[0].y is a trap representation,
or one of p[0],p[1] is volatile, for instance.)

Where does it say that

foo x = ...;
foo y = ...;
memcpy(&x, &y, sizeof (foo))
assert (x==y);

must necessarily succeed? I don't see anywhere, except perhaps
footnote 38 (which says that struct assignment may be done
"element-at-a-time or via memcpy"). And I don't think footnotes
are normative, even if the intent of the footnote were clearer.

-Arthur
[Remember, the whole question is moot.]

 
Reply With Quote
 
Jack Klein
Guest
Posts: n/a
 
      10-07-2003
On Mon, 6 Oct 2003 19:08:06 -0400 (EDT), "Arthur J. O'Dwyer"
<(E-Mail Removed)> wrote in comp.lang.c:

>
> On Mon, 6 Oct 2003, Eric Sosman wrote:
> >
> > Aha! Finally, the mystery of why offsetof intruded itself
> > into an apparently unrelated question becomes clear. Just to
> > be sure I've understood you: You're wondering whether different
> > instances of struct weird in the same program could arrange
> > their padding differently. Clearly, this cannot be the case
> > if offsetof(struct weird, y) is single-valued.

>
> Yes! You've hit the nail on the head.
>
> > But even without offsetof I think you can rule out such
> > shenanigans. True, direct assignment of struct objects might
> > perhaps be clever enough to play games. But memcpy() must
> > also work:
> >
> > struct weird *p = malloc(2 * sizeof *p);
> > p[0].x = ...; p[0].y = ...; p[0].z = ...;
> > memcpy (p+1, p, sizeof *p);
> > assert (p[1].x == p[0].x);
> > assert (p[1].y == p[0].y); // the crucial point
> > assert (p[2].z == p[0].z);

>
> Yes, but *must* these 'assert(...)'s succeed? (Obviously
> they needn't succeed if p[0].y is a trap representation,
> or one of p[0],p[1] is volatile, for instance.)
>
> Where does it say that
>
> foo x = ...;
> foo y = ...;
> memcpy(&x, &y, sizeof (foo))
> assert (x==y);
>
> must necessarily succeed? I don't see anywhere, except perhaps
> footnote 38 (which says that struct assignment may be done
> "element-at-a-time or via memcpy"). And I don't think footnotes
> are normative, even if the intent of the footnote were clearer.
>
> -Arthur
> [Remember, the whole question is moot.]


What you missed is:

========
6.2.6 Representations of types

6.2.6.1 General

1 The representations of all types are unspecified except as stated in
this subclause.

2 Except for bit-fields, objects are composed of contiguous sequences
of one or more bytes, the number, order, and encoding of which are
either explicitly specified or implementation-defined.

3 Values stored in unsigned bit-fields and objects of type unsigned
char shall be represented using a pure binary notation.

4 Values stored in non-bit-field objects of any other object type
consist of n CHAR_BIT bits, where n is the size of an object of that
type, in bytes. The value may be copied into an object of type
unsigned char [n] (e.g., by memcpy); the resulting set of bytes is
called the object representation of the value. Values stored in
bit-fields consist of m bits, where m is the size specified for the
bit-field. The object representation is the set of m bits the
bit-field comprises in the addressable storage unit holding it. Two
values (other than NaNs) with the same object representation compare
equal, but values that compare equal may have different object
representations.
========

From C99, and note the last sentence in paragraph 4.

Even without this, it would be impossible pass or return structures or
pointers to structures to functions in separate translation units if
an identical structure definition did not result in identically laid
out objects.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
Reply With Quote
 
Arthur J. O'Dwyer
Guest
Posts: n/a
 
      10-07-2003

On Tue, 7 Oct 2003, Jack Klein wrote:
>
> Arthur J. O'Dwyer wrote:
> > On Mon, 6 Oct 2003, Eric Sosman wrote:
> > >
> > > Aha! Finally, the mystery of why offsetof intruded itself
> > > into an apparently unrelated question becomes clear. Just to
> > > be sure I've understood you: You're wondering whether different
> > > instances of struct weird in the same program could arrange
> > > their padding differently. Clearly, this cannot be the case
> > > if offsetof(struct weird, y) is single-valued.

> >
> > Yes! You've hit the nail on the head.
> >
> > > But even without offsetof I think you can rule out such
> > > shenanigans. True, direct assignment of struct objects might
> > > perhaps be clever enough to play games. But memcpy() must
> > > also work:


> > Where does it say that
> >
> > foo x = ...;
> > foo y = ...;
> > memcpy(&x, &y, sizeof (foo))
> > assert (x==y);
> >
> > must necessarily succeed? I don't see anywhere, except perhaps
> > footnote 38


> What you missed is:
>
> ========
> 6.2.6 Representations of types
>
> 6.2.6.1 General
>
> 1 The representations of all types are unspecified except as stated in
> this subclause.
>
> 2 Except for bit-fields, objects are composed of contiguous sequences
> of one or more bytes, the number, order, and encoding of which are
> either explicitly specified or implementation-defined.


Okay, no problems here. The "weird" layout can be defined easily
by the implementation.

> 3 Values stored in unsigned bit-fields and objects of type unsigned
> char shall be represented using a pure binary notation.
>
> 4 Values stored in non-bit-field objects of any other object type
> consist of n CHAR_BIT bits, where n is the size of an object of that
> type, in bytes. The value may be copied into an object of type
> unsigned char [n] (e.g., by memcpy); the resulting set of bytes is
> called the object representation of the value. Values stored in
> bit-fields consist of m bits, where m is the size specified for the
> bit-field. The object representation is the set of m bits the
> bit-field comprises in the addressable storage unit holding it. Two
> values (other than NaNs) with the same object representation compare
> equal,


Okay, this is the part I assume you mean. Well,
<devil's-advocate>
what exactly does it mean for two structs to "compare equal"?
I mean, you can't use the == operator on structs, right? And if
we can only talk about member-by-member equality, well then we'll
have to consider a *member-by-member* memcpy -- which works fine!
</devil's-advocate>

> but values that compare equal may have different object
> representations.
> ========
>
> From C99, and note the last sentence in paragraph 4.


(And not in N869, right?)

> Even without this, it would be impossible pass or return structures or
> pointers to structures to functions in separate translation units if
> an identical structure definition did not result in identically laid
> out objects.


Debatable. But irrelevant.
Remember, the "weird" layout is perfectly consistent between t.u.'s.
A compiler could say, "Okay, this struct is a candidate for
weirdification," and generate appropriate code across all t.u.'s,
easily enough.

-Arthur
[Remember, still moot.]

P.S.-- As a small on-topic note, am I completely mistaken in my
prior belief that 'offsetof' was a relatively recent addition to
C? If so, why do we get so many variations on FAQ 2.14?
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
question row filter (more of sql query question) =?Utf-8?B?YW5kcmV3MDA3?= ASP .Net 2 10-06-2005 01:07 PM
Quick Question - Newby Question =?Utf-8?B?UnlhbiBTbWl0aA==?= ASP .Net 4 02-16-2005 11:59 AM
Question on Transcender Question :-) eddiec MCSE 6 05-20-2004 06:59 AM
Question re: features of the 831 router (also a 924 question) Wayne Cisco 0 03-02-2004 07:57 PM
Syntax Question - Novice Question sean ASP .Net 1 10-20-2003 12:18 PM



Advertisments