Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Distance between struct members

Reply
Thread Tools

Distance between struct members

 
 
Keith Thompson
Guest
Posts: n/a
 
      10-19-2007
Kenneth Brody <(E-Mail Removed)> writes:
> Keith Thompson wrote:

[...]
>> The real reason is that pointer subtraction invokes undefined behavior
>> if the two pointers point to distinct objects. See C99 6.5.6p9. This
>> applies even to subtraction of char* pointers, which are not affected
>> by alignment.

>
> But &s1.i4 and &s1.i3 are both pointers within s1, and therefore are
> not "distinct objects". (I suppose the typical "IMO" disclaimer may
> apply?)


Yes; as I've acknowledged, my statement above was the result of my
misreading the previous material.

> Plus, as I understand it, it is perfectly legal to overlay an array
> of unsigned chars on any object, and access any and all bytes within
> that object through this array. How is casting &s1.i4 and &s1.i3 to
> "unsigned char *" any different than overlaying an unsigned char
> array?
>
> On second thought, however, I can see that taking the addresses of
> the two as their native "int *", you can say that the two ints are
> not part of the same object, as they are not part of an array of
> ints. (Which is why they may not be a multiple-of-sizeof-int bytes
> apart.) It is the casting to "unsigned char *" which means that the
> addresses can be treated "as-if" they were part of an array of
> unsigned chars the size of the struct.


The standard's requirement isn't really that they point to "the same
object". The actual wording, in C99 6.5.6p9, is:

When two pointers are subtracted, both shall point to elements of
the same array object, or one past the last element of the array
object; the result is the difference of the subscripts of the two
array elements.

It's stated elsewhere that any object of type T can be treated as an
array of type T[1], and that any object can be treated as an array of
unsigned char. The latter lets you get away with converting the
pointers &s1.i4 and &s1.i3 to ``unsigned char*'' before subtracting
them. Other rules, which I'm too lazy to look up, allow you to do the
same thing with ``char*'' or ``signed char*''. But if i4 and i3 are
both of type int, there's no rule that lets you treat them as elements
of the same array. (If the required alignment for type int is the
same as its size, you're very likely to get away with it unless the
implementation goes out of its way to stop you, but it's still
undefined behavior.)

[...]

> I think offsetof() is the way to go here. The offset of s1.i3 is
> guaranteed to be the same as the offset of s2.i3, assuming that s1
> and s2 are the same type, and any arithmetic which arrives at that
> offset is guaranteed to be properly aligned.


Agreed. If you care about the distince in bytes between the members
i3 and i4 of some struct type, then
offsetof(struct foo, i4) - offsetof(struct foo, i3)
is a clearer way to express it.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      10-19-2007
"(E-Mail Removed)" <(E-Mail Removed)> writes:
> On Oct 19, 10:28 pm, Kenneth Brody <(E-Mail Removed)> wrote:
>> I think offsetof() is the way to go here. The offset of s1.i3 is
>> guaranteed to be the same as the offset of s2.i3, assuming that s1
>> and s2 are the same type, and any arithmetic which arrives at that
>> offset is guaranteed to be properly aligned.

>
> But the offsetof() uses size_t other than "char *" or "unsigned char
> *" to designate the type of the addresses, why?


No, offsetof() uses size_t for the offset; it doesn't express any
address as a size_t.

Here's the standard's definition (C99 7.17p3):

offsetof(type, member-designator)

which expands to an integer constant expression that has type
size_t, the value of which is the offset in bytes, to the
structure member (designated by member-designator), from the
beginning of its structure (designated by type). The type and
member designator shall be such that given

static type t;

then the expression &(t.member-designator) evaluates to an address
constant. (If the specified member is a bit-field, the behavior is
undefined.)

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
 
 
 
Eric Sosman
Guest
Posts: n/a
 
      10-19-2007
(E-Mail Removed) wrote On 10/19/07 14:43,:
>
> So, the extra casts make the code in the original post legal and
> portable, doesn't it?


The code in the original post was illegal and non-
portable. The revised code in this post is almost all
right: It will print 13 and something else (because the
third argument to printf should be `*(int*)((char*)...)'
instead of what you wrote).

> #include <stdio.h>
> #include <stddef.h>
>
> struct S {
> /*...*/
> int i3;
> /*...*/
> int i7;
> };
>
> int main(void)
> {
> struct S s1 = {11, 12}, s2 = {13, 14};
> ptrdiff_t distance;
>
> distance = (char *)&s1.i7 - (char *)&s1.i3;
> printf("%d, %d\n", s2.i3, (int)*((char *)&s2.i3 + distance));
> return 0;
> }


... but I must ask: WHY do you want to do this?
If you want to print the value of s2.i7, just do it:
don't fool around with all this pointer-bashing. Even
if it is *possible* to perform an appendectomy with two
teaspoons and an eggbeater, that doesn't make it a
good idea.

--
(E-Mail Removed)
 
Reply With Quote
 
lovecreatesbea...@gmail.com
Guest
Posts: n/a
 
      10-20-2007
On Oct 20, 3:53 am, Keith Thompson <(E-Mail Removed)> wrote:
> "(E-Mail Removed)" <(E-Mail Removed)> writes:
> > On Oct 19, 10:28 pm, Kenneth Brody <(E-Mail Removed)> wrote:
> >> I think offsetof() is the way to go here. The offset of s1.i3 is
> >> guaranteed to be the same as the offset of s2.i3, assuming that s1
> >> and s2 are the same type, and any arithmetic which arrives at that
> >> offset is guaranteed to be properly aligned.

>
> > But the offsetof() uses size_t other than "char *" or "unsigned char
> > *" to designate the type of the addresses, why?

>
> No, offsetof() uses size_t for the offset; it doesn't express any
> address as a size_t.
>
> Here's the standard's definition (C99 7.17p3):
>
> offsetof(type, member-designator)
>
> which expands to an integer constant expression that has type
> size_t, the value of which is the offset in bytes, to the
> structure member (designated by member-designator), from the
> beginning of its structure (designated by type). The type and
> member designator shall be such that given
>
> static type t;
>
> then the expression &(t.member-designator) evaluates to an address


Thank you.

Why it's not in this form

(char *) &(t.member-designator)

I read it from other posts, some peopoe said that the standard
definition

#define offsetof(type, memb) ((size_t) &((type *) 0)-> memb)

implies

#define offsetof(type, memb) ((size_t) &((type *) 0)-> memb -
&((type *) 0))


Isn't the following one better?

#define offsetof(type, memb) \
((size_t) ((char *) &((type *) 0)-> memb - (char *) &((type
*) 0)))

> constant. (If the specified member is a bit-field, the behavior is
> undefined.)


 
Reply With Quote
 
lovecreatesbea...@gmail.com
Guest
Posts: n/a
 
      10-20-2007
On Oct 20, 4:23 am, Eric Sosman <(E-Mail Removed)> wrote:
> (E-Mail Removed) wrote On 10/19/07 14:43,:
>
>
>
> > So, the extra casts make the code in the original post legal and
> > portable, doesn't it?

>
> The code in the original post was illegal and non-
> portable. The revised code in this post is almost all
> right: It will print 13 and something else (because the
> third argument to printf should be `*(int*)((char*)...)'
> instead of what you wrote).
>


Thank you.

I wrote the third argument wrongly, thanks for the correction.

>
>
>
>
> > #include <stdio.h>
> > #include <stddef.h>

>
> > struct S {
> > /*...*/
> > int i3;
> > /*...*/
> > int i7;
> > };

>
> > int main(void)
> > {
> > struct S s1 = {11, 12}, s2 = {13, 14};
> > ptrdiff_t distance;

>
> > distance = (char *)&s1.i7 - (char *)&s1.i3;
> > printf("%d, %d\n", s2.i3, (int)*((char *)&s2.i3 + distance));
> > return 0;
> > }

>
> ... but I must ask: WHY do you want to do this?
> If you want to print the value of s2.i7, just do it:
> don't fool around with all this pointer-bashing. Even
> if it is *possible* to perform an appendectomy with two
> teaspoons and an eggbeater, that doesn't make it a
> good idea.


I didn't know the knowledge of these details about structs before. and
locating of struct members by offset.

Some people said the offsetof macro in this way

#define offsetof(type, memb) ((size_t) &((type *) 0)-> memb)

dereferences NULL /* 0 */ pointer and it's undefined behavior. I'm
even more anxious on this. And it's not put in this form

#define offsetof(type, memb) \
((size_t) ((char *) &((type *) 0)-> memb - (char *) &((type *)
0)))

Could you please talk about this more?

 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      10-20-2007
"(E-Mail Removed)" <(E-Mail Removed)> writes:
> On Oct 20, 3:53 am, Keith Thompson <(E-Mail Removed)> wrote:
>> "(E-Mail Removed)" <(E-Mail Removed)> writes:
>> > On Oct 19, 10:28 pm, Kenneth Brody <(E-Mail Removed)> wrote:
>> >> I think offsetof() is the way to go here. The offset of s1.i3 is
>> >> guaranteed to be the same as the offset of s2.i3, assuming that s1
>> >> and s2 are the same type, and any arithmetic which arrives at that
>> >> offset is guaranteed to be properly aligned.

>>
>> > But the offsetof() uses size_t other than "char *" or "unsigned char
>> > *" to designate the type of the addresses, why?

>>
>> No, offsetof() uses size_t for the offset; it doesn't express any
>> address as a size_t.
>>
>> Here's the standard's definition (C99 7.17p3):
>>
>> offsetof(type, member-designator)
>>
>> which expands to an integer constant expression that has type
>> size_t, the value of which is the offset in bytes, to the
>> structure member (designated by member-designator), from the
>> beginning of its structure (designated by type). The type and
>> member designator shall be such that given
>>
>> static type t;
>>
>> then the expression &(t.member-designator) evaluates to an address
>> constant. (If the specified member is a bit-field, the behavior is
>> undefined.)


I restored the last two lines of the above (you quoted them at the
bottom of your followup).

> Thank you.
>
> Why it's not in this form
>
> (char *) &(t.member-designator)


Why should it be? The point of mentioning ``&(t.member-designator)''
in the description is *only* that it must be an address constant; it's
used to specify which arguments to offsetof() are legal.

> I read it from other posts, some peopoe said that the standard
> definition
>
> #define offsetof(type, memb) ((size_t) &((type *) 0)-> memb)
>
> implies
>
> #define offsetof(type, memb) ((size_t) &((type *) 0)-> memb -
> &((type *) 0))
>
>
> Isn't the following one better?
>
> #define offsetof(type, memb) \
> ((size_t) ((char *) &((type *) 0)-> memb - (char *) &((type
> *) 0)))


There is no "standard definition" of offsetof().

It's *commonly* defined as you write above. That definition invokes
undefined behavior but the implementation is allowed to take advantage
of the vagaries of the implementation. Subtracting
(char *) &((type*) 0)
from something that already works is useful or necessary.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      10-20-2007
"(E-Mail Removed)" <(E-Mail Removed)> writes:
[...]
> Some people said the offsetof macro in this way
>
> #define offsetof(type, memb) ((size_t) &((type *) 0)-> memb)
>
> dereferences NULL /* 0 */ pointer and it's undefined behavior. I'm
> even more anxious on this. And it's not put in this form
>
> #define offsetof(type, memb) \
> ((size_t) ((char *) &((type *) 0)-> memb - (char *) &((type *)
> 0)))
>
> Could you please talk about this more?


Have you read question 2.14 in the comp.lang.c FAQ,
<http://c-faq.com/>?

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      10-20-2007
(E-Mail Removed) wrote:
> On Oct 20, 4:23 am, Eric Sosman <(E-Mail Removed)> wrote:
>>
>> ... but I must ask: WHY do you want to do this?
>> If you want to print the value of s2.i7, just do it:
>> don't fool around with all this pointer-bashing. Even
>> if it is *possible* to perform an appendectomy with two
>> teaspoons and an eggbeater, that doesn't make it a
>> good idea.

>
> I didn't know the knowledge of these details about structs before. and
> locating of struct members by offset.


Fine, but don't forget that the elements have names for
a reason: to make it easy to refer to them. Refer to struct
elements by their names whenever you know them -- that is,
whenever you know the type of the struct. It is possible to
get at a struct's elements using offsets and pointers and
casts, but this is best done only when you *don't* know the
struct type and hence the element names. It works, but it
has various drawbacks:

- As you have seen, it is easy to make misteaks that the
compiler will not detect.

- You lose the convenience of having the compiler keep
track of the element types. Refer to `s1.i7' and the
compiler knows it's an int, but use casts and offsets
and the compiler just has to trust your casting.

- It makes maintenance and debugging harder. If you found
that `s1.i7' is being set to a garbage value, you might
search your source for references to `i7' and put assert()
macros at each site. But you won't find cast-and-offset
references this way.

- It may make your code slower. Derive an `int*' from a
bunch of other data and store through it, and the compiler
may need to assume that every `int' variable it knows of is
a potential target. It may move register-resident values
back to memory, do your store, and then reload everything --
whereas if you'd just said `s1.i7 = 42' it would have known
that `i' and `j' and `k' were unaffected and could remain
safely and conveniently in their CPU registers.

In short, it's a technique that's available to you when it's
needed, but it's a technique of last resort. On a desert island
you might have to perform that appendectomy with two teaspoons
and an eggbeater, but it's not the method of choice.

> Some people said the offsetof macro in this way
>
> #define offsetof(type, memb) ((size_t) &((type *) 0)-> memb)
>
> dereferences NULL /* 0 */ pointer and it's undefined behavior. I'm
> even more anxious on this. And it's not put in this form
>
> #define offsetof(type, memb) \
> ((size_t) ((char *) &((type *) 0)-> memb - (char *) &((type *)
> 0)))
>
> Could you please talk about this more?


Keith Thompson has explained this elsethread. Briefly, it's
perfectly all right for the implementation's own code to rely on
things the Standard does not guarantee, because the implementation
can rely on its own behavior.

--
Eric Sosman
(E-Mail Removed)lid
 
Reply With Quote
 
Martin Golding
Guest
Posts: n/a
 
      10-23-2007
On Fri, 19 Oct 2007 11:43:35 -0700, (E-Mail Removed) wrote:
[much snippage]

> So, the extra casts make the code in the original post legal and
> portable, doesn't it?


Not quite. There appears to be a bug, which invokes undefined behavior,
which is never portable. I don't know if it's a mere typo or an actual
misunderstanding.

> #include <stdio.h>
> #include <stddef.h>
>
> struct S {
> /*...*/
> int i3;
> /*...*/
> int i7;
> };
>
> int main(void)
> {
> struct S s1 = {11, 12}, s2 = {13, 14};
> ptrdiff_t distance;
>
> distance = (char *)&s1.i7 - (char *)&s1.i3;
> printf("%d, %d\n", s2.i3,


This reads the first byte of s1.i7, likely not what you intended:
> (int)*((char *)&s2.i3 + distance));

and, because it reads an int object using a char pointer, the
behavior is undefined. (It will, mostly, work.)
You probably wanted
*(int *)((char *)&s2.i3 + distance));
ie, cast the pointer to pointer-to-int before the dereference.
If you wanted to extract the lowest addressed byte of the int,
you would use unsigned char
*((unsigned char *)&s2.i3 + distance));
the behavior of which is defined by the standard.

> return 0;
> }



With the example numbers, the code is likely to appear to have worked.
struct S s1 = {11, 12}, s2 = {13, 0x01020304};
will demonstrate the problem.


Martin
--
Martin Golding DoD #0236 | (E-Mail Removed)
Always code as if the person who ends up maintaining your code will be a
violent psychopath who knows where you live.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Can *common* struct-members of 2 different struct-types, that are thesame for the first common members, be accessed via pointer cast to either struct-type? John Reye C Programming 28 05-08-2012 12:24 AM
std::distance() for pointers to data members Alex Vinokur C++ 3 03-24-2011 04:33 PM
Difference between static final members and final static members(if any)? JFCM Java 4 02-07-2006 11:32 AM
struct my_struct *p = (struct my_struct *)malloc(sizeof(struct my_struct)); Chris Fogelklou C Programming 36 04-20-2004 08:27 AM
Can nested class members access private members of nesting class? CoolPint C++ 8 12-14-2003 02:30 PM



Advertisments