Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > A basic (?) problem with addresses (gcc)

Reply
Thread Tools

A basic (?) problem with addresses (gcc)

 
 
jacob navia
Guest
Posts: n/a
 
      12-15-2010
Le 15/12/10 21:45, BartC a écrit :
>
>
> "Seebs" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>> On 2010-12-15, Piotrne <(E-Mail Removed)> wrote:
>>> The first result is correct, but what happened to the second?

>>
>> Your code was wrong. The compiler did whatever it wanted.
>>
>>> Probably addresses have been shifted for some reason. But such
>>> constructions seem to be an elementary property of the C language,
>>> and they don???t work...

>>
>> Nope. You invoked undefined behavior, the compiler caught you at it.
>>

>
> What was wrong with it? Assuming int and float are the same sizes and
> are aligned in a compatible way.
>

It is wrong because gcc with optimizations screws it in some machines


In my Macintosh (OSX Intel) gcc gives the SAME results with and without
optimizations presumably because here Apple does a good job for us.
/tmp $ gcc -v
Using built-in specs.
Target: i686-apple-darwin10
[snip]
Thread model: posix
gcc version 4.2.1 (Apple Inc. build 5664)


Using Open Suse (inside VirtualBox under Macintosh) I obtain the same
good results using both optimized and non optimized code.

gcc -v gives

Using built-in specs.
Target: x86_64-suse-linux
[snip]
Thread model: posix
gcc version 4.4.1 [gcc-4_4-branch revision 150839] (SUSE Linux)

The problem with many êople is that they will never accept that gcc has
bugs. This is politically incorrect since gcc is GNU and GNU means

Gcc has No bUgs





 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      12-15-2010
jacob navia <(E-Mail Removed)> writes:
> Le 15/12/10 21:45, BartC a écrit :
>> "Seebs" <(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed)...
>>> On 2010-12-15, Piotrne <(E-Mail Removed)> wrote:
>>>> The first result is correct, but what happened to the second?
>>>
>>> Your code was wrong. The compiler did whatever it wanted.
>>>
>>>> Probably addresses have been shifted for some reason. But such
>>>> constructions seem to be an elementary property of the C language,
>>>> and they don???t work...
>>>
>>> Nope. You invoked undefined behavior, the compiler caught you at it.
>>>

>>
>> What was wrong with it? Assuming int and float are the same sizes and
>> are aligned in a compatible way.
>>

> It is wrong because gcc with optimizations screws it in some machines

[snip]
>
> The problem with many people is that they will never accept that gcc has
> bugs. This is politically incorrect since gcc is GNU and GNU means
>
> Gcc has No bUgs
>
>


Here's the original program:

#include <stdio.h>

int main(int argc, char **argv)
{
float x = 4.3;
int y;

y = *(int*)&x; /* copying of 4 bytes to int */
x = *(float*)&y; /* and back to float */

printf("x=%f\n",x); /* 4.3 expected here */
printf("y=%d\n",y);
return 0;
}

Assuming int and float have the same alignment requirements, the
pointer conversions are ok (C99 6.3.2.3p7), but 6.5p7 says:

An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:

-- a type compatible with the effective type of the object,

-- a qualified version of a type compatible with the effective
type of the object,

-- a type that is the signed or unsigned type corresponding to
the effective type of the object,

-- a type that is the signed or unsigned type corresponding to
a qualified version of the effective type of the object,

-- an aggregate or union type that includes one of the
aforementioned types among its members (including,
recursively, a member of a subaggregate or contained
union), or

-- a character type.

The effective types of x and y are float and int, respectively
(their declared types, see 6.5p6). (Storing a value into an object
with no declared type can change its effective type; that doesn't
apply here.) The first assignment accesses the stored value of x
by an lvalue of type int; likewise, the second accesses the stored
value of y by an lvalue of type float. Both accesses violate 6.5p7,
so the program's behavior is undefined. gcc apparently assumes
that y will not be modified via an lvalue of type float, and that
x will not be modified via an lvalue of type int, and performs some
optimizations based on those assumptions.

It even warns about what it's doing:

c.c:8: warning: dereferencing type-punned pointer will break strict-aliasing rules
c.c:9: warning: dereferencing type-punned pointer will break strict-aliasing rules

I do not claim or believe for one moment that gcc is bug-free
(and I seem to recall someone here saying recently that gcc's
"strict-aliasing rules" might go beyond what the standard permits),
but in this case the bug is in the program, not in the compiler.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
 
 
 
BartC
Guest
Posts: n/a
 
      12-16-2010

"Keith Thompson" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...

> int main(int argc, char **argv)
> {
> float x = 4.3;
> int y;
>
> y = *(int*)&x; /* copying of 4 bytes to int */
> x = *(float*)&y; /* and back to float */
>
> printf("x=%f\n",x); /* 4.3 expected here */
> printf("y=%d\n",y);
> return 0;
> }


> It even warns about what it's doing:
>
> c.c:8: warning: dereferencing type-punned pointer will break
> strict-aliasing rules
> c.c:9: warning: dereferencing type-punned pointer will break
> strict-aliasing rules
>
> I do not claim or believe for one moment that gcc is bug-free
> (and I seem to recall someone here saying recently that gcc's
> "strict-aliasing rules" might go beyond what the standard permits),
> but in this case the bug is in the program, not in the compiler.


How then you do this (vaguely Fortran code) in C:

integer*4 i
real*4 a
equivalence (a,i)

This is a related problem: both i and a share the same address, and those
four bytes can be accessed as an integer or a float value.

And do it without doing any unnecessary copying (memcpy) or using unions
(not always practical, and which could anyway have the same problems).
(Assume you know the hardware would have no problems with this, and you
don't care about portability.)

--
Bartc

 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      12-16-2010
"BartC" <(E-Mail Removed)> writes:
> "Keith Thompson" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
>
>> int main(int argc, char **argv)
>> {
>> float x = 4.3;
>> int y;
>>
>> y = *(int*)&x; /* copying of 4 bytes to int */
>> x = *(float*)&y; /* and back to float */
>>
>> printf("x=%f\n",x); /* 4.3 expected here */
>> printf("y=%d\n",y);
>> return 0;
>> }

>
>> It even warns about what it's doing:
>>
>> c.c:8: warning: dereferencing type-punned pointer will break
>> strict-aliasing rules
>> c.c:9: warning: dereferencing type-punned pointer will break
>> strict-aliasing rules
>>
>> I do not claim or believe for one moment that gcc is bug-free
>> (and I seem to recall someone here saying recently that gcc's
>> "strict-aliasing rules" might go beyond what the standard permits),
>> but in this case the bug is in the program, not in the compiler.

>
> How then you do this (vaguely Fortran code) in C:
>
> integer*4 i
> real*4 a
> equivalence (a,i)
>
> This is a related problem: both i and a share the same address, and those
> four bytes can be accessed as an integer or a float value.


Use a union.

> And do it without doing any unnecessary copying (memcpy) or using unions
> (not always practical, and which could anyway have the same problems).
> (Assume you know the hardware would have no problems with this, and you
> don't care about portability.)


I think a union is the best solution. A footnote on C99 6.5.2.3p3 says:

If the member used to access the contents of a union object
is not the same as the member last used to store a value in
the object, the appropriate part of the object representation
of the value is reinterpreted as an object representation in
the new type as described in 6.2.6 (a process sometimes called
"type punning"). This might be a trap representation

So they're not going to have the same problem (assuming you can avoid
trap representations).

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Seebs
Guest
Posts: n/a
 
      12-16-2010
On 2010-12-15, BartC <(E-Mail Removed)> wrote:
> What was wrong with it? Assuming int and float are the same sizes and are
> aligned in a compatible way.


It tried to read something throug an lvalue of the wrong type. Ultimately,
this violates the strict aliasing rules; the compiler is allowed to ignore
the reference or do anything it wants with it.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / (E-Mail Removed)
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.
 
Reply With Quote
 
Seebs
Guest
Posts: n/a
 
      12-16-2010
On 2010-12-16, BartC <(E-Mail Removed)> wrote:
> How then you do this (vaguely Fortran code) in C:


You doin't -- it violates one of the rules. Any attempt to do this
is *necessarily* undefined behavior.

> And do it without doing any unnecessary copying (memcpy) or using unions
> (not always practical, and which could anyway have the same problems).
> (Assume you know the hardware would have no problems with this, and you
> don't care about portability.)


The way you express that is with a union. Apart from that, it's undefined
behavior and you *can't* express it in plain C.

One of the points of using C, rather than assembly, is that the language
spec defines the language in a way that, at least a little, cares about
portability.

You might be able to fake something up by declaring things with "volatile"
somewhere in them, but...

Basically, if you are assuming you know the hardware has no problems with
this, you're not writing C, but a machine-specific variant which the
compiler may not support, and isn't obliged to.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / (E-Mail Removed)
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.
 
Reply With Quote
 
BartC
Guest
Posts: n/a
 
      12-16-2010
"Keith Thompson" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> "BartC" <(E-Mail Removed)> writes:


>> How then you do this (vaguely Fortran code) in C:
>>
>> integer*4 i
>> real*4 a
>> equivalence (a,i)
>>
>> This is a related problem: both i and a share the same address, and those
>> four bytes can be accessed as an integer or a float value.

>
> Use a union.


OK. But apart from the inconvenience of wrapping these things in unions then
having to use field selection to access the data, how do you do something
like this:

integer*4 i(20)
real*8 a
equivalance (a,i(7))

(So the 8 bytes at i(7.. are shared with the floating point number.)

The OP's method (perhaps wrapped in a macro) would have been ideal for this:

#define asdouble(x) *(double*)&(x)

asdouble(i[7]);


> I think a union is the best solution. A footnote on C99 6.5.2.3p3 says:
>
> If the member used to access the contents of a union object
> is not the same as the member last used to store a value in
> the object, the appropriate part of the object representation
> of the value is reinterpreted as an object representation in
> the new type as described in 6.2.6 (a process sometimes called
> "type punning"). This might be a trap representation
>
> So they're not going to have the same problem (assuming you can avoid
> trap representations).


I must have got the idea somewhere that you could only read out the same
member that was last written.

--
Bartc

 
Reply With Quote
 
Jens Thoms Toerring
Guest
Posts: n/a
 
      12-16-2010
BartC <(E-Mail Removed)> wrote:
> "Keith Thompson" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
> > "BartC" <(E-Mail Removed)> writes:


> >> How then you do this (vaguely Fortran code) in C:
> >>
> >> integer*4 i
> >> real*4 a
> >> equivalence (a,i)
> >>
> >> This is a related problem: both i and a share the same address, and those
> >> four bytes can be accessed as an integer or a float value.

> >
> > Use a union.


> OK. But apart from the inconvenience of wrapping these things in unions then
> having to use field selection to access the data, how do you do something
> like this:


> integer*4 i(20)
> real*8 a
> equivalance (a,i(7))


> (So the 8 bytes at i(7.. are shared with the floating point number.)


You don't. The EQUIVALENCE stuff in FORTRAN is just a horrible hack
IMHO (beside computed GOTOs and COMMON blocks it's one of the most
effective ways to write completely obfuscated FORTRAN programs.
I never went too far with FORTRAN (actually my first language but
then quickly forgotten), so how does FORTRAN deal with this when
on a certain system a real must be 8-byte aligned but an integer
only on 4-bytes? Then accessing 'a' if 'i' starts at an 8-byte
aligned address might be "interesting". To make it transparent
to the programmer (and not resulting in a SIGBUS) the compiler
would have to do something equivalent to a memcpy() to a tempo-
rary (correctly aligned for real) variable each time 'a' is ac-
cessed...

But then you can get a similar effect in C anyway with memcpy(),
just the normal syntax of the language doesn't support it for good
reasons IMHO. Why make something inherently broken (unless under
some very special circumstances) easy to do?

> > I think a union is the best solution. A footnote on C99 6.5.2.3p3 says:
> >
> > If the member used to access the contents of a union object
> > is not the same as the member last used to store a value in
> > the object, the appropriate part of the object representation
> > of the value is reinterpreted as an object representation in
> > the new type as described in 6.2.6 (a process sometimes called
> > "type punning"). This might be a trap representation
> >
> > So they're not going to have the same problem (assuming you can avoid
> > trap representations).


> I must have got the idea somewhere that you could only read out the same
> member that was last written.


I would guess the standard as cited by Keith (with emphasizing the
problem with trap representations) is pretty clear, i.e. if you're
lucky (no trap representation) it "works". What "works" actually
means is another question - if you e.g. try to read the value of
a float as an int then, of course, what you get will depend on the
bit representation of floats and ints on that system. So the result
will be inherently system dependent - but then it already is because
for this to somehow "work" requires that a float and an int have the
same size.
Regards, Jens
--
\ Jens Thoms Toerring ___ (E-Mail Removed)
\__________________________ http://toerring.de
 
Reply With Quote
 
lawrence.jones@siemens.com
Guest
Posts: n/a
 
      12-16-2010
BartC <(E-Mail Removed)> wrote [re. unions]:
>
> I must have got the idea somewhere that you could only read out the same
> member that was last written.


C89. The rules were changed in C99 to bless what everyone expected and
all known implementations did anyway.
--
Larry Jones

I'm getting disillusioned with these New Years. -- Calvin
 
Reply With Quote
 
Seebs
Guest
Posts: n/a
 
      12-16-2010
On 2010-12-16, BartC <(E-Mail Removed)> wrote:
> OK. But apart from the inconvenience of wrapping these things in unions then
> having to use field selection to access the data, how do you do something
> like this:


> integer*4 i(20)
> real*8 a
> equivalance (a,i(7))


> (So the 8 bytes at i(7.. are shared with the floating point number.)


You don't. C doesn't support or allow for overlap like this, so far as I
know.

> I must have got the idea somewhere that you could only read out the same
> member that was last written.


You can only read the same member that was last written if you want to know
what you'll get.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / (E-Mail Removed)
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to implement a firewall for Windows platform that blocks based on Mac addresses instead of IP addresses cagdas.gerede@gmail.com C Programming 1 12-07-2006 04:30 AM
Physical Addresses VS. Logical Addresses namespace1 C++ 3 11-29-2006 03:07 PM
What is the difference between Visual Basic.NET and Visual Basic 6? Jimmy Dean Computer Support 3 07-25-2005 07:05 AM
Upgrading Microsoft Visual Basic 6.0 to Microsoft Visual Basic .NET Jaime MCSD 2 09-20-2003 05:16 AM
Basic question about casting and addresses drowned C++ 4 08-03-2003 12:25 AM



Advertisments