Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > warning of breaking strict-aliasing rules

Reply
Thread Tools

warning of breaking strict-aliasing rules

 
 
Noob
Guest
Posts: n/a
 
      04-10-2012
Hello,

I'm supposed to "clean up" code that does things which the
standard frowns upon, such as

typedef long int LONG;
typedef unsigned long ULONG;
{
unsigned char csw[ 80 ] = { 0 };
fill_array(csw);
LONG sign = *(LONG*)&csw[0];
ULONG tag = *(ULONG *)&csw[4];
LONG residue = *(LONG*)&csw[8];
}

AFAIU, there are several problems with this code.

1. Since csw is a byte array, it might not be correctly
aligned for long accesses.

2. I think the compiler is allowed to assume that a given
object cannot be accessed through two incompatible pointers
(Is this what aliasing refers to?)

Strangely, my compiler (gcc) complains only about the
first dereference:

warning: dereferencing type-punned pointer will break strict-aliasing rules

AFAICT, all three lines have the same issue, right?

It seems the compiler should complain "equally" about the
three dereferences. Do you agree?

As for the fix, I think all is needed is, e.g.
LONG sign;
memcpy(&sign, csw+0, sizeof sign);
/* etc */
Do you agree?

Regards.
 
Reply With Quote
 
 
 
 
James Kuyper
Guest
Posts: n/a
 
      04-10-2012
On 04/10/2012 10:11 AM, Noob wrote:
> Hello,
>
> I'm supposed to "clean up" code that does things which the
> standard frowns upon, such as
>
> typedef long int LONG;
> typedef unsigned long ULONG;
> {
> unsigned char csw[ 80 ] = { 0 };
> fill_array(csw);
> LONG sign = *(LONG*)&csw[0];
> ULONG tag = *(ULONG *)&csw[4];
> LONG residue = *(LONG*)&csw[8];
> }
>
> AFAIU, there are several problems with this code.
>
> 1. Since csw is a byte array, it might not be correctly
> aligned for long accesses.


Correct.

> 2. I think the compiler is allowed to assume that a given
> object cannot be accessed through two incompatible pointers
> (Is this what aliasing refers to?)


Correct.

> Strangely, my compiler (gcc) complains only about the
> first dereference:
>
> warning: dereferencing type-punned pointer will break strict-aliasing rules
>
> AFAICT, all three lines have the same issue, right?
>
> It seems the compiler should complain "equally" about the
> three dereferences. Do you agree?


That seems reasonable, but the C standard only requires one diagnostic
for a program, no matter how many separate reasons there might be why a
diagnostic is required. The diagnostic doesn't have to contain any
useful information; it doesn't have to be in a language you (or anyone
else) know how to read. The requirement could be met by causing a light
to blink red. Providing anything more than that is a matter of "Quality
of Implementation" (QoI) which is outside the scope of the standard.

Only the gcc developers can actually change this.

> As for the fix, I think all is needed is, e.g.
> LONG sign;
> memcpy(&sign, csw+0, sizeof sign);
> /* etc */
> Do you agree?


Yes.
 
Reply With Quote
 
 
 
 
Noob
Guest
Posts: n/a
 
      04-10-2012
James Kuyper wrote:

> Noob wrote:
>
>> typedef long int LONG;
>> typedef unsigned long ULONG;
>> {
>> unsigned char csw[ 80 ] = { 0 };
>> fill_array(csw);
>> LONG sign = *(LONG*)&csw[0];
>> ULONG tag = *(ULONG *)&csw[4];
>> LONG residue = *(LONG*)&csw[8];
>> }


[snip]

>> warning: dereferencing type-punned pointer will break strict-aliasing rules
>>
>> It seems the compiler should complain "equally" about the
>> three dereferences. Do you agree?

>
> That seems reasonable, but the C standard only requires one diagnostic
> for a program, no matter how many separate reasons there might be why a
> diagnostic is required.


When you say "one diagnostic for a program" does that mean if I fix
the first line, the compiler should complain about the second?

> The diagnostic doesn't have to contain any
> useful information; it doesn't have to be in a language you (or anyone
> else) know how to read. The requirement could be met by causing a light
> to blink red. Providing anything more than that is a matter of "Quality
> of Implementation" (QoI) which is outside the scope of the standard.


Roger that.
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      04-10-2012
On 04/10/2012 10:54 AM, Noob wrote:
> James Kuyper wrote:
>
>> Noob wrote:
>>
>>> typedef long int LONG;
>>> typedef unsigned long ULONG;
>>> {
>>> unsigned char csw[ 80 ] = { 0 };
>>> fill_array(csw);
>>> LONG sign = *(LONG*)&csw[0];
>>> ULONG tag = *(ULONG *)&csw[4];
>>> LONG residue = *(LONG*)&csw[8];
>>> }

>
> [snip]
>
>>> warning: dereferencing type-punned pointer will break strict-aliasing rules
>>>
>>> It seems the compiler should complain "equally" about the
>>> three dereferences. Do you agree?

>>
>> That seems reasonable, but the C standard only requires one diagnostic
>> for a program, no matter how many separate reasons there might be why a
>> diagnostic is required.

>
> When you say "one diagnostic for a program" does that mean if I fix
> the first line, the compiler should complain about the second?


Actually, I wasn't paying enough attention. This code contains serious
defects, and it's a good thing that gcc provides the option of giving
you warnings about them, and it would be more reasonable for it to warn
in all three places. However, what's defective about the code is that
the behavior is undefined. It's not a syntax error or a constraint
violation, so no diagnostic is required. Therefore my comment, while
true in isolation, doesn't apply in this context.
 
Reply With Quote
 
tom st denis
Guest
Posts: n/a
 
      04-10-2012
On Apr 10, 10:28*am, James Kuyper <(E-Mail Removed)> wrote:
> Only the gcc developers can actually change this.
>
> > As for the fix, I think all is needed is, e.g.
> > * LONG sign;
> > * memcpy(&sign, csw+0, sizeof sign);
> > * /* etc */
> > Do you agree?

>
> Yes.


What? That's not how to load an integer type from a char array. He
will NEED to do something like

sign = csw[0] | (csw[1] << | ... ;

Or whatever endianess the data is meant to be in.

Tom
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      04-10-2012
On 04/10/2012 11:57 AM, tom st denis wrote:
> On Apr 10, 10:28 am, James Kuyper <(E-Mail Removed)> wrote:
>> Only the gcc developers can actually change this.
>>
>>> As for the fix, I think all is needed is, e.g.
>>> LONG sign;
>>> memcpy(&sign, csw+0, sizeof sign);
>>> /* etc */
>>> Do you agree?

>>
>> Yes.

>
> What? That's not how to load an integer type from a char array. He
> will NEED to do something like
>
> sign = csw[0] | (csw[1] << | ... ;
>
> Or whatever endianess the data is meant to be in.


I was assuming that he was reading in an integer object in the same
format used by the implementation that compiled the code, possibly even
by another part of the same program. Keep in mind that the code he's
updating is currently working; it just needs updating. If the data
source was using a different format, that wouldn't have been the case.

We've been given no information about what restrictions there are on the
input data format; they might be quite strict, or non-existent. Unless
they're very strict, the function needs additional information to
determine which conversions (such as the ones you describe) are needed.
 
Reply With Quote
 
Kaz Kylheku
Guest
Posts: n/a
 
      04-10-2012
On 2012-04-10, Noob <root@127.0.0.1> wrote:
> Hello,
>
> I'm supposed to "clean up" code that does things which the
> standard frowns upon, such as


This is generally a waste of time unless there is an economic justification,
such as: a paying customer wants to use the program on platform X and it does
not work.

If the program works now, it's not going to translate to any tangible benefit
to the end user, and with any change, there is the risk of screwing it up
more.

> warning: dereferencing type-punned pointer will break strict-aliasing rules


GCC supports type punning code which breaks strict aliasing rules.
You just have to give it the right option.

GCC implicitly enables -fstrict-aliasing at optimization levels -O2 and above,
but you can override that with -fno-strict-aliasing later on the command line.

This band-aid fix is way less effort than mucking around with the code.
 
Reply With Quote
 
Noob
Guest
Posts: n/a
 
      04-11-2012
Kaz Kylheku wrote:

> Noob wrote:
>
>> I'm supposed to "clean up" code that does things which the
>> standard frowns upon, such as

>
> This is generally a waste of time unless there is an economic justification,
> such as: a paying customer wants to use the program on platform X and it does
> not work.
>
> If the program works now, it's not going to translate to any tangible benefit
> to the end user, and with any change, there is the risk of screwing it up
> more.
>
>> warning: dereferencing type-punned pointer will break strict-aliasing rules

>
> GCC supports type punning code which breaks strict aliasing rules.
> You just have to give it the right option.
>
> GCC implicitly enables -fstrict-aliasing at optimization levels -O2 and above,
> but you can override that with -fno-strict-aliasing later on the command line.
>
> This band-aid fix is way less effort than mucking around with the code.


I think I agree with your advice. I'll just tweak the command-line.
 
Reply With Quote
 
Noob
Guest
Posts: n/a
 
      04-11-2012
Tom wrote:

> Noob wrote:
>
>> As for the fix, I think all is needed is, e.g.
>> LONG sign;
>> memcpy(&sign, csw+0, sizeof sign);
>> /* etc */

>
> What? That's not how to load an integer type from a char array.
> He will NEED to do something like
>
> sign = csw[0] | (csw[1] << | ... ;
>
> Or whatever endianess the data is meant to be in.


I will NEED no such thing.
It's a little-endian protocol on a little-endian CPU.
No byte-shuffling is required.
 
Reply With Quote
 
Tim Rentsch
Guest
Posts: n/a
 
      05-07-2012
Noob <root@127.0.0.1> writes:

> Hello,
>
> I'm supposed to "clean up" code that does things which the
> standard frowns upon, such as
>
> typedef long int LONG;
> typedef unsigned long ULONG;
> {
> unsigned char csw[ 80 ] = { 0 };
> fill_array(csw);
> LONG sign = *(LONG*)&csw[0];
> ULONG tag = *(ULONG *)&csw[4];
> LONG residue = *(LONG*)&csw[8];
> }
>
> AFAIU, there are several problems with this code.
>
> 1. Since csw is a byte array, it might not be correctly
> aligned for long accesses.
>
> 2. I think the compiler is allowed to assume that a given
> object cannot be accessed through two incompatible pointers
> (Is this what aliasing refers to?)
>
> Strangely, my compiler (gcc) complains only about the
> first dereference:
>
> warning: dereferencing type-punned pointer will break strict-aliasing rules
>
> AFAICT, all three lines have the same issue, right?


In fact, there are several issues, and the gcc complaint is
misleading in this regard. That said, the three lines all have
all the issues.

Point of information: the phrase 'strict-aliasing rules' is
specific to gcc, and is different from what the C Standard
requires (the 'effective type' rules). So gcc's warning
is, in a very important sense, disguising the real problem.


> It seems the compiler should complain "equally" about the
> three dereferences. Do you agree?


Yes, and it should complain in a way that is more
indicative of the problem as the C Standard views it.

> As for the fix, I think all is needed is, e.g.
> LONG sign;
> memcpy(&sign, csw+0, sizeof sign);
> /* etc */
> Do you agree?


I have to disagree with Kaz's advice in his response. Choosing a
gcc option to disallow "strict aliasing" (whatever that may be,
since gcc defines it however they please, and there is no clear
statement of what the definition is) is NOT the right way to
solve this problem. Using memcpy(), as you show, is one
alternative approach. Here is another:

union {
unsigned char csw[80];
LONG longs[ 80 / sizeof (LONG) ];
ULONG ulongs[ 80 / sizeof (ULONG) ];
} stuff;

fill_array( stuff.csw );
LONG sign = stuff.longs[0];
ULONG tag = stuff.ulongs[1];
LONG residue = stuff.longs[2];

This makes the behavior well-defined (assuming of course the data
read into stuff.csw is represented appropriately). In particular
it takes care of alignment problems, and satisfies the effect
type rules (aka "aliasing rules" or "anti-aliasing rules"). Then
you can use -fstrict-aliasing, and if gcc complains, you will
know that it is gcc, and not the C Standard, that is causing the
problem. A benefit of this approach is that casts and using
'void *' are avoided. A downside of this approach is that
offsets like 4, 8, etc, will need to be converted to their
"in-type" offsets, but that should be able to be mostly automated
and not too bad. And if you ever want to move the code to
a platform where 'sizeof (long) != 4' the transition will be
much easier.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
MIMEText breaking the rules? Dale Strickland-Clark Python 2 08-06-2007 12:10 AM
OT: AMCNGP victory! AMCNGP Homepage intronet online open for commint! AMCNGP is 1337 am breaking ALL THE RULES and stop MCNGP! THEY ARE AFEARED! AMCNGP 1337 hax0r codebase alpha MCSE 158 01-25-2006 08:49 PM
Re: AMCNGP victory! AMCNGP Homepage intronet online open for commint!AMCNGP is 1337 am breaking ALL THE RULES and stop MCNGP! THEY ARE AFEARED! Patrick Dickey MCSE 0 01-24-2006 03:25 PM
Re: AMCNGP victory! AMCNGP Homepage intronet online open for commint!AMCNGP is 1337 am breaking ALL THE RULES and stop MCNGP! THEY ARE AFEARED! Patrick Dickey MCSE 0 01-24-2006 02:16 PM
Looking for a breaking news rss feed that really contains breaking news Amy XML 0 02-22-2005 06:31 PM



Advertisments