Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Union and strict aliasing

Reply
Thread Tools

Union and strict aliasing

 
 
Maxim Fomin
Guest
Posts: n/a
 
      07-28-2012
Problem: there is need to allocate space for some object of unknown quantity and I know, that in "99% cases" it would be less than N items. In such cases it is better to use fixed-size array, however I don't want to impose limitations and cover all possible cases (when quantity is bigger that N). Asa trade-off I use union of fixed-size array with single element:

union u
{
type *arr[N]; /* type is opaque structure */
type *ptr;
};

If quantity is bigger than N, I assign allocated memory to ptr.

However, consider following example program (uhack.c):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct data
{
int val;
char *align;
} d0 = {0, "hello"},
d1 = {1, "world"},
d2 = {2, "from"},
d3 = {3, "a"},
d4 = {4, "simple"},
d5 = {5, "c"},
d6 = {6, "program"};

union
{
struct data *arr[7];
struct data *ptr;
} u = {{&d0, &d1, &d2, &d3, &d4, &d5, &d6}};

int main(void)
{
int i;
struct data *new_ptr;
for (i = 0; i < 7; i++) {
printf("%s\n", u.ptr[i].align);
}
new_ptr = malloc(9 * sizeof *new_ptr);
memcpy(new_ptr, u.ptr, 7 * sizeof *new_ptr);
u.ptr = new_ptr;
for (i = 0; i < 7; i++) {
printf("%s\n", u.ptr[i].align);
}
return 0;
}

Compiling with gcc (gcc -Wall -Wextra -pedantic -std=c99 -O0 -o gccO0 -Wstrict-aliasing -fstrict-aliasing uhack.c) gives expected result, but raising optimization level (gcc -Wall -Wextra -pedantic -std=c99 -O1 -o gccO1 -Wstrict-aliasing -fstrict-aliasing uhack.c) gives binary which segfaults onmy machine.

I don't understand where "strict aliasing rules" are broken (assuming that raising optimization level for given program reveals such errors):
- union trick unlikely is a source of problem, since it overlaps compatibletypes (even same): an object and first element of fixed-size array of suchobjects
- union u is likely to be initialized right;
- padding and trailing bits are also unlikely to cause problem, since members are accessed through . and ->

In addition, clang (clang -Wall -Wextra -pedantic -std=c99 -O3 -o clangO3-Wstrict-aliasing -fstrict-aliasing uhack.c) for O3 produces runnable binary, so I suppose there is somewhere tricky UB which I cannot find.
 
Reply With Quote
 
 
 
 
Ike Naar
Guest
Posts: n/a
 
      07-28-2012
On 2012-07-28, Maxim Fomin <(E-Mail Removed)> wrote:
> Problem: there is need to allocate space for some object of unknown
> quantity and I know, that in "99% cases" it would be less than N
> items. In such cases it is better to use fixed-size array, however I
> don't want to impose limitations and cover all possible cases (when
> quantity is bigger that N). As a trade-off I use union of fixed-size
> array with single element:
>
> union u
> {
> type *arr[N]; /* type is opaque structure */
> type *ptr;
> };
>
> If quantity is bigger than N, I assign allocated memory to ptr.
>
> However, consider following example program (uhack.c):
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
>
> struct data
> {
> int val;
> char *align;
> } d0 = {0, "hello"},
> d1 = {1, "world"},
> d2 = {2, "from"},
> d3 = {3, "a"},
> d4 = {4, "simple"},
> d5 = {5, "c"},
> d6 = {6, "program"};
>
> union
> {
> struct data *arr[7];
> struct data *ptr;
> } u = {{&d0, &d1, &d2, &d3, &d4, &d5, &d6}};
>
> int main(void)
> {
> int i;
> struct data *new_ptr;
> for (i = 0; i < 7; i++) {
> printf("%s\n", u.ptr[i].align);
> }
> new_ptr = malloc(9 * sizeof *new_ptr);
> memcpy(new_ptr, u.ptr, 7 * sizeof *new_ptr);
> u.ptr = new_ptr;
> for (i = 0; i < 7; i++) {
> printf("%s\n", u.ptr[i].align);
> }
> return 0;
> }


Strange things happening here.
This program relies on the assumption that the variables
d0,d1,d2,d3,d4,d5,d6 are allocated contiguously and in that order.
The compiler is not obliged to honour this assumption.
See what happens when you swap, say, the lines

d1 = {1, "world"},
d2 = {2, "from"},

to get

d2 = {2, "from"},
d1 = {1, "world"},

In theory this should yield an equivalent program, but
you might find it changes the output printed by the program.

If, for whatever reason, the compiler decides to insert
padding between any of d0,d1,d2,d3,d4,d5,d6, the program
is likely to crash.

Note that the initialization

} u = {{&d0, &d1, &d2, &d3, &d4, &d5, &d6}};

is bogus (except for the initialization of the first element),
if the initialization were changed to, say,

} u = {{&d0, 0, 0, 0, 0, 0, 0}};

it would not change the working of the program, so why are
you taking the trouble to initialize u.arr[1..6] at all?
 
Reply With Quote
 
 
 
 
Ben Bacarisse
Guest
Posts: n/a
 
      07-29-2012
Maxim Fomin <(E-Mail Removed)> writes:

> Problem: there is need to allocate space for some object of unknown
> quantity and I know, that in "99% cases" it would be less than N
> items. In such cases it is better to use fixed-size array, however I
> don't want to impose limitations and cover all possible cases (when
> quantity is bigger that N). As a trade-off I use union of fixed-size
> array with single element:
>
> union u
> {
> type *arr[N]; /* type is opaque structure */
> type *ptr;
> };
>
> If quantity is bigger than N, I assign allocated memory to ptr.
>
> However, consider following example program (uhack.c):
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
>
> struct data
> {
> int val;
> char *align;
> } d0 = {0, "hello"},
> d1 = {1, "world"},
> d2 = {2, "from"},
> d3 = {3, "a"},
> d4 = {4, "simple"},
> d5 = {5, "c"},
> d6 = {6, "program"};
>
> union
> {
> struct data *arr[7];
> struct data *ptr;
> } u = {{&d0, &d1, &d2, &d3, &d4, &d5, &d6}};
>
> int main(void)
> {
> int i;
> struct data *new_ptr;
> for (i = 0; i < 7; i++) {
> printf("%s\n", u.ptr[i].align);
> }


<screeching brakes>
Whoa!

Forget all the problems you might have later, you're off the rails
already. If I could easily draw a diagram, it would be simpler to
explain, but let me try in words.

u.ptr occupies the same space as u.arr[0] -- they overlap -- and since
they have the same type, you can use them in similar ways. This means
that writing u.ptr[i] is the same as writing u.arr[0][i] which is not at
all what you thought (I image). As Ike has said, if d1 happens to
follow d0 in memory, u.ptr[1], u.arr[0][1], u.arr[1] and &d2 will all
have the same value and things will, deceptively, look OK.

I don't think you can do what you are trying to do starting from here.
Given the limited example, it's hard to say what's best. I'd re-write
you example just using a simple pointer that either points to a static
array or to an allocated one, but the fact that you are not already
doing that suggests that your real usage is more complicated than your
posted example. I feel more details are needed.

> new_ptr = malloc(9 * sizeof *new_ptr);
> memcpy(new_ptr, u.ptr, 7 * sizeof *new_ptr);
> u.ptr = new_ptr;
> for (i = 0; i < 7; i++) {
> printf("%s\n", u.ptr[i].align);
> }
> return 0;
> }


<snip>
--
Ben.
 
Reply With Quote
 
Phil Carmody
Guest
Posts: n/a
 
      07-29-2012
Maxim Fomin <(E-Mail Removed)> writes:
> Problem: there is need to allocate space for some object of unknown quantity and I know, that in "99% cases" it would be less than N items. In such cases it is better to use fixed-size array, however I don't want to impose limitations and cover all possible cases (when quantity is bigger that N). As a trade-off I use union of fixed-size array with single element:
>
> union u
> {
> type *arr[N]; /* type is opaque structure */
> type *ptr;
> };


What's wrong with just coding the possibility of needing to
look elsewhere for more?

struct block_of_N_typeptr {
type *array[N];
struct block_of_N_typeptr *next;
};

But even that's higher-tech than you really need - what's wrong with
just realloc()?

Phil
--
> I'd argue that there is much evidence for the existence of a God.

Pics or it didn't happen.
-- Tom (/. uid 822)
 
Reply With Quote
 
Maxim Fomin
Guest
Posts: n/a
 
      08-02-2012
воскресенье, 29 июля 2012*г., 3:13:29 UTC+4 пользователь Ike Naar написал:
> On 2012-07-28, Maxim Fomin <(E-Mail Removed)> wrote:
> Strange things happening here.
>
> This program relies on the assumption that the variables
>
> d0,d1,d2,d3,d4,d5,d6 are allocated contiguously and in that order.
>
> The compiler is not obliged to honour this assumption.
>
>


<skip>

Agree, but I found another problem - using u.ptr[i].align
instead of u.arr[i]->align. Because this expression was
probably considered as intentional, the code may be
reconsidered as conforming.

>
> Note that the initialization
>
>
>
> } u = {{&d0, &d1, &d2, &d3, &d4, &d5, &d6}};
>
>
>
> is bogus (except for the initialization of the first element),
>
> if the initialization were changed to, say,
>
>
>
> } u = {{&d0, 0, 0, 0, 0, 0, 0}};
>
>
>
> it would not change the working of the program, so why are
>
> you taking the trouble to initialize u.arr[1..6] at all?


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Strict aliasing and Q2.6 in the FAQ Conor F C Programming 7 09-20-2011 09:55 PM
Strict aliasing and Q2.6 in the FAQ Carveone C Programming 0 09-19-2011 04:11 PM
Strict aliasing and buffer handling Francois Duranleau C++ 20 06-21-2011 11:43 PM
char and strict aliasing Paul Brettschneider C++ 4 07-18-2008 12:22 PM
Strict Pointer Aliasing Question Bryan Parkoff C++ 2 01-15-2004 06:43 PM



Advertisments