Velocity Reviews > contiguity of arrays

# contiguity of arrays

Steve Kobes
Guest
Posts: n/a

 09-25-2004
Is this legal? Must it print 4?

int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];
printf("%d\n", *(b + 3));

--Steve

Al Bowers
Guest
Posts: n/a

 09-25-2004

Steve Kobes wrote:
> Is this legal? Must it print 4?
>
> int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];
> printf("%d\n", *(b + 3));
>
> --Steve

Yes it is legal and will print 4.
This array describes a contiguously allocated nonempty set of objects of
type int.
b = a[0] positions the int *b to the beginning of the array object as
would b = &a[0][0].

#include <stdio.h>

int main(void)
{
int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];

printf("a[2][2] = ((1,2},{3,4}};\nint *b;\n"
"Using b = a[0]. *(b + 3) = %d\n", *(b + 3));
b = &a[0][0];
printf("Using b = &a[0][0]. *(b + 3) = %d\n",*(b+3));
printf("The pointer values &a[0][0] and a[0] are%sequal\n",
(&a[0][0] == a[0])?" ":" not ");
return 0;
}

--
Al Bowers
Tampa, Fl USA
mailto: http://www.velocityreviews.com/forums/(E-Mail Removed) (remove the x to send email)
http://www.geocities.com/abowers822/

Chris Torek
Guest
Posts: n/a

 09-25-2004
>Steve Kobes wrote:
>> Is this legal? Must it print 4?
>>
>> int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];
>> printf("%d\n", *(b + 3));

In article <news:(E-Mail Removed)>
Al Bowers <(E-Mail Removed)> wrote:
>Yes it is legal and will print 4.

I believe there are those in comp.std.c, at least, who will disagree
with you.

>This array describes a contiguously allocated nonempty set of objects of
>type int.
>b = a[0] positions the int *b to the beginning of the array object as
>would b = &a[0][0].

In particular, "b" points to the first of the two consecutive "int"s
{1, 2}. These two consecutive "int"s are indeed followed immediately
in memory by the two consecutive "int"s {3, 4} -- but a compiler
is allowed, through arbitrarily magical means, to insert bounds-checking
on array subscripts and pointer arithmetic. (Remember that these
two are quite similar: subscripting is simply address arithmetic
followed by pointer indirection, and pointer indirection is simply
subscripting -- *p and p[0] are identical.[%] Note that we are allowed
to compute &array[N] but not follow the resulting pointer, so there
is a slight hitch in using the same code for both -- you also need
a boolean flag "this address arithmetic will be followed by
indirection, or not", to handle this in any bounds-checking code.)

Suppose that the compiler does in fact do this magical bounds-checking,
so that it "knows" that b points to the first of only *two* ints stored
in the array a[0]. In this case, b[0] and b[1] are allowed, but b[2]
is rejected with an "out of bounds array subscript" error.

(Personally, I find it all quite unlikely, but it does seem to be
allowed. I also think that, if one writes a compiler with the
decidedly non-magical bounds-checking done via either "fat pointers"
or some kind of runtime table lookups, the bounds for b should run
from 0 to 3 inclusive, in this case, rather than 0 to 1. But these
are my opinions of "the way it ought to work" rather than "what
the letter of the C standard says".)

In practice, I do not believe anyone has yet found an implementation
where b[3] (or equivalently, *(b + 3)) is caught as an error. The
most likely place to find this kind of runtime[%%] checking is in
C compilers for Lisp machines, and I have not had access to one of
those in years.

[% As a result, you can always replace (*p) with p[0], including
in expressions like (*p).field, which is more usually written as
p->field. Hence the -> operator is replaceable with "[0].", e.g.,
instead of:

if (a->b->c == 42)

you can write:

if (a[0].b[0].c == 42)

This example just goes to show that C has a plethora of operators. ]

[%% This particular case can be caught at compile time, but in
general one would want to test the subscript in b[i], or the offset
in (b+i), at runtime, both because "i" might be difficult or
impossible to predict, and also because "b" might also be difficult
or impossible to predict:

int a1[10], a2[2];
int *b;
...
if (user_input == 42)
b = a1;
else if (user_input == 6 * 9)
b = a2;
...
do_something(b[3]);

Here, a general-purpose bounds-checker might do something vaguely
like this:

struct bounds *bp;

bp = bounds_lookup(b);
if (bp == NULL)
__runtime_error("invalid pointer");
if (bp->upper_limit < 3)
__runtime_error("out of bounds array subscript");
tmp = b[3];
do_something(tmp);

to handle the call to do_something(). The bounds_lookup() function
compares the supplied pointer value against a table of all active
arrays -- all "global variables" that have array type, all outstanding
malloc()s, and all local variables that are still live on the current
stack frame, for instance -- and finds the corresponding array
bounds information. We can do this without concerning ourselves
with ordinary simple pointers (&localvar, where localvar is *not*
an array) because we have a nonzero subscript. If this were using
b[i] the code would have to do:

if (i != 0) {
struct bounds *bp;
... as before ...
} else
tmp = *b; /* cannot do bounds checking on zero subscripts */
do_something(tmp);

becaues each ordinary variable "acts like" an array of size 0.
Alternatively, bounds_lookup() might actually have bounds-table
information for *all* variables whose address is taken, instead
of just all arrays.

The bounds-checking above needs to handle pointer arithmetic as
well, so that if we do "b++", the upper limit is reduced by one
and the lower limit goes from 0 to -1. (There are other methods
for handling this that are probably more practical in real
bounds-checkers; this version is just for illustration.)

Bounds-checking tends to slow down programs quite a bit, and is an
area in which good compiler-level optimization is important. If
a loop is guaranteed to run from 0 to N-1 inclusive, a single
bounds-check on N-1 before entering the loop at all replaces many
runtime tests. If the bounds-check can be done at compile time,
then omitted entirely from the runtime code, that is even better.]
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

pete
Guest
Posts: n/a

 09-25-2004
Chris Torek wrote:
>
> >Steve Kobes wrote:
> >> Is this legal? Must it print 4?
> >>
> >> int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];
> >> printf("%d\n", *(b + 3));

>
> In article <news:(E-Mail Removed)>
> Al Bowers <(E-Mail Removed)> wrote:
> >Yes it is legal and will print 4.

>
> I believe there are those in comp.std.c, at least, who will disagree
> with you.

If it was

int a[2][2] = {{1, 2}, {3, 4}}, *b = (int *)a;
^^^^^^^
then I believe you could access b[3].

--
pete

Barry Schwarz
Guest
Posts: n/a

 09-25-2004
On 25 Sep 2004 17:21:37 GMT, Chris Torek <(E-Mail Removed)> wrote:

>>Steve Kobes wrote:
>>> Is this legal? Must it print 4?
>>>
>>> int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];
>>> printf("%d\n", *(b + 3));

>
>In article <news:(E-Mail Removed)>
>Al Bowers <(E-Mail Removed)> wrote:
>>Yes it is legal and will print 4.

>
>I believe there are those in comp.std.c, at least, who will disagree
>with you.

I could not follow your discussion which followed so I don't know
exactly which point I'm about to disagree with. Instead I will try to
present an argument that I believe shows the original question about
legality must be answered in the affirmative.

For an array q (not declared as a function parameter),
sizeof q / sizeof *q must evaluate to the number of elements in b.

For a pointer p to type T, the expression p+1 must evaluate to the
same value as (T*)((char*)p + sizeof(T))

Therefore

sizeof a must evaluate to the same value as 2*sizeof a[0]

sizeof a[0] and sizeof a[1] both must evaluate to the same value
as 2*sizeof a[0][0].

sizeof a[0][0] must evaluate to the same value as sizeof(int)

sizeof a must evaluate to the same value as 4*sizeof(int)

Since a contains 4 int and is exactly large enough to contain 4 int,
these 4 int must be located at b, b+1, b+2, and b+3.

snip

<<Remove the del for email>>

Mark McIntyre
Guest
Posts: n/a

 09-25-2004
On 25 Sep 2004 22:48:34 GMT, in comp.lang.c , Barry Schwarz
<(E-Mail Removed)> wrote:

>On 25 Sep 2004 17:21:37 GMT, Chris Torek <(E-Mail Removed)> wrote:
>
>>>Steve Kobes wrote:
>>>> Is this legal? Must it print 4?
>>>>
>>>> int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];
>>>> printf("%d\n", *(b + 3));

>>
>>In article <news:(E-Mail Removed)>
>>Al Bowers <(E-Mail Removed)> wrote:
>>>Yes it is legal and will print 4.

>>
>>I believe there are those in comp.std.c, at least, who will disagree
>>with you.

>
>I could not follow your discussion which followed so I don't know
>exactly which point I'm about to disagree with. Instead I will try to
>present an argument that I believe shows the original question about
>legality must be answered in the affirmative.

(snip argument which is based on the size of the objects, and 1-d array
arithmetic. ) The size argument is spurious. The compiler is allowed to
tell you the object is size 4, even if all 4 members were in separate
memory arrays in different solar systems. Theres no limit to the internal
magic the compiler can perform.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---

pete
Guest
Posts: n/a

 09-25-2004
Barry Schwarz wrote:
>
> On 25 Sep 2004 17:21:37 GMT, Chris Torek <(E-Mail Removed)> wrote:
>
> >>Steve Kobes wrote:
> >>> Is this legal? Must it print 4?
> >>>
> >>> int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];
> >>> printf("%d\n", *(b + 3));

> >
> >In article <news:(E-Mail Removed)>
> >Al Bowers <(E-Mail Removed)> wrote:
> >>Yes it is legal and will print 4.

> >
> >I believe there are those in comp.std.c, at least, who will disagree
> >with you.

>
> I could not follow your discussion which followed so I don't know
> exactly which point I'm about to disagree with. Instead I will try to
> present an argument that I believe shows the original question about
> legality must be answered in the affirmative.
>
> For an array q (not declared as a function parameter),
> sizeof q / sizeof *q must evaluate to the number of elements in b.
>
> For a pointer p to type T, the expression p+1 must evaluate to the
> same value as (T*)((char*)p + sizeof(T))
>
> Therefore
>
> sizeof a must evaluate to the same value as 2*sizeof a[0]
>
> sizeof a[0] and sizeof a[1] both must
> evaluate to the same value as 2*sizeof a[0][0].
>
> sizeof a[0][0] must evaluate to the same value as sizeof(int)
>
> sizeof a must evaluate to the same value as 4*sizeof(int)
>
> Since a contains 4 int and is exactly large enough to contain 4 int,
> these 4 int must be located at b, b+1, b+2, and b+3.

The problem is that b, with (b = a[0])
is pointed at the first element of a two element array.

If you have

int a = 0, b = 0;

then, you can't do this

if (a + 1 == b) { /* The condition check is valid */
printf("%d\n", a[1]); /* Then printf call isn't valid */
}

Whether or not there is an int object at (a + 1) is not the point.
Overunning the bounds of the object, is the point.

--
pete

pete
Guest
Posts: n/a

 09-26-2004
pete wrote:
>
> Barry Schwarz wrote:
> >
> > On 25 Sep 2004 17:21:37 GMT, Chris Torek <(E-Mail Removed)> wrote:
> >
> > >>Steve Kobes wrote:
> > >>> Is this legal? Must it print 4?
> > >>>
> > >>> int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];
> > >>> printf("%d\n", *(b + 3));
> > >
> > >In article <news:(E-Mail Removed)>
> > >Al Bowers <(E-Mail Removed)> wrote:
> > >>Yes it is legal and will print 4.
> > >
> > >I believe there are those in comp.std.c, at least, who will disagree
> > >with you.

> >
> > I could not follow your discussion which followed so I don't know
> > exactly which point I'm about to disagree with. Instead I will try to
> > present an argument that I believe shows the original question about
> > legality must be answered in the affirmative.
> >
> > For an array q (not declared as a function parameter),
> > sizeof q / sizeof *q must evaluate to the number of elements in b.
> >
> > For a pointer p to type T, the expression p+1 must evaluate to the
> > same value as (T*)((char*)p + sizeof(T))
> >
> > Therefore
> >
> > sizeof a must evaluate to the same value as 2*sizeof a[0]
> >
> > sizeof a[0] and sizeof a[1] both must
> > evaluate to the same value as 2*sizeof a[0][0].
> >
> > sizeof a[0][0] must evaluate to the same value as sizeof(int)
> >
> > sizeof a must evaluate to the same value as 4*sizeof(int)
> >
> > Since a contains 4 int and is exactly large enough to contain 4 int,
> > these 4 int must be located at b, b+1, b+2, and b+3.

>
> The problem is that b, with (b = a[0])
> is pointed at the first element of a two element array.
>
> If you have
>
> int a = 0, b = 0;
>
> then, you can't do this
>
> if (a + 1 == b) { /* The condition check is valid */
> printf("%d\n", a[1]); /* Then printf call isn't valid */
> }

Major typos there.

Should be:

if (&a + 1 == &b) { /* The condition check is valid */
printf("%d\n", (&a)[1]); /* Then printf call isn't valid */
}

> Whether or not there is an int object at (a + 1) is not the point.
> Overunning the bounds of the object, is the point.

--
pete

E. Robert Tisdale
Guest
Posts: n/a

 09-26-2004
Steve Kobes wrote:

> Is this legal? Must it print 4?
>
> int a[2][2] = {{1, 2}, {3, 4}}, *b = a[0];
> printf("%d\n", *(b + 3));

Despite the fact that this works everywhere,
neither C89 or C99 "guarantees" this.

You can reference *(b+0) and *(b+1) and
you can compute (b+2) but you can't reference *(b+2).
It isn't "legal" to compute (b+3) much less reference *(b+3).

If this "equivalence" is important to you,
I don't think that you need to be too concerned
about what the standard says.

Michael Mair
Guest
Posts: n/a

 09-26-2004
Hiho,

> (snip argument which is based on the size of the objects, and 1-d array
> arithmetic. ) The size argument is spurious. The compiler is allowed to
> tell you the object is size 4, even if all 4 members were in separate
> memory arrays in different solar systems. Theres no limit to the internal
> magic the compiler can perform.

True. But I have yet to see the compiler that will correctly wrap _all_
calls to memcpy() and all other functions getting size_t and pointer
arguments for this case.
It is much easier to build non-broken programs the other way.

Cheers
Michael

 Thread Tools

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is OffTrackbacks are On Pingbacks are On Refbacks are Off Forum Rules

 Similar Threads Thread Thread Starter Forum Replies Last Post Philipp Java 21 01-20-2009 08:33 AM Bill Reyn C++ 3 06-22-2004 12:01 PM Alexandra Stehman Java 5 06-17-2004 06:04 PM kelvSYC C Programming 2 09-26-2003 06:52 AM Mantorok Redgormor C Programming 4 09-11-2003 02:08 AM

Advertisments