Velocity Reviews > A question about arrays and valid pointers

# A question about arrays and valid pointers

Tony Johansson
Guest
Posts: n/a

 08-26-2003
Hello experts!

If you have an array of integers you know that one element past the end of
the array is legal as address but not to follow. That's ok.
The address to one element past the end of the array is also grater then the
address to the last element.That's also ok.

The book says " Don't assume that a pointer to the element just before the
start of an array is legal.
This for loop should be valid as I think but the book say no because this
for loop has a problem:
the loop terminates only when ptr becomes less then a. And that's not
guaranteed to be a legal address, which means the comparasion may fail.
for (ptr = &a[num-1]; ptr >= a; ptr--)
printf("%i\n", *ptr);

I think, don't you that the address to the element just before the start of
an array must be lower then the address to the first element.

//Tony

Martin Dickopp
Guest
Posts: n/a

 08-26-2003
"Tony Johansson" <(E-Mail Removed)> writes:

> Hello experts!
>
> I question about understanding. I have read a book that says.
> If you have an array of integers you know that one element past the end of
> the array is legal as address but not to follow. That's ok.
> The address to one element past the end of the array is also grater then the
> address to the last element.That's also ok.
>
> The book says " Don't assume that a pointer to the element just before the
> start of an array is legal.

That's correct. An expression which yields a pointer anywhere before the
array causes undefined behavior, even if the pointer is never dereferenced.

> This for loop should be valid as I think but the book say no because this
> for loop has a problem:
> the loop terminates only when ptr becomes less then a. And that's not
> guaranteed to be a legal address, which means the comparasion may fail.
> for (ptr = &a[num-1]; ptr >= a; ptr--)
> printf("%i\n", *ptr);

This loop causes undefined behavior, so *anything* may happen. However,
the problem is easy to avoid if you slightly rewrite your code. Here is
a correct program that outputs the elements of an array in reverse order:

#include <stddef.h>
#include <stdio.h>

int main (void)
{
const int *ptr, a [] = {0, 1, 2, 3, 4, 5};
const size_t num = sizeof a / sizeof a [0];

for (ptr = &a [num]; ptr > a; )
printf ("%i\n", *--ptr);

return 0;
}

> I think, don't you that the address to the element just before the start of
> an array must be lower then the address to the first element.

No, I don't. There is no guarantee that a "lower" address exists; the
first array element could well be located at the beginning of the valid
address space. Or the machine could use some obscure addressing scheme which
makes the "lower" address unrepresentable altogether. An example for the
latter is an x86 PC in real mode.

Martin

Brett Frankenberger
Guest
Posts: n/a

 08-26-2003
In article <SIG2b.21295\$(E-Mail Removed)>,
Tony Johansson <(E-Mail Removed)> wrote:
>
>I think, don't you that the address to the element just before the start of
>an array must be lower then the address to the first element.

No. The reason I don't think that is because the standard doesn't
require it. So while some implementations, or maybe even most
implmentations, might work that way, there's no guarantee that they all
will.

For one example, consider an array of, say, 20 byte structures,
starting at address 16. Most implementations would implement the
address of the "-1" element as (16-20), or -4. If they treated
pointers as unsigned 32 bit integers in comparisons, then -4 would
actually be 0xFFFFFFFC, which would be greater than 16. (An
implementation isn't required to work that way, of course. I'm just
offering it as a potential example of how your code might fail in the
real world.)

-- Brett

Glen Herrmannsfeldt
Guest
Posts: n/a

 08-27-2003

"Martin Dickopp" <(E-Mail Removed)> wrote in message
news:bifg3l\$ung\$00\$(E-Mail Removed)-online.com...

(snip)
(someone wrote)

> > I think, don't you that the address to the element just before the start

of
> > an array must be lower then the address to the first element.

>
> No, I don't. There is no guarantee that a "lower" address exists; the
> first array element could well be located at the beginning of the valid
> address space. Or the machine could use some obscure addressing scheme

which
> makes the "lower" address unrepresentable altogether. An example for the
> latter is an x86 PC in real mode.

In either real or protected mode, it is normal for only the offset to be
modified in pointer arithmetic.

So yes, the offset could be zero, it would then wrap to some large number,
which would then not be less than the start address of the array.

Note, though, that this problem is not unique to pointers:

unsigned int i;

for(i=10; i>=0U; i--) printf("%ud\n",i)

-- glen

Martin Dickopp
Guest
Posts: n/a

 08-27-2003
"Glen Herrmannsfeldt" <(E-Mail Removed)> writes:

> "Martin Dickopp" <(E-Mail Removed)> wrote in message
> news:bifg3l\$ung\$00\$(E-Mail Removed)-online.com...
>
> (snip)
> (someone wrote)
>
> > > I think, don't you that the address to the element just before the start

> of
> > > an array must be lower then the address to the first element.

> >
> > No, I don't. There is no guarantee that a "lower" address exists; the
> > first array element could well be located at the beginning of the valid
> > address space. Or the machine could use some obscure addressing scheme

> which
> > makes the "lower" address unrepresentable altogether. An example for the
> > latter is an x86 PC in real mode.

>
> In either real or protected mode, it is normal for only the offset to be
> modified in pointer arithmetic.
>
> So yes, the offset could be zero, it would then wrap to some large number,
> which would then not be less than the start address of the array.
>
> Note, though, that this problem is not unique to pointers:
>
> unsigned int i;
>
> for(i=10; i>=0U; i--) printf("%ud\n",i)

While unsigned integers have well-defined wrap-around semantics in C,
pointers don't. I was merely constructing an example to show that the
expectation the OP had about a certain type of undefined behavior is not
only theoretically unjustified, but that a real processor exists on which
this expectation would not necessarily hold.

Martin

Glen Herrmannsfeldt
Guest
Posts: n/a

 08-28-2003

"Martin Dickopp" <(E-Mail Removed)> wrote in message
news:biie6j\$e98\$05\$(E-Mail Removed)-online.com...
> "Glen Herrmannsfeldt" <(E-Mail Removed)> writes:
>
> > "Martin Dickopp" <(E-Mail Removed)> wrote in message
> > news:bifg3l\$ung\$00\$(E-Mail Removed)-online.com...
> >
> > (snip)

> > > No, I don't. There is no guarantee that a "lower" address exists;

the
> > > first array element could well be located at the beginning of the

valid
> > > address space. Or the machine could use some obscure addressing scheme

> > which
> > > makes the "lower" address unrepresentable altogether. An example for

the
> > > latter is an x86 PC in real mode.

> >
> > In either real or protected mode, it is normal for only the offset to be
> > modified in pointer arithmetic.
> >
> > So yes, the offset could be zero, it would then wrap to some large

number,
> > which would then not be less than the start address of the array.
> >
> > Note, though, that this problem is not unique to pointers:
> >
> > unsigned int i;
> >
> > for(i=10; i>=0U; i--) printf("%ud\n",i)

>
> While unsigned integers have well-defined wrap-around semantics in C,
> pointers don't. I was merely constructing an example to show that the
> expectation the OP had about a certain type of undefined behavior is not
> only theoretically unjustified, but that a real processor exists on which
> this expectation would not necessarily hold.

Actually, though, it is usual for pointers returned by malloc() to have
length, and other allocation related information just before the address
returned. It isn't required, but it is a popular implementation. In 16
bit small model, it is usual not to return the address 0, as that is the
NULL pointer. Many C libraries check at the end to see if the value at
address 0 was modified, and report it.

In the 16 bit OS/2 days, though, I would sometimes, instead of calling
malloc() directly call the OS/2 memory allocator, and use the resulting
segment selector with an offset of zero. In that case, such a loop would
fail. Otherwise, I don't know of any that actually return an offset of
zero. Not that that has anything to do with it being legal or not. One
could obviously do the same loop with a larger decrement, in which case it
would wrap.

Rules are rules, and there are good reasons for this one. Still, I don't
know of any actual, popular, implementations where it fails.

-- glen