Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Dereference an array pointer... UB?

Reply
Thread Tools

Dereference an array pointer... UB?

 
 
Old Wolf
Guest
Posts: n/a
 
      02-12-2008
On Feb 12, 7:21 am, "Tomás Ó hÉilidhe" <(E-Mail Removed)> wrote:
> Do you think we can reach any kind of consensus on whether the
> following code's behaviour is undefined by the Standard?
>
> int my_array[5];
>
> int const *const pend = *(&my_array + 1);


&X + 1 is a pointer to one-past-the-end.
Dereferencing such a pointer this causes UB.
Doesn't matter what data type the pointer is.
 
Reply With Quote
 
 
 
 
Tomás Ó hÉilidhe
Guest
Posts: n/a
 
      02-12-2008
Old Wolf:

> &X + 1 is a pointer to one-past-the-end.
> Dereferencing such a pointer this causes UB.
> Doesn't matter what data type the pointer is.



That's a very superficial way of looking at it.

The REASON why it's UB to dereference a pointer to one-past-the-last is
because it could result in an out-of-bounds memory access.

With a pointer to an array, nothing happens when you dereference it -- all
that happens is that you've got an expression of int[X] rather than int(*)
[X].

--
Tomás Ó hÉilidhe
 
Reply With Quote
 
 
 
 
Tomás Ó hÉilidhe
Guest
Posts: n/a
 
      02-12-2008
Tomás Ó hÉilidhe:

> With a pointer to an array, nothing happens when you dereference it --
> all that happens is that you've got an expression of int[X] rather
> than int(*) [X].



In fact, I'd go one step further to say that the following should be legal:


int (*parr)[X] = (int(*)[X])798797; /* Some random address (but which
doesn't cause a trap)

*parr;


--
Tomás Ó hÉilidhe
 
Reply With Quote
 
Thad Smith
Guest
Posts: n/a
 
      02-12-2008
Tomás Ó hÉilidhe wrote:
> Old Wolf:
>
>> &X + 1 is a pointer to one-past-the-end.
>> Dereferencing such a pointer this causes UB.
>> Doesn't matter what data type the pointer is.

>
> That's a very superficial way of looking at it.
>
> The REASON why it's UB to dereference a pointer to one-past-the-last is
> because it could result in an out-of-bounds memory access.


Perhaps your point is that the Standard /should/ have defined a behavior,
but didn't. I agree with that.

My reading is that a unary * applied to a function pointer is defined. A
unary * applied to a pointer to an object is defined. There are no other
cases defined for the unary * operator. Since &X+1 technically isn't a
pointer to an object, *(&X+1) is undefined by omission.

--
Thad
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      02-12-2008
"Tomás Ó hÉilidhe" <(E-Mail Removed)> writes:
> Old Wolf:
>> &X + 1 is a pointer to one-past-the-end.
>> Dereferencing such a pointer this causes UB.
>> Doesn't matter what data type the pointer is.

>
> That's a very superficial way of looking at it.
>
> The REASON why it's UB to dereference a pointer to one-past-the-last is
> because it could result in an out-of-bounds memory access.


The reason why it's UB is that the standard doesn't define the
behavior. (Though you've correctly described the rationale for what
the standard says.)

> With a pointer to an array, nothing happens when you dereference it -- all
> that happens is that you've got an expression of int[X] rather than int(*)
> [X].


An expression of array type is converted to a pointer. There has to
be something to convert in the first place.

--
Keith Thompson (The_Other_Keith) <(E-Mail Removed)>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      02-12-2008
"Tomás Ó hÉilidhe" <(E-Mail Removed)> writes:
> Tomás Ó hÉilidhe:
>> With a pointer to an array, nothing happens when you dereference it --
>> all that happens is that you've got an expression of int[X] rather
>> than int(*) [X].

>
> In fact, I'd go one step further to say that the following should be legal:
>
> int (*parr)[X] = (int(*)[X])798797; /* Some random address (but which
> doesn't cause a trap)
> *parr;


You're certainly free to argue that it *should* be legal.

Actually, "legal" isn't the right word. It's not a syntax error or a
constraint violation, so it's "legal" in the sense that no diagnostic
is required. The question is whether the standard defines the
behavior.

parr is an lvalue. If it doesn't designate an object, then the
behavior of evaluating *parr is undefined. As always, the consequence
of undefined behavior can include doing nothing, or doing just what
you wanted it to do.

--
Keith Thompson (The_Other_Keith) <(E-Mail Removed)>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Tomás Ó hÉilidhe
Guest
Posts: n/a
 
      02-12-2008
Keith Thompson:

> An expression of array type is converted to a pointer. There has to
> be something to convert in the first place.



Yes but an array type isn't a value -- which is the very reason why
arrays decay to a pointer to their first element, so that we can actually
get a value out of them.

--
Tomás Ó hÉilidhe
 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      02-12-2008
"Tomás Ó hÉilidhe" <(E-Mail Removed)> writes:
> Keith Thompson:
>> An expression of array type is converted to a pointer. There has to
>> be something to convert in the first place.

>
> Yes but an array type isn't a value -- which is the very reason why
> arrays decay to a pointer to their first element, so that we can actually
> get a value out of them.


To quibble over your choice of words, of course an array type isn't a
value; an array type is a type. (I'm not picking on you, but
precision is important.)

Presumably what you meant is that there's no such thing as an array
value. I think the standard is vague on this point, but I disagree;
there *is* such a thing as an array value. The language just provides
very few contexts in which array values become visible.

C99 3.17 defines a "value" as the "precise meaning of the contents of
an object when interpreted as having a specific type". I don't see
how that excludes arrays. (It does seem to exclude the result of
evaluating a non-lvalue expression, but that's a separate issue.)

There clearly are struct values. Structs can be assigned, passed as
function arguments, and returned as function results, all by copying
the value. A struct value consists of the values of its members;
for example, given:
struct { int x; int y; } obj = { 10, 20 };
the value of obj consists of the int values 10 and 20. A struct with
a member of array type has a value that includes the value of the
array member; that value consists of the values of the array's
elements.

Here's something to chew on. It probably says something about the
original question, but I'm not sure what.

int main(void)
{
struct s {
int x;
int y[2];
} ;
volatile struct s obj = { 10, { 20, 30 } };

obj; /* Computes and discards the value of obj.
Must access obj.x, obj.y[0], and obj.y[1]. */

obj.x; /* Computes and discards the value of obj.x.
Must access obj.x. */

obj.y; /* Computes and discards the address of obj.y[0].
Must this access obj.y[0] and obj.y[1]?
*May* it do so?
C&V? */

return 0;
}

--
Keith Thompson (The_Other_Keith) <(E-Mail Removed)>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Thad Smith
Guest
Posts: n/a
 
      02-13-2008
Tomás Ó hÉilidhe wrote:
> Old Wolf:
>
>> &X + 1 is a pointer to one-past-the-end.
>> Dereferencing such a pointer this causes UB.
>> Doesn't matter what data type the pointer is.

>
>
> That's a very superficial way of looking at it.
>
> The REASON why it's UB to dereference a pointer to one-past-the-last is
> because it could result in an out-of-bounds memory access.


I would say that the reason that the behavior is undefined is that the
committee didn't realize (or appreciate) the potential utility of defining
the meaning of the unary * operator on pointer values derived from pointers
to objects, but not themselves a pointer to an object.

--
Thad
 
Reply With Quote
 
Kaz Kylheku
Guest
Posts: n/a
 
      02-13-2008
On Feb 11, 10:21*am, "Tomás Ó hÉilidhe" <(E-Mail Removed)> wrote:
> * * Do you think we can reach any kind of consensus on whether the
> following code's behaviour is undefined by the Standard?
>
> * * int my_array[5];
>
> * * int const *const pend = *(&my_array + 1);


You may have a pointer one element past the last element of an array
object. However, my_array as whole is not an element of an array. So
&myarray + 1 is invalid.

What you are doing is similar to computing p below:

int i, j[1];
int *p = &i + 1; // not right, i is not an array object
int *q = &j + 1; // okay, since j is an array object

We can fix this in your example, similarly to the trick with j above:
use a one-element array.

But the dereference conundrum is still there:

int my_array[1][5];
int *p = my_array[1];

The problem is clearer now: you're trying to create pointer-based
access to an nonexistent array. The expression my_array[0] refers to a
valid array element, which is an array of 5 ints. But there is no such
array as my_array[1]. This my_array[1] expression has the /type/
``array of 5 int'', but it's not an object. You're allowed to point to
it as a unit, but that's it.

We can show the problem in these two steps:

int my_array[1][5];
int (*q)[5] = my_array + 1;

Now q is a ``pointer to an array of 5 int'', correctly aimed one-
element past the end of an array object. So far so good.

What we're trying to do next is effectively the same as:

int *p = q[0];

We've been given a finger, and want to take the hand. Not happy with
having a pointer one element past the end of an array object, we want
a pointer to the first element of that nonexistent element.

In fact the pointer we're trying to compute points to the same
location as &my_array[0][5], which is allowed, and has the same type.
One element past the end of my_array[0] would appear to be the same
nonexistent thing as the first element of my_array[1] (indeed it has
the same type and address) but the semantics is subtly different.

But if q[0] is okay, why not &q[0][0]. If decay cancels out bad
dereferencing, then address-of can also cancel out more bad
dereferencing. And now you open the door to &q[0][1]. If we can point
to the first element of a nonexistent array of 5 int, why not the
second? It's because we know that the justification for the first
element is that it's really one element past the end of something.
However, we didn't arrive at it that way.

/How/ we arrive at a value can determine whether or not it is correct,
not just the final value itself. If I have two int objects i, and j,
and perform arithmetic on &i so that the result points to j, that's
not correct, even though the result is indistinguishable from the
correct value &j.

Fact is, a bounds checking compiler could be designed to enforce the
semantic rule that dereferencing an out-of-bounds pointer is not
allowed under any circumstances, and consequently that array-to-
pointer decay can only happen over a valid array object.

> * * Considering the syntax of the language, then we definitely do
> dereference an invalid pointer... but if we consider the mechanics of the
> language, then we know that nothing "happens" when we dereference a pointer
> to an array, because arrays are dealt with in terms of pointers.


We could also argue that ``nothing'' happens when you merely increment
a pointer out of bounds.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Two dimentional array dereference in C Srinu C Programming 10 11-21-2007 04:22 AM
dereference an array problem craigandjeanne Perl Misc 3 06-01-2006 12:53 AM
lexical declaration and array dereference ko Perl Misc 3 08-27-2003 07:43 AM
Re: how to properly dereference STL list item Howard C++ 0 07-01-2003 05:46 PM
Re: how to properly dereference STL list item Jakob Bieling C++ 0 07-01-2003 05:45 PM



Advertisments