Velocity Reviews > lvalues and rvalues

# lvalues and rvalues

Keith Thompson
Guest
Posts: n/a

 05-02-2010
Nicklas Karlsson <(E-Mail Removed)> writes:
> On May 2, 8:37Â*pm, Keith Thompson <(E-Mail Removed)> wrote:
>
>> The operand of the unary "&" operator must be an lvalue; it designates
>> the object whose address is to be taken. Â*The expression ``&whatever''
>> is not an lvalue; its result is an ordinary value of some pointer type.

>
> Yes, I do understand that the & operation results in an ordinary
> value, in the end.
>
>> If you refer to its "starting address", I suggest that you're
>> thinking about this on too low a level.

>
>> No, the lvalue does not evaluate to an address. Â*Designating an object
>> and taking the address of an object are two quite different things.
>> (In the generated machine code, evaluating an lvalue might involve
>> computing an address, but we're talking about C, not machine code.)

>
> Yes, I might be thinking of it at a to low level you are right. I was
> thinking of how usually an lvalue evaluates to designate an object,
> then that objects has its value read or has a value written to it. But
> the lvalue required for the & operation most likely won't evaluate to
> identify the entire object that it could in another context, but
> merely takes the address to the objects first byte (that is, the
> address identifies an N-byte(s) (usually 1 byte) big object.

Perhaps I'm misunderstanding you, but this is incorrect.

Let's assume sizeof(int)==4. Given:
int arr[10];
int i = 5;
the expression ``arr[i]'' is an lvalue that designates an
element of the array. This lvalue is of type int, which means
that it designates an object of type int, whose size is 4 bytes.
Applying unary "&" to this value gives us ``&arr[i]'', a non-lvalue
expression whose value is of type int*. It is the address of the
int object, *not* the address of its first byte. (If you wanted
the address of its first byte rather than of the entire int object,
you could write ``(char*)&arr[i]'', among other possibilities.)

>> I don't understand this. Â*Can you give an example?

>
> Well, lets assume the following:
> 1. The variable is in memory

Ok.

> 3. The variable is declared as int and an int is 4 bytes.

Ok. (You swapped 2 and 3, but that doesn't matter.)

> 2. An address is an address to a region of storage that can store 1
> byte of data

And this is where you go wrong. I think you're using the term
"address" to refer to a machine-level address. That's not what the
word "address" means in C, and in particular it's not what the unary
"&" operator yields.

A C "address" is a value of pointer type. A pointer type always
specifies a type that it points to (possibly void, possibly
incomplete, possibly a function type, but it's always some C type).

C pointer values are typically *implemented* as machine-level
addresses (though I could cite exceptions to that), but logically a
C pointer is always a pointer to something of some particular type,
and if the pointed-to type is an object type then it has a specific
size associated with it. That size isn't 1 byte unless it just
happens to be a type whose size is 1 byte.

> Now, if I did "var = 1;" the expression "var" would evaluate to
> identify the object to where the value "1" should be written, if I do
> "printf("%d", var);" the expression "var" would evaluate to identify
> the object and read the objects value.

Right. In the first case, the lvalue ``var'' appears in a context
that requires an lvalue, so it designates the object. In the
second it appears in a context that doesn't require an lvalue,
so it's "converted to the value stored in the designated object
(and is no longer an lvalue)" (C99 6.3.2.1p2).

> Now if I did "&var" the lvalue
> would evaluate to the address to the objects first byte, and since the
> objects size is not used the address itself only identifies a 1 byte
> big object, not the entire object that the lvalue *could* identify if
> used in another context (and if the object was bigger than 1 byte).

No, ``&var'' yields the address *of the entire object*, and is of type
pointer-to-int, *not* pointer-to-byte.

Now the generated machine code and in-memory representation for a
pointer to an int is likely (but by no means certain) to be identical to
those for a pointer to the first byte of the same int. Type information
is discarded during the process of translating C source code to machine
code. Similarly, the generated code and in-memory representation for
float x = 3.1415927;
and
unsigned int x = 0x40490fdb;
are likely to be identical, but *in C* they're conceptually very
different. We still say that x is of type float in the first case
and of type unsigned int in the second.

> (Sorry for this quoting style, I cut off the rest of the text already
> and cannot go back)
> "The operand of the unary "&" operator must be an lvalue; it
> designates
> the object whose address is to be taken."
>
> Right, thats the entire point, the lvalue won't (unless the size if
> the same as the object behind an address) evaluate to identify the
> entire object (on my computers, it evaluates to an address), it will
> only evaluate to identify parts of what's required to fully identify
> an the object (it misses size, so the object's size isn't known,
> therefore the object is not fully identified), namely the starting
> address of the object.

Can you provide a concrete example, in C code, where an lvalue
designates the first byte of an object rather than the entire
object? What exactly do you mean when you say that it designates
the first byte?

(A minor point: I prefer to use the term "designate" rather than
"identify". They probably mean the same thing, but the standard
uses the term "designate".)

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Nicklas Karlsson
Guest
Posts: n/a

 05-02-2010
On May 2, 10:57*pm, Keith Thompson <(E-Mail Removed)> wrote:

> Perhaps I'm misunderstanding you, but this is incorrect.
>
> Let's assume sizeof(int)==4. *Given:
> * * int arr[10];
> * * int i = 5;
> the expression ``arr[i]'' is an lvalue that designates an
> element of the array. *This lvalue is of type int, which means
> that it designates an object of type int, whose size is 4 bytes.
> Applying unary "&" to this value gives us ``&arr[i]'', a non-lvalue
> expression whose value is of type int*. *It is the address of the
> int object, *not* the address of its first byte. *(If you wanted
> the address of its first byte rather than of the entire int object,
> you could write ``(char*)&arr[i]'', among other possibilities.)
>

(char*)&arr[i] is the same as &arr[i] except instead of int* its char*

I see your point, a memory based object has an address (which I prefer
to call starting address) and a size, so yes, doing & gives the
objects starting address (that is, the objects address), but, since
the size is not used, the address is on its own, and therefore most
likely only identifies 1 byte (this byte could be a part of the bigger
object). But since the lvalue will not evaluate to an instruction
using the size of the object, the lvalue does not evaluate to identify
the *entire* object, even tho it *would* in another context.

I find it very hard to state my point, I'm not sure why, but I can try
to simple it down:

The lvalue has the type "int" and a starting address, so in most cases
it evaluates to something that keeps track of this size, because the
object is a certain size, maybe 4 bytes, if the lvalue evaluates to
the objects address, it does not fully identify the object, because
the object has a size.

> Can you provide a concrete example, in C code, where an lvalue
> designates the first byte of an object rather than the entire
> object? *What exactly do you mean when you say that it designates
> the first byte?

In C code, no, I cannot. What I meant when i said that it only
designates the first byte is basically, the lvalue will only evaluate
to the objects address, and that address on itself most likely only
identifies 1 byte of storage. Why? Because an object has a size, and
if that size is bigger than 1 byte (or whatever the storage at the
address can store) it only identifies that object (the object at that
address), not the entire object that might be for example 4 bytes big.

bart.c
Guest
Posts: n/a

 05-02-2010
Nicklas Karlsson wrote:
> On May 2, 10:57 pm, Keith Thompson <(E-Mail Removed)> wrote:

> (char*)&arr[i] is the same as &arr[i] except instead of int* its char*
>
> I see your point, a memory based object has an address (which I prefer
> to call starting address) and a size, so yes, doing & gives the
> objects starting address (that is, the objects address), but, since
> the size is not used, the address is on its own, and therefore most
> likely only identifies 1 byte (this byte could be a part of the bigger
> object).

C needs to work also with machines where a byte pointer and an int pointer
to the same location, could have different address representations.

(For example, a machine that can only address 32-bit words, where C's char
is 8-bits, might need a different pointer format for chars.)

So it might not be possible to have an address 'on it's own' without also
knowing what type it points to.

--
Bartc

Keith Thompson
Guest
Posts: n/a

 05-02-2010
Nicklas Karlsson <(E-Mail Removed)> writes:
> On May 2, 10:57Â*pm, Keith Thompson <(E-Mail Removed)> wrote:
>
>> Perhaps I'm misunderstanding you, but this is incorrect.
>>
>> Let's assume sizeof(int)==4. Â*Given:
>> Â* Â* int arr[10];
>> Â* Â* int i = 5;
>> the expression ``arr[i]'' is an lvalue that designates an
>> element of the array. Â*This lvalue is of type int, which means
>> that it designates an object of type int, whose size is 4 bytes.
>> Applying unary "&" to this value gives us ``&arr[i]'', a non-lvalue
>> expression whose value is of type int*. Â*It is the address of the
>> int object, *not* the address of its first byte. Â*(If you wanted
>> the address of its first byte rather than of the entire int object,
>> you could write ``(char*)&arr[i]'', among other possibilities.)
>>

>
> (char*)&arr[i] is the same as &arr[i] except instead of int* its char*
>
> I see your point, a memory based object has an address (which I prefer
> to call starting address) and a size, so yes, doing & gives the
> objects starting address (that is, the objects address), but, since
> the size is not used, the address is on its own, and therefore most
> likely only identifies 1 byte (this byte could be a part of the bigger
> object). But since the lvalue will not evaluate to an instruction
> using the size of the object, the lvalue does not evaluate to identify
> the *entire* object, even tho it *would* in another context.

Well, all I can say is that you're mistaken.

In C (which is what we discuss in this newsgroup), an "address" isn't
just the address of a byte. All addresses (equivalently, all values of
pointer type) have both a value and a type. A non-null value of type
int* is the address of an object of type int. Of the entire object,
*not* of its first byte.

That's what the word "address" means in C.

[...]

>> Can you provide a concrete example, in C code, where an lvalue
>> designates the first byte of an object rather than the entire
>> object? Â*What exactly do you mean when you say that it designates
>> the first byte?

>
> In C code, no, I cannot.

And that's the point!

It's true that an int* value is *typically* represented, in a running
program compiled from C source, as the machine address of the first
byte of the int object. But that's not what it means in C.

And I've actually worked on a system (the Cray T90) where an int*
is represented as the machine-level address of a 64-bit word, and a
char* is not a machine address at all. Since the hardware cannot
address 8-bit bytes, a char* pointer consists of a word pointer
with 3 bits of offset information stored in the high-order bits.
This was implemented entirely in software. (It happens that a
pointer to the first byte of a word has the same representation
as a pointer to the containing word, but that needn't be the case;
if the high-order 3 bits weren't available, the offset would have
to be stored separately, with sizeof(char*) > sizeof(int*).)

> What I meant when i said that it only
> designates the first byte is basically, the lvalue will only evaluate
> to the objects address, and that address on itself most likely only
> identifies 1 byte of storage. Why? Because an object has a size, and
> if that size is bigger than 1 byte (or whatever the storage at the
> address can store) it only identifies that object (the object at that
> address), not the entire object that might be for example 4 bytes big.

Nope.

Again, an lvalue doesn't evaluate to the address of an object. It
*designates* an object, which is a subtly different thing. And it
designates the entire object, not the object's first byte.

Given:
int x;
x = 42;
``x'' in the assignment is an lvalue. It doesn't evaluate to the
address of x; there is no expression or subexpression of type int*
anywhere in sight. It doesn't designate the first byte of x, it
designates x.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Nicklas Karlsson
Guest
Posts: n/a

 05-02-2010
On May 3, 1:04*am, Keith Thompson <(E-Mail Removed)> wrote:

> Well, all I can say is that you're mistaken.
>
> In C (which is what we discuss in this newsgroup), an "address" isn't
> just the address of a byte. *All addresses (equivalently, all values of
> pointer type) have both a value and a type. *A non-null value of type
> int* is the address of an object of type int. *Of the entire object,
> *not* of its first byte.
>
> That's what the word "address" means in C.

Okay thanks, that makes sense.

> And that's the point!

I saw that one coming

> Again, an lvalue doesn't evaluate to the address of an object. *It
> *designates* an object, which is a subtly different thing. *And it
> designates the entire object, not the object's first byte.

I see, so even tho the operand of the & operator may be implemented as
evaluating to an address to a byte, as far as C is concerned it
evaluated to identify an object that later has its address taken?

It even makes sense to think the expression is looked at (evaluated)
and it is found out what object is being identified, and then that
objects address is taken.

Tim Rentsch
Guest
Posts: n/a

 05-03-2010
Keith Thompson <(E-Mail Removed)> writes:

> Nicklas Karlsson <(E-Mail Removed)> writes:
>> On May 2, 10:57 pm, Keith Thompson <(E-Mail Removed)> wrote:
>>
>>> Perhaps I'm misunderstanding you, but this is incorrect.
>>>
>>> Let's assume sizeof(int)==4. Given:
>>> int arr[10];
>>> int i = 5;
>>> the expression ``arr[i]'' is an lvalue that designates an
>>> element of the array. This lvalue is of type int, which means
>>> that it designates an object of type int, whose size is 4 bytes.
>>> Applying unary "&" to this value gives us ``&arr[i]'', a non-lvalue
>>> expression whose value is of type int*. It is the address of the
>>> int object, *not* the address of its first byte. (If you wanted
>>> the address of its first byte rather than of the entire int object,
>>> you could write ``(char*)&arr[i]'', among other possibilities.)
>>>

>>
>> (char*)&arr[i] is the same as &arr[i] except instead of int* its char*
>>
>> I see your point, a memory based object has an address (which I prefer
>> to call starting address) and a size, so yes, doing & gives the
>> objects starting address (that is, the objects address), but, since
>> the size is not used, the address is on its own, and therefore most
>> likely only identifies 1 byte (this byte could be a part of the bigger
>> object). But since the lvalue will not evaluate to an instruction
>> using the size of the object, the lvalue does not evaluate to identify
>> the *entire* object, even tho it *would* in another context.

>
> Well, all I can say is that you're mistaken.
>
> In C (which is what we discuss in this newsgroup), an "address" isn't
> just the address of a byte. All addresses (equivalently, all values of
> pointer type) have both a value and a type. A non-null value of type
> int* is the address of an object of type int. Of the entire object,
> *not* of its first byte.
>
> That's what the word "address" means in C.

I feel obliged to offer a dissenting opinion. The word "address"
is not defined in the C standard. (Perhaps it's defined in one of
the normative references? I don't know.) I wouldn't say the
Standard uses the term as synonymous or interchangeable with
pointer (or non-null pointers, if that distinction matters); in
particular AFAIIA the Standard never talks about addresses as
having a type or refers to the type of an address, or anything
related to size information. As I read the Standard it usually
uses the term address to mean something like "the abstract value of
a (char *) or (void *) that points to the first byte of an object",
sort of like 3 or 5 for (int). In any case what "address" means in
the C Standard is a matter of opinion, since it isn't defined in
the Standard, or perhaps it's defined in one of the normative
references, in which case it certainly doesn't mean the same thing
as "pointer" since the normative references do not have to do
specifically with C.

Roughly speaking, "address" in the C Standard means a "machine
address in the C abstract machine".

>>> Can you provide a concrete example, in C code, where an lvalue
>>> designates the first byte of an object rather than the entire
>>> object? What exactly do you mean when you say that it designates
>>> the first byte?

>>
>> In C code, no, I cannot.

>
> And that's the point!
>
> It's true that an int* value is *typically* represented, in a running
> program compiled from C source, as the machine address of the first
> byte of the int object. But that's not what it means in C.
>
> And I've actually worked on a system (the Cray T90) where an int*
> is represented as the machine-level address of a 64-bit word, and a
> char* is not a machine address at all. Since the hardware cannot
> address 8-bit bytes, a char* pointer consists of a word pointer
> with 3 bits of offset information stored in the high-order bits.
> This was implemented entirely in software. (It happens that a
> pointer to the first byte of a word has the same representation
> as a pointer to the containing word, but that needn't be the case;
> if the high-order 3 bits weren't available, the offset would have
> to be stored separately, with sizeof(char*) > sizeof(int*).)

The confusion here is about which machine is being referred to. A
"machine address" is an address in an actual machine (in this case
a Cray T90). An "address" (or "C address") is an address in the C
abstract machine.

>> What I meant when i said that it only
>> designates the first byte is basically, the lvalue will only evaluate
>> to the objects address, and that address on itself most likely only
>> identifies 1 byte of storage. Why? Because an object has a size, and
>> if that size is bigger than 1 byte (or whatever the storage at the
>> address can store) it only identifies that object (the object at that
>> address), not the entire object that might be for example 4 bytes big.

>
> Nope.
>
> Again, an lvalue doesn't evaluate to the address of an object. It
> *designates* an object, which is a subtly different thing. And it
> designates the entire object, not the object's first byte.

The two statements are not incompatible. Evaluating an lvalue
expression computes the address of an object (not "evaluates to the
address of" but "computes the address of"). The lvalue expression,
when evaluated, also designates an object. There are two parts to
designating an object, namely: (1) its runtime address, and (2)
its type (which normally also implies a size). The information in
part (2) is compile-time information, it doesn't need to be
computed at runtime. The information in part (1) is run-time
information, and needs to be computed somehow, so the object can be
read or stored into. In many cases that "computation" is trivial,
but the computation does need to occur, because some identifiers
refer to different objects (located at different addresses) at
different points in a program's execution, even though it's the
same identifier in the program source.

It's also true that lvalues can designate bitfields but this is a
minor matter; it simply means that in addition to the address (of
the addressable unit in which the bitfield resides) and the type
there is compile-time information about the starting bit position
and width of the bitfield. Some address still must be computed.

> Given:
> int x;
> x = 42;
> ``x'' in the assignment is an lvalue. It doesn't evaluate to the
> address of x; there is no expression or subexpression of type int*
> anywhere in sight. It doesn't designate the first byte of x, it
> designates x.

Whether there is an (int*) pointer is irrelevant; addresses are
not the same as pointers. It's clear that in evaluating the
assignment 'x = 42;' the address of 'x' is needed to be able to
store the right-hand-side value. In only makes sense to think that
evaluting the left-hand-side lvalue 'x' will compute this address.

At some level the views here are just questions of terminology; I
don't think there's any real disagreement about what happens (if we
ignore for a moment which particular words are used to describe what
happens). What does "address" mean? In my opinion the Standard reads
more naturally if "address" is taken to mean an address in the C
abstract machine; a "pointer" is then one of many (depending on what
type is being referenced) particular representations of an address (or
"non-address" for a null pointer) _plus_ some compile-time information
expressing what type is being pointed at. An lvalue, when evaluated,
desginates an object: part of designating an object is compile-time
information that is directly encoded in the machine instructions
corresponding to the expression in question; the other part of
designating an object is run-time information (where the object is)
that must be computed (in some cases) at run time. It's natural
to call that location information the address of the object.
As I read the Standard that's just how the term "address" is used.

Keith Thompson
Guest
Posts: n/a

 05-03-2010
Tim Rentsch <(E-Mail Removed)> writes:
> Keith Thompson <(E-Mail Removed)> writes:
>> Nicklas Karlsson <(E-Mail Removed)> writes:

[...]
>> Well, all I can say is that you're mistaken.
>>
>> In C (which is what we discuss in this newsgroup), an "address" isn't
>> just the address of a byte. All addresses (equivalently, all values of
>> pointer type) have both a value and a type. A non-null value of type
>> int* is the address of an object of type int. Of the entire object,
>> *not* of its first byte.
>>
>> That's what the word "address" means in C.

>
> I feel obliged to offer a dissenting opinion. The word "address"
> is not defined in the C standard. (Perhaps it's defined in one of
> the normative references? I don't know.) I wouldn't say the
> Standard uses the term as synonymous or interchangeable with
> pointer (or non-null pointers, if that distinction matters); in
> particular AFAIIA the Standard never talks about addresses as
> having a type or refers to the type of an address, or anything
> related to size information. As I read the Standard it usually
> uses the term address to mean something like "the abstract value of
> a (char *) or (void *) that points to the first byte of an object",
> sort of like 3 or 5 for (int). In any case what "address" means in
> the C Standard is a matter of opinion, since it isn't defined in
> the Standard, or perhaps it's defined in one of the normative
> references, in which case it certainly doesn't mean the same thing
> as "pointer" since the normative references do not have to do
> specifically with C.
>
> Roughly speaking, "address" in the C Standard means a "machine
> address in the C abstract machine".

So we have three disinct concepts: a machine address (virtual,
physical, whatever) on the actual hardware, a "machine address"
in the C abstract machine (which points only to the first byte of
an object), and a pointer value which points to an entire object.
I don't think the second concept is either necessary or clearly
stated in the Standard.

And if you want a pointer to a byte, you already have char*
and friends.

N1256 6.5.3.2p3:
The unary & operator yields the address of its operand. If
the operand has type â€˜â€˜_type_â€™â€™, the result has type
â€˜â€˜pointer to _type_â€™â€™.

(The original C99 standard has "returns" rather than "yields".)

IMHO this very strongly suggests that the "address" has a particular
pointer type. (I'm assuming that what the operator "yields" is exactly
the same as its "result"; since the result has type "pointer to _type_",
I conclude that the address has type "pointer to _type_".)

I just took a (very quick) look at all occurrences of the word "address"
in N1256. The discussion of "addresses that are particular multiples of
a byte address" might suggest that an object's address is the address of
its first byte, but I don't think it's a strong implication. I see
nothing in the standard that's inconsistent with my interpretation.

On the other hand, it's not stated explicitly (though I think 6.5.3.2p3
comes close).

I'm admittedly biased by my opinion that my interpretation of the
meaning of a C "address" makes sense -- but of course that's a *good*
bias. }

[...]
>> Again, an lvalue doesn't evaluate to the address of an object. It
>> *designates* an object, which is a subtly different thing. And it
>> designates the entire object, not the object's first byte.

>
> The two statements are not incompatible. Evaluating an lvalue
> expression computes the address of an object (not "evaluates to the
> address of" but "computes the address of"). The lvalue expression,
> when evaluated, also designates an object. There are two parts to
> designating an object, namely: (1) its runtime address, and (2)
> its type (which normally also implies a size). The information in
> part (2) is compile-time information, it doesn't need to be
> computed at runtime. The information in part (1) is run-time
> information, and needs to be computed somehow, so the object can be
> read or stored into. In many cases that "computation" is trivial,
> but the computation does need to occur, because some identifiers
> refer to different objects (located at different addresses) at
> different points in a program's execution, even though it's the
> same identifier in the program source.
>
> It's also true that lvalues can designate bitfields but this is a
> minor matter; it simply means that in addition to the address (of
> the addressable unit in which the bitfield resides) and the type
> there is compile-time information about the starting bit position
> and width of the bitfield. Some address still must be computed.

I don't think it's a minor matter. Lvalues can designate things that
don't have addresses, both bitfields and register objects. Making
address computation part of the evaluation of an lvalue makes the
definition more complicated; it computes the address of the object, or
it computes the address of the addressable unit in which the bitfield
resides, or it does whatever it does for a register object. And the
standard doesn't say that evaluating an lvalue has anything to do with
the address of the designated object.

I find it simpler just to say that an lvalue designates an object. How
that designation is used depends on the context (some contexts do
require the object to have an address). How that use is *implemented*
is up to the implementation, and is outside the scope of the language
standard.

[...]

> At some level the views here are just questions of terminology; I
> don't think there's any real disagreement about what happens (if we
> ignore for a moment which particular words are used to describe what
> happens).

Agreed.

> What does "address" mean? In my opinion the Standard reads
> more naturally if "address" is taken to mean an address in the C
> abstract machine; a "pointer" is then one of many (depending on what
> type is being referenced) particular representations of an address (or
> "non-address" for a null pointer) _plus_ some compile-time information
> expressing what type is being pointed at. An lvalue, when evaluated,
> desginates an object: part of designating an object is compile-time
> information that is directly encoded in the machine instructions
> corresponding to the expression in question; the other part of
> designating an object is run-time information (where the object is)
> that must be computed (in some cases) at run time. It's natural
> to call that location information the address of the object.
> As I read the Standard that's just how the term "address" is used.

Again, I think that 6.5.3.2p3 is the closest the standard comes to
defining "address" (it's describing the "address-of" operator, after
all), and I think it says that the address has a pointer type.

But I agree that it's not 100% clear, and your interpretation is
probably consistent with wording of the standard. (It's possible
that the members of the Committee weren't 100% clear on this
themselves.) But I still like my interpretation better.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Tim Rentsch
Guest
Posts: n/a

 05-05-2010
Keith Thompson <(E-Mail Removed)> writes:

> Tim Rentsch <(E-Mail Removed)> writes:
>> Keith Thompson <(E-Mail Removed)> writes:
>>> Nicklas Karlsson <(E-Mail Removed)> writes:

> [...]
>>> Well, all I can say is that you're mistaken.
>>>
>>> In C (which is what we discuss in this newsgroup), an "address" isn't
>>> just the address of a byte. All addresses (equivalently, all values of
>>> pointer type) have both a value and a type. A non-null value of type
>>> int* is the address of an object of type int. Of the entire object,
>>> *not* of its first byte.
>>>
>>> That's what the word "address" means in C.

>>
>> I feel obliged to offer a dissenting opinion. The word "address"
>> is not defined in the C standard. (Perhaps it's defined in one of
>> the normative references? I don't know.) I wouldn't say the
>> Standard uses the term as synonymous or interchangeable with
>> pointer (or non-null pointers, if that distinction matters); in
>> particular AFAIIA the Standard never talks about addresses as
>> having a type or refers to the type of an address, or anything
>> related to size information. As I read the Standard it usually
>> uses the term address to mean something like "the abstract value of
>> a (char *) or (void *) that points to the first byte of an object",
>> sort of like 3 or 5 for (int). In any case what "address" means in
>> the C Standard is a matter of opinion, since it isn't defined in
>> the Standard, or perhaps it's defined in one of the normative
>> references, in which case it certainly doesn't mean the same thing
>> as "pointer" since the normative references do not have to do
>> specifically with C.
>>
>> Roughly speaking, "address" in the C Standard means a "machine
>> address in the C abstract machine".

>
> So we have three disinct concepts: a machine address (virtual,
> physical, whatever) on the actual hardware, a "machine address"
> in the C abstract machine (which points only to the first byte of
> an object), and a pointer value which points to an entire object.

I count only two concepts -- address, and C pointer. Different
kinds of machines uses different sorts of addresses; it doesn't
seem like a big leap for an actual machine and the C abstract
machine to each have their own kind of addresses.

> I don't think the second concept is either necessary or clearly
> stated in the Standard.

Do we know if the term "address" is defined in the normative
references? At some level I'd be surprised if it were not.

> And if you want a pointer to a byte, you already have char*
> and friends.

Ahhh, but that isn't quite the same as an address. Granted,
in the C abstract machine, we expect (char*) to have the same
_resolution_ as an address, but they still aren't the same
thing.

> N1256 6.5.3.2p3:
> The unary & operator yields the address of its operand. If
> the operand has type ''_type_'', the result has type
> ''pointer to _type_''.
>
> (The original C99 standard has "returns" rather than "yields".)
>
> IMHO this very strongly suggests that the "address" has a particular
> pointer type. (I'm assuming that what the operator "yields" is exactly
> the same as its "result"; since the result has type "pointer to _type_",
> I conclude that the address has type "pointer to _type_".)
>
> I just took a (very quick) look at all occurrences of the word "address"
> in N1256. The discussion of "addresses that are particular multiples of
> a byte address" might suggest that an object's address is the address of
> its first byte, but I don't think it's a strong implication. I see
> nothing in the standard that's inconsistent with my interpretation.
>
> On the other hand, it's not stated explicitly (though I think 6.5.3.2p3
> comes close).
>
> I'm admittedly biased by my opinion that my interpretation of the
> meaning of a C "address" makes sense -- but of course that's a *good*
> bias. }
>
> [...]
>>> Again, an lvalue doesn't evaluate to the address of an object. It
>>> *designates* an object, which is a subtly different thing. And it
>>> designates the entire object, not the object's first byte.

>>
>> The two statements are not incompatible. Evaluating an lvalue
>> expression computes the address of an object (not "evaluates to the
>> address of" but "computes the address of"). The lvalue expression,
>> when evaluated, also designates an object. There are two parts to
>> designating an object, namely: (1) its runtime address, and (2)
>> its type (which normally also implies a size). The information in
>> part (2) is compile-time information, it doesn't need to be
>> computed at runtime. The information in part (1) is run-time
>> information, and needs to be computed somehow, so the object can be
>> read or stored into. In many cases that "computation" is trivial,
>> but the computation does need to occur, because some identifiers
>> refer to different objects (located at different addresses) at
>> different points in a program's execution, even though it's the
>> same identifier in the program source.
>>
>> It's also true that lvalues can designate bitfields but this is a
>> minor matter; it simply means that in addition to the address (of
>> the addressable unit in which the bitfield resides) and the type
>> there is compile-time information about the starting bit position
>> and width of the bitfield. Some address still must be computed.

>
> I don't think it's a minor matter. Lvalues can designate things that
> don't have addresses, both bitfields and register objects. Making
> address computation part of the evaluation of an lvalue makes the
> definition more complicated; it computes the address of the object, or
> it computes the address of the addressable unit in which the bitfield
> resides, or it does whatever it does for a register object. And the
> standard doesn't say that evaluating an lvalue has anything to do with
> the address of the designated object.

Oh, but the Standard talks explicitly about the "addressable units"
that bitfields reside within; for example, the address of a struct
of union points at the "addressable unit" of bitfield members (only
the first such member for structs, obviously).

As for registers, the C abstract machine wisely chose to put
its registers in addressable memory so that they have addresses
just like any other object. It's only because they realized
that actual machines aren't as well-designed as the C abstract
machine that the Standard disallows & to be applied to a register
variable.

> I find it simpler just to say that an lvalue designates an object. How
> that designation is used depends on the context (some contexts do
> require the object to have an address). How that use is *implemented*
> is up to the implementation, and is outside the scope of the language
> standard.

Again I think the two notions are not incompatible. A pointer
(which includes a type) clearly has more information than an
"address", used in the sense of a machine address. A pointer
(that isn't == NULL) designates an object; part of designating
an object is its address, and another part of designating an
object is other information (most of which usually is known
at compile time, but that's not especially important).

>> At some level the views here are just questions of terminology; I
>> don't think there's any real disagreement about what happens (if we
>> ignore for a moment which particular words are used to describe what
>> happens).

>
> Agreed.
>
>> What does "address" mean? In my opinion the Standard reads
>> more naturally if "address" is taken to mean an address in the C
>> abstract machine; a "pointer" is then one of many (depending on what
>> type is being referenced) particular representations of an address (or
>> "non-address" for a null pointer) _plus_ some compile-time information
>> expressing what type is being pointed at. An lvalue, when evaluated,
>> desginates an object: part of designating an object is compile-time
>> information that is directly encoded in the machine instructions
>> corresponding to the expression in question; the other part of
>> designating an object is run-time information (where the object is)
>> that must be computed (in some cases) at run time. It's natural
>> to call that location information the address of the object.
>> As I read the Standard that's just how the term "address" is used.

>
> Again, I think that 6.5.3.2p3 is the closest the standard comes to
> defining "address" (it's describing the "address-of" operator, after
> all), and I think it says that the address has a pointer type.
>
> But I agree that it's not 100% clear, and your interpretation is
> probably consistent with wording of the standard. (It's possible
> that the members of the Committee weren't 100% clear on this
> themselves.) But I still like my interpretation better.

Personally I think it makes more sense to take "address" in
the Standard to mean the same sense that it has in "machine
address", but meant more abstractly, like an address in the
C abstract machine. However, as long as we agree that the
Standard itself doesn't definitely address (no pun intended)
the question, and allows either reading, I think we agree on
the most important point.

 Thread Tools

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is OffTrackbacks are On Pingbacks are On Refbacks are Off Forum Rules

 Similar Threads Thread Thread Starter Forum Replies Last Post jacob navia C Programming 68 06-27-2007 03:32 PM Steven T. Hatton C++ 1 12-14-2006 09:45 PM ramasubramanian.rahul@gmail.com C Programming 3 10-14-2006 09:55 PM amparikh@gmail.com C++ 6 06-08-2005 03:19 PM Steven T. Hatton C++ 11 04-20-2004 01:38 AM

Advertisments