Velocity Reviews > Re: Compatibility question

# Re: Compatibility question

BartC
Guest
Posts: n/a

 10-22-2012
"Les Cargill" <(E-Mail Removed)> wrote in message
news:k64gjh\$ptl\$(E-Mail Removed)...
> BartC wrote:

>> You modify a, and find that b has also changed! Yet if a and b were
>> numbers,
>> you could 'modify' a after b=a, and b is not affected!)
>>
>> This is pass-by-reference taken to extremes.
>>

>
>
> That's not too different from:
>
> short a[] = {10, 20, 30};
> short *b = a;

But you've explicitly said that b is a pointer, so subsequent behaviour is
not unexpected.

(Although, because C doesn't need an explicit dereference for b, it can
cause a bit of confusion when the types of a and b are not immediately
visible to the reader. That's just another quirk of the language that has
both pros and cons.)

--
bartc

BartC
Guest
Posts: n/a

 10-22-2012

"James Kuyper" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> On 10/22/2012 05:44 PM, BartC wrote:
> ...
>> (What confuses *me* in some languages (it seems most of them these days)
>> is
>> stuff like this:
>>
>> a=[10,20,30]
>> b=a
>>
>> a[0]=9999 # modify a
>>
>> print a
>> print b
>>
>> You modify a, and find that b has also changed! Yet if a and b were
>> numbers,
>> you could 'modify' a after b=a, and b is not affected!)
>>
>> This is pass-by-reference taken to extremes.

>
> I agree, but it's a viable choice, so long as the language also provides
> some mechanism for creating a copy of 'a' rather than a reference to
> 'a'. Do each of the languages you're thinking of have some such mechanism?
>

Probably. But you have to look it up. (Googling suggests it might be
something like b=copy.deepcopy(a) for my Python example, but each language
is different.)

(It gets worse too: try adding the line c=[a]*100 to this example. After a
is modified, c now has 100 'copies' of the new a. Now do this:

c[0][0]=45

You'll find that all the other 99 elements of c have changed too, and is the
point where you start tearing your hair out!

I can see where they're coming from: if a and b were file handles for
example, then clearly b=a only makes a copy of the handle, not of the file!
But that's an explicit use of handles and references; it's not implicit and
depending also on what exactly a and b might be, as it it in these
languages.

When I implement this stuff, I always do deep copies - of data that is
managed by the language - but then I also allow everything to be mutable.)

--
bartc

BartC
Guest
Posts: n/a

 10-22-2012
"Steve Thompson" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> On Mon, Oct 22, 2012 at 10:52:47AM -0700, Keith Thompson wrote:

>> I'm not sure what you mean when you say `a` "is a pointer in
>> the implementation", especially after you mention "the fact that
>> variables defined as structures were not pointers".

>
> Meaning that under the hood in the C compiler, 'a' is really a
> pointer.

Not really. &a is a pointer, but a is just a value (of the struct).

I suppose if the struct was huge, a compiler might see if it could get away
with using a pointer to the struct instead. But for smaller structs it's not
a problem to manipulate them by value. (And it's anyway up to the programmer
if he wanted large structs handled by value or not.)

Also consider a function like this:

struct point add(struct point a,struct point b) {
struct point c={a.x+b.x,a.y+b.y};
return c;
}

and calling it like this:

struct point p,q,r,s,t;

how much work do you think it's going to be to manage all those behind the
scenes pointers, which might even require memory allocations to keep track
of? It's easier to keep intermediate values on the call stack.

> True enough, but the difference is that a structure cannot be used for
> arithmetic operations of any kind in C. That makes it fundamentally
> different from simple type variables. It may be meaningful to say that
> structures passed to functions as values is a hack in the C language.
> It fills a use-case scenario that will be found (or should be found)
> only in trivial programs, IMHO.

See above example..

>> > These days I'm thinking more and more about cache-line usage, so I am
>> > concerned with preserving the 'hotness' of my data. Copying
>> > structures around a whole lot is hostile to this paradigm, so that is
>> > one language feature I am happy to avoid.

>>
>> If a structure is smaller than a pointer, or not much bigger, it makes
>> perfect sense to pass it around by value.

>
> Most structures are considerably larger than a pointer in practice.
> My personal opinion is that it is a waste to pass them around by
> value in most cases.

It depends. If there are 100 fields and you only access 1 or 2, then it
probably is wasteful. If you have to access most of the fields anyway, then
it's not so bad (and you can avoid a level of indirection.)

--
bartc

Keith Thompson
Guest
Posts: n/a

 10-22-2012
Steve Thompson <(E-Mail Removed)> writes:
> On Mon, Oct 22, 2012 at 10:52:47AM -0700, Keith Thompson wrote:
>> Steve Thompson <(E-Mail Removed)> writes:
>> [...]
>> > No annoyance taken, although now that I think of it it does show an
>> > instance of overloaded semantics, which can be confusing. When I was
>> > less experienced with C, I was confused by the fact that variables
>> > defined as structures were not pointers:
>> >
>> > struct point {
>> > int x, y, z;
>> > };
>> >
>> > struct point a;
>> >
>> >
>> > One cannot reference 'a' (as such) in an expression even though it is
>> > a pointer in the implementation because C allows one to use structures
>> > as arguments to a function. I'd never do that in my own code, but
>> > some people may like the fact that they can pass complex data
>> > structures around as if they were simple variables. I think it wastes
>> > resources, but of course that is less of an issue today than it was
>> > when C was first developed.

>>
>> I'm not sure what you mean when you say `a` "is a pointer in
>> the implementation", especially after you mention "the fact that
>> variables defined as structures were not pointers".

>
> Meaning that under the hood in the C compiler, 'a' is really a
> pointer. The logic of the compiler causes it to use the structure
> data as specified in the standard, but conceptually 'a' is still a
> pointer to the data in the specific structure identified by 'a'. I
> found it very confusing when I was learning C.

Frankly, I think you're still confused.

`a` is not in any sense a pointer. It's one of several things,
depending on what level of abstraction you're looking at:

- An identifier that's the name of an object of type `struct point`;
- An object of type `struct point`;
- An lvalue expression that designates an object of type `struct point`;
- A non-lvalue expression that evaluates to the value of an object of
type `struct point`, a value that consists of the values of the
members.

If you want a pointer to `a`, you can construct one easily enough
by writing `&a`. A compiler *might* use pointers internally to
handle some operations on `a`, but there's no reason at all to
assume that it has to. For example, here's the code generated by
gcc for an assignment `b = a;`, where `b` and `a` are both of type
`struct point`:

movl 56(%esp), %eax
movl %eax, 68(%esp)
movl 60(%esp), %eax
movl %eax, 72(%esp)
movl 64(%esp), %eax
movl %eax, 76(%esp)

And here's the code generated for an assignment `y = x;`, where `y` and
`x` are of type `long double`:

movl 16(%esp), %eax
movl 20(%esp), %edx
movl 24(%esp), %ecx
movl %eax, 32(%esp)
movl %edx, 36(%esp)
movl %ecx, 40(%esp)

Looks pretty similar to me. Is a long double object "really" a pointer?

[...]

> True enough, but the difference is that a structure cannot be used for
> arithmetic operations of any kind in C. That makes it fundamentally
> different from simple type variables. It may be meaningful to say that
> structures passed to functions as values is a hack in the C language.
> It fills a use-case scenario that will be found (or should be found)
> only in trivial programs, IMHO.

Ok, you can't do arithmetic on structures. You can't use "%" on
floating-point types. You *can* assign structures and pass them as
function arguments, with exactly the same syntax and semantics as the
corresponding operations on scalar types.

[...]

There's a close relationship between arrays and pointers (though it's
important to remember that *arrays are not pointers*; see section 6 of
the comp.lang.c FAQ <http://www.c-faq.com>). There is no such special
relationship at the language or compiler level between structures and
pointers.

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson
Guest
Posts: n/a

 10-23-2012
"BartC" <(E-Mail Removed)> writes:
[...]
>
> struct {
> char r,g,b,a;
> } c;

(unsigned char would probably make more sense.)

> It would be less efficient to pass a pointer to this value, than to pass the
> value itself (especially when the pointer could well be wider). Returning
> such a struct as a value from a function is also far simpler.
>
> In any case C gives you the choice of pass-by-value or pass-by-reference -
> for structs.

I presume what you mean by that is that you can define a function that
takes a pointer to a struct, like this:

struct rgba {
unsigned char r, g, b, a;
};

void func(struct rgba *arg);

struct rgba obj;
func(&obj);

If the parameter is declared to be of type "struct rgba", it's
*always* passed by value.

I presume you know that, but other readers might mistakenly infer
that you can tell the compiler to pass arguments by reference
(which is possible in some other languages).

[...]

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Ian Collins
Guest
Posts: n/a

 10-23-2012
On 10/23/12 12:42, Steve Thompson wrote:
> On Mon, Oct 22, 2012 at 10:44:11PM +0100, BartC wrote:
>> "Steve Thompson"<(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed)...
>>
>>> When I was
>>> less experienced with C, I was confused by the fact that variables
>>> defined as structures were not pointers:
>>>
>>> struct point {
>>> int x, y, z;
>>> };
>>>
>>> struct point a;
>>>
>>>
>>> One cannot reference 'a' (as such) in an expression even though it is
>>> a pointer in the implementation because C allows one to use structures
>>> as arguments to a function. I'd never do that in my own code, but
>>> some people may like the fact that they can pass complex data
>>> structures around as if they were simple variables. I think it wastes
>>> resources, but of course that is less of an issue today than it was
>>> when C was first developed.

>>
>>
>> struct {
>> char r,g,b,a;
>> } c;
>>
>> It would be less efficient to pass a pointer to this value, than to pass the
>> value itself (especially when the pointer could well be wider). Returning
>> such a struct as a value from a function is also far simpler.

>
> I have my doubts about that, at least on anything like a contemporary
> x86 machine.

Try it, you may be surprised.

--
Ian Collins

BartC
Guest
Posts: n/a

 10-23-2012
"Keith Thompson" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> "BartC" <(E-Mail Removed)> writes:

>> In any case C gives you the choice of pass-by-value or
>> pass-by-reference -
>> for structs.

>
> I presume what you mean by that is that you can define a function that
> takes a pointer to a struct, like this:
>
> struct rgba {
> unsigned char r, g, b, a;
> };
>
> void func(struct rgba *arg);
>
> struct rgba obj;
> func(&obj);
>
> If the parameter is declared to be of type "struct rgba", it's
> *always* passed by value.
>
> I presume you know that, but other readers might mistakenly infer
> that you can tell the compiler to pass arguments by reference
> (which is possible in some other languages).

Yes, pass-by-reference is emulated by using a pointer to the data. C
*thinks* it's pass-by-value (the value of the pointer).

--
Bartc

Ben Bacarisse
Guest
Posts: n/a

 10-23-2012
Steve Thompson <(E-Mail Removed)> writes:

> On Mon, Oct 22, 2012 at 09:06:39PM +0100, Ben Bacarisse wrote:
>> Steve Thompson <(E-Mail Removed)> writes:
>>
>> > On Mon, Oct 22, 2012 at 10:52:47AM -0700, Keith Thompson wrote:
>> >> Steve Thompson <(E-Mail Removed)> writes:
>> >> [...]
>> >> > No annoyance taken, although now that I think of it it does show an
>> >> > instance of overloaded semantics, which can be confusing. When I was
>> >> > less experienced with C, I was confused by the fact that variables
>> >> > defined as structures were not pointers:
>> >> >
>> >> > struct point {
>> >> > int x, y, z;
>> >> > };
>> >> >
>> >> > struct point a;
>> >> >
>> >> >
>> >> > One cannot reference 'a' (as such) in an expression even though it is
>> >> > a pointer in the implementation because C allows one to use structures
>> >> > as arguments to a function. I'd never do that in my own code, but
>> >> > some people may like the fact that they can pass complex data
>> >> > structures around as if they were simple variables. I think it wastes
>> >> > resources, but of course that is less of an issue today than it was
>> >> > when C was first developed.
>> >>
>> >> I'm not sure what you mean when you say `a` "is a pointer in
>> >> the implementation", especially after you mention "the fact that
>> >> variables defined as structures were not pointers".
>> >
>> > Meaning that under the hood in the C compiler, 'a' is really a
>> > pointer. The logic of the compiler causes it to use the structure
>> > data as specified in the standard, but conceptually 'a' is still a
>> > pointer to the data in the specific structure identified by 'a'. I
>> > found it very confusing when I was learning C.

>>
>> I am not surprised. I would have found it totally baffling when I was
>> learning C. I find it baffling even now, having learnt C!
>>
>> Whatever goes on under the hood (and I'd dispute that 'a' is "really a
>> pointer") it's much easier to learn a language in its own terms, at
>> least as far as that's possible.

>
> Well, no. 'a' is never a pointer as such in the terms of the language
> syntax, but its treatment by the compiler internally is nominally
> equivalent to they way it handles a pointer, with the exception that
> the language grammar forbids its use in the program text as a pointer.

I've been around Usenet long enough to recognise the signs. You are now
100% committed to this point of view and I don't think there's much
point in continuing to debate it.

> In a simplistic sense, all variables are pointers to a segment of
> program data, modifiable or otherwise. In the real world, variable
> values may move around in and from RAM to CPU registers and back while
> being identified as a single consistent entity. This helps to
> illustrate the difference between a language and its implementation.
>
>> In C terms, 'a' names an object and 'a' is a (modifiable) lvalue
>> expression, just as if it had been declared to be an int. trying to
>> reconcile this with some model that says it's a pointer (when it isn't
>> one) is bound to be confusing.

>
> I was never happy with the term lvalue or rvalue. My brain wants to
> think in terms of pointers, variables, and values. Structures defined
> as automatic variables don't fit all that neatly in that
> classification.

Then surely that's a sign that your "pointers, variables, and values"
classification is unsuitable?

--
Ben.

BartC
Guest
Posts: n/a

 10-23-2012
"Steve Thompson" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...

> Hypothetically, C might have been designed differently:
>
> struct point {
> int x, y, z;
> };
>
> struct point a, b;
>
> a.x = 1;
> a.y = 2;
> a.z = 3;

If you're going to the trouble of redesigning the language, at least make it
possible to write:

a={1,2,3};

> With two distinct assignment operators, one might get a copied
> structure, or a pointer to the original.
>
> Copy:
> b = a;
>
> By reference:
> b := a;

C currently assigns by value, and always does a shallow copy. The above is
handled as follows:

struct point *A,*B;

A=B; /* copy by reference */
*A=*B; /* copy by value */

Where 'reference' and 'value' refer to the targets of the pointers. However
the copy by value is only one level deep. (To do a proper deep copy involves
recursing through any nested pointers in the struct, which can point to any
other kinds of objects. It gets hairy, and would be an entirely different
kind of language. I don't know if that's what you had in mind.)

--
Bartc

Keith Thompson
Guest
Posts: n/a

 10-23-2012
Steve Thompson <(E-Mail Removed)> writes:
> On Mon, Oct 22, 2012 at 09:06:39PM +0100, Ben Bacarisse wrote:

[...]
>> Whatever goes on under the hood (and I'd dispute that 'a' is "really a
>> pointer") it's much easier to learn a language in its own terms, at
>> least as far as that's possible.

>
> Well, no. 'a' is never a pointer as such in the terms of the language
> syntax, but its treatment by the compiler internally is nominally
> equivalent to they way it handles a pointer, with the exception that
> the language grammar forbids its use in the program text as a pointer.

I don't believe that's true. Do you have some evidence that it is?

> In a simplistic sense, all variables are pointers to a segment of
> program data, modifiable or otherwise. In the real world, variable
> values may move around in and from RAM to CPU registers and back while
> being identified as a single consistent entity. This helps to
> illustrate the difference between a language and its implementation.

So a char object is really a pointer to char, an int object is
really a pointer to int, and a char** object is really a char**?

I think you're simply adding an extra -- and unnecessary -- level
of indirection, and of confusion.

There may also be some confusion between pointer *objects* and

If I define an object of struct type and perform some operation
on it (say, a simple assignment), the generated code might very
well refer to the address of the object. There is no *object*
of pointer type, either explicitly or implicitly.

Struct objects are very often manipulated via pointers (simulating
by-reference argument passing, for example), but that's a coding
convention, not anything inherent to the language.

A struct object is a struct object, just as a scalar object is a
scalar object. The distinction you see between structs and scalars
doesn't exist. (Certainly there's a distinction, but it's not the

>> In C terms, 'a' names an object and 'a' is a (modifiable) lvalue
>> expression, just as if it had been declared to be an int. trying to
>> reconcile this with some model that says it's a pointer (when it isn't
>> one) is bound to be confusing.

>
> I was never happy with the term lvalue or rvalue. My brain wants to
> think in terms of pointers, variables, and values. Structures defined
> as automatic variables don't fit all that neatly in that classification.

C's definitions of the terms lvalue and rvalue is not consistent with
their original meanings.

Originally, an expression could be evaluated in two different ways,
depending on the context in which it appears. Given an assignment:

int x = 42;
int y;
y = x;

The expression `x` (on the right side of the assignment, thus
"rvalue") is evaluated *for its rvalue*, which is 42, whereas
the expresison `y` (on the left side, thus "lvalue" is evaluated
*for its lvalue*, which is the identify of the object named "y",
regardless of what value it happens to contain. (Either can be a
more complex expression, for example `arr[func(42)].member = abs(x)
+ 42;`, so there can be some non-trivial computation involved in
determining an lvalue.)

The C standard changed the meaning of "lvalue", so it refers to the
expression `y` itself, rather than to the result of evaluating it.
The standard doesn't use the term "rvalue" other than in a footnote,
which mentions in passing that an rvalue is the result of an
expression.

See N1337 6.3.2.1.

To oversimplify, an lvalue in C is an expression that designates
an object. Note carefully that *designating* an object needn't
involve generating a *pointer* to an object, or equivalently
computing the address of an object. A simple example:

int x = 42;
register int y;
y = x;

`y` is an lvalue, but it refers to an object that has no address.
The same thing happens if `y` is a bit field.

Dropping the "register" keyword doesn't make any real conceptual
difference -- nor does making `x` and `y` structures rather than
scalars.

You're right: "Structures defined as automatic variables don't fit
all that neatly in that classification" (of pointers, variables,
and values). The problem is not with the language, it's with your

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"