Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > What is Expressiveness in a Computer Language

Reply
Thread Tools

What is Expressiveness in a Computer Language

 
 
Joachim Durchholz
Guest
Posts: n/a
 
      07-14-2006
Marshall schrieb:
> void foo() {
> int i = 0;
> int j = 0;
> j = 1;
> i = 2;
> // check value of j here. It is still 1, no matter what you filled
> // in above.
> // The assignment to i cannot be made to affect the value of j.
> }
>
> Those two local primitive variables cannot be made to have the same
> identity.


Sure. To state it more clearly, they cannot be aliases.

> But you can update them, so this is an example of mutability
> without the possibility of identity.


You're being a bit sloppy with terminology here. "Identity" in the
phrase above refers to two entities, in the sense of "i and j cannot be
identical".
Identity is actually a property of a single entity, namely that what
remains constant regardless of what you do with the entity.

Regards,
Jo
 
Reply With Quote
 
 
 
 
Joachim Durchholz
Guest
Posts: n/a
 
      07-14-2006
Marshall schrieb:
> By your definition, "pointer" and "variable" are synonyms. That doesn't
> seem like a good idea to me. (What if i and j end up in registers?
> I have not heard it said that registers have addresses.)


There are no registers in the JVM ;-P

More specifically, where Joe said "pointer" he really meant "reference".
i refers to a variable, and you really want to make sure that the first
i refers to the same variable as the second i, so whatever is referred
to by i is really an identity.

>>> Those two local primitive variables cannot be made to have the same
>>> identity. But you can update them, so this is an example of mutability
>>> without the possibility of identity.

>> The identity is temporal: You use the same variable name at two
>> different times. Do you intend for the second `i' to mean the same
>> variable as the first `i'?

>
> Okay, so "i" and "i" have the same identity.


Vacuous but true.

> I suppose you could argue that the language's namespace is an
> address-space, and variable names are addresses.


Correct.

> At this point, though, you've made the concept of identity so broad
> that it is now necessarily a property of all languages that use named
> values, whether updatable or not. I think you would also have to call
> "3" a void pointer by this scheme.


It's not a void pointer, it's a pointer to an integer, but you're
correct: "3", when written in a program text, is a reference to the
successor of the successor of the successor of zero.

If you find that silly and would like to stick with equality, then
you're entirely correct. Natural numbers are entirely immutable by
definition, and I'm definitely not the first one to observe that
identity and equality are indistinguishable for immutable objects.

> Clearly there is *some* difference between a language which allows
> explicit pointers and thus aliasing and one that doesn't.


You can have aliasing without pointers; e.g. arrays are fully sufficient.
If i = j, then a [i] and a [j] are aliases of the same object.

(After I observed that, I found it no longer a surprise that array
optimizations are what Fortran compiler teams sink most time into. The
aliasing problems are *exactly* the same as those with pointers in C -
though in practice, the Fortranistas have the advantage that the
compiler will usually know that a [i] and b [j] cannot be aliases of
each other, so while they have the same problems, the solutions give the
Fortranistas more leverage.)

> What term would you use? First-class variables?


I think it's more a quasi-continuous spectrum.

On one side, we have alias-free toy languages (no arrays, no pointers,
no call-by-reference, just to name the most common sources of aliasing).

SQL is slightly more dangerous: it does have aliases, but they are
rarely a problem (mostly because views aren't a very common thing, and
even if they are used, they aren't usually so abstract that they really
hide the underlying dependencies).

Next are languages with arrays and call-by-reference. "Pascal without
pointers" would be a candidate.
Here, aliasing occurs, and it can and does hide behind abstraction
barriers, but it's not a serious problem unless the software becomes
*really* large.

The end of the line are languages that use references to mutable data
everywhere. All OO languages are a typical example of that.

Regards,
Jo
 
Reply With Quote
 
 
 
 
Marshall
Guest
Posts: n/a
 
      07-14-2006
Joachim Durchholz wrote:
> Marshall schrieb:
> > What about my example of SQL? Mutation, no pointers, no aliasing.
> > Yet: useful.

>
> Sorry, but SQL does have aliasing.


Well. I suppose we do not have an agreed upon definition
of aliasing, so it is hard to evaluate either way. I would
propose using the same one you used for identity:

if there are two variables and modifying one also modifies
the other, then there is aliasing between them.

I avoided mentioning equality to include, for example,
having an array i that is an alias to a subset of array j.
Please feel free to critque this definition, and/or supply
an alternative.


> E.g. if you have records that have name="John", surname="Doe", the
> statements
> SELECT * FROM persons WHERE name = "John"
> and
> SELECT * FROM persons WHERE name = "Doe"
> are aliases of each other.
>
> The alias is actually in the WHERE clause.


Not by my definition, because there is only one variable here.


> And this *can* get you into
> trouble if you have something that does
> UPDATE ... WHERE name = "John"
> and
> UPDATE ... WHERE surname = "Doe"
> e.g. doing something with the Johns, then updating the names of all
> Does, and finally restoring the Johns (but not recognizing that changing
> the names of all Does might have changed your set of Johns).


The fact that some person might get confused about the
semantics of what they are doing does not indicate aliasing.
It is easy enough to do an analysis of your updates and
understand what will happen; this same analysis is impossible
with two arbitrary pointers, unless one has a whole-program
trace. That strikes me as a significant difference.


> Conceptually, this is just the same as having two different access path
> to the same memory cell. Or accessing the same global variable through a
> call-by-reference parameter and via its global name.


There are similarities, but they are not the same.


> BTW with views, you get not just aliasing but what makes aliasing really
> dangerous. Without views, you can simply survey all the queries that you
> are working with and lexically compare table and field names to see
> whether there's aliasing. With views, the names that you see in a
> lexical scope are not enough to determine aliasing.
> E.g. if you use a view that builds upon the set of Johns but aren't
> aware of that (possibly due to abstraction barriers), and you change the
> name field of all Does, then you're altering the view without a chance
> to locally catch the bug. That's just as bad as with any pointer
> aliasing problem.


It is certainly aliasing, but I would not call it "just as bad." There
are
elaborate declarative constraint mechanisms in place, for example.
And the definition of the view itsef is a declarative, semantic
entity. Whereas a pointer is an opaque, semantics-free address.


Marshall

 
Reply With Quote
 
Marshall
Guest
Posts: n/a
 
      07-14-2006
Andreas Rossberg wrote:
> Marshall wrote:
> >
> > After all, what are the alternatives? Purely-functional
> > languages remove themselves from a large class of
> > problems that I consider important: data management.

>
> Maybe, but I have yet to see how second-class variables are really more
> adequate in dealing with it.
>
> And note that even with second-class state you can still have aliasing
> issues - you just need mutable arrays and pass around indices. Keys in
> databases are a more general form of the same problem.


So for array a, you would claim that "a[5]" is an alias for
(a part of) "a"? That seems to stretch the idea of aliasing
to me. With these two expressions, it is obvious enough
what is going on; with two arbitrary pointers, it is not.

It seems to me the source of the problem is the opacity
of pointers, but perhaps I am missing something.


> > I have explored the OO path to its bitter end and am
> > convinced it is not the way. So what is left? Uniqueness
> > types and logic programming, I suppose. I enjoy logic
> > programming but it doesn't seem quite right. But notice:
> > no pointers there! And it doesn't seem to suffer from the
> > lack.

>
> Uh, aliasing all over the place! Actually, I think that logic
> programming, usually based on deep unification, brings by far the worst
> incarnation of aliasing issues to the table.


Hmmm. Can you elaborate just a bit?


Marshall

 
Reply With Quote
 
Marshall
Guest
Posts: n/a
 
      07-14-2006
Joachim Durchholz wrote:
> Marshall schrieb:
> > void foo() {
> > int i = 0;
> > int j = 0;
> > j = 1;
> > i = 2;
> > // check value of j here. It is still 1, no matter what you filled
> > // in above.
> > // The assignment to i cannot be made to affect the value of j.
> > }
> >
> > Those two local primitive variables cannot be made to have the same
> > identity.

>
> Sure. To state it more clearly, they cannot be aliases.


Yes.


> > But you can update them, so this is an example of mutability
> > without the possibility of identity.

>
> You're being a bit sloppy with terminology here. "Identity" in the
> phrase above refers to two entities, in the sense of "i and j cannot be
> identical".
> Identity is actually a property of a single entity, namely that what
> remains constant regardless of what you do with the entity.


It is a fair critque. I'll try to be more precise.


Marshall

 
Reply With Quote
 
Marshall
Guest
Posts: n/a
 
      07-14-2006
Joachim Durchholz wrote:
>
> You can have aliasing without pointers; e.g. arrays are fully sufficient.
> If i = j, then a [i] and a [j] are aliases of the same object.


I am having a hard time with this very broad definition of aliasing.
Would we also say that a[1+1] and a[2] are aliases? It seems
to me, above, that we have only a, and with only one variable
there can be no aliasing.

A further question:

given a 32 bit integer variable x, and offsets i and j (equal as in
the above example) would you say that

x &= (1 << i)
and
x &= (1 << j)

are aliased expressions for setting a particular bit in x?

I am not being facetious; I am trying to understand the limits
of your definition for aliasing.


> (After I observed that, I found it no longer a surprise that array
> optimizations are what Fortran compiler teams sink most time into. The
> aliasing problems are *exactly* the same as those with pointers in C -
> though in practice, the Fortranistas have the advantage that the
> compiler will usually know that a [i] and b [j] cannot be aliases of
> each other, so while they have the same problems, the solutions give the
> Fortranistas more leverage.)


I don't understand this paragraph. On the one hand, it seems you
are saying that C and Fortran are identically burdened with the
troubles caused by aliasing, and then a moment later it seems
you are saying the situation is distinctly better with Fortran.


> > What term would you use? First-class variables?

>
> I think it's more a quasi-continuous spectrum.
>
> On one side, we have alias-free toy languages (no arrays, no pointers,
> no call-by-reference, just to name the most common sources of aliasing).
>
> SQL is slightly more dangerous: it does have aliases, but they are
> rarely a problem (mostly because views aren't a very common thing, and
> even if they are used, they aren't usually so abstract that they really
> hide the underlying dependencies).
>
> Next are languages with arrays and call-by-reference. "Pascal without
> pointers" would be a candidate.
> Here, aliasing occurs, and it can and does hide behind abstraction
> barriers, but it's not a serious problem unless the software becomes
> *really* large.
>
> The end of the line are languages that use references to mutable data
> everywhere. All OO languages are a typical example of that.


Now with this, it appears you are agreeing that SQL has an advantage
vis-a-vis aliasing compared to OO languages. Yes? If so, we are
agreeing on the part I care about, and the specifics of just what
we call aliasing are not so important to me.


Marshall

 
Reply With Quote
 
rossberg@ps.uni-sb.de
Guest
Posts: n/a
 
      07-14-2006
Marshall wrote:
> Andreas Rossberg wrote:
> >
> > And note that even with second-class state you can still have aliasing
> > issues - you just need mutable arrays and pass around indices. Keys in
> > databases are a more general form of the same problem.

>
> So for array a, you would claim that "a[5]" is an alias for
> (a part of) "a"? That seems to stretch the idea of aliasing
> to me.


Not at all, I'd say. In my book, aliasing occurs whenever you have two
different "lvalues" that denote the same mutable cell. I think it would
be rather artificial to restrict the definition to lvalues that happen
to be variables or dereferenced pointers.

So of course, a[i] aliases a[j] when i=j, which in fact is a well-known
problem in some array or matrix routines (e.g. copying array slices
must always consider overlapping slices as special cases).

> > Uh, aliasing all over the place! Actually, I think that logic
> > programming, usually based on deep unification, brings by far the worst
> > incarnation of aliasing issues to the table.

>
> Hmmm. Can you elaborate just a bit?


Logic variables immediately become aliases when you unify them.
Afterwards, after you bind one, you also bind the other - or fail,
because both got already bound the other way round. Unfortunately,
unification also is a deep operation, that is, aliasing can be induced
into variables deeply nested into some terms that happen to get
unified, possibly without you being aware of any of this (the
unification, nor the variable containment). To avoid accidental
aliasing you essentially must keep track of all potential unifications
performed by any given predicate (including transitively, by its
subgoals), which totally runs square to all concerns of modularity and
abstraction.

I've never been much of a logic programmer myself, but our language
group has a dedicated LP and CP background, and people here have
developed very strong opinions about the adequateness of unrestricted
logic variables and particularly unification in practice. I remember
somebody calling Prolog "the worst imperative language ever invented".

- Andreas

 
Reply With Quote
 
Joe Marshall
Guest
Posts: n/a
 
      07-14-2006

Marshall wrote:
> Joe Marshall wrote:
> > Marshall wrote:
> > >
> > > Consider the following Java fragment:
> > >
> > > void foo() {
> > > int i = 0;
> > > int j = 0;
> > >
> > > // put any code here you want
> > >
> > > j = 1;
> > > i = 2;
> > > // check value of j here. It is still 1, no matter what you filled in
> > > above.
> > > // The assignment to i cannot be made to affect the value of j.
> > >
> > > }

> >
> > True, but you have hidden the pointers. Semantically, the identifiers
> > i and j refer not to integers but to locations that hold integers. The
> > assignment modifies the location.

>
> What the implementation looks like shouldn't affect how we speak
> of the logical model. In the above code, there are no pointers.
>
> By your definition, "pointer" and "variable" are synonyms. That doesn't
> seem like a good idea to me. (What if i and j end up in registers?
> I have not heard it said that registers have addresses.)


You probably won't like this, but....

The example you give above is a bit ambiguous as to whether or not it
shows `true' mutation. So what do I mean by `true' mutation? The code
above could be transformed by alpha-renaming into an equivalent version
that does no mutation:

void foo() {
int i = 0;
int j = 0;

// put any code here you want

int jj = 1;
int ii = 2;
// rename all uses of i to ii and j to jj after this point.

}

Once I've done the renaming, we're left with a pure function. I claim
that this sort of `mutation' is uninteresting because it can be
trivially eliminated through renaming. Since we can eliminate it
through renaming, we can use a simplified semantics that does not
involve a `store', and use a substitution model of evaluation.

The more interesting kind of mutation cannot be eliminated through
renaming. The semantic model requires the notion of a stateful
`store', and the model is not referentially transparent: substitution
does not work. These qualities are what make mutation problematic. My
claim is that this kind of mutation cannot be had without some model of
pointers and references.

There is a further distinguishing characteristic: Can I write a
program that detects the mutation (and acts differently depending on
whether there is mutation or not)? It is unclear from your example
above whether you intended this to be possible, but certainly there is
no way for a caller of foo to tell that foo reassigns an internal
variable. If there a procedure is extrinsically a pure function, then
there is no real meaning to any `mutation' that goes on inside; there
is always a way to turn the internal mutation into a pure function
through alpha-renaming.

>
> Clearly there is *some* difference between a language which allows
> explicit pointers and thus aliasing and one that doesn't. What term
> would you use? First-class variables?


My argument to you in re this statement:
> Again, I disagree: it is posible to have mutability without
> pointers/identity/objects.


is that mutability without a notion of pointers, identity, or objects
is trivially eliminated through alpha renaming and thus isn't the
`mutability' of interest.

 
Reply With Quote
 
Joachim Durchholz
Guest
Posts: n/a
 
      07-14-2006
Marshall schrieb:
> Joachim Durchholz wrote:
>> Marshall schrieb:
>>> What about my example of SQL? Mutation, no pointers, no aliasing.
>>> Yet: useful.

>> Sorry, but SQL does have aliasing.

>
> Well. I suppose we do not have an agreed upon definition
> of aliasing, so it is hard to evaluate either way. I would
> propose using the same one you used for identity:
>
> if there are two variables and modifying one also modifies
> the other, then there is aliasing between them.


I think that's an adequate example.
For a definition, I'd say it's a bit too narrow unless we use a fairly
broad definition for "variable". I.e. in a C context, I'd say that a->b
is a variable, too, as would be foo(blah)->x.

> I avoided mentioning equality to include, for example,
> having an array i that is an alias to a subset of array j.


This would mean that there's aliasing between, say, a list of
transaction records and the balance of the account (since modifying the
list of transactions will change the balance, unless the object isn't
properly encapsulated).
For purposes of this discussion, it's probably fair to say "that's a
form of aliasing, too", even though it's quite indirect.

>> E.g. if you have records that have name="John", surname="Doe", the
>> statements
>> SELECT * FROM persons WHERE name = "John"
>> and
>> SELECT * FROM persons WHERE name = "Doe"
>> are aliases of each other.


Arrrrrgh... I made a most braindamaged, stupid mistake here. The second
SELECT should have used the *surname* field, so it would be

>> SELECT * FROM persons WHERE surname = "Doe"


Then, if there's a record that has name = "John", surname = "Doe", the
two WHERE clauses have aliasing in the form of overlapping result sets.

>> The alias is actually in the WHERE clause.

>
> Not by my definition, because there is only one variable here.


Sorry, my wording was sloppy.

I meant to say that 'in SQL, you identify records via clauses like WHERE
name = "John", i.e. WHERE clauses are a kind of identity'.

This is still not precise - the identity of an SQL record isn't
explicitly accessible (except the nonstandard OID facility that some SQL
engines offer). Nevertheless, they do have an identity, and there's a
possibility of aliasing - if you change all Johns, you may also change a
Doe.

>> And this *can* get you into
>> trouble if you have something that does
>> UPDATE ... WHERE name = "John"
>> and
>> UPDATE ... WHERE surname = "Doe"
>> e.g. doing something with the Johns, then updating the names of all
>> Does, and finally restoring the Johns (but not recognizing that changing
>> the names of all Does might have changed your set of Johns).

>
> The fact that some person might get confused about the
> semantics of what they are doing does not indicate aliasing.
> It is easy enough to do an analysis of your updates and
> understand what will happen; this same analysis is impossible
> with two arbitrary pointers, unless one has a whole-program
> trace. That strikes me as a significant difference.


Sure. I said that aliases in SQL aren't as bad as in other programs.

Once you get abstraction mixed in, the analysis becomes less
straightforward. In the case of SQL, views are such an abstraction
facility, and they indeed can obscure what you're doing and make the
analysis more difficult. If it's just SQL we're talking about, you
indeed have to look at the whole SQL to check whether there's a view
that may be involved in the queries you're analysing, so the situation
isn't *that* different from pointers - it's just not a problem because
the amount of code is so tiny!

>> Conceptually, this is just the same as having two different access path
>> to the same memory cell. Or accessing the same global variable through a
>> call-by-reference parameter and via its global name.

>
> There are similarities, but they are not the same.


What are the relevant differences? How does the semantics of a WHERE
clause differ from that of a pointer, in terms of potential aliasing?

My line of argument would be this:
Pointers can be simulated using arrays, so it's no surprise that arrays
can emulate all the aliasing of pointers.
Arrays can be simulated using WHERE (with SELECT and UPDATE), so I'd say
that SQL can emulate all the aliasing of arrays, and hence that of pointers.

>> BTW with views, you get not just aliasing but what makes aliasing really
>> dangerous. Without views, you can simply survey all the queries that you
>> are working with and lexically compare table and field names to see
>> whether there's aliasing. With views, the names that you see in a
>> lexical scope are not enough to determine aliasing.
>> E.g. if you use a view that builds upon the set of Johns but aren't
>> aware of that (possibly due to abstraction barriers), and you change the
>> name field of all Does, then you're altering the view without a chance
>> to locally catch the bug. That's just as bad as with any pointer
>> aliasing problem.

>
> It is certainly aliasing, but I would not call it "just as bad." There
> are
> elaborate declarative constraint mechanisms in place, for example.


Yes, but they can give you unexpected results.
It smells a bit like closing the gates after the horses are out.

On the other hand, *if* there is identity and updates, no matter how we
handle them, there must be *some* way to deal with the problems that
ensue. Having declarative constraints doesn't sound like the worst way
to do that.

> And the definition of the view itsef is a declarative, semantic
> entity. Whereas a pointer is an opaque, semantics-free address.


Oh, that's a very BCPL view. It was still somewhat true for K&R C, but
it's not really applicable to ANSI C, not at all to C++, and even less
to languages that associate well-encapsulated types with pointers (such
as Ada, Eiffel, or Java).
Unless, of course, you mean "address" if you say "pointer"

Regards,
Jo
 
Reply With Quote
 
Joachim Durchholz
Guest
Posts: n/a
 
      07-14-2006
Marshall schrieb:
> Joachim Durchholz wrote:
>> You can have aliasing without pointers; e.g. arrays are fully sufficient.
>> If i = j, then a [i] and a [j] are aliases of the same object.

>
> I am having a hard time with this very broad definition of aliasing.
> Would we also say that a[1+1] and a[2] are aliases? It seems
> to me, above, that we have only a, and with only one variable
> there can be no aliasing.


a[1] is a variable, too. You can assign values to it.

Inside function foo when called as foo(a[1]), it may even be referenced
with yet another name and updated in isolation; foo may not even know
it's an array element. (This is sort-of true in C, and definitely true
in languages that don't have pointer arithmetic. If you pass a[1] to a
call-by-reference parameter in Pascal, the callee has no way to access
the parameter as if it were an array element.)

> A further question:
>
> given a 32 bit integer variable x, and offsets i and j (equal as in
> the above example) would you say that
>
> x &= (1 << i)
> and
> x &= (1 << j)
>
> are aliased expressions for setting a particular bit in x?


Yes.

> I am not being facetious; I am trying to understand the limits
> of your definition for aliasing.


Understood and appreciated.

>> (After I observed that, I found it no longer a surprise that array
>> optimizations are what Fortran compiler teams sink most time into. The
>> aliasing problems are *exactly* the same as those with pointers in C -
>> though in practice, the Fortranistas have the advantage that the
>> compiler will usually know that a [i] and b [j] cannot be aliases of
>> each other, so while they have the same problems, the solutions give the
>> Fortranistas more leverage.)

>
> I don't understand this paragraph. On the one hand, it seems you
> are saying that C and Fortran are identically burdened with the
> troubles caused by aliasing, and then a moment later it seems
> you are saying the situation is distinctly better with Fortran.


That's exactly the situation.
Aliasing is present in both language, but in Fortran, the guarantee that
two names aren't aliased are easier to come by.
Even more non-aliasing guarantees exist if the language has a more
expressive type system. In most cases, if you know that two expressions
have different types, you can infer that they cannot be aliases.

Of course, the life of compiler writers if of less importance to us; I
added them only because the reports of Fortran compilers having aliasing
problems due to array indexes made me aware that pointers aren't the
only source of aliasing. That was quite a surprise to me then.

> Now with this, it appears you are agreeing that SQL has an advantage
> vis-a-vis aliasing compared to OO languages. Yes?


Sure.
I think that making an aliasing bug in SQL is just as easy as in any
pointer-ridden language, but transactions and constraints (plus the
considerably smaller typical size of SQL code) make it far easier to
diagnose any such bugs.

> If so, we are agreeing on the part I care about, and the specifics of
> just what we call aliasing are not so important to me.


I'm not sure whether I'm getting the same mileage out of this, but the
relevance of a problem is obviously a function on what one is doing on a
day-to-day basis, so I can readily agree with that

Regards,
Jo
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
What is Expressiveness in a Computer Language Xah Lee Java 640 07-22-2006 03:50 PM
What is Expressiveness in a Computer Language Xah Lee Python 671 07-22-2006 03:50 PM
A language-agnostic language Ed Java 24 03-27-2006 08:19 PM
[perl-python] text pattern matching, and expressiveness Xah Lee Python 4 02-11-2005 09:11 PM
[perl-python] text pattern matching, and expressiveness Xah Lee Perl Misc 1 02-07-2005 08:18 PM



Advertisments