Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   Pythonification of the asterisk-based collection packing/unpacking syntax (http://www.velocityreviews.com/forums/t807040-pythonification-of-the-asterisk-based-collection-packing-unpacking-syntax.html)

Eelco 12-17-2011 02:38 PM

Pythonification of the asterisk-based collection packing/unpacking syntax
 

This is a follow-up discussion on my earlier PEP-suggestion. Ive
integrated the insights collected during the previous discussion, and
tried to regroup my arguments for a second round of feedback. Thanks
to everybody who gave useful feedback the last time.

PEP Proposal: Pythonification of the asterisk-based collection packing/
unpacking syntax.

This proposal intends to expand upon the currently existing collection
packing and unpacking syntax. Thereby we mean the following related
python constructs:
head, *tail = somesequence
#pack the remainder of the unpacking of somesequence into a list
called tail
def foo(*args): pass
#pack the unbound positional arguments into a tuple calls args
def foo(**kwargs): pass
#pack the unbound keyword arguments into a dict calls kwargs
foo(*args)
#unpack the sequence args into positional arguments
foo(**kwargs)
#unpack the mapping kwargs into keyword arguments

We suggest that these constructs have the following shortcomings that
could be remedied.
It is unnecessarily cryptic, and out of line with Pythons preference
for an explicit syntax. One can not state in a single line what the
asterisk operator does; this is highly context dependent, and is
devoid of that ‘for line in file’ pythonic obviousness. From the
perspective of a Python outsider, the only hint as to what *args means
is by loose analogy with the C-way of handling variable arguments.
The current syntax, in its terseness, leaves to be desired in terms of
flexibility. While a tuple might be the logical choice to pack
positional arguments in the vast majority of cases, it need not be
true that a list is always the preferred choice to repack an unpacked
sequence, for instance.


Type constraints:

In case the asterisk is not used to signal unpacking, but rather to
signal packing, its semantics is essentially that of a type
constraint. The statement:

head, tail = sequence

Signifies regular unpacking. However, if we add an asterisk, as in:

head, *tail = sequence

We demand that tail not be just any python object, but rather a list.
This changes the semantics from normal unpacking, to unpacking and
then repacking all but the head into a list.

It may be somewhat counter-intuitive to think of this as a type
constraint, since python is after all a weakly-typed language. But the
current usage of askeriskes is an exception to that rule. For those
who are unconvinced, please consider the analogy to the following
simple C# code:

var foo = 3;

An ‘untyped‘ object foo is created (actually, its type will be
inferred from its rhs as an integer).

float foo = 3;

By giving foo a type-constraint of float instead, the semantics are
modified; foo is no longer the integer 3, but gets silently cast to
3.0. This is a simple example, but conceptually entirely analogous to
what happens when one places an asterisk before an lvalue in Python.
It means ‘be a list, and adjust your behavior accordingly’, versus ‘be
a float, and adjust your behavior accordingly’.

The aim of this PEP, is that this type-constraint syntax is expanded
upon. We should be careful here to distinguish with providing optional
type constraints throughout python as a whole; this is not our aim.
This concept has been considered before, but the costs have not been
found to out-weight the benefits. http://www.artima.com/weblogs/viewpost.jsp?thread=86641
Our primary aim is the niche of collection packing/unpacking, but if
further generalizations can be made without increasing the cost, those
are most welcome. To reiterate: what is proposed is nothing radical;
merely to replace the asterisk-based type constraints with a more
explicit type constraint.

Currently favored alternative syntax:

Both for the sake of explicitness and flexibility, we consider it
desirable that the name of the collection type is used directly in any
collection packing statement. Annotating a variable declaration with a
collection type name should signal collection packing. This
association between a collection type name and a variable declaration
can be accomplished in many ways; for now, we suggest
collectionname::collectiontype for packing, and ::collectionname for
unpacking.

Examples of use:
head, tail::tuple = ::sequence
def foo(args::list, kwargs::dict): pass
foo(::args, ::kwargs)

The central idea is to replace annotations with asteriskes by
annotations with collection type names, but note that we have opted
for several other minor alterations of the existing syntax that seem
natural given the proposed changes.

First of all, explicitly mentioning the type of the collection
involved eliminates the need to have two symbols, * and **. Which
variable captures the positional arguments and which captures the
keyword arguments can be inferred from the collection type they model,
mapping or sequence. The rare case of collections that both model a
sequence and a mapping can either be excluded or handled by assigning
precedence for one type or the other.

A double semicolon before a collection type signals unpacking. As with
declarations, there is no genuine need to have a different operator
for sequence and mapping types, although if such a demand exists, it
would not be hard to accommodate. A double semicolon in front of the
collection is congruent with the asterisk syntax, and nicely
emphasizes this unpacking operation being the symmetric counterpart of
the packing operation, which is signalled by the same symbols to the
right of the identifier. Since we are going to make the double
semicolon (or whatever the symbol) a general collection packing/
unpacking marker, we feel it makes sense to allow it to be used to
explicitly signify unpacking, even when as much is implied by the
syntax on the left hand side, to preserve symmetry with the syntax
inside function calls.

Summarizing, what this syntax achieves, in loose order of perceived
importance:
Simplicity: we have reduced a set of rather arbitrary rules concerning
the syntax and semantics of the asterisk (does it construct a list or
a tuple?) to a single general symbol: the double semicolon is the
collection packing/unpacking annotation symbol, and that is all there
is to know about it.
Readability: the proposed syntax reads like a book: args-list and
kwargs-dict, unlike the more cryptic asterisk syntax. We avoid extra
lines of code in the event another sequence or mapping type than the
one returned by default is required.
Efficiency: by declaring the desired collection type, it can be
constructed in the optimal way from the given input, rather than
requiring a conversion after the default collection type is
constructed.

A double semicolon is suggested, since the single colon is already
taken by the function annotation syntax in Python 3. This is somewhat
unfortunate: programming should come before meta-programming, and it
should rather be the other way around. On the one hand having both :
and :: as variable declaration annotation symbols is a nice
unification, on the other hand, a syntax more easily visually
distinguished from function annotations can be defended. For increased
backwards compatibility the asterisk could be used, but sandwiched
between two identifiers it looks like a multiplication. But many
others symbols would do, such as @ or !.

Steven D'Aprano 12-17-2011 05:00 PM

Re: Pythonification of the asterisk-based collectionpacking/unpacking syntax
 
On Sat, 17 Dec 2011 06:38:22 -0800, Eelco wrote:

> One can not state in a single line what the asterisk
> operator does;


Multiplication, exponentiation, sequence packing/unpacking, and varargs.


--
Steven

Roy Smith 12-17-2011 05:14 PM

Re: Pythonification of the asterisk-based collection packing/unpacking syntax
 
In article <4eeccabe$0$29979$c3e8da3$5496439d@news.astraweb.c om>,
Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:

> On Sat, 17 Dec 2011 06:38:22 -0800, Eelco wrote:
>
> > One can not state in a single line what the asterisk
> > operator does;

>
> Multiplication, exponentiation, sequence packing/unpacking, and varargs.


Import wildcarding?

Chris Angelico 12-17-2011 05:18 PM

Re: Pythonification of the asterisk-based collectionpacking/unpacking syntax
 
On Sun, Dec 18, 2011 at 4:14 AM, Roy Smith <roy@panix.com> wrote:
> Import wildcarding?


That's not an operator, any more than it is when used in filename
globbing. The asterisk _character_ has many meanings beyond those of
the operators * and **.

ChrisA

Eelco 12-17-2011 08:11 PM

Re: Pythonification of the asterisk-based collectionpacking/unpacking syntax
 
On Dec 17, 6:18*pm, Chris Angelico <ros...@gmail.com> wrote:
> On Sun, Dec 18, 2011 at 4:14 AM, Roy Smith <r...@panix.com> wrote:
> > Import wildcarding?

>
> That's not an operator, any more than it is when used in filename
> globbing. The asterisk _character_ has many meanings beyond those of
> the operators * and **.
>
> ChrisA


To cut short this line of discussion; I meant the asterisk symbol
purely in the context of collection packing/unpacking. Of course it
has other uses too.

Even that single use requires a whole paragraph to explain completely;
when does it result in a tuple or a list, when is unpacking implicit
and when not, why * versus **, and so on.

Steven D'Aprano 12-17-2011 11:20 PM

Re: Pythonification of the asterisk-based collectionpacking/unpacking syntax
 
On Sat, 17 Dec 2011 12:11:04 -0800, Eelco wrote:

> > One can not state in a single line what the asterisk
> > operator does;

....
> To cut short this line of discussion; I meant the asterisk symbol purely
> in the context of collection packing/unpacking. Of course it has other
> uses too.
>
> Even that single use requires a whole paragraph to explain completely;
> when does it result in a tuple or a list, when is unpacking implicit and
> when not, why * versus **, and so on.


Do you think that this paragraph will become shorter if you change the
spelling * to something else?

It takes more than one line to explain list comprehensions, content
managers, iterators, range(), and import. Why should we care if * and **
also take more than one paragraph? Even if you could get it down to a
single line, what makes you think that such extreme brevity is a good
thing?

You might not be able to explain them in a single line, but you can
explain them pretty succinctly:

Varags: Inside a function parameter list, * collects an arbitrary
number of positional arguments into a tuple. When calling functions,
* expands any iterator into positional arguments. In both cases, **
does the same thing for keyword arguments.

Extended iterator unpacking: On the left hand side of an assignment,
* collects multiple values from the right hand side into a list.


Let's see you do better with your suggested syntax. How concisely can you
explain the three functions?

Don't forget the new type coercions (not constraints, as you keep calling
them) you're introducing. It boggles my mind that you complain about the
complexity of existing functionality, and your solution involves
*increasing* the complexity with more functionality.



--
Steven

Steven D'Aprano 12-18-2011 12:59 AM

Re: Pythonification of the asterisk-based collectionpacking/unpacking syntax
 
On Sat, 17 Dec 2011 06:38:22 -0800, Eelco wrote:

> Type constraints:
>
> In case the asterisk is not used to signal unpacking, but rather to
> signal packing, its semantics is essentially that of a type constraint.


"Type constraint" normally refers to type restrictions on *input*: it is
a restriction on what types are accepted. When it refers to output, it is
not normally a restriction, therefore "constraint" is inappropriate.
Instead it is normally described as a coercion, cast or conversion.
Automatic type conversions are the opposite of a constraint: it is a
loosening of restrictions. "I don't have to use a list, I can use any
sequence or iterator".


In iterator unpacking, it is the *output* which is a list, not a
restriction on input: in the statement:

head, *tail = sequence

tail may not exist before the assignment, and so describing this as a
constraint on the type of tail is completely inappropriate.



> The statement:
>
> head, tail = sequence
>
> Signifies regular unpacking. However, if we add an asterisk, as in:
>
> head, *tail = sequence
>
> We demand that tail not be just any python object, but rather a list.


We don't demand anything, any more than when we say:

for x in range(1, 100):

we "demand" that x is not just any python object, but rather an int.

Rather, we accept what we're given: in case of range and the for loop, we
are given an int. In the case of extended tuple unpacking, we are given a
list.



> This changes the semantics from normal unpacking, to unpacking and then
> repacking all but the head into a list.


Aside: iterator unpacking is more general than just head/tail unpacking.

>>> a, b, *var, c, d, e = range(10)
>>> print(a, b, c, d, e, var)

0 1 7 8 9 [2, 3, 4, 5, 6]


You are jumping to conclusions about implementation details which aren't
supported by the visible behaviour. What evidence do you have that
iterator unpacking creates a tuple first and then converts it to a list?


> It may be somewhat counter-intuitive to think of this as a type
> constraint, since python is after all a weakly-typed language.


The usual test of a weakly-typed language is that "1"+1 succeeds (and
usually gives 2), as in Perl but not Python. I believe you are confusing
weak typing with dynamic typing, a common mistake.


[...]
> The aim of this PEP, is that this type-constraint syntax is expanded
> upon. We should be careful here to distinguish with providing optional
> type constraints throughout python as a whole; this is not our aim.


Iterator unpacking is no more about type constraints than is len().



--
Steven

Chris Angelico 12-18-2011 02:45 AM

Re: Pythonification of the asterisk-based collectionpacking/unpacking syntax
 
On Sun, Dec 18, 2011 at 11:59 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> The usual test of a weakly-typed language is that "1"+1 succeeds (and
> usually gives 2), as in Perl but not Python. I believe you are confusing
> weak typing with dynamic typing, a common mistake.


I'd go stronger than "usually" there. If "1"+1 results in "11", then
that's not weak typing but rather a convenient syntax for
stringification - if every object can (or must) provide a to-string
method, and concatenating anything to a string causes it to be
stringified, then it's still strongly typed.

Or is a rich set of automated type-conversion functions evidence of
weak typing? And if so, then where is the line drawn - is upcasting of
int to float weak?

ChrisA

Evan Driscoll 12-18-2011 03:03 AM

Re: Pythonification of the asterisk-based collection packing/unpackingsyntax
 
On 12/17/2011 20:45, Chris Angelico wrote:
> I'd go stronger than "usually" there. If "1"+1 results in "11", then
> that's not weak typing but rather a convenient syntax for
> stringification - if every object can (or must) provide a to-string
> method, and concatenating anything to a string causes it to be
> stringified, then it's still strongly typed.
>
> Or is a rich set of automated type-conversion functions evidence of
> weak typing? And if so, then where is the line drawn - is upcasting of
> int to float weak?
>
> ChrisA

Sorry, I just subscribed to the list so am stepping in mid-conversation,
but "strong" vs "weak" typing does not have a particularly well-defined
meaning. There are at least three very different definitions you'll find
people use which are almost pairwise orthogonal in theory, if less so in
practice. There's a great mail to a Perl mailing list I've seen [1]
where someone lists *eight* definitions (albeit with a couple pairs of
definitions that are only slightly different).

I like to use it in the "automated conversion" sense, because I feel
like the other possible definitions are covered by other terms
(static/dynamic, and safe/unsafe). And in that sense, I think that
thinking of languages as "strong" *or* "weak" is a misnomer; it's a
spectrum. (Actually even a spectrum is simplifying things -- it's more
like a partial order.)

Something like ML or Haskell, which does not even allow integer to
double promotions, is very strong typing. Something like Java, which
allows some arithmetic conversion and also automatic stringification (a
la "1" + 1) is somewhere in the middle of the spectrum. Personally I'd
put Python even weaker on account of things such as '[1,2]*2' and '1 <
True' being allowed, but on the other hand it doesn't allow "1"+1.

Evan

[1]
http://groups.google.com/group/comp....b5f256ea7bfadb
(though I don't think I've seen all of those)


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJO7VflAAoJEAOzoR8eZTzgE1wH+wSMMYP6hK R6dNM4/j6ffHGE
VrMSQkKMoPaSNwwtLyxPhc9IOIWrp3HqxyXR/howHLPMO/j5kW0VZ8Vh5+HdxX6Q
Emu0sCHuzDdWXctqE1TfiA7UGJ3dLzhUPQSHzS0yOiKgQXboQo PtplvG2q0h0uxp
L1XpyEt0POYUTKxrVwNSrG5IECZ2XRUcvRrq150WgmzPJPTwG1 1JNegJ/gCXMjn1
MWKA0vxJPs42B6tONNcqh3eYfqvmqH1piPy4jA/Yc3ZtZzbADZL/fkJvokEnaLrK
3NID7xH1jxLbO1Kfg0b9gNC2nCLJiJo28wKz2rfZ6gNOYR93FP rrDvSIpCgx9w8=
=xxT3
-----END PGP SIGNATURE-----


Evan Driscoll 12-18-2011 03:08 AM

Re: Pythonification of the asterisk-based collection packing/unpackingsyntax
 
On 12/17/2011 21:03, Evan Driscoll wrote:
> Personally I'd put Python even weaker on account of things such as
> '[1,2]*2' and '1 < True' being allowed, but on the other hand it
> doesn't allow "1"+1.


Not to mention duck typing, which under my definition I'd argue is
pretty much the weakest of typing that you can apply to structure-like
types which I can think of.

Evan



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJO7VkVAAoJEAOzoR8eZTzgrUcIAIjTXy65MC WOK18WvmSOa9xE
uTXK1LnR3A0rgCKR1tORZanDIAqdvl4hXUhJ0xV5gQa2guOs74 QFHMY5vsAbbww8
sHqygFrX61n1FfGJEsReoKyGa9ZzIYwD8PiPH+y+SurpAU84nZ LzQv0fZ9HOKHNw
teZs+S+gFFGfZIhbwSkHGtw9kv+7CYzsFca0RVgTtNUWt/gPrG/V0fbNPNWGlpKL
jfZr0zd1xHgzSNXSKCjO6KPtTMdCvWe4rkI7UnY8dq6+QujUj8 tsRIH2smeaZtTE
3qhQNqYhGz9MuerqOJYzxh0JodEYKaMvqT4FjbgwpDkE6N2FV0 R+2GoS3tEmaUE=
=7pbQ
-----END PGP SIGNATURE-----



All times are GMT. The time now is 01:15 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.