Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Multi-dimensional list initialization

Reply
Thread Tools

Multi-dimensional list initialization

 
 
Steven D'Aprano
Guest
Posts: n/a
 
      11-07-2012
On Tue, 06 Nov 2012 14:41:24 -0800, Andrew Robinson wrote:

> Yes. But this isn't going to cost any more time than figuring out
> whether or not the list multiplication is going to cause quirks, itself.
> Human psychology *tends* (it's a FAQ!) to automatically assume the
> purpose of the list multiplication is to pre-allocate memory for the
> equivalent (using lists) of a multi-dimensional array. Note the OP even
> said "4d array".


I'm not entirely sure what your point is here. The OP screwed up -- he
didn't generate a 4-dimensional array. He generated a 2-dimensional
array. If his intuition about the number of dimensions is so poor, why
should his intuition about list multiplication be treated as sacrosanct?

As they say, the only truly intuitive interface is the nipple. There are
many places where people's intuition about programming fail. And many
places where Fred's intuition is the opposite of Barney's intuition.

Even more exciting, there are places where people's intuition is
*inconsistent*, where they expect a line of code to behave differently
depending on their intention, rather than on the code. And intuition is
often sub-optimal: e.g. isn't it intuitively obvious that "42" + 1 should
give 43? (Unless it is intuitively obvious that it should give 421.)

So while I prefer intuitively obvious behaviour where possible, it is not
the holy grail, and I am quite happy to give it up.


> The OP's original construction was simple, elegant, easy to read and
> very commonly done by newbies learning the language because it's
> *intuitive*. His second try was still intuitive, but less easy to read,
> and not as elegant.


Yes. And list multiplication is one of those areas where intuition is
suboptimal -- it produces a worse outcome overall, even if one minor use-
case gets a better outcome.

I'm not disputing that [[0]*n]*m is intuitively obvious and easy. I'm
disputing that this matters. Python would be worse off if list
multiplication behaved intuitively.

An analogy: the intuitively obvious thing to do with a screw is to bang
it in with a hammer. It's long, thin, has a point at the end, and a flat
head that just screams "hit me". But if you do the intuitive thing, your
carpentry will be *much worse* than the alternatives -- a hammered in
screw holds much less strongly than either a nail or a screwed in screw.
The surface area available for gripping is about 2% compared to a nail
and about 0.01% compared to a screw used correctly.

Having list multiplication copy has consequences beyond 2D arrays. Those
consequences make the intuitive behaviour you are requesting a negative
rather than a positive. If that means that newbie programmers have to
learn not to hammer screws in, so be it. It might be harder, slower, and
less elegant to drill a pilot hole and then screw the screw in, but the
overall result is better.


>> * Consistency of semantics is better than a plethora of special
>> cases. Python has a very simple and useful rule: objects should not
>> be copied unless explicitly requested to be copied. This is much
>> better than having to remember whether this operation or that
>> operation makes a copy. The answer is consistent:

>
> Bull. Even in the last thread I noted the range() object produces
> special cases.
> >>> range(0,5)[1]

> 1
> >>> range(0,5)[1:3]

> range(1, 3)


What's the special case here? What do you think is copied?

You take a slice of a tuple, you get a new tuple.

You take a slice of a list, you get a new list.

You take a slice of a range object, you get a new range object.

I'm honestly not getting what you think is inconsistent about this.



> The principle involved is that it gives you what you *usually* want;


Who is the "you" that decides what "you" usually want? And how do they
know what is "usual"?

Two-dimensional arrays in Python using lists are quite rare. Anyone who
is doing serious numeric work where they need 2D arrays is using numpy,
not lists. There are millions of people using Python, so it's hardly
surprising that once or twice a year some newbie trips over this. But
it's not something that people tend to trip over again and again and
again, like C's "assignment is an expression" misfeature.


> I read some of the documentation on why Python 3 chose to implement it
> this way.


What documentation is this? Because this is a design decision that goes
all the way back to at least Python 1.5:

[steve@ando ~]$ python1.5
Python 1.5.2 (#1, Aug 27 2012, 09:09:1 [GCC 4.1.2 20080704 (Red Hat
4.1.2-52)] on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> x = [[0]*5]*3
>>> x[0][1] = 99
>>> x

[[0, 99, 0, 0, 0], [0, 99, 0, 0, 0], [0, 99, 0, 0, 0]]


So I expect the design decision for Python 3 was "we made the right
decision before, there's no need to change it".



>> (pardon me for belabouring the point here)
>>
>> Q: Does [0]*10 make ten copies of the integer object? A: No, list
>> multiplication doesn't make copies of elements.

>
> Neither would my idea for the vast majority of things on your first
> list.


Um, yes? The point is that "vast majority" is not "everything". Hence,
your suggested behaviour is inconsistent.



> Q: What about [[]]*10?
> A: No, the elements are never copied.
>
> YES! For the obvious reason that such a construction is making mutable
> lists that the user wants to populate later. If they *didn't* want to
> populate them later, they ought to have used tuples -- which take less
> overhead. Who even does this thing you are suggesting?!


Who knows? Who cares? Nobody does:

n -= n

instead of just n=0, but that doesn't mean that we should give it some
sort of special meaning different from n -= m. If it turns out that the
definition of list multiplication is such that NOBODY, EVER, uses [[]]*n,
that is *still* not a good reason for special-casing it. All it means is
that this will be a less-obscure example of the billions of things which
can be done in Python but nobody wants to.

You have quoted from the Zen of Python a few times in this post. Perhaps
you missed one of the most critical ones?

Special cases aren't special enough to break the rules.

There are perfectly good ways to generate a 2D array out of lists, and
even better reasons not to use lists for that in the first place. (Numpy
arrays are much better suited for serious work.)


>> Q: What about other mutable objects like sets or dicts? A: No, the
>> elements are never copied.

>
> They aren't list multiplication compatible in any event! It's a total
> nonsense objection.


I'm afraid you've just lost an awful lot of credibility there.

py> x = [{}]*5
py> x
[{}, {}, {}, {}, {}]
py> x[0]['key'] = 1
py> x
[{'key': 1}, {'key': 1}, {'key': 1}, {'key': 1}, {'key': 1}]

And similarly for any other mutable object.

If you don't understand that lists can contain other mutable objects
apart from lists, then you really shouldn't be discussing this issue.


>> Your proposal throws away consistency for a trivial benefit on a rare
>> use- case, and replaces it with a bunch of special cases:

>
> RARE!!!! You are NUTS!!!!


Yes, rare. I base that on about 15 years of Python coding and many
thousands (tens of thousands?) of hours on Python forums like this one.
What's your opinion based on?

List multiplication is rare enough, but when it is used, it is usually
used to generate a 1D array like this:

values = [None]*n # or 0 is another popular starting value

Using it twice to generate a 2D array is even rarer.


>> Q: How about if I use delegation to proxy a list? A: Oh no, they
>> definitely won't be copied.

>
> Give an example usage of why someone would want to do this. Then we can
> discuss it.


Proxying objects is hardly a rare scenario. Delegation is less common
since you can subclass built-ins, but it is still used. It is a standard
design pattern.


>> Losing consistency in favour of saving a few characters for something
>> as uncommon as list multiplication is a poor tradeoff. That's why this
>> proposal has been rejected again and again and again every time it has
>> been suggested.

>
> Please link to the objection being proposed to the developers, and their
> reasoning for rejecting it.
> I think you are exaggerating.


Python is a twenty year old language. Do you really think this is the
first time somebody has noticed it?

It's hard to search for discussions on the dev list, because the obvious
search terms bring up many false positives. But here are a couple of bug
reports closed as "won't fix":

http://bugs.python.org/issue1408
http://bugs.python.org/issue12597

I suspect it is long past time for a PEP so this can be rejected once and
for all.


>> List multiplication [x]*n is conceptually equivalent to: <snip>
>> This is nice and simple and efficient.

> No it isn't efficient. It's *slow* when done as in your example.


Well of course it is slow*er* when you move it from low-level C to high
level Python, but it is still fast.

>> Copying other objects is slow and inefficient. Keeping list
>> multiplication consistent, and fast, is MUCH more important than making
>> it work as expected for the rare case of 2D arrays:

>
> I don't think so -- again, look at range(); it was made to work
> inconsistent for a "common" case.


You mentioned range before, but it isn't clear to me what you think is
inconsistent about it.


> Besides, 2D arrays are *not* rare and people *have* to copy internals of
> them very often.


So you say.


> The copy speed will be the same or *faster*, and the typing less -- and
> the psychological mistakes *less*, the elegance more.


You think that it is *faster* to copy a list than to make a new pointer
to it? Your credibility is not looking too good here.


> It's hardly going to confuse anyone to say that lists are copied with
> list multiplication, but the elements are not.


Well, that confuses me. What about a list where the elements are lists?
Are they copied?

What about other mutable objects? Are they copied?

What about mutable objects which are uncopyable, like file objects?


> Every time someone passes a list to a function, they *know* that the
> list is passed by value -- and the elements are passed by reference.


And there goes the last of your credibility. *You* might "know" this, but
that doesn't make it so.

Python does not use either call-by-value or call-by-reference, and it
certainly doesn't use different calling conventions for different
arguments or parts of arguments. Everything is passed using the same
calling convention. Start here:

http://mail.python.org/pipermail/tut...er/080505.html


> People in Python are USED to lists being "the" way to weird behavior
> that other languages don't do.


Python's calling behaviour is identical to that used by languages
including Java (excluding unboxed primitives) and Ruby, to mention only
two.

You're starting to shout and yell, so perhaps it's best if I finish this
here.


--
Steven
 
Reply With Quote
 
 
 
 
rusi
Guest
Posts: n/a
 
      11-07-2012
On Nov 7, 5:26*am, MRAB <(E-Mail Removed)> wrote:
> I prefer the term "reference semantics".


Ha! That hits the nail on the head.

To go back to the OP:

On Nov 5, 11:28 am, Demian Brecht <(E-Mail Removed)> wrote:
> So, here I was thinking "oh, this is a nice, easy way to initialize a 4D matrix" (running 2.7.3, non-core libs not allowed):
>
> m = [[None] * 4] * 4
>
> The way to get what I was after was:
>
> m = [[None] * 4, [None] * 4, [None] * 4, [None * 4]]
>
> (Obviously, I could have just hardcoded the initialization, but I'm too lazy to type all that out )
>
> The behaviour I encountered seems a little contradictory to me. [None] * 4 creates four distinct elements in a single array while [[None] * 4] * 4 creates one distinct array of four distinct elements, with three references to it:
>
> >>> a = [None] * 4
> >>> a[0] = 'a'
> >>> a

>
> ['a', None, None, None]
>
> >>> m = [[None] * 4] * 4
> >>> m[0][0] = 'm'
> >>> m

>
> [['m', None, None, None], ['m', None, None, None], ['m', None, None, None], ['m', None, None, None]]
>
> Is this expected behaviour and if so, why? In my mind either result makessense, but the inconsistency is what throws me off.
>


m=[[None] * 2] * 3

is the same as

m=[[None]*2, [None]*2, [None]*2]

until one starts doing things like

m[0][0] = 'm'

So dont do it!

And to get python to help you by saying the same that I am saying do
m=((None) * 2) * 3
(well almost... its a bit more messy in practice)
m=(((None,) * 2),)*3

After that try assigning to m[0][0] and python will kindly say NO!

tl;dr version:
reference semantics is ok
assignment is ok (well up to a point)
assignment + reference semantics is not
 
Reply With Quote
 
 
 
 
Steven D'Aprano
Guest
Posts: n/a
 
      11-07-2012
On Wed, 07 Nov 2012 00:23:44 +0000, MRAB wrote:

>> Incorrect. Python uses what is commonly known as call-by-object, not
>> call-by-value or call-by-reference. Passing the list by value would
>> imply that the list is copied, and that appends or removes to the list
>> inside the function would not affect the original list. This is not
>> what Python does; the list inside the function and the list passed in
>> are the same list. At the same time, the function does not have access
>> to the original reference to the list and cannot reassign it by
>> reassigning its own reference, so it is not call-by-reference semantics
>> either.
>>

> I prefer the term "reference semantics".



Oh good, because what the world needs is yet another name for the same
behaviour.

- call by sharing
- call by object sharing
- call by object reference
- call by object
- call by value, where "values" are references
(according to the Java community)
- call by reference, where "references" refer to objects, not variables
(according to the Ruby community)
- reference semantics


Anything else?

http://en.wikipedia.org/wiki/Evaluat...all_by_sharing




--
Steven
 
Reply With Quote
 
Roy Smith
Guest
Posts: n/a
 
      11-07-2012
In article <5099ec1d$0$21759$c3e8da3$(E-Mail Removed) om>,
Steven D'Aprano <(E-Mail Removed)> wrote:

> On Wed, 07 Nov 2012 00:23:44 +0000, MRAB wrote:
>
> >> Incorrect. Python uses what is commonly known as call-by-object, not
> >> call-by-value or call-by-reference. Passing the list by value would
> >> imply that the list is copied, and that appends or removes to the list
> >> inside the function would not affect the original list. This is not
> >> what Python does; the list inside the function and the list passed in
> >> are the same list. At the same time, the function does not have access
> >> to the original reference to the list and cannot reassign it by
> >> reassigning its own reference, so it is not call-by-reference semantics
> >> either.
> >>

> > I prefer the term "reference semantics".

>
>
> Oh good, because what the world needs is yet another name for the same
> behaviour.
>
> - call by sharing
> - call by object sharing
> - call by object reference
> - call by object
> - call by value, where "values" are references
> (according to the Java community)
> - call by reference, where "references" refer to objects, not variables
> (according to the Ruby community)
> - reference semantics
>
>
> Anything else?
>
> http://en.wikipedia.org/wiki/Evaluat...all_by_sharing


Call by social network? The called function likes the object.
Depending on how it feels, it can also comment on some of the object's
attributes.
 
Reply With Quote
 
Gregory Ewing
Guest
Posts: n/a
 
      11-07-2012
Roy Smith wrote:
> Call by social network? The called function likes the object.
> Depending on how it feels, it can also comment on some of the object's
> attributes.


And then finds that it has inadvertently shared all its
private data with other functions accessing the object.

--
Greg
 
Reply With Quote
 
Gregory Ewing
Guest
Posts: n/a
 
      11-07-2012
If anything is to be done in this area, it would be better
as an extension of list comprehensions, e.g.

[[None times 5] times 10]

which would be equivalent to

[[None for _i in xrange(5)] for _j in xrange(10)]

--
Greg
 
Reply With Quote
 
Jussi Piitulainen
Guest
Posts: n/a
 
      11-07-2012
Steven D'Aprano writes:
> On Wed, 07 Nov 2012 00:23:44 +0000, MRAB wrote:


> > I prefer the term "reference semantics".

>
> Oh good, because what the world needs is yet another name for the
> same behaviour.
>
> - call by sharing
> - call by object sharing
> - call by object reference
> - call by object
> - call by value, where "values" are references
> (according to the Java community)
> - call by reference, where "references" refer to objects, not variables
> (according to the Ruby community)
> - reference semantics
>
> Anything else?
>
> http://en.wikipedia.org/wiki/Evaluat...all_by_sharing


Something else:

There's a call-by-* versus pass-by-* distinction, where the call-by-*
would be rather different from any of the above:

- call-by-value is what most languages now use: argument expressions
are reduced to values before they are passed to the function /
procedure / method / whatever.

- call-by-name was something Algol 60 had by default: something like
evaluating the argument expression every time its value is needed

- call-by-need: argument expression is reduced to a value the first
time its value is needed (if ever)

- call-by-lazy (increasingly silly terminology, and I don't quite have
an idea what it means in contrast to call-by-need)

The modern confusions would then be mostly over the pass-by-* family,
invariably using call-by-value in the above sense. The terminology for
these tends to produce more heat than light, but I think the relevant
distinctions are mostly just these:

- can one modify the argument effectively [Python: yes]

- can one modify the parameter with abandon [Python: don't]

- can one swap [Python: no]

- possibly: is it expensive to pass large objects? [Python: no]

The actual rule in Scheme, Java, and Python is the same simple and
sane rule: what are passed are values (argument expressions are fully
evaluated before the actual call takes place), parameter passing does
not involve any (observable) copying, and the arguments are bound to
fresh variables (no aliasing of variables).

Different communities use different words. Sometimes they use the same
words about different things. Resulting in more heat than light

(I'd have a few more things in the something-else department, but this
is already much longer than I thought. Ends.)
 
Reply With Quote
 
wxjmfauth@gmail.com
Guest
Posts: n/a
 
      11-07-2012
Le mercredi 7 novembre 2012 02:55:10 UTC+1, Steven D'Aprano a écrit*:

>
>
>
>
>
>
> Two-dimensional arrays in Python using lists are quite rare. Anyone who
>
> is doing serious numeric work where they need 2D arrays is using numpy,
>
> not lists. There are millions of people using Python, so it's hardly
>
> surprising that once or twice a year some newbie trips over this. But
>
> it's not something that people tend to trip over again and again and
>
> again, like C's "assignment is an expression" misfeature.
>
>


--------------------


>>> from vecmat6 import *
>>> from vmio5 import *

Traceback (most recent call last):
File "<eta last command>", line 1, in <module>
ImportError: No module named vmio5
>>> from vmio6 import *
>>> from svdecomp6 import *
>>> mm = NewMat(3, 3)
>>> mm

[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]
>>> mm[0][0] = 1.0; mm[0][1] = 2.0; mm[0][2] = 3.0
>>> mm[1][0] = 11.0; mm[1][1] = 12.0; mm[1][2] = 13.0
>>> mm[2][0] = 21.0; mm[2][1] = 22.0; mm[2][2] = 23.0
>>> pr(mm, 'mm=')

mm=
( 1.00000e+000 2.00000e+000 3.00000e+000 )
( 1.10000e+001 1.20000e+001 1.30000e+001 )
( 2.10000e+001 2.20000e+001 2.30000e+001 )
>>> aa, b, cc = SVDecomp(mm)
>>> pr(aa, 'aa=')

aa=
( -8.08925e-002 -9.09280e-001 4.08248e-001 )
( -4.77811e-001 -3.24083e-001 -8.16497e-001 )
( -8.74730e-001 2.61114e-001 4.08248e-001 )
>>> pr(b, 'b=')

b=
( 4.35902e+001 1.37646e+000 1.93953e-016 )
>>> pr(cc, 'cc=')

cc=
( -5.43841e-001 7.33192e-001 4.08248e-001 )
( -5.76726e-001 2.68499e-002 -8.16497e-001 )
( -6.09610e-001 -6.79492e-001 4.08248e-001 )
>>> bb = VecToDiagMat(b)
>>> cct = TransposeMat(cc)
>>> oo = MatMulMatMulMat(aa, bb, cct)
>>> pr(oo, 'aa * bb * cct=')

aa * bb * cct=
( 1.00000e+000 2.00000e+000 3.00000e+000 )
( 1.10000e+001 1.20000e+001 1.30000e+001 )
( 2.10000e+001 2.20000e+001 2.30000e+001 )
>>>
>>> # or
>>> oo

[[0.9999999999999991, 1.9999999999999993, 2.9999999999999982],
[10.999999999999995, 11.99999999999999, 12.999999999999996],
[20.999999999999986, 21.999999999999975, 22.999999999999986]]



jmf

 
Reply With Quote
 
Ethan Furman
Guest
Posts: n/a
 
      11-07-2012
Oscar Benjamin wrote:
> On Nov 7, 2012 5:41 AM, "Gregory Ewing" <(E-Mail Removed)
> <(E-Mail Removed)>> wrote:
> >
> > If anything is to be done in this area, it would be better
> > as an extension of list comprehensions, e.g.
> >
> > [[None times 5] times 10]
> >
> > which would be equivalent to
> >
> > [[None for _i in xrange(5)] for _j in xrange(10)]

>
> I think you're right that the meaning of list-int multiplication
> can't/shouldn't be changed if this way.
>
> A multidimensional list comprehension would be useful even for people
> who are using numpy as it's common to use a list comprehension to
> initialise a numpy array.
>
> A more modest addition for the limited case described in this thread
> could be to use exponentiation:
>
> >>> [0] ** (2, 3)

> [[0, 0, 0], [0, 0, 0]]


What would happen with

--> [{}] ** (2, 3)

or

--> [my_custom_container()] ** (2, 3)

?

~Ethan~
 
Reply With Quote
 
Ethan Furman
Guest
Posts: n/a
 
      11-07-2012
After this post the only credibility you have left (with me, anyway) is that you seem to be willing
to learn. So learn the way Python works before you try to reimplement it.

~Ethan~
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
initialization of array as a member using the initialization list aaragon C++ 2 11-02-2008 04:57 PM
array initialization in initialization list. toton C++ 5 09-28-2006 05:13 PM
Initialization of non-integral type in initialization list anongroupaccount@googlemail.com C++ 6 12-11-2005 09:51 PM
Initialization via ctor vs. initialization via assignment Matthias Kaeppler C++ 2 07-18-2005 04:25 PM
Default Initialization Vs. Value Initialization JKop C++ 10 09-22-2004 07:26 PM



Advertisments