Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > dictionary initialization

Reply
Thread Tools

dictionary initialization

 
 
Weiguang Shi
Guest
Posts: n/a
 
      11-25-2004
Hi,

With awk, I can do something like
$ echo 'hello' |awk '{a[$1]++}END{for(i in a)print i, a[i]}'

That is, a['hello'] was not there but allocated and initialized to
zero upon reference.

With Python, I got
>>> b={}
>>> b[1] = b[1] +1

Traceback (most recent call last):
File "<stdin>", line 1, in ?
KeyError: 1

That is, I have to initialize b[1] explicitly in the first place.

Personally, I think

a[i]++

in awk is much more elegant than

if i in a: a[i] += 1
else: a[i] = 1

I wonder how the latter is justified in Python.

Thanks,
Weiguang
 
Reply With Quote
 
 
 
 
Weiguang Shi
Guest
Posts: n/a
 
      11-25-2004
Hi,

In article <(E-Mail Removed)>, Caleb Hattingh wrote:
> ...
>Dict entries accessed with 'string' keys,

Not necessarily. And doesn't make a difference in my question.

> ...
>
>Which feature specifically do you want justification for?

Have it your way: string-indexed dictionaries.

>>> a={}
>>> a['1']+=1

Traceback (most recent call last):
File "<stdin>", line 1, in ?
KeyError: '1'

a['1'] when it referenced, is detected non-existent but not
automatically initialized so that it exists before adding 1 to its
value.

Weiguang
 
Reply With Quote
 
 
 
 
Bengt Richter
Guest
Posts: n/a
 
      11-25-2004
On Thu, 25 Nov 2004 18:38:17 +0000 (UTC), http://www.velocityreviews.com/forums/(E-Mail Removed) (Weiguang Shi) wrote:

>Hi,
>
>With awk, I can do something like
> $ echo 'hello' |awk '{a[$1]++}END{for(i in a)print i, a[i]}'
>
>That is, a['hello'] was not there but allocated and initialized to
>zero upon reference.
>
>With Python, I got
> >>> b={}
> >>> b[1] = b[1] +1

> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> KeyError: 1
>
>That is, I have to initialize b[1] explicitly in the first place.
>
>Personally, I think
>
> a[i]++
>
>in awk is much more elegant than
>
> if i in a: a[i] += 1
> else: a[i] = 1
>
>I wonder how the latter is justified in Python.
>

You wrote it, so you have to "justify" it

While I agree that ++ and -- are handy abbreviations, and creating a key by default
makes for concise notation, a[i]++ means you have to make some narrow assumptions -- i.e.,
that you want to create a zero integer start value. You can certainly make a dict subclass
that behaves that way if you want it:

>>> class D(dict):

... def __getitem__(self, i):
... if i not in self: self[i] = 0
... return dict.__getitem__(self, i)
...
>>> dink = D()
>>> dink

{}
>>> dink['a'] +=1
>>> dink

{'a': 1}
>>> dink['a'] +=1
>>> dink

{'a': 2}
>>> dink['b']

0
>>> dink['b']

0
>>> dink

{'a': 2, 'b': 0}


Otherwise the usual ways are along the lines of

>>> d = {}
>>> d.setdefault('hello',[0])[0] += 1
>>> d

{'hello': [1]}
>>> d.setdefault('hello',[0])[0] += 1
>>> d

{'hello': [2]}

Or
>>> d['hi'] = d.get('hi', 0) + 1
>>> d

{'hi': 1, 'hello': [2]}
>>> d['hi'] = d.get('hi', 0) + 1
>>> d

{'hi': 2, 'hello': [2]}
>>> d['hi'] = d.get('hi', 0) + 1
>>> d

{'hi': 3, 'hello': [2]}

Or
>>> for x in xrange(3):

... try: d['yo'] += 1
... except KeyError: d['yo'] = 1
... print d
...
{'hi': 3, 'hello': [2], 'yo': 1}
{'hi': 3, 'hello': [2], 'yo': 2}
{'hi': 3, 'hello': [2], 'yo': 3}

Regards,
Bengt Richter
 
Reply With Quote
 
Weiguang Shi
Guest
Posts: n/a
 
      11-25-2004
Hi,

In article <(E-Mail Removed)>, Bengt Richter wrote:
> On Thu, 25 Nov 2004 18:38:17 +0000 (UTC), (E-Mail Removed)
> (Weiguang Shi) wrote:
>You wrote it, so you have to "justify" it

I guess

>While I agree that ++ and -- are handy abbreviations, and creating a
>key by default makes for concise notation, a[i]++ means you have to
>make some narrow assumptions ...

Right, though generalization can be painful for the uninitiated/newbie.

>You can certainly make a dict subclass that behaves that way if you
>want it:
> ...

This is nice even for someone hopelessly lazy as me.

>
>Otherwise the usual ways are along the lines of
>...

I would happily avoid them all.

Thanks a lot,
Weiguang
 
Reply With Quote
 
Dan Perl
Guest
Posts: n/a
 
      11-25-2004
I don't know awk, so I don't know how your awk statement works.

Even when it comes to the python statements, I'm not sure exactly what the
intentions of design intention were in this case, but I can see at least one
justification. Python being dynamically typed, b[1] can be of any type, so
you have to initialize b[1] to give it a type and only then adding something
to it makes sense. Otherwise, the 'add' operation not being implemented for
all types, 'b[1]+1' may not even be allowed.

You're saying that in awk a['hello'] is initialized to 0. That would not be
justified in python. The type of b[1] is undetermined until initialization
and I don't see why it should be an int by default.

Dan

"Weiguang Shi" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Hi,
>
> With awk, I can do something like
> $ echo 'hello' |awk '{a[$1]++}END{for(i in a)print i, a[i]}'
>
> That is, a['hello'] was not there but allocated and initialized to
> zero upon reference.
>
> With Python, I got
> >>> b={}
> >>> b[1] = b[1] +1

> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> KeyError: 1
>
> That is, I have to initialize b[1] explicitly in the first place.
>
> Personally, I think
>
> a[i]++
>
> in awk is much more elegant than
>
> if i in a: a[i] += 1
> else: a[i] = 1
>
> I wonder how the latter is justified in Python.
>
> Thanks,
> Weiguang



 
Reply With Quote
 
Weiguang Shi
Guest
Posts: n/a
 
      11-25-2004
In article <(E-Mail Removed)>, Dan Perl wrote:
>I don't know awk, so I don't know how your awk statement works.

It doesn't hurt to give it a try

>
>Even when it comes to the python statements, I'm not sure exactly what the
> ...

I see your point.

>
>You're saying that in awk a['hello'] is initialized to 0.

More than that; I said awk recognizes a['hello']++ as an
arithmetic operation and initializes a['hello'] to 0 and add one to
it. (This is all guess. I didn't implement gawk. But you see my point.)

> That would not be justified in python. The type of b[1] is
> undetermined until initialization and I don't see why it should be
> an int by default.

In my example, it was b[1]+=1. "+=1" should at least tell Python two
things: this is an add operation and one of the operands is an
integer. Based on these, shouldn't Python be able to insert the pair
"1:0" into a{} before doing the increment?

Weiguang
 
Reply With Quote
 
Weiguang Shi
Guest
Posts: n/a
 
      11-25-2004
Hi,

In article <(E-Mail Removed)>, Caleb Hattingh wrote:
> ...
> ***
> # You *must* use a={}, just start as below
> '>>> a={}

Yeah I know. I can live with that.

> '>>> a['1']=0
> '>>> a['1']+=1

Right here. You have to say a['1'] = 0 before you can say a['1'] +=1
Python does not do the former for you. That's what I'm asking
justifications for.

Regards,
Weiguang
 
Reply With Quote
 
Peter Hansen
Guest
Posts: n/a
 
      11-25-2004
Weiguang Shi wrote:
> In article <(E-Mail Removed)>, Dan Perl wrote:
>>That would not be justified in python. The type of b[1] is
>>undetermined until initialization and I don't see why it should be
>>an int by default.

>
> In my example, it was b[1]+=1. "+=1" should at least tell Python two
> things: this is an add operation and one of the operands is an
> integer.


Why would it tell Python that?

>>> b = {1: 2.5}
>>> b[1] += 1
>>> b

{1: 3.5}

So at this point, it can clearly be either an integer or
a float. Doubtless it could also be an object which
overloads the += operator with integer arguments, though
what it might actually do is anyone's guess.

-Peter
 
Reply With Quote
 
=?iso-8859-15?q?Berthold_H=F6llmann?=
Guest
Posts: n/a
 
      11-25-2004
(E-Mail Removed) (Weiguang Shi) writes:

> Hi,
>
> With awk, I can do something like
> $ echo 'hello' |awk '{a[$1]++}END{for(i in a)print i, a[i]}'
>
> That is, a['hello'] was not there but allocated and initialized to
> zero upon reference.
>
> With Python, I got
> >>> b={}
> >>> b[1] = b[1] +1

> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> KeyError: 1
>
> That is, I have to initialize b[1] explicitly in the first place.
>
> Personally, I think
>
> a[i]++
>
> in awk is much more elegant than
>
> if i in a: a[i] += 1
> else: a[i] = 1
>
> I wonder how the latter is justified in Python.


It isn't

>>> a={}
>>> a[1] = a.get(1, 0) + 1
>>> a

{1: 1}
>>> a[1] = a.get(1, 0) + 1
>>> a

{1: 2}

Regards
Berthold
--
(E-Mail Removed) / <http://höllmanns.de/>
(E-Mail Removed) / <http://starship.python.net/crew/bhoel/>
 
Reply With Quote
 
Josiah Carlson
Guest
Posts: n/a
 
      11-25-2004

(E-Mail Removed) (Weiguang Shi) wrote:
>
> In article <(E-Mail Removed)>, Dan Perl wrote:
> >I don't know awk, so I don't know how your awk statement works.

> It doesn't hurt to give it a try
>
> >
> >Even when it comes to the python statements, I'm not sure exactly what the
> > ...

> I see your point.
>
> >
> >You're saying that in awk a['hello'] is initialized to 0.

> More than that; I said awk recognizes a['hello']++ as an
> arithmetic operation and initializes a['hello'] to 0 and add one to
> it. (This is all guess. I didn't implement gawk. But you see my point.)
>
> > That would not be justified in python. The type of b[1] is
> > undetermined until initialization and I don't see why it should be
> > an int by default.

> In my example, it was b[1]+=1. "+=1" should at least tell Python two
> things: this is an add operation and one of the operands is an
> integer. Based on these, shouldn't Python be able to insert the pair
> "1:0" into a{} before doing the increment?


As Peter has already mentioned, since b[1] doesn't exist until you
assign it, the type of b[1] is ambiguous.

The reason Python doesn't do automatic assignments on unknown access is
due to a few Python 'Zens'

>>> import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Specifically:
Explicit is better than implicit.
(you should assign what you want, not expect Python to know what you
want)
Special cases aren't special enough to break the rules.
(incrementing non-existant values in a dictionary shouldn't be any
different from accessing non-existant values)
In the face of ambiguity, refuse the temptation to guess.
(what class/value should the non-existant value initialize to?)


Learn the zens. Any time you have a design question about the Python,
check the zens, then check google, then check here.

- Josiah

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
what's the difference between value-initialization and default-initialization? Jess C++ 23 05-04-2007 03:03 AM
array initialization in initialization list. toton C++ 5 09-28-2006 05:13 PM
Initialization of non-integral type in initialization list anongroupaccount@googlemail.com C++ 6 12-11-2005 09:51 PM
Initialization via ctor vs. initialization via assignment Matthias Kaeppler C++ 2 07-18-2005 04:25 PM
Default Initialization Vs. Value Initialization JKop C++ 10 09-22-2004 07:26 PM



Advertisments