Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > How much slower is dict indexing vs. list indexing?

Reply
Thread Tools

How much slower is dict indexing vs. list indexing?

 
 
Emin
Guest
Posts: n/a
 
      01-11-2007
Dear Experts,

How much slower is dict indexing vs. list indexing (or indexing into a
numpy array)? I realize that looking up a value in a dict should be
constant time, but does anyone have a sense of what the overhead will
be in doing a dict lookup vs. indexing into a list? Some ad hoc tests
I've done indicate that the overhead is less than 15% (i.e., dict
lookups seem to take no more than 15% longer than indexing into a list
and there doesn't seem to be much difference in indexing into a list
vs. a numpy array).

The reason I ask is because I'm wondering how hard I should work to
compute and store an index into a list to avoid repeated hash table
lookups. From my tests, it looks like the answer is basically "don't
bother". Does anyone have information, thoughts, or comments on this?

Thanks,
-Emin

 
Reply With Quote
 
 
 
 
Steve Holden
Guest
Posts: n/a
 
      01-11-2007
Emin wrote:
> Dear Experts,
>
> How much slower is dict indexing vs. list indexing (or indexing into a
> numpy array)? I realize that looking up a value in a dict should be
> constant time, but does anyone have a sense of what the overhead will
> be in doing a dict lookup vs. indexing into a list? Some ad hoc tests
> I've done indicate that the overhead is less than 15% (i.e., dict
> lookups seem to take no more than 15% longer than indexing into a list
> and there doesn't seem to be much difference in indexing into a list
> vs. a numpy array).
>
> The reason I ask is because I'm wondering how hard I should work to
> compute and store an index into a list to avoid repeated hash table
> lookups. From my tests, it looks like the answer is basically "don't
> bother". Does anyone have information, thoughts, or comments on this?
>

I think "don't bother" is a good conclusion. Tim Peters has led the
charge to (as he might put it) "optimize the snot" out of the dict
implementation. Anything you do in Python to speed up access is likely
to add more to execution time than you might save by not exercising the
C-based dict lookup code.

What technique were you thinking of to look up the cached index values
in Python, just as a matter of curiosity? Storing them in a dict? It
would be hard to think of a faster way ...

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com

 
Reply With Quote
 
 
 
 
Emin
Guest
Posts: n/a
 
      01-12-2007
On Jan 11, 5:53 pm, Steve Holden <s...@holdenweb.com> wrote:
>
> What technique were you thinking of to look up the cached index values
> in Python, just as a matter of curiosity? Storing them in a dict? It
> would be hard to think of a faster way ...


I didn't have anything fancy in mind. I was just wondering whether it
makes sense to replace a block of code like

data = {'a' : 1, 'b' :2, 'c' : 3}
for i in someLargeList:
for name in ['a','b','c']:
result.append(data[name])

with somthing like

data = {'a' : 1, 'b' :2, 'c' : 3}
dataValues = [data[k] for k in ['a','b','c']
for i in someLargeList:
for index in [0,1,2]:
result.append(dataValues[index])

It sounds like it's probably not worth the effort in general, but might
be for extremely ultra-critical parts of code.

Thanks

 
Reply With Quote
 
skip@pobox.com
Guest
Posts: n/a
 
      01-12-2007

Emin> It sounds like it's probably not worth the effort in general, but
Emin> might be for extremely ultra-critical parts of code.

Even in extremely ultra-critical parts of code I doubt the loss of
readability would be worth it. If there are situations where you can really
gain by switching from a natural indexing scheme to lists, there are
probably other places in your code where you will gain just as much benefit
without the corresponding loss of readability. Indexing lists only appears
to be about twice as fast as indexing dicts:

% timeit.py -s "data = {'a' : 1, 'b' :2, 'c' : 3}" "for k in 'abc': x = data[k]"
100000 loops, best of 3: 4.61 usec per loop
% timeit.py -s "data = [1, 2, 3]" "for k in [0, 1, 2]: x = data[k]"
100000 loops, best of 3: 2.97 usec per loop

If you're worried about regaining a couple microseconds per index you
probably shouldn't be using Python.

Skip
 
Reply With Quote
 
Paul McGuire
Guest
Posts: n/a
 
      01-12-2007
"Emin" <> wrote in message
news: ups.com...
> On Jan 11, 5:53 pm, Steve Holden <s...@holdenweb.com> wrote:
>>
>> What technique were you thinking of to look up the cached index values
>> in Python, just as a matter of curiosity? Storing them in a dict? It
>> would be hard to think of a faster way ...

>
> I didn't have anything fancy in mind. I was just wondering whether it
> makes sense to replace a block of code like
>
> data = {'a' : 1, 'b' :2, 'c' : 3}
> for i in someLargeList:
> for name in ['a','b','c']:
> result.append(data[name])
>
> with somthing like
>
> data = {'a' : 1, 'b' :2, 'c' : 3}
> dataValues = [data[k] for k in ['a','b','c']
> for i in someLargeList:
> for index in [0,1,2]:
> result.append(dataValues[index])
>


[Your as-posted code doesn't run, you are missing a trailing ']' in your
list comprehension. ]

So what you want is this?
1. Get the values from the data dict in order of their key, ['a','b','c']
(which is not the same as getting the data.values() list, which is in
unpredictable order)
2. For every element in some larger list, append each of the elements in
order from step 1 to some other result list.

First of all, realize that:
> for index in [0,1,2]:
> result.append(dataValues[index])

is the same as
result.extend(dataValues)
assuming that dataValues has exactly 3 elements in it.

Second, why are you iterating over someLargeList? You are doing nothing
with i, and neither the data dict nor the dataValues list changes as you
iterate.

This will do the job more quickly, I should think:

data = {'a' : 1, 'b' :2, 'c' : 3}
dataValues = [data[k] for k in ['a','b','c']]
result.extend( dataValues * len(someLargeList) )

-- Paul


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
Re: HELP! Ping getting slower and slower and slower! Paul Computer Information 2 04-03-2012 05:58 PM
Indexing services under Windows XP SP2 - Can I disable MS Indexing Service to hasten Google's OR does Google Desktop uses this MS Indexing Service? ricardodefaria Computer Support 6 08-05-2007 04:14 AM
XPath queries getting slower and slower... Andre Charbonneau Java 0 02-15-2005 05:04 PM
Dialup modem connects at slower and slower speeds Sharon Sharp Computer Support 7 10-13-2004 02:51 PM



Advertisments