Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > collections.Counter surprisingly slow

Reply
Thread Tools

collections.Counter surprisingly slow

 
 
Stefan Behnel
Guest
Posts: n/a
 
      07-30-2013
Stefan Behnel, 30.07.2013 08:39:
> Serhiy Storchaka, 29.07.2013 21:37:
>> 29.07.13 20:19, Ian Kelly написав(ла):
>>> On Mon, Jul 29, 2013 at 5:49 AM, Joshua Landau wrote:
>>>> Also, couldn't Counter just extend from defaultdict?
>>>
>>> It could, but I expect the C helper function in 3.4 will be faster
>>> since it doesn't even need to call __missing__ in the first place.

>>
>> I'm surprised, but the Counter constructor with commented out import of
>> this accelerator is faster (at least for some data).

>
> Read my post. The accelerator doesn't take the fast path for dicts as
> Counter is only a subtype of dict, not exactly a dict. That means that it
> raises and catches a KeyError exception for each new value that it finds,
> and that is apparently more costly than the overhead of calling get().
>
> So, my expectation is that it's faster for highly repetitive data and
> slower for mostly unique data.
>
> Maybe a "fast_dict_lookup" option for the accelerator that forces the fast
> path would fix this. The Counter class, just like many (most?) other
> subtypes of dict, definitely doesn't need the fallback behaviour.


Or rather drop the fallback path completely. It's not worth having code
duplication if it's not predictable up-front (before looking at the data)
if it will help or not.

http://bugs.python.org/issue18594

Stefan


 
Reply With Quote
 
 
 
 
Serhiy Storchaka
Guest
Posts: n/a
 
      07-30-2013
29.07.13 14:49, Joshua Landau написав(ла):
> I find it hard to agree that counter should be optimised for the
> unique-data case, as surely it's much more oft used when there's a point
> to counting?


Different methods are faster for different data. LBYL approach is best
for the mostly unique data case, while EAFP approach is best for the
mostly repeated data case. In general case a performance of particular
method is a function of its performances in this two extreme cases. When
it slow for one of extreme case it can be slow in a number of
intermediate cases.

> Also, couldn't Counter just extend from defaultdict?


Unfortunately this only will slowdown it.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
*.py source file surprisingly deleted with Pydev/Eclipse. Who elseexperienced this ? Nebur Python 9 04-05-2008 07:41 AM
OT, but very interesting - and a surprisingly good read! Tony Sperling Windows 64bit 0 09-25-2007 11:30 PM
On board Nvdia 6100 graphics does surprisingly well thingy NZ Computing 4 09-04-2007 05:05 AM
Sorting collections based on nested collections Doug Poland Java 9 09-27-2003 10:46 PM
InnerProperty Persistance for Collections containing other Collections mutex ASP .Net Building Controls 0 07-27-2003 02:45 PM



Advertisments