Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > finding most common elements between thousands of multiple arrays.

Reply
Thread Tools

finding most common elements between thousands of multiple arrays.

 
 
Raymond Hettinger
Guest
Posts: n/a
 
      07-08-2009
[Scott David Daniels]
> def most_frequent(arr, N):
> * * *'''Return the top N (freq, val) elements in arr'''
> * * *counted = frequency(arr) # get an iterator for freq-val pairs
> * * *heap = []
> * * *# First, just fill up the array with the first N distinct
> * * *for i in range(N):
> * * * * *try:
> * * * * * * *heap.append(counted.next())
> * * * * *except StopIteration:
> * * * * * * *break # If we run out here, no need for a heap
> * * *else:
> * * * * *# more to go, switch to a min-heap, and replace the least
> * * * * *# element every time we find something better
> * * * * *heapq.heapify(heap)
> * * * * *for pair in counted:
> * * * * * * *if pair > heap[0]:
> * * * * * * * * *heapq.heapreplace(heap, pair)
> * * *return sorted(heap, reverse=True) # put most frequent first.


In Py2.4 and later, see heapq.nlargest().
In Py3.1, see collections.Counter(data).most_common(n)


Raymond
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Finding K most common words from a collection of Documents. AbidDF C++ 0 02-19-2010 07:39 AM
common elements between list of lists and lists antar2 Python 2 07-17-2008 09:19 AM
java.lang.NoSuchMethodError: wm.common.session.Common.getCustRptListFromMax Denny Java 1 05-01-2008 07:33 AM
Flickr: difference between "most relevant" and "most interesting" Max Digital Photography 7 09-26-2007 10:38 PM
Identifying common elements from multiple hashes Neil Perl Misc 4 12-16-2005 07:34 PM



Advertisments