![]() |
Returning histogram-like data for items in a list
Hi there,
I have a list: L1 = [1,1,1,2,2,3] How can I easily turn this into a list of tuples where the first element is the list element and the second is the number of times it occurs in the list (I think that this is referred to as a histogram): i.e.: L2 = [(1,3),(2,2),(3,1)] I was doing something like: myDict = {} for i in L1: myDict.setdefault(i,[]).append(i) then doing this: L2 = [] for k, v in myDict.iteritems(): L2.append((k, len(v))) This works but I sort of feel like there ought to be an easier way, rather than to have to store the list elements, when all I want is a count of them. Would anyone care to comment? I also tried this trick, where locals()['_[1]'] refers to the list comprehension itself as it gets built, but it gave me unexpected results: >>> L2 = [(i, len(i)) for i in L2 if not i in locals()['_[1]']] >>> L2 [((1, 3), 2), ((2, 2), 2), ((3, 1), 2)] i.e. I don't understand why each tuple is being counted as well. Regards, Ric |
Re: Returning histogram-like data for items in a list
Ric Deez wrote:
> Hi there, > > I have a list: > L1 = [1,1,1,2,2,3] > > How can I easily turn this into a list of tuples where the first element > is the list element and the second is the number of times it occurs in > the list (I think that this is referred to as a histogram): > > i.e.: > > L2 = [(1,3),(2,2),(3,1)] >>> import itertools >>> L1 = [1,1,1,2,2,3] >>> L2 = [(key, len(list(group))) for key, group in itertools.groupby(L1)] >>> L2 [(1, 3), (2, 2), (3, 1)] -- Michael Hoffman |
Re: Returning histogram-like data for items in a list
"Michael Hoffman" <cam.ac.uk@mh391.invalid> wrote:
> Ric Deez wrote: > > Hi there, > > > > I have a list: > > L1 = [1,1,1,2,2,3] > > > > How can I easily turn this into a list of tuples where the first element > > is the list element and the second is the number of times it occurs in > > the list (I think that this is referred to as a histogram): > > > > i.e.: > > > > L2 = [(1,3),(2,2),(3,1)] > > >>> import itertools > >>> L1 = [1,1,1,2,2,3] > >>> L2 = [(key, len(list(group))) for key, group in itertools.groupby(L1)] > >>> L2 > [(1, 3), (2, 2), (3, 1)] > -- > Michael Hoffman This is correct if the original list items are grouped together; to be on the safe side, sort it first: L2 = [(key, len(list(group))) for key, group in itertools.groupby(sorted(L1))] Or if you care about performance rather than number of lines, use this: def hist(seq): h = {} for i in seq: try: h[i] += 1 except KeyError: h[i] = 1 return h.items() George |
Re: Returning histogram-like data for items in a list
Adding to George's reply, if you want slightly more performance, you
can avoid the exception with something like def hist(seq): h = {} for i in seq: h[i] = h.get(i,0)+1 return h.items() Jeethu Rao |
Re: Returning histogram-like data for items in a list
Ric Deez a écrit :
> Hi there, > > I have a list: > L1 = [1,1,1,2,2,3] > > How can I easily turn this into a list of tuples where the first element > is the list element and the second is the number of times it occurs in > the list (I think that this is referred to as a histogram): > > i.e.: > > L2 = [(1,3),(2,2),(3,1)] > > I was doing something like: > > myDict = {} > for i in L1: > myDict.setdefault(i,[]).append(i) > > then doing this: > > L2 = [] > for k, v in myDict.iteritems(): > L2.append((k, len(v))) > > This works but I sort of feel like there ought to be an easier way, If you don't care about order (but your solution isn't garanteed to preserve order either...): L2 = dict([(item, L1.count(item)) for item in L1]).items() But this may be inefficient is the list is large, so... def hist(seq): d = {} for item in seq: if not item in d: d[item] = seq.count(item) return d.items() > I also tried this trick, where locals()['_[1]'] refers to the list Not sure to understand how that one works... But anyway, please avoid this kind of horror unless your engaged in WORN context with a perl-monger !-). |
Re: Returning histogram-like data for items in a list
"jeethu_rao" <jeethurao@gmail.com> wrote:
> Adding to George's reply, if you want slightly more performance, you > can avoid the exception with something like > > def hist(seq): > h = {} > for i in seq: > h[i] = h.get(i,0)+1 > return h.items() > > Jeethu Rao The performance penalty of the exception is imposed only the first time a distinct item is found. So unless you have a huge list of distinct items, I seriously doubt that this is faster at any measurable rate. George |
Re: Returning histogram-like data for items in a list
"Ric Deez" <deez@next-level.com.au> wrote in message news:dbpat7$28o$1@nnrp.waia.asn.au... > I have a list: > L1 = [1,1,1,2,2,3] > How can I easily turn this into a list of tuples where the first element > is the list element and the second is the number of times it occurs in > the list (I think that this is referred to as a histogram): For ease of reading (but not efficiency) I like: hist = [(x,L1.count(x)) for x in set(L1)] See http://aspn.activestate.com/ASPN/Coo.../Recipe/277600 Alan Isaac |
| All times are GMT. The time now is 07:49 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.