Home  Forums  Reviews  Guides  Newsgroups  Register  Search 
Thread Tools 
Charlotte Henkle 


 
Steven Bethard
Guest
Posts: n/a

Charlotte Henkle <charlotte <at> fgm.com> writes:
> I'm pondering how to count the number of times an item appears in > total in a nested list. How about this: >>> myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']] >>> def count(item): .... if not isinstance(item, list): .... return {item:1} .... counts = {} .... for i in item: .... for key, ct in count(i).items(): .... counts[key] = counts.get(key, 0) + ct .... return counts .... >>> count(myList) {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1} Steve 




Steven Bethard 


 
Bryan
Guest
Posts: n/a

Steven Bethard wrote:
> Charlotte Henkle <charlotte <at> fgm.com> writes: > >>I'm pondering how to count the number of times an item appears in >>total in a nested list. > > > How about this: > > >>>>myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']] >>>>def count(item): > > ... if not isinstance(item, list): > ... return {item:1} > ... counts = {} > ... for i in item: > ... for key, ct in count(i).items(): > ... counts[key] = counts.get(key, 0) + ct > ... return counts > ... > >>>>count(myList) > > {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1} > > Steve > or maybe a less general approach might work if the nested list is always one deep: >>> myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']] >>> tmp = [] >>> d = {} >>> for item in myList: tmp += item >>> for key in tmp: d[key] = d.get(key, 0) + 1 >>> d {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1} bryan 




Bryan 
Steven Bethard
Guest
Posts: n/a

Bryan <belred1 <at> yahoo.com> writes:
> or maybe a less general approach might work if the nested list is always one > deep: > > >>> myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']] > >>> tmp = [] > >>> d = {} > >>> for item in myList: tmp += item > >>> for key in tmp: d[key] = d.get(key, 0) + 1 > >>> d > {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1} Yeah, if that's the case, you don't really even need to construct a temporary list: >>> myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']] >>> counts = {} >>> for lst in myList: .... for item in lst: .... counts[item] = counts.get(item, 0) + 1 .... >>> counts {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1} Steve 




Steven Bethard 
Alex Martelli
Guest
Posts: n/a

Charlotte Henkle <(EMail Removed)> wrote:
> Hello; > > I'm pondering how to count the number of times an item appears in > total in a nested list. For example: > > myList=[[a,b,c,d],[a,f,g,h],[a,b,x,y]] > > I'd like to know that a appeared three times, and b appeared twice, > and the rest appeard only once. Two problems enmeshed, here: how to count items of various values if you have a way to iterate over the values; and how to iterate on a nested list. You may code them together, but factoring them as two reusable functions is preferable  you'll end up with more reusable building blocks than by coding them as one monolithic function. Python is good at allowing such factorings. There may be a third subproblem, about how best to _present_ the counts  hard to say from your statement about the problem, whether that's a problem, but, see later. You're not telling us what is the type of the items. If they're strings or numbers, basically anything that can be a key to a dictionary, the counting problem is best solved by building a dictionary: def itemcount(sequence): d = {} for item in sequence: d[item] = 1 + d.get(item, 0) return d this kind of function is known as an 'accumulator': given a sequence it loops over it computing/building some resulting object and returns it. Not a specific Python term, but a nice one to keep in mind. You don't clarify the structure of your list  is it always two levels as you present it, could it be more, are all items always going to be at the same level or could some level have a mixture of items and sublists? Taking the simplest hypothesis  always two levels and all items at the second level, the first one being made up of sublists only: def itemseq(nested_list): for sublist in nested_list: for item in sublist: yield item this kind of function is known as a 'simple generator': it produces a sequence you may loop on with a for statement, pass to an accumulator, etc. This is a specific Python term, and the keyword that lets you tell whether a function is a generator is 'yield'. So, given these two functions  or enhanced equivalents if your list or its items don't meet these hypotheses  the dict of counts is easy to obtain: myCounts = itemcount(itemseq(myList)) Now for the presentation issue  you say you want to receive the information "that a appeared three times, and b appeared twice, and the rest appeard only once". This appears to mean that you want to get a sort of items by count, with items appearing only once just summarized as a group, not listed individually. If I correctly divined your intention, this might help: def summarize_counts(counts_dict): number_of_singletons = 0 count_per_item_list = [] for item, count in counts_dict.iteritems(): if count == 1: number_of_singletons += 1 else: count_per_item_list.append((count, item)) count_per_item_list.sort() for count, item in count_per_item_list: print 'Item %r appeared %d times' % (item, count) print '%d items appeared only once each' % number_of_singletons So, overall, if I've read your mind correctly, summarize_counts(itemcount(itemseq(myList))) should do exactly what you required. However, mindreading is always an iffy business (that's part of why the Zen of Python says "explicit is better than implicit".... So, if you can clarify the various ambiguous and uncertain details of your specs, which forced me to guess (violating another Zen injunction: "in the face of ambiguity, resist the temptation to guess"), over and over, at what exactly you meant, I'm confident that the various pieces presented here can be enhanced to meet whatever your requirements actually _are_. Alex 




Alex Martelli 
Michael Hoffman
Guest
Posts: n/a

If you want to be able to use any iterable, you can use itertools.chain().
>>> import itertools >>> myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']] >>> counts = {} >>> for item in itertools.chain(*myList): .... counts[item] = counts.get(item, 0) + 1 .... >>> counts {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1}  Michael Hoffman 




Michael Hoffman 
Charlotte Henkle
Guest
Posts: n/a

Ahhhh...thank you all!
Actually, I wanted to be able to count the items in the list to check my thinking on a problem, and as it turns out, my thinking was incorrect. So I have a follow up question now Some background: I was given a problem by a friend. He is in a tennis group that has 6 members. They play once a week for 10 weeks. They were having trouble generating a schedule that was random but fair (IE everyone gets to play a base number of times , and the mod is evenly and randomly distributed). I decided that I wanted to abstract as much of the problem as possible so that it would be reusable for other groups (IE to solve it for games with N players, Y times over X weeks). And then I decided I wanted this to be my second program in python, so forgive my clumsiness My first pass through was short and simple: Figure out the total number of games that will be played, and then use something like this: gamePlayers=random.sample(templist, players_per_game) to fill up all the game slots. Neat, simple. The problem I found with this solution was that it didn't give me a weighted list for remainder. The same person could get in all of the "extra" games. So instead I decided to fill games from my list of players by removal until I had a number of players left less than a full game, and then have a backup list. The back up list would act like a queue, but if you were already in the game, it would take the next guy. If you got into a game from the backup list, you were sent to the end of the line. My 2 lines grew to 50 plus Sadly, this isn't quite working either. Now that I can print out the players, I see that generally things work, but every once in a while, I get an unexpected result. For example, with 6 players, 10 weeks, 2 games per week, I would expect combinations of: {'a': 13, 'c': 14, 'b': 14, 'e': 13, 'd': 13, 'f': 13} (4 13s and 2 14s) But sometimes I get: {'a': 13, 'c': 14, 'b': 14, 'e': 14, 'd': 12, 'f': 13} (2 13s, 3 14s and a 12) That 12 breaks the planned even distribution. I suspect the problem comes from the random pulling in the first part, but I'm not sure. I also feel that some sections (espcially the print) don't have a "pythongrace", so I would love some suggestions to make them more...slithery? To make a long story longer, here's the code: ....#! /usr/bin/env python .... ....#A program to randomly fill a tennis schedule ....# ....#The original theory looked like this: ....# gamePlayers=random.sample(templist, players_per_game) ....# print gamePlayers ....# ....#But that didn't give a weighted list for extra games .... ....import random .... ....#Eventually these will get set dynamically ....number_of_weeks=10 ....players=['a', 'b', 'c', 'd', 'e', 'f'] ....games_per_week=2 ....players_per_game=4 ....games=number_of_weeks*games_per_week .... ....#this will be used to pull off "extra game" players ....backupList=players[:] ....random.shuffle(backupList) .... ....#a templist so we can modify it. ....templist=players[:] .... ....#our finished product: ....finishedList=[] .... ....while len(finishedList)!=games: .... if len(templist)>=players_per_game: .... gamePlayers=[] .... while len(gamePlayers)!=players_per_game: .... randomNumber=random.randint(0, len(templist)1) .... potentialPlayer=templist.pop(randomNumber) .... gamePlayers.append(potentialPlayer) .... finishedList.append(gamePlayers) .... else: .... gamePlayers=templist .... print "I am the leftover game players", gamePlayers .... print "I am the list of backup players", backupList .... count=0 .... while len(gamePlayers)!=players_per_game: .... print "I am a potential player" .... potentialPlayer=backupList[count] .... print potentialPlayer .... print "checking to see if I'm in the game" .... if potentialPlayer not in gamePlayers: .... print "I do not think the player is in the game" .... print "I am the backup list", backupList .... potentialPlayer=backupList.pop(count) .... gamePlayers.append(potentialPlayer) .... backupList.append(potentialPlayer) .... print "I am the backup list after reorder", backupList .... print "I am the gamePlayers after test and insertion", gamePlayers .... .... else: .... print "I think that player is in the game" .... count+=1 .... finishedList.append(gamePlayers) .... templist=players[:] .... ....#count the list (thank you, Steve! ....http://groups.google.com/groups?dq=&...oogle%2BSearch .... ....def count(item): .... if not isinstance(item, list): .... return {item:1} .... counts = {} .... for i in item: .... for key, ct in count(i).items(): .... counts[key] = counts.get(key, 0) + ct .... return counts .... ....def printList(weeks, games, list): .... x=0 .... while x!=weeks: .... y=0 .... print "Week: ", x+1 .... while y<games: .... print "Game ",y+1, " players are ", list[y] .... y+=1 .... x+=1 .... ....#printing out and counting the final list .... ....printList(number_of_weeks, games_per_week, finishedList) ....print finishedList.count("a") 




Charlotte Henkle 
Alex Martelli
Guest
Posts: n/a

Charlotte Henkle <(EMail Removed)> wrote:
... > I was given a problem by a friend. He is in a tennis group that has 6 > members. They play once a week for 10 weeks. They were having > trouble generating a schedule that was random but fair (IE everyone > gets to play a base number of times , and the mod is evenly and > randomly distributed). How many of the 6 members can play on each of those onceweekly occasions? > My first pass through was short and simple: Figure out the total > number of games that will be played, and then use something like this: > > gamePlayers=random.sample(templist, players_per_game) > > to fill up all the game slots. Neat, simple. > > The problem I found with this solution was that it didn't give me a > weighted list for remainder. The same person could get in all of the > "extra" games. Yep, you're being a bit "too random" there  sampling with repetition. Consider random.shuffle: puts the players in arbitrary order but without repetition. Just pick whatever number at a time, say from the back  doing so from the front would be just as good, but slower, so it might as well be from the back. When you're trying to pick more players than are left in the shuffled list you're peeling stuff from, make sure those get in then reshuffle the original list minus those who are already in, for the next time. Ah, natural language is SO complicated, let's put it in pseudocode. First, you'll need sets, since obviously there are set operations involved (e.g., "minus those who are already in" is goofy language for a set difference operation). So let's make sure we have sets. In Python 2.4, they're builtin; in 2.3, they're in module sets of the standard library. To ensure you run optimally in either release, star your program with: import random try: set=set except NameError: from sets import Set as set Now we're ready to write our pseudocode. At each step, i.e. each time another weekly game is planned, we'll have: some list X of players who haven't played yet in this 'round' in shuffled order; a request for N players for tonight; and of course the original set P of players. If N is less than the items in X we're in an easy case: if N < len(X): yield X[N:] del X[N:] i.e. we just yield the last N items of list X then remove those same items from further consideration in this round. If more players are going to play tonight, than there are items in X, it's a little bit more complicated. We need to prepare a shuffled list of all players except those who are alreayd in X...: else: newX = list(Pset(X)) random.shuffle(newX) and then yield the contents of X and the needed Nlen(X) items of newX: fromNewX = Nlen(X) yield X + newX[fromNewX:] finally, we need to remove from newX the items we just yielded, and set the remainder as the new value of X for the next evening of play: X = newX[:fromNewX] Great, but how do we get started? Heh, funny, if X is empty we're just in a starting position  len(X) is 0 so we're gonna prepare newX, and we're gonna prepare it from the whole of set P... just as we want. So all the initialization we need is to make sure that P _is_ a set and that X is empty: P = set(P) X = [] Great  now that we have our pseudocode, how do we turn it into actual working code? That's where Python shines, for the pseudocode we jotted down to help our thinking as we reasoned about the problem IS working Python code. We could rename X, P and N, but apart from that, here is the generator we need...: def choose_players(P, N): assert len(P) >= N > 0 P = set(P) X = [] while True: if N < len(X): yield X[N:] del X[N:] else: newX = list(Pset(X)) random.shuffle(newX) fromNewX = Nlen(X) yield X + newX[fromNewX:] X = newX[:fromNewX] I've added a check that N makes sense (0 or fewer players, or more than the club's membership, being needed each time, obviously makes no sense!) and an unending loop around the whole thing. This is a nonterminating generator  it won't ever stop iterating by itself; it needs to be called the appropriate number of times. I find that more useful than passing the generator a parameter telling it how many times to loop (you'd just need to add such a parameter T and change the while to 'for i in xrange(T):'  but if you need to do that, you could just as well use iterator.islice or whatever to limit the numbers of steps over the generator...). Here's an example use...: for i, players in enumerate(choose_players(range(13), 4))): print players if i > 9: break Here there are 13 members, named 0 to 12, and each week 4 can play, for 11 weeks. You can run it a few times to eyeball it for correctness, but of course you'll want to check it out better eventually, for example: for number_of_tests in range(10): number_of_plays = 13*[0] for i, players in enumerate(choose_players(range(13), 4)): for p in players: number_of_plays[p] += 1 if i > 9: break for i in range(max(number_of_plays)+1): print '%d:%d'%(i, number_of_plays.count(i)), You'll soon see that the results aren't as even as we wished, e.g.: kallisti:~/downloads alex$ python2.3 pla.py 0:0 1:0 2:2 3:4 4:7 0:0 1:0 2:1 3:6 4:6 0:0 1:0 2:0 3:8 4:5 0:0 1:0 2:0 3:8 4:5 0:0 1:0 2:2 3:4 4:7 0:0 1:0 2:1 3:6 4:6 0:0 1:0 2:1 3:6 4:6 0:0 1:0 2:0 3:8 4:5 0:0 1:0 2:0 3:8 4:5 0:0 1:0 2:0 3:8 4:5 ooops  the 44 available 'slots' are NOT always divided "three apiece and a fourth one for 5 lucky ones of the 13 members"  sometimes one or even two members play only twice! So what's wrong with the pseudode we had so nicely worked out...? Hmmm, clearly the 'left over' 13th player who first gets to play on the 4th night isn't given a chance to play again soon enough... he should go right back into the new X. Can we fix that? We can sure try, just change in the above: newX = list(Pset(X)) random.shuffle(newX) fromNewX = Nlen(X) yield X + newX[fromNewX:] X = newX[:fromNewX] into: newX = list(Pset(X)) random.shuffle(newX) fromNewX = Nlen(X) yield X + newX[fromNewX:] X += newX[:fromNewX] random.shuffle(X) Ah, NOW our tests are more satisfactory  huge runs of 0:0 1:0 2:0 3:8 4:5 just as we wanted! Of course, that's no mathematical _proof_ that our procedure is correct  indeed, it's QUITE interesting to provide such a proof (but I have no space left in the margins of this post to do so.... I hope that by showing you how a little program can evolve in real life, from specs to thinking in pseudocode to Python, eyeballtesting, better testing (with counting, correction of some problem, ..., I may have helped you on your way to "thinking like a programmer"!) Alex 




Alex Martelli 
Charlotte Henkle
Guest
Posts: n/a

> To make a long story longer, here's the code:
Whoops...Correction: ....#! /usr/bin/env python ....#Charlotte Henkle .... ....#A program to randomly fill a tennis schedule ....# ....#The original theory looked like this: ....# gamePlayers=random.sample(templist, players_per_game) ....# print gamePlayers ....# ....#But that didn't give a weighted list for extra games .... ....import random .... ....#Eventually these will get set dynamically ....number_of_weeks=10 ....players=['a', 'b', 'c', 'd', 'e', 'f', 'g'] ....games_per_week=2 ....players_per_game=4 ....games=number_of_weeks*games_per_week .... ....#this will be used to pull off "extra game" players ....backupList=players[:] ....random.shuffle(backupList) .... ....#a templist so we can modify it. ....templist=players[:] .... ....#our finished product: ....finishedList=[] .... ....while len(finishedList)!=games: .... if len(templist)>=players_per_game: .... gamePlayers=[] .... while len(gamePlayers)!=players_per_game: .... randomNumber=random.randint(0, len(templist)1) .... potentialPlayer=templist.pop(randomNumber) .... gamePlayers.append(potentialPlayer) .... finishedList.append(gamePlayers) .... else: .... gamePlayers=templist .... print "I am the leftover game players", gamePlayers .... print "I am the list of backup players", backupList .... count=0 .... while len(gamePlayers)!=players_per_game: .... print "I am a potential player " .... potentialPlayer=backupList[count] .... print potentialPlayer .... print "checking to see if I'm in the game" .... if potentialPlayer not in gamePlayers: .... print "I do not think the player is in the game" .... print "I am the backup list", backupList .... potentialPlayer=backupList.pop(count) .... gamePlayers.append(potentialPlayer) .... backupList.append(potentialPlayer) .... print "I am the backup list after reorder", backupList .... print "I am the gamePlayers after test and insertion", gamePlayers .... .... else: .... print "I think that player is in the game" .... count+=1 .... finishedList.append(gamePlayers) .... templist=players[:] .... ....#count the list (thank you, Steve! ....http://groups.google.com/groups?dq=&...oogle%2BSearch .... ....def count(item): .... if not isinstance(item, list): .... return {item:1} .... counts = {} .... for i in item: .... for key, ct in count(i).items(): .... counts[key] = counts.get(key, 0) + ct .... return counts .... ....def printList(weeks, games, list): .... x=0 .... y=0 .... index=0 .... while x!=weeks: .... print "Week: ", x+1 .... y=0 .... while y<games: .... print "Game ",y+1, " players are ", list[index] .... y+=1 .... index+=1 .... x+=1 .... ....#printing out and counting the final list .... ....printList(number_of_weeks, games_per_week, finishedList) ....print count(finishedList) 




Charlotte Henkle 


 
Thread Tools  


Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
locate items in matrix (index of lists of lists)  Alexzive  Python  6  03202009 06:14 PM 
Page File counter and Private Bytes Counter  George2  C++  1  01312008 09:27 AM 
newbie/ merging lists of lists with items in common  ardief  Python  14  02032007 04:23 AM 
List of lists of lists of lists...  =?UTF8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==?=  Python  5  05152006 11:47 AM 
Session("counter") vs. ViewState("counter")...a newbie question  The Eeediot  ASP .Net  3  12222004 09:31 PM 
Powered by vBulletin®. Copyright ©2000  2014, vBulletin Solutions, Inc..
SEO by vBSEO ©2010, Crawlability, Inc. 