Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Weighted "random" selection from list of lists

Reply
Thread Tools

Weighted "random" selection from list of lists

 
 
Jesse Noller
Guest
Posts: n/a
 
      10-08-2005
Hello -

I'm probably missing something here, but I have a problem where I am
populating a list of lists like this:

list1 = [ 'a', 'b', 'c' ]
list2 = [ 'dog', 'cat', 'panda' ]
list3 = [ 'blue', 'red', 'green' ]

main_list = [ list1, list2, list3 ]

Once main_list is populated, I want to build a sequence from items
within the lists, "randomly" with a defined percentage of the sequence
coming for the various lists. For example, if I want a 6 item
sequence, I might want:

60% from list 1 (main_list[0])
30% from list 2 (main_list[1])
10% from list 3 (main_list[2])

I know how to pull a random sequence (using random()) from the lists,
but I'm not sure how to pick it with the desired percentages.

Any help is appreciated, thanks

-jesse
 
Reply With Quote
 
 
 
 
Ron Adam
Guest
Posts: n/a
 
      10-08-2005
Jesse Noller wrote:


> 60% from list 1 (main_list[0])
> 30% from list 2 (main_list[1])
> 10% from list 3 (main_list[2])
>
> I know how to pull a random sequence (using random()) from the lists,
> but I'm not sure how to pick it with the desired percentages.
>
> Any help is appreciated, thanks
>
> -jesse


Just add up the total of all lists.

total = len(list1)+len(list2)+len(list3)
n1 = .60 * total # number from list 1
n2 = .30 * total # number from list 2
n3 = .10 * total # number from list 3

You'll need to decide how to handle when a list has too few items in it.

Cheers,
Ron
 
Reply With Quote
 
 
 
 
Peter Otten
Guest
Posts: n/a
 
      10-08-2005
Jesse Noller wrote:

> I'm probably missing something here, but I have a problem where I am
> populating a list of lists like this:
>
> list1 = [ 'a', 'b', 'c' ]
> list2 = [ 'dog', 'cat', 'panda' ]
> list3 = [ 'blue', 'red', 'green' ]
>
> main_list = [ list1, list2, list3 ]
>
> Once main_list is populated, I want to build a sequence from items
> within the lists, "randomly" with a defined percentage of the sequence
> coming for the various lists. For example, if I want a 6 item
> sequence, I might want:
>
> 60% from list 1 (main_list[0])
> 30% from list 2 (main_list[1])
> 10% from list 3 (main_list[2])
>
> I know how to pull a random sequence (using random()) from the lists,
> but I'm not sure how to pick it with the desired percentages.



If the percentages can be normalized to small integral numbers, just make a
pool where each list is repeated according to its weight, e. g.
list1 occurs 6, list2 3 times, and list3 once:

pools =[list1, list2, list3]
weights = [6, 3, 1]
sample_size = 10

weighted_pools = []
for p, w in zip(pools, weights):
weighted_pools.extend([p]*w)

sample = [random.choice(random.choice(weighted_pools))
for _ in xrange(sample_size)]


Another option is to use bisect() to choose a pool:

pools =[list1, list2, list3]
sample_size = 10

def isum(items, sigma=0.0):
for item in items:
sigma += item
yield sigma

cumulated_weights = list(isum([60, 30, 10], 0))
sigma = cumulated_weights[-1]

sample = []
for _ in xrange(sample_size):
pool = pools[bisect.bisect(cumulated_weights, random.random()*sigma)]
sample.append(random.choice(pool))

(all code untested)

Peter
 
Reply With Quote
 
Scott David Daniels
Guest
Posts: n/a
 
      10-08-2005
Jesse Noller wrote:
<paraphrased>
> Once main_list is populated, I want to build a sequence from items
> within the lists, "randomly" with a defined percentage of the sequence
> coming for the various lists. For example:
> 60% from list 1 (main_list[0]), 30% from list 2 (main_list[1]), 10% from list 3 (main_list[2])



import bisect, random
main_list = [['a', 'b', 'c'],
['dog', 'cat', 'panda'],
['blue', 'red', 'green']]
weights = [60, 30, 10]

cumulative = []
total = 0
for index, value in enumerate(weights):
total += value
cumulative.append(total)

for i in range(20):
score = random.random() * total
index = bisect.bisect(cumulative, score)
print random.choice(main_list[index]),


--
-Scott David Daniels

 
Reply With Quote
 
Steven D'Aprano
Guest
Posts: n/a
 
      10-09-2005
On Sat, 08 Oct 2005 12:48:26 -0400, Jesse Noller wrote:

> Once main_list is populated, I want to build a sequence from items
> within the lists, "randomly" with a defined percentage of the sequence
> coming for the various lists. For example, if I want a 6 item
> sequence, I might want:
>
> 60% from list 1 (main_list[0])
> 30% from list 2 (main_list[1])
> 10% from list 3 (main_list[2])


If you are happy enough to match the percentages statistically rather than
exactly, simply do something like this:

pr = random.random()
if pr < 0.6:
list_num = 0
elif pr < 0.9:
list_num = 1
else:
list_num = 2
return random.choice(main_list[list_num])

or however you want to extract an item.

On average, this will mean 60% of the items will come from list1 etc, but
for small numbers of trials, you may have significant differences.



--
Steven.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
weighted mean; weighted standard error of the mean (sem) C Barrington-Leigh Python 1 09-10-2010 02:03 AM
Random weighted selection... Pat Java 16 05-29-2009 04:32 AM
Most pythonic way of weighted random selection Manuel Ebert Python 3 08-31-2008 02:02 AM
List of lists of lists of lists... =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==?= Python 5 05-15-2006 11:47 AM
HOWTO autopost the selection list upon selection curiousity ASP .Net Mobile 0 11-21-2003 12:57 AM



Advertisments