- **Python**
(*http://www.velocityreviews.com/forums/f43-python.html*)

- - **getting a submatrix of all true**
(*http://www.velocityreviews.com/forums/t319169-getting-a-submatrix-of-all-true.html*)

getting a submatrix of all trueI have a largish data set (1000 observations x 100 floating point variables), and some of the of the data are missing. I want to try a variety of clustering, neural network, etc, algorithms on the data, and to keep life simple I want to reduce the dimensions of the matrix so that I have no missing values, since not all the algorithms are able to handle them and there is sufficient redundancy in the variables that I can afford to lose some. I am currently using a hack that works, but it makes me wonder if there is an optimal solution. I define optimal as the removal of rows and columns such that there are no missing values and max(numRows*numCols). My current approach is to drop rows (observations) that have more than some prespecified number of missing variables, and then drop the columns (variables) of the reduced data set that have any missing values. I chose the threshold for dropping a row by eyeballing the distribution of number of missing variables per observation, pick a number on the low end of the distribution, and dropping the rows that exceed the threshold. Another way of formulating the question: for a sparse boolean matrix (sparse on True), what is the optimal way to remove rows and columns so that the total number of elements in the matrix is maximal and there are no True values left. Example: 0 0 0 0 0 0 candidate sub matrix has 12 elements 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 candidate submatrix has 15 elements 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 candidate submatrix has 8 elements 0 0 0 0 I want to programatically extract the 15 element matrix Following the approach described above, I get the desired answer in the example below, though this is a hack solution and I have the feeling there is a better one. from Numeric import nonzero, array, take, sum X = array([[1, 0, 0, 0, 1], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 1, 0, 0]]) goodObsInd = nonzero(sum(X,1)<2) # observations with < 2 missing variables X = take(X, goodObsInd) # drop the bad goodVarInd = nonzero(sum(X)==0) # variables with no missing data X = take(X, goodVarInd, 1 ) # drop the bad variables print X John Hunter |

Re: getting a submatrix of all trueOn Wed, 02 Jul 2003 14:16:57 -0500, John Hunter <jdhunter@ace.bsd.uchicago.edu> wrote:
> >I have a largish data set (1000 observations x 100 floating point >variables), and some of the of the data are missing. I want to try a >variety of clustering, neural network, etc, algorithms on the data, >and to keep life simple I want to reduce the dimensions of the matrix >so that I have no missing values, since not all the algorithms are >able to handle them and there is sufficient redundancy in the >variables that I can afford to lose some. > >I am currently using a hack that works, but it makes me wonder if >there is an optimal solution. I define optimal as the removal of rows >and columns such that there are no missing values and >max(numRows*numCols). > >My current approach is to drop rows (observations) that have more than >some prespecified number of missing variables, and then drop the >columns (variables) of the reduced data set that have any missing >values. I chose the threshold for dropping a row by eyeballing the >distribution of number of missing variables per observation, pick a >number on the low end of the distribution, and dropping the rows that >exceed the threshold. > >Another way of formulating the question: for a sparse boolean matrix >(sparse on True), what is the optimal way to remove rows and columns >so that the total number of elements in the matrix is maximal and >there are no True values left. > > >Example: > > 0 0 0 > 0 0 0 candidate sub matrix has 12 elements > 0 0 0 > 0 0 0 > >1 0 0 0 1 >0 0 0 0 0 0 0 0 0 0 >0 0 0 0 0 0 0 0 0 0 candidate submatrix has 15 elements >0 0 0 0 0 0 0 0 0 0 >0 0 1 0 0 > > 0 0 > 0 0 candidate submatrix has 8 elements > 0 0 > 0 0 > >I want to programatically extract the 15 element matrix If I understand your original optimality definition, that would be suboptimal. I.e., how about the 16-element matrix? (x's mark the corresponding zeroes) 1 0 0 0 1 x x 0 x x x x 0 x x x x 0 x x x x 1 x x Or do they have to be adjacent? > >Following the approach described above, I get the desired answer in >the example below, though this is a hack solution and I have the >feeling there is a better one. > > from Numeric import nonzero, array, take, sum > > X = array([[1, 0, 0, 0, 1], > [0, 0, 0, 0, 0], > [0, 0, 0, 0, 0], > [0, 0, 0, 0, 0], > [0, 0, 1, 0, 0]]) > > goodObsInd = nonzero(sum(X,1)<2) # observations with < 2 missing variables > X = take(X, goodObsInd) # drop the bad > > goodVarInd = nonzero(sum(X)==0) # variables with no missing data > X = take(X, goodVarInd, 1 ) # drop the bad variables > > print X > Brute force seems to work on this little example. Maybe it can be memoized and optimized and/or whatever to handle your larger matrix fast enough? ====< submatrix.py >================================ X = [ [1, 0, 0, 0, 1], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 1, 0, 0] ] def optimrcs(mx, rows=[], cols=[]): if not rows or not cols: return 0,[],[] maxscore = 0 for r in rows: if not mx[r].count(1): continue for c in cols: if not mx[r][c]: continue # eval column submatrix vs row submatrix colsxthis = cols[:]; colsxthis.remove(c) colscore, rset, cset = optimrcs(mx, rows, colsxthis) if colscore>maxscore: maxscore, maxrows, maxcols = colscore, rset, cset rowsxthis = rows[:]; rowsxthis.remove(r) rowscore, rset, cset = optimrcs(mx, rowsxthis, cols) if rowscore >= maxscore: maxscore, maxrows, maxcols = rowscore, rset, cset if not maxscore: return len(rows)*len(cols), rows, cols return maxscore, maxrows,maxcols if __name__ == '__main__': score, rowsel, colsel = optimrcs(X, range(5), range(5)) print 'Score = %s, rows = %s, cols = %s' % (score,rowsel,colsel) for r in range(5): row = X[r] for c in range(5): if r in rowsel and c in colsel: print '<%s>'% row[c], else: print ' %s '% row[c], ================================================== == Running it results thus: [21:24] C:\pywk\clp>submatrix.py Score = 16, rows = [1, 2, 3, 4], cols = [0, 1, 3, 4] 1 0 0 0 1 <0> <0> 0 <0> <0> <0> <0> 0 <0> <0> <0> <0> 0 <0> <0> <0> <0> 1 <0> <0> Regards, Bengt Richter |

Re: getting a submatrix of all true"John Hunter" <jdhunter@ace.bsd.uchicago.edu> wrote in message news:mailman.1057173484.29754.python-list@python.org... > > I have a largish data set (1000 observations x 100 floating point > variables), and some of the of the data are missing. All too typical -- missing data are the bane of statistics. > I want to try a > variety of clustering, neural network, etc, algorithms on the data, > and to keep life simple I want to reduce the dimensions of the matrix > so that I have no missing values, since not all the algorithms are > able to handle them and there is sufficient redundancy in the > variables that I can afford to lose some. Statisticians have tried a variety of approaches. Googling for ' statistics "missing data" 'will give you some leads if you want. Terry J. Reedy |

Re: getting a submatrix of all true>>>>> "Bengt" == Bengt Richter <bokr@oz.net> writes:
Bengt> If I understand your original optimality definition, that Bengt> would be suboptimal. I.e., how about the 16-element Bengt> matrix? (x's mark the corresponding zeroes) Bengt> Or do they have to be adjacent? No, in general they won't be. I missed that one. Just goes to show that my "solution" is not one, which I knew. But it does efficiently deliver good solutions, where good means large but not optimal. Bengt> Brute force seems to work on this little example. Maybe it Bengt> can be memoized and optimized and/or whatever to handle Bengt> your larger matrix fast enough? Thanks for the example. Unfortunately, it is too slow even for moderate size matrices (30,10). I've been running it for over two hours for a 30x10 matrix and it hasn't finished. And my data are 1000x100! Here is a little code to generate larger matrices for testing.... from Numeric import zeros, array, Int, put, reshape from RandomArray import uniform numRows, numCols = 30,10 numMissing = int(0.05*numRows*numCols) # 5% missing X = zeros( (300,)) ind = uniform(0, numRows*numCols, (numMissing,)).astype(Int) put(X,ind,1) X = reshape(X, (numRows,numCols)) # turn it into a list for your code.... X = [ [val for val in row] for row in X] for row in X: print row Last night, I began to formulate the problem as a logic statement, hoping this would give me an idea of how to proceed. But no progress yet. But I have come to the conclusion that with P ones, brute force requires 2^P combinations. With 1000x100 with 5% missing that gives me 2^5000. Not good. Thanks for your suggestion, though. If you have any more thoughts, let me know. John Hunter |

Re: getting a submatrix of all true>>>>> "Terry" == Terry Reedy <tjreedy@udel.edu> writes:
Terry> Statisticians have tried a variety of approaches. Googling Terry> for ' statistics "missing data" 'will give you some leads Terry> if you want. I have done some searching. I'm familiar with the common methods (delete every row that contains any missing, replace missing via mean or regression or something clever) but haven't seen any discussion of dropping variables and observations together to yield data sets with no missing values. Have you seen something like this? John Hunter |

Re: getting a submatrix of all true"John Hunter" <jdhunter@ace.bsd.uchicago.edu> wrote in message news:mailman.1057250709.8853.python-list@python.org... > >>>>> "Terry" == Terry Reedy <tjreedy@udel.edu> writes: > > > Terry> Statisticians have tried a variety of approaches. Googling > Terry> for ' statistics "missing data" 'will give you some leads > Terry> if you want. > > I have done some searching. I'm familiar with the common methods > (delete every row that contains any missing, replace missing via mean > or regression or something clever) but haven't seen any discussion of > dropping variables and observations together to yield data sets with > no missing values. Have you seen something like this? There are also calculation methods for at least some analyses that allow for missing data . One of the google hits is for the book Statistical Analysis with Missing Data. I have not seen it, but it is not unique. As I hinted, there are no really nice solutions to missing data. I have done both row and column deletion. Sometimes I have done multiple analyses with different deletion strategies: once with enough vars deleted so all or most cases are complete, and again with enough cases deleted so that all or most var are complete. I would start by counting (with a program) the number of missing values for each row and then construction the frequency distribution thereof. Then the same for the columns, with the addition of a correlation table or tables. One thing one can do with vars is to combine some to make a composite measure. For instance, if three variables more-or-less measure the same thing, one can combine (perhaps by the mean of those present) to make one variable that is present if any of the three are, so it is only missing for cases (rows) that are missing all three. This type of work requires that you look at the variables and consider their meaning, rather than just inputing them into a blind proceedure that consisders all vars to be the same. Terry J. Reedy |

Re: getting a submatrix of all truebokr@oz.net (Bengt Richter) wrote in message news:<be0b84$89a$0@216.39.172.122>...
> On Wed, 02 Jul 2003 14:16:57 -0500, John Hunter <jdhunter@ace.bsd.uchicago.edu> wrote: > >I am currently using a hack that works, but it makes me wonder if > >there is an optimal solution. I define optimal as the removal of rows > >and columns such that there are no missing values and > >max(numRows*numCols). There must be! For each missing entry you can choose to drop either the row or the column - giving a maximum of 2^N possible ways for N missing elements. Given that the order in which the rows/columns are deleted is unimportant the number of possibilities to test is less than that, but the difficulty is that your decision about row or column for one element affects the decision for other elements if they share rows or columns. Graph theoretic approaches probably help - it's similar to reordering matrix elements to avoid fill in when decomposing sparse matrices. Depending on N, you might be able to test all possibilities in a reasonable time, but I have a nasty feeling you need to check all 2^M to be sure of an optimal solution, where M is the subset of N which either a share a row or column with another problematic element. > >Another way of formulating the question: for a sparse boolean matrix > >(sparse on True), what is the optimal way to remove rows and columns > >so that the total number of elements in the matrix is maximal and > >there are no True values left. Sadly I can only give you a method which is certainly not optimal, but at least finds something which is probably not too bad, and fairly quickly. Uses the Numeric module to speed up, so that a few thousand by a few hundred is quickly computable. I'm curious to know if it does much better or worse than other algorithms. Interesting problem! Cheers, Jon ================================================== =================== from Numeric import * def pickrcs(a,debug=0): b=array(a,copy=1) # local copy of array is modified dr=zeros(b.shape[0]) # Arrays to note deleted rows... dc=zeros(b.shape[1]) # ... and columns sizeleft=[b.shape[0],b.shape[1]] # Remaining dimensions while 1: # Keep deleting till none left sr=sum(b,1) # Column scores = ij's to remove sc=sum(b,0) # Row scores mr=argmax(sr) # index highest row score mc=argmax(sc) # index highest column score if sr[mr]==0 and sc[mc]==0 : break # stop when nothing to delete # pick the row/column to delete fewest useful elements if sizeleft[1]-sr[mr] > sizeleft[0]-sc[mc]: b[:,mc]=0 # Zero out column dc[mc]=1 # tags column as deleted sizeleft[1]-=1 else: b[mr,:]=0 # Zero out row dr[mr]=1 sizeleft[0]-=1 # end of deletions loop - should be no missing elements now # # paranoia!, check if anything was left after deletions if sum(dr)<b.shape[0] and sum(dc)<b.shape[1]: dr=compress(dr==0,range(b.shape[0])) # gives return format of dc=compress(dc==0,range(b.shape[1])) # optimrcs posted by Bengt score=dr.shape[0]*dc.shape[0] # Richter return score,dr,dc else: print "Sorry, didn't manage to let anything survive" return 0,0,0 # test code X = [ [1, 0, 0, 0, 1], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 1, 0, 0] ] if __name__ == '__main__': a=array(X) score, rowsel, colsel = pickrcs(a) print "score=",score for r in range(5): row = X[r] for c in range(5): if r in rowsel and c in colsel: print '<%s>'% row[c], else: print ' %s '% row[c], # now try for a larger problem - use a random pattern import RandomArray,time start=time.time() x0=1000 # problem size x1=600 a=RandomArray.random((x0,x1)) a=where(a>0.999,1,0) # ~ 0.1% of elements are 1 print "Using a large random matrix" print "Number of non-zeros=",sum(ravel(a)),"in",x0,"x",x1,"matrix" score, rowsel, colsel = pickrcs(a) print 'Score = %s' % (score) print 'Number of rows deleted=',a.shape[0]-rowsel.shape[0] print 'Number of columns deleted=',a.shape[1]-colsel.shape[0] print time.time()-start," seconds" ==sample output (of course second part is random!)=============== score= 16 1 0 0 0 1 <0> <0> 0 <0> <0> <0> <0> 0 <0> <0> <0> <0> 0 <0> <0> <0> <0> 1 <0> <0> Using a large random matrix Number of non-zeros= 607 in 1000 x 600 matrix Score = 331776 Number of rows deleted= 424 Number of columns deleted= 24 3.38999998569 seconds |

Re: getting a submatrix of all trueJohn Hunter <jdhunter@ace.bsd.uchicago.edu> wrote:
>Another way of formulating the question: for a sparse boolean matrix >(sparse on True), what is the optimal way to remove rows and columns >so that the total number of elements in the matrix is maximal and >there are no True values left. After having read Bengts code (and scraping the parts of my exploded head from the floor) and after later getting the hint about the size of the problem being 2**N, it finally dawned on me that the problem would be related to getting all possible combinations of a range. The following code has the advantage that it generates solutions by index, for example "M[i]" returns the score and the row and column information for index i. (note that is not the same as finding the optimal solution, but it still could be handy to be able to generate a specific one by index) However for small matrices it is possible to do exhaustive searching with "max(M)". Maybe if this would be combined with some heuristic algorithm it would be better (genetic algorithm?) Code below (needs Python 2.3, very lightly tested), I hope this helps, Anton class Combinations: def __init__(self,n,k): self.n,self.k,self.count = n,k,self.noverk(n,k) def __getitem__(self,index): #combination number index if index > self.count - 1: raise IndexError res,x,rest = [],self.n,index for i in range(self.k,0,-1): while self.noverk(x,i) > rest : x -= 1 rest -= self.noverk(x,i) res.append(x) return res def noverk(self,n,k): return reduce(lambda a,b: a*(n-b)/(b+1),range(k),1) class AllCombinations: def __init__(self,n): self.n, self.count = n, 2**n def __getitem__(self,index): #combination number index for all k in combinations(n,k) if index > self.count - 1: raise IndexError n,rest = self.n,index for k in range(n+1): c = Combinations(n,k) if rest - c.count < 0: return c[rest] rest -= c.count class ScoreMatrix: def __init__(self, X): self.AC = AllCombinations(len(X)) self.X = X def __getitem__(self,i): #score selected rows and columns by index rows = self.AC[i][::-1] if rows: Y = [self.X[i] for i in rows] Z = [(i,z) for i,z in enumerate(zip(*Y)) if 1 not in z] cols = [i for i,z in Z] score = sum(map(len,[z for i,z in Z])) return score,rows,cols else: return 0,[],[] def test(): X = [ [1, 0, 0, 0, 1], [0, 0, 0, 0, 0], [0, 0, 0, 1, 0], [0, 0, 0, 0, 0], [0, 0, 1, 0, 0], [0, 0, 1, 0, 0], ] M = ScoreMatrix(X) sc,rows,cols = max(M) print sc for i,r in enumerate(X): for j,c in enumerate(r): if i in rows and j in cols: print c, else: print 'x', if __name__=='__main__': test() output: 16 x x x x x 0 0 x 0 0 x x x x x 0 0 x 0 0 0 0 x 0 0 0 0 x 0 0 |

Re: getting a submatrix of all trueJohn Hunter <jdhunter@ace.bsd.uchicago.edu> wrote:
>>>>>> "Bengt" == Bengt Richter <bokr@oz.net> writes: > Bengt> Brute force seems to work on this little example. Maybe it > Bengt> can be memoized and optimized and/or whatever to handle > Bengt> your larger matrix fast enough? > >Thanks for the example. Unfortunately, it is too slow even for >moderate size matrices (30,10). I've been running it for over two >hours for a 30x10 matrix and it hasn't finished. And my data are >1000x100! I posted a somewhat long version before (inspired by Bengts idea) but I remembered an easier way to generate all combinations, and also I noticed that there is freedom in choosing to analyze rows or columns, which can be computationally advantageous. Below is a version that depends mostly on 2**N where N is the smallest of those two values. Unfortunately 2**100 is still way to big a number, but my code can now routinely solve 100x15 matrices ... >Last night, I began to formulate the problem as a logic statement, >hoping this would give me an idea of how to proceed. But no progress >yet. But I have come to the conclusion that with P ones, brute force >requires 2^P combinations. With 1000x100 with 5% missing that gives >me 2^5000. Not good. IMO it depends (except for the amount time spent in copying data of course) on the smallest of the number of rows or columns, so that's 2**100 in this case. Maybe for some matrices, your number is preferable? >Thanks for your suggestion, though. If you have any more thoughts, >let me know. I hope you don't mind me following this too :-) Nice problem, and I'm not so sure about my math as I sound above, if anyone can prove me right or wrong I'd be happy anyway ... Anton --- class ScoreMatrix: def __init__(self, X): n1,n2 = len(X),len(X[0]) if n2<n1: self.X = zip(*X) self.rotated = True n = n2 else: self.X = X n = n1 self.rotated = False self.R = range(n) self.count = 2**n def __getitem__(self,i): #score selected rows and columns by index if not (-1<i<self.count): raise IndexError rows =[x for j,x in enumerate(self.R) if 1<<j &i] if rows: Y = [self.X[i] for i in rows] Z = [(i,z) for i,z in enumerate(zip(*Y)) if 1 not in z] cols = [i for i,z in Z] score = sum(map(len,[z for i,z in Z])) if self.rotated: return score,cols,rows else: return score,rows,cols else: return 0,[],[] def test(): from random import random,randint f = .05 r,c = 100,15 X=[] for i in range(r): X.append([]) for j in range(c): X[i].append(int(random()<f)) M = ScoreMatrix(X) highscore = 0,[],[] for i,sc in enumerate(M): if sc > highscore: mi,highscore = i,sc sc,rows,cols = highscore print "maximum score: %s" %(sc) print "index of this score: %s" %(mi) print "tabledata:" for i,r in enumerate(X): for j,c in enumerate(r): if i in rows and j in cols: print c, else: print 'x', if __name__=='__main__': test() |

Re: getting a submatrix of all trueFolllowing Bengt and anton@vredegoor.doge.nl (Anton Vredegoor)'s leads,
the following code can be fast (at times). It is quite sensitive to the probability of non-zeroness, (.01 works well, the .o5 is nowhere near so nice). I get 2.25 min to do 25 x 25 at .05 2.75 min to do 30 x 30 at .05 It gets slow _very_ fast, but gets good numbers if the probability is low .25 min to do 45 x 45 at .01 1.5 min to do 50 x 50 at .01 -Scott David Daniels Scott.Daniels@Acm.Org (not fully back up yet, though). Here's the code: ################################ # myopt.py __version__ = '0.3' try: assert list(enumerate('ab')) == [(0, 'a'), (1, 'b')] except NameError: def enumerate(iterable): # Not exact, but close enough lst = list(iterable) return zip(range(len(lst)), lst) def bitcount(row): '''Return the number of on bits in the integer argsA''' assert row >= 0 result = 0 while row: result += 1 lsbit = row & -row row ^= lsbit return result def rowencode(vector): '''convert from a buncha numbers to a bit-representation''' bit = 1L result = 0 for element in vector: if element: result |= bit bit <<= 1 return result def rowdecode(value, mask=-1): '''convert from a bit-rep to a buncha numbers (at least as long as mask)''' if mask < 0: mask = value assert value >= 0 result = [] while mask or value: result.append(value & 1) mask >>= 1 value >>= 1 return result class Answer(object): '''An answer represents a result calculation.''' __slots__ = 'rows', 'colmask', '_score' totalrequests = 0 totalcalcs = 0 def __init__(self, colmask, rows): '''The columns in colmask are the ones we keep.''' self.colmask = colmask self.rows = rows self._score = None def score(self): '''Calculate the score lazily''' self.__class__.totalrequests += 1 if self._score is None: self.__class__.totalcalcs += 1 self._score = bitcount(self.colmask) * len(self.rows) return self._score def __repr__(self): return '%s(%d:%s, %x):%s' % (self.__class__.__name__, len(self.rows), self.rows, self.colmask, self._score) totalcalls = 0 def choose(rows, keepcols, N=0, keeprows=None): '''Choose rows and columns to keep. Return an Answer for the choice''' global totalcalls totalcalls += 1 if keeprows is None: keeprows = [] try: while 0 == rows[N] & keepcols: keeprows.append(N) N += 1 except IndexError: return Answer(keepcols, keeprows) # a difference: some kept columns in this row are non-0 # must drop either those columns or this row # Calculate result if we keep this row (drop problem columns) withrow = choose(rows, keepcols & ~rows[N], N+1, keeprows + [N]) # Calculate result if we drop this row skiprow = choose(rows, keepcols, N+1, keeprows) # Choose the better answer (keep the row if everything is equal). if (withrow.colmask == skiprow.colmask or withrow.score() >= skiprow.score()): return withrow else: return skiprow # The data used from the example X = [ [1, 0, 0, 0, 1], [0, 0, 0, 0, 0], [0, 0, 0, 1, 0], [0, 0, 0, 0, 0], [0, 0, 1, 0, 0], [0, 0, 1, 0, 0] ] def printrep(row, symbols, mask=0): '''A row representing a single row (column-masked by mask)''' assert mask >= 0 result = [] for element in row: result.append(symbols[(1 & mask) * 2 + (element != 0)]) mask >>= 1 assert mask == 0 # mask doesn't extend beyond data. return ''.join(result) def printanswer(data, rows, keepcols): '''Print the represented row''' toohuge = len(data) rowqueue = rows + [toohuge] rowqueue.reverse() nextrow = rowqueue.pop() for rownumber, row in enumerate(data): if nextrow > rownumber: # This row was zapped print '#', printrep(row, '01OI', keepcols) else: assert rownumber == nextrow # This row was kept nextrow = rowqueue.pop() print '_', printrep(row, '01~@', keepcols) assert nextrow == toohuge and not rowqueue def getanswer(data): '''Calculate the best-cut for a particular matrix''' columns = max([len(row) for row in data]) rowdata = [rowencode(row) for row in data] return choose(rowdata, (1L << columns) - 1) def main(data=X): global totalcalls totalcalls = 0 answer = getanswer(data) print 'Requested: %s, Calculated: %s,' % ( Answer.totalrequests, Answer.totalcalcs), print 'answer: %r,' % answer, print 'Score: %s' % answer.score() print 'Total choose calls required: %s' % totalcalls printanswer(data, answer.rows, answer.colmask) def rangen(rows, columns, probability=0.05): '''create a rows by columns data table with 1s at the given probability''' import random result = [] for row in range(rows): result.append([int(random.random() < probability) for column in range(columns)]) return result if __name__ == '__main__': import sys assert getanswer([[0]]).score() == 1 assert getanswer([[0,1], [1,0]]).score() == 1 assert getanswer([[0,1,0], [1,0,0]]).score() == 2 if len(sys.argv) < 2: main(X) else: args = sys.argv[1:] if '.' in args[-1]: assert len(args) > 1 probability = float(args.pop()) else: probability = .2 rows = int(args[0]) if len(args) == 1: cols = rows else: assert len(args) == 2 cols = int(args[1]) main(rangen(rows, cols, probability)) |

All times are GMT. The time now is 09:26 AM. |

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.

SEO by vBSEO ©2010, Crawlability, Inc.